p-Index From 2020 - 2025
0.408
P-Index
This Author published in this journals
All Journal Telematika
Abadi, Friska
Lambung Mangkurat University

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search

Comparative Analysis of Distance Metrics in KNN and SMOTE Algorithms for Software Defect Prediction Maulidha, Khusnul Rahmi; Faisal, Mohammad Reza; Saputro, Setyo Wahyu; Abadi, Friska; Nugrahadi, Dodon Turianto; Adi, Puput Dani Prasetyo; Hariyady, Hariyady
Telematika Vol 18, No 1: February (2025)
Publisher : Universitas Amikom Purwokerto

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35671/telematika.v18i1.3008

Abstract

As the complexity and scale of projects increase, new challenges arise related to handling software defects. One solution uses machine learning-based software defect prediction techniques, such as the K-Nearest Neighbors (KNN) algorithm. However, KNN’s performance can be hindered by the majority vote mechanism and the distance/similarity metric choice, especially when applied to imbalanced datasets. This research compares the effectiveness of Euclidean, Hamming, Cosine, and Canberra distance metrics on KNN performance, both before and after the application of SMOTE (Synthetic Minority Over-sampling Technique). Results show significant improvements in the AUC and F-1 measure values across various datasets after the SMOTE application. Following the SMOTE application, Euclidean distance produced an AUC of 0.7752 and an F1 of 0.7311 for the EQ dataset. With Canberra distance and SMOTE, the JDT dataset produced an AUC of 0.7707 and an F-1 of 0.6342. The LC dataset improved to 0.6752 and 0.3733 in tandem with the ML dataset, which climbed to 0.6845 and 0.4261 with Canberra distance. Lastly, after using SMOTE, the PDE dataset improved to 0.6580 and 0.3957 with Canberra distance. The findings confirm that SMOTE, combined with suitable distance metrics, significantly boosts KNN’s prediction accuracy, with a P-value of 0.0001.
Automatic Analysis of Natural Disaster Messages on Social Media Using IndoBERT and Multilingual BERT Safitri, Yasmin Dwi; Faisal, Mohammad Reza; Kartini, Dwi; Saragih, Triando Hamonangan; Abadi, Friska; Bachtiar, Adam Mukharil
Telematika Vol 18, No 2: August (2025)
Publisher : Universitas Amikom Purwokerto

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35671/telematika.v18i2.3140

Abstract

Information about natural disasters disseminated through social media can serve as an important data source for mitigation processes and early warning systems. Social media platforms, such as X (formerly known as Twitter), have become primary channels for conveying real-time information, especially during disaster emergencies. With the large amount of unstructured disaster-related text that must be processed, the main challenge is accurately filtering and classifying messages into three categories: eyewitness, non-eyewitness, and don’t know. This research aims to compare the performance of four BERT-based natural language processing models, namely IndoBERT, IndoBERT with Masked Language Modeling (MLM), Multilingual BERT, and Multilingual BERT with MLM, in classifying Indonesian-language disaster messages. The dataset used in this study was obtained from previous research and publicly available data on GitHub, consisting of annotated messages related to floods, earthquakes, and forest fires. The method applied is a deep learning approach using the hold-out technique with an 80:20 ratio for training and testing data, and the same ratio applied to split the training data into training and validation subsets, with stratification to maintain balanced class proportions. In addition, variations in batch size were explored to evaluate their effect on model performance stability. The results show that the IndoBERT model achieved the highest performance on the flood and earthquake datasets, with accuracies of 80.67% and 81.50%, respectively. Meanwhile, IndoBERT with MLM pre-training recorded the highest accuracy on the forest fire dataset, 88.33%. Overall, IndoBERT demonstrated the most consistent and superior performance across datasets compared to the other models. These findings indicate that IndoBERT has strong capabilities in understanding Indonesian disaster-related text, and the results can be used as a foundation for developing automatic classification systems to support real-time disaster monitoring and early warning applications