Claim Missing Document
Check
Articles

Found 1 Documents
Search

Dimensionality Reduction Using Principal Component Analysis and Feature Selection Using Genetic Algorithm with Support Vector Machine for Microarray Data Classification Dwi Kartini; Rahmat Amin Badali; Muliadi Muliadi; Dodon Turianto Nugrahadi; Fatma Indriani; Setyo Wahyu Saputro
Indonesian Journal of Electronics, Electromedical Engineering, and Medical Informatics Vol. 7 No. 1 (2025): February
Publisher : Jurusan Teknik Elektromedik, Politeknik Kesehatan Kemenkes Surabaya, Indonesia

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.35882/mr7x9713

Abstract

DNA microarray is used to analyze gene expression on a large scale simultaneously and plays a critical role in cancer detection. The creation of a DNA microarray starts with RNA isolation from the sample, which is then converted into cDNA and scanned to generate gene expression data. However, the data generated through this process is highly dimensional, which can affect the performance of predictive models for cancer detection. Therefore, dimensionality reduction is required to reduce data complexity. This study aims to analyze the impact of applying Principal Component Analysis (PCA) for dimensionality reduction, Genetic Algorithm (GA) for feature selection, and their combination on microarray data classification using Support Vector Machine (SVM). The datasets used are microarray datasets, including breast cancer, ovarian cancer, and leukemia. The research methodology involves preprocessing, PCA for dimensionality reduction, GA for feature selection, data splitting, SVM classification, and evaluation. Based on the results, the application of PCA dimensionality reduction combined with GA feature selection and SVM classification achieved the best performance compared to other classifications. For the breast cancer dataset, the highest accuracy was 73.33%, recall 0.74, precision 0.75, and F1 score 0.73. For the ovarian cancer dataset, the highest accuracy was 98.68%, recall 0.98, precision 0.99, and F1 score 0.99. For the leukemia dataset, the highest accuracy was 95.45%, recall 0.94, precision 0.97, and F1 score 0.95. It can be concluded that combining PCA for dimensionality reduction with GA for feature selection in microarray classification can simplify the data and improve the accuracy of the SVM classification model. The implications of this study emphasize the effectiveness of applying PCA and GA methods in enhancing the classification performance of microarray data.