Evaluasi Algoritma Pembelajaran Terbimbing terhadap Dataset Penyakit Jantung yang telah Dilakukan Oversampling
Sari
Penyakit jantung mengalami peningkatan setiap tahunnya dan menjadi penyebab kematian tertinggi di Indonesia, terutama pada usia produktif. Pola makan yang tidak seimbang dan gaya hidup tidak sehat menjadi faktor penyebab prevalensi penyakit jantung yang tinggi. Bidang ilmu kedokteran mulai beradaptasi dan mengandalkan model prediksi otomatis berbasis komputer untuk diagnosis secara tepat dan akurat. Data tentang penyakit jantung seringkali memiliki ketidakseimbangan, yaitu jumlah data pada kelas minoritas lebih kecil daripada kelas mayoritas. Oleh karena itu, teknik oversampling seperti SMOTE dan ADASYN digunakan untuk menangani masalah ini. Hasil dari penelitian ini Algoritma Random Forest Classifier menjadi model perbandingan terbaik dengan akurasi sekitar 90,71%. Penerapan teknik oversampling SMOTE + Random Forest, akurasi dapat meningkat hingga sekitar 94,54% dengan kurva ROC sebesar 98,4%. Model diagnosa yang akurat dapat menjadi media bagi tenaga medis untuk mengambil langkah pencegahan yang tepat dan meningkatkan kualitas perawatan pasien.
Kata kunci: ADASYN, Klasifikasi, Pohon Keputusan, Regresi, SMOTE
AbstractHeart disease is rapidly increasing in Indonesia and has become the primary cause of death, particularly among those in their productive years. The prevalence of heart disease is due to unhealthy lifestyle choices and an imbalanced diet. The medical field is relying more heavily on computer-based automatic prediction models to ensure precise and accurate diagnoses. However, data on heart disease is frequently imbalanced, with fewer cases in the minority class. To resolve this issue, oversampling techniques such as SMOTE and ADASYN have been implemented. The study demonstrates that the Random Forest Classifier Algorithm is the most effective comparison model, with an accuracy rate of approximately 90.71%. By implementing the SMOTE + Random Forest oversampling technique, the accuracy rate increased to around 94.54%, with a ROC curve of 98.4%. A highly accurate diagnostic model is essential for enabling medical personnel to take appropriate preventive measures and enhance the quality of patient care.Keywords: ADASYN, Classification, Decision Tree, Regresi, SMOTE
Teks Lengkap:
PDFReferensi
Ath, S., Al, T., Darmawan, D., Fahmi, N., Hakim, A., Qibtiya, M. Al, & Syafei, N. S. (2022). Jurnal Teknologi Terpadu HYBRID MACHINE LEARNING MODEL UNTUK MEMPREDIKSI PENYAKIT JANTUNG DENGAN METODE LOGISTIC REGRESSION DAN RANDOM. 8(1), 40–46.
Braunwald, E. (2019). Braunwald’s Heart Disease: A Textbook of Cardiovascular Medicine. In Elsivier (Vol. 7, Issue 2).
Centers for Disease Control and Prevention. (2020). BRFSS Survey Data and Documentation. Centers for Disease Control and Prevention.
Cherfi, A., Nouira, K., & Ferchichi, A. (2018). Very Fast C4.5 Decision Tree Algorithm. Applied Artificial Intelligence, 32(2), 119–137. https://doi.org/10.1080/08839514.2018.1447479
Derisma, D. (2020). Perbandingan Kinerja Algoritma untuk Prediksi Penyakit Jantung dengan Teknik Data Mining. Journal of Applied Informatics and Computing, 4(1). https://doi.org/10.30871/jaic.v4i1.2152
Djatna, T., Hardhienata, M. K. D., & Masruriyah, A. F. N. (2018). An intuitionistic fuzzy diagnosis analytics for stroke disease. Journal of Big Data, 5(1). https://doi.org/10.1186/s40537-018-0142-7
El-Hasnony, I. M., Elzeki, O. M., Alshehri, A., & Salem, H. (2022). Multi-Label Active Learning-Based Machine Learning Model for Heart Disease Prediction. Sensors, 22(3). https://doi.org/10.3390/s22031184
Ghosh, P., Azam, S., Jonkman, M., Karim, A., Shamrat, F. M. J. M., Ignatious, E., Shultana, S., Beeravolu, A. R., & De Boer, F. (2021). Efficient prediction of cardiovascular disease using machine learning algorithms with relief and lasso feature selection techniques. IEEE Access, 9, 19304–19326. https://doi.org/10.1109/ACCESS.2021.3053759
Hartshorn, S. (2020). Machine Learning with Random Forest and Decision Tree.
Kementrian Kesehatan Republik Indonesia. (2021). Penyakit Jantung Koroner Didominasi Masyarakat Kota. https://www.kemkes.go.id/article/view/21093000002/penyakit-jantung-koroner-didominasi-masyarakat-kota.html
Khasanah, N., Komarudin, R., Afni, N., Maulana, Y. I., & Salim, A. (2021). Skin Cancer Classification Using Random Forest Algorithm. Sisfotenika, 11(2), 137. https://doi.org/10.30700/jst.v11i2.1122
Li, Y., Xu, W., Li, W., Li, A., & Liu, Z. (2021). Research on hybrid intrusion detection method based on the ADASYN and ID3 algorithms. Mathematical Biosciences and Engineering, 19(2). https://doi.org/10.3934/MBE.2022095
Maldonado, S., López, J., & Vairetti, C. (2019). An alternative SMOTE oversampling strategy for high-dimensional datasets. Applied Soft Computing Journal, 76, 380–389. https://doi.org/10.1016/j.asoc.2018.12.024
Masruriyah, A. F. N., Djatna, T., Dewi Hardhienata, M. K., Handayani, H. H., & Wahiddin, D. (2019). Predictive Analytics For Stroke Disease. Proceedings of 2019 4th International Conference on Informatics and Computing, ICIC 2019. https://doi.org/10.1109/ICIC47613.2019.8985716
Masruriyah, A. F. N., Djatna, T., Hardhienata, M. K. D., Handayani, H. H., & Wahiddin, D. (2019). Predictive Analytics For Stroke Disease. 13–16.
Mia, M., Masruriyah, A. F. N., & Pratama, A. R. (2022). The Utilization of Decision Tree Algorithm In Order to Predict Heart Disease. JURNAL SISFOTEK GLOBAL, 12(2), 138. https://doi.org/10.38101/sisfotek.v12i2.551
Mohan, S., Thirumalai, C., & Srivastava, G. (2019). Effective heart disease prediction using hybrid machine learning techniques. IEEE Access, 7, 81542–81554. https://doi.org/10.1109/ACCESS.2019.2923707
Muqorobin, M., Utomo, P. B., Nafi’Uddin, M., & Kusrini, K. (2019). Implementasi Metode Certainty Factor pada Sistem Pakar Diagnosa Penyakit Ayam Berbasis Android. Creative Information Technology Journal, 5(3), 185. https://doi.org/10.24076/citec.2018v5i3.198
Nugraha, R. G., Yoga Wibowo, M., Ajie, P., Handayani, H. H., Fauzi, A., & Nur Masruriyah, A. F. (2021). Implementation of Deep Learning in Order to Detect Inapposite Mask User. 2021 6th International Conference on Informatics and Computing, ICIC 2021, 4–9. https://doi.org/10.1109/ICIC54025.2021.9632994
Nurdian, R. A., Mujib Ridwan, & Ahmad Yusuf. (2022). Komparasi Metode SMOTE dan ADASYN dalam Meningkatkan Performa Klasifikasi Herregistrasi Mahasiswa Baru. Jurnal Teknik Informatika Dan Sistem Informasi, 8(1). https://doi.org/10.28932/jutisi.v8i1.4004
Pangaribuan, J. J., Tedja, C., & Wibowo, S. (2019). Perbandingan Metode Algoritma C4.5 dan Extreme Learning Machine untuk Mendiagnosis Penyakit Jantung Koroner. In PSDKU Medan Jurusan Teknik Informatika INFORMATICS ENGINEERING RESEARCH AND TECHNOLOGY.
Primajaya, A., & Sari, B. N. (2018). Random Forest Algorithm for Prediction of Precipitation. Indonesian Journal of Artificial Intelligence and Data Mining, 1(1), 27. https://doi.org/10.24014/ijaidm.v1i1.4903
Qing, Z., Zeng, Q., Wang, H., Liu, Y., Xiong, T., & Zhang, S. (2022). ADASYN-LOF Algorithm for Imbalanced Tornado Samples. Atmosphere, 13(4). https://doi.org/10.3390/atmos13040544
Raja, R., Nagwanshi, K. K., Kumar, S., & Laxmi, K. R. (2022). Data Mining and Machine Learning Applications.
Ramadhan, N. G. (2021). Comparative Analysis of ADASYN-SVM and SMOTE-SVM Methods on the Detection of Type 2 Diabetes Mellitus. Scientific Journal of Informatics, 8(2). https://doi.org/10.15294/sji.v8i2.32484
Rohman, A., & Rochcham, D. M. (2018). MODEL ALGORITMA C4.5 UNTUK PREDIKSI PENYAKIT JANTUNG. In Jurnal Neo Teknika (Vol. 4, Issue 2).
Satapathy, S. K., Mishra, S., Mallick, P. K., & Chae, G. S. (2021). ADASYN and ABC-optimized RBF convergence network for classification of electroencephalograph signal. Personal and Ubiquitous Computing. https://doi.org/10.1007/s00779-021-01533-4
Siringoringo, R. (2018). KLASIFIKASI DATA TIDAK SEIMBANG MENGGUNAKAN ALGORITMA SMOTE DAN k-NEAREST NEIGHBOR (Vol. 3, Issue 1).
Sonjaya, C. B., Masruriyah, A. F. N., Kusumaningrum, D. S., & Pratama, A. R. (2022). The Performance Comparison of Classification Algorithm in Order to Detecting Heart Disease. INTERNAL (Information System Journal, 5(2), 166–175. https://doi.org/10.32627
Zaki, M. J., & Wagner Jr, M. (2020). Data Mining and Machine Learning Fundamental Concepts and Algorithms.
Zhang, Y., & Yang, Y. (2015). Cross-validation for selecting a model selection procedure. Journal of Econometrics, 187(1), 95–112. https://doi.org/10.1016/j.jeconom.2015.02.006
DOI: https://doi.org/10.26760/mindjournal.v8i2.242-253
Refbacks
- Saat ini tidak ada refbacks.
____________________________________________________________
ISSN (cetak) : 2338-8323 | ISSN (elektronik) : 2528-0902
diterbitkan oleh:
Informatika Institut Teknologi Nasional Bandung
Alamat : Gedung 2 Jl. PHH. Mustofa 23 Bandung 40124
Kontak : Tel. 7272215 (ext. 181)Â Fax. 7202892
Email : mind.journal@itenas.ac.id
____________________________________________________________
Statistik Pengunjung :
Jurnal ini terlisensi oleh Creative Commons Attribution-ShareAlike 4.0 International License.