Evaluasi Algoritma Pembelajaran Terbimbing terhadap Dataset Penyakit Jantung yang telah Dilakukan Oversampling

ANIS FITRI NUR MASRURIYAH, HILDA YULIA NOVITA, CICI EMILIA SUKMAWATI, SITI NOVIANTI NURAINI ARIF, ANGGA RAMDA RAMADHAN

Sari


Abstrak

Penyakit jantung mengalami peningkatan setiap tahunnya dan menjadi penyebab kematian tertinggi di Indonesia, terutama pada usia produktif. Pola makan yang tidak seimbang dan gaya hidup tidak sehat menjadi faktor penyebab prevalensi penyakit jantung yang tinggi. Bidang ilmu kedokteran mulai beradaptasi dan mengandalkan model prediksi otomatis berbasis komputer untuk diagnosis secara tepat dan akurat. Data tentang penyakit jantung seringkali memiliki ketidakseimbangan, yaitu jumlah data pada kelas minoritas lebih kecil daripada kelas mayoritas. Oleh karena itu, teknik oversampling seperti SMOTE dan ADASYN digunakan untuk menangani masalah ini. Hasil dari penelitian ini Algoritma Random Forest Classifier menjadi model perbandingan terbaik dengan akurasi sekitar 90,71%. Penerapan teknik oversampling SMOTE + Random Forest, akurasi dapat meningkat hingga sekitar 94,54% dengan kurva ROC sebesar 98,4%. Model diagnosa yang akurat dapat menjadi media bagi tenaga medis untuk mengambil langkah pencegahan yang tepat dan meningkatkan kualitas perawatan pasien.

Kata kunci: ADASYN, Klasifikasi, Pohon Keputusan, Regresi, SMOTE

AbstractHeart disease is rapidly increasing in Indonesia and has become the primary cause of death, particularly among those in their productive years. The prevalence of heart disease is due to unhealthy lifestyle choices and an imbalanced diet. The medical field is relying more heavily on computer-based automatic prediction models to ensure precise and accurate diagnoses. However, data on heart disease is frequently imbalanced, with fewer cases in the minority class. To resolve this issue, oversampling techniques such as SMOTE and ADASYN have been implemented. The study demonstrates that the Random Forest Classifier Algorithm is the most effective comparison model, with an accuracy rate of approximately 90.71%. By implementing the SMOTE + Random Forest oversampling technique, the accuracy rate increased to around 94.54%, with a ROC curve of 98.4%. A highly accurate diagnostic model is essential for enabling medical personnel to take appropriate preventive measures and enhance the quality of patient care.

Keywords: ADASYN, Classification, Decision Tree, Regresi, SMOTE


Teks Lengkap:

PDF

Referensi


Ath, S., Al, T., Darmawan, D., Fahmi, N., Hakim, A., Qibtiya, M. Al, & Syafei, N. S. (2022). Jurnal Teknologi Terpadu HYBRID MACHINE LEARNING MODEL UNTUK MEMPREDIKSI PENYAKIT JANTUNG DENGAN METODE LOGISTIC REGRESSION DAN RANDOM. 8(1), 40–46.

Braunwald, E. (2019). Braunwald’s Heart Disease: A Textbook of Cardiovascular Medicine. In Elsivier (Vol. 7, Issue 2).

Centers for Disease Control and Prevention. (2020). BRFSS Survey Data and Documentation. Centers for Disease Control and Prevention.

Cherfi, A., Nouira, K., & Ferchichi, A. (2018). Very Fast C4.5 Decision Tree Algorithm. Applied Artificial Intelligence, 32(2), 119–137. https://doi.org/10.1080/08839514.2018.1447479

Derisma, D. (2020). Perbandingan Kinerja Algoritma untuk Prediksi Penyakit Jantung dengan Teknik Data Mining. Journal of Applied Informatics and Computing, 4(1). https://doi.org/10.30871/jaic.v4i1.2152

Djatna, T., Hardhienata, M. K. D., & Masruriyah, A. F. N. (2018). An intuitionistic fuzzy diagnosis analytics for stroke disease. Journal of Big Data, 5(1). https://doi.org/10.1186/s40537-018-0142-7

El-Hasnony, I. M., Elzeki, O. M., Alshehri, A., & Salem, H. (2022). Multi-Label Active Learning-Based Machine Learning Model for Heart Disease Prediction. Sensors, 22(3). https://doi.org/10.3390/s22031184

Ghosh, P., Azam, S., Jonkman, M., Karim, A., Shamrat, F. M. J. M., Ignatious, E., Shultana, S., Beeravolu, A. R., & De Boer, F. (2021). Efficient prediction of cardiovascular disease using machine learning algorithms with relief and lasso feature selection techniques. IEEE Access, 9, 19304–19326. https://doi.org/10.1109/ACCESS.2021.3053759

Hartshorn, S. (2020). Machine Learning with Random Forest and Decision Tree.

Kementrian Kesehatan Republik Indonesia. (2021). Penyakit Jantung Koroner Didominasi Masyarakat Kota. https://www.kemkes.go.id/article/view/21093000002/penyakit-jantung-koroner-didominasi-masyarakat-kota.html

Khasanah, N., Komarudin, R., Afni, N., Maulana, Y. I., & Salim, A. (2021). Skin Cancer Classification Using Random Forest Algorithm. Sisfotenika, 11(2), 137. https://doi.org/10.30700/jst.v11i2.1122

Li, Y., Xu, W., Li, W., Li, A., & Liu, Z. (2021). Research on hybrid intrusion detection method based on the ADASYN and ID3 algorithms. Mathematical Biosciences and Engineering, 19(2). https://doi.org/10.3934/MBE.2022095

Maldonado, S., López, J., & Vairetti, C. (2019). An alternative SMOTE oversampling strategy for high-dimensional datasets. Applied Soft Computing Journal, 76, 380–389. https://doi.org/10.1016/j.asoc.2018.12.024

Masruriyah, A. F. N., Djatna, T., Dewi Hardhienata, M. K., Handayani, H. H., & Wahiddin, D. (2019). Predictive Analytics For Stroke Disease. Proceedings of 2019 4th International Conference on Informatics and Computing, ICIC 2019. https://doi.org/10.1109/ICIC47613.2019.8985716

Masruriyah, A. F. N., Djatna, T., Hardhienata, M. K. D., Handayani, H. H., & Wahiddin, D. (2019). Predictive Analytics For Stroke Disease. 13–16.

Mia, M., Masruriyah, A. F. N., & Pratama, A. R. (2022). The Utilization of Decision Tree Algorithm In Order to Predict Heart Disease. JURNAL SISFOTEK GLOBAL, 12(2), 138. https://doi.org/10.38101/sisfotek.v12i2.551

Mohan, S., Thirumalai, C., & Srivastava, G. (2019). Effective heart disease prediction using hybrid machine learning techniques. IEEE Access, 7, 81542–81554. https://doi.org/10.1109/ACCESS.2019.2923707

Muqorobin, M., Utomo, P. B., Nafi’Uddin, M., & Kusrini, K. (2019). Implementasi Metode Certainty Factor pada Sistem Pakar Diagnosa Penyakit Ayam Berbasis Android. Creative Information Technology Journal, 5(3), 185. https://doi.org/10.24076/citec.2018v5i3.198

Nugraha, R. G., Yoga Wibowo, M., Ajie, P., Handayani, H. H., Fauzi, A., & Nur Masruriyah, A. F. (2021). Implementation of Deep Learning in Order to Detect Inapposite Mask User. 2021 6th International Conference on Informatics and Computing, ICIC 2021, 4–9. https://doi.org/10.1109/ICIC54025.2021.9632994

Nurdian, R. A., Mujib Ridwan, & Ahmad Yusuf. (2022). Komparasi Metode SMOTE dan ADASYN dalam Meningkatkan Performa Klasifikasi Herregistrasi Mahasiswa Baru. Jurnal Teknik Informatika Dan Sistem Informasi, 8(1). https://doi.org/10.28932/jutisi.v8i1.4004

Pangaribuan, J. J., Tedja, C., & Wibowo, S. (2019). Perbandingan Metode Algoritma C4.5 dan Extreme Learning Machine untuk Mendiagnosis Penyakit Jantung Koroner. In PSDKU Medan Jurusan Teknik Informatika INFORMATICS ENGINEERING RESEARCH AND TECHNOLOGY.

Primajaya, A., & Sari, B. N. (2018). Random Forest Algorithm for Prediction of Precipitation. Indonesian Journal of Artificial Intelligence and Data Mining, 1(1), 27. https://doi.org/10.24014/ijaidm.v1i1.4903

Qing, Z., Zeng, Q., Wang, H., Liu, Y., Xiong, T., & Zhang, S. (2022). ADASYN-LOF Algorithm for Imbalanced Tornado Samples. Atmosphere, 13(4). https://doi.org/10.3390/atmos13040544

Raja, R., Nagwanshi, K. K., Kumar, S., & Laxmi, K. R. (2022). Data Mining and Machine Learning Applications.

Ramadhan, N. G. (2021). Comparative Analysis of ADASYN-SVM and SMOTE-SVM Methods on the Detection of Type 2 Diabetes Mellitus. Scientific Journal of Informatics, 8(2). https://doi.org/10.15294/sji.v8i2.32484

Rohman, A., & Rochcham, D. M. (2018). MODEL ALGORITMA C4.5 UNTUK PREDIKSI PENYAKIT JANTUNG. In Jurnal Neo Teknika (Vol. 4, Issue 2).

Satapathy, S. K., Mishra, S., Mallick, P. K., & Chae, G. S. (2021). ADASYN and ABC-optimized RBF convergence network for classification of electroencephalograph signal. Personal and Ubiquitous Computing. https://doi.org/10.1007/s00779-021-01533-4

Siringoringo, R. (2018). KLASIFIKASI DATA TIDAK SEIMBANG MENGGUNAKAN ALGORITMA SMOTE DAN k-NEAREST NEIGHBOR (Vol. 3, Issue 1).

Sonjaya, C. B., Masruriyah, A. F. N., Kusumaningrum, D. S., & Pratama, A. R. (2022). The Performance Comparison of Classification Algorithm in Order to Detecting Heart Disease. INTERNAL (Information System Journal, 5(2), 166–175. https://doi.org/10.32627

Zaki, M. J., & Wagner Jr, M. (2020). Data Mining and Machine Learning Fundamental Concepts and Algorithms.

Zhang, Y., & Yang, Y. (2015). Cross-validation for selecting a model selection procedure. Journal of Econometrics, 187(1), 95–112. https://doi.org/10.1016/j.jeconom.2015.02.006




DOI: https://doi.org/10.26760/mindjournal.v8i2.242-253

Refbacks

  • Saat ini tidak ada refbacks.


____________________________________________________________

ISSN (cetak) : 2338-8323  |  ISSN (elektronik) :  2528-0902

diterbitkan oleh:

Informatika Institut Teknologi Nasional Bandung

Alamat : Gedung 2 Jl. PHH. Mustofa 23 Bandung 40124

Kontak : Tel. 7272215 (ext. 181)  Fax. 7202892

Email : mind.journal@itenas.ac.id

____________________________________________________________

Statistik Pengunjung :

Flag Counter

  Web
Analytics Statistik Pengunjung

 Jurnal ini terlisensi oleh Creative Commons Attribution-ShareAlike 4.0 International License.

Creative Commons License