Pendekatan Unsupervised learning dalam Segmentasi Kesehatan: Perbandingan K-Means dan DBSCAN

ANIS FITRI NUR MASRURIYAH, MARDIAH MARDIAH, MUHAMMAD DWI ANANDA, KARENINA NURMELITA MALIK

Sari


Abstrak

Segmentasi kesehatan berbasis data pemeriksaan medis penting untuk mendukung strategi pencegahan penyakit. Penelitian ini membandingkan metode clustering K-Means dan DBSCAN menggunakan Silhouette Score dan Davies-Bouldin Index. Hasil menunjukkan bahwa K-Means dengan 8 cluster memberikan performa terbaik dengan Silhouette Score 0.2972 dan Davies-Bouldin Index 1.2934, dibandingkan konfigurasi lainnya. DBSCAN memperoleh Silhouette Score 0.2837, menunjukkan pendekatan berbasis densitas juga efektif dalam pengelompokan data. Dengan hasil ini, K-Means dengan 8 cluster dipilih sebagai metode terbaik untuk segmentasi kesehatan dalam penelitian ini. Temuan ini dapat mendukung analisis data medis untuk pencegahan penyakit yang lebih efektif dan personal.

Kata kunci: Segmentasi Kesehatan, Clustering, K-Means, DBSCAN, Silhouette Score, Davies-Bouldin Index

Abstract
Health segmentation based on medical examination data plays a crucial role in supporting disease prevention strategies. This study compares K-Means and DBSCAN clustering methods, evaluated using Silhouette Score and Davies-Bouldin Index, to identify the most effective segmentation approach. Experimental results indicate that K-Means with 8 clusters achieves the best performance, yielding a Silhouette Score of 0.2972 and a Davies-Bouldin Index of 1.2934, outperforming other configurations. Meanwhile, DBSCAN attains a Silhouette Score of 0.2837, demonstrating the efficacy of density-based clustering in handling medical data. Based on these findings, K-Means with 8 clusters emerges as the most optimal method for health segmentation in this study. These insights contribute to the advancement of data-driven disease prevention strategies and personalized healthcare management..

Keywords: Health Segmentation, Clustering, K-Means, DBSCAN, Silhouette Score, Davies-Bouldin Index


Teks Lengkap:

PDF

Referensi


Ahmed, M., Seraj, R., & Islam, S. M. S. (2020). The k-means algorithm: A comprehensive survey and performance evaluation. In Electronics (Switzerland) (Vol. 9, Issue 8). https://doi.org/10.3390/electronics9081295

AL-Kahil, A. B., Khawaja, R. A., Kadri, A. Y., Abbarh,MBBS, S. M., Alakhras, J. T., & Jaganathan, P. P. (2020). Knowledge and Practices Toward Routine Medical Checkup Among Middle-Aged and Elderly People of Riyadh. Journal of Patient Experience, 7(6). https://doi.org/10.1177/2374373519851003

Aram, S. A. (2021). Assessing the effect of working conditions on routine medical checkup among artisanal goldminers in Ghana. Heliyon, 7(7). https://doi.org/10.1016/j.heliyon.2021.e07596

Choi, Y., An, J., Ryu, S., & Kim, J. (2022). Development and Evaluation of Machine Learning-Based High-Cost Prediction Model Using Health Check-Up Data by the National Health Insurance Service of Korea. International Journal of Environmental Research and Public Health, 19(20). https://doi.org/10.3390/ijerph192013672

Costa, E., Papatsouma, I., & Markos, A. (2022). Benchmarking distance-based partitioning methods for mixed-type data. http://arxiv.org/abs/2203.16287

de la Selle, T., Weiss, J., & Deschanel, S. (2024). Acoustic multiplets detection based on DBSCAN and cross-correlation. Mechanical Systems and Signal Processing, 211. https://doi.org/10.1016/j.ymssp.2024.111149

Dubey, A. K., Gupta, U., & Jain, S. (2022). Medical data clustering and classification using TLBO and machine learning algorithms. Computers, Materials and Continua, 70(3), 4523–4543. https://doi.org/10.32604/cmc.2022.021148

Faisal, S., Sameer, S., Kamil Mohammed, I., & S Abd, M. (2021). Review of medical diagnostics via data mining techniques. Iraqi Journal of Science, 62(7), 2401–2424. https://doi.org/10.24996/ijs.2021.62.7.30

Gunawan, W. (2021). Implementasi Algoritma DBScan dalam Pemngambilan Data Menggunakan Scatterplot. Techno Xplore : Jurnal Ilmu Komputer Dan Teknologi Informasi, 6(2), 91–98. https://doi.org/10.36805/technoxplore.v6i2.1179

Hairani, H., Saputro, K. E., & Fadli, S. (2020). K-means-SMOTE for handling class imbalance in the classification of diabetes with C4.5, SVM, and naive Bayes. Jurnal Teknologi Dan Sistem Komputer, 8(2). https://doi.org/10.14710/jtsiskom.8.2.2020.89-93

Hasan, S. (2025). Medical Examination Dataset. https://www.kaggle.com/datasets/jazidesigns/medical-examination-dataset

Ikotun, A. M., Ezugwu, A. E., Abualigah, L., Abuhaija, B., & Heming, J. (2023). K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data. Information Sciences, 622. https://doi.org/10.1016/j.ins.2022.11.139

Johari, N. A. A. M., Mohamad, N., & Isa, N. (2020). Smart Self-Checkup for Early Disease Prediction.

Masruriyah, A. F. N., Novita, H. Y., & Sukmawati, C. E. (2023). Performance Evaluation of Popular Supervised Learning Algorithms Towards Cardiovascular Disease. 8(3), 420–426. https://doi.org/10.32493/informatika.v8i3.34103

Masruriyah, A. F. N., Novita, H. Y., Sukmawati, C. E., Arif, S. N. N., & Ramadhan, A. R. (2023). Evaluasi Algoritma Pembelajaran Terbimbing terhadap Dataset Penyakit Jantung yang telah Dilakukan Oversampling. Journal MIND Journal | ISSN, 8(2), 242–253. https://doi.org/10.26760/mindjournal.v8i2.242-253

Masruriyah, A. F. N., Novita, H. Y., Sukmawati, C. E., Fauzi, A., Wahiddin, D., & Handayani, H. H. (2023). Thorough Evaluation of the Effectiveness of SMOTE and ADASYN Oversampling Methods in Enhancing Supervised Learning Performance for Imbalanced Heart Disease Datasets. International Conference on Informatics and Computing (ICIC).

Masruriyah, A. F. N., Sukmawati, C. E., & Dermawan, B. A. (2024). Memahami Data Mining dengan Python: Implementasi Praktis. https://repository.penerbiteureka.com/publications/568010/memahami-data-mining-dengan-python-implementasi-praktis

Meneses Navarro, S., Pelcastre-Villafuerte, B. E., Becerril-Montekio, V., & Serván-Mori, E. (2022). Overcoming the health systems’ segmentation to achieve universal health coverage in Mexico. International Journal of Health Planning and Management, 37(6). https://doi.org/10.1002/hpm.3538

Mia, M., Masruriyah, A. F. N., & Pratama, A. R. (2022). The Utilization of Decision Tree Algorithm In Order to Predict Heart Disease. JURNAL SISFOTEK GLOBAL, 12(2), 138. https://doi.org/10.38101/sisfotek.v12i2.551

Nnoaham, K. E., & Cann, K. F. (2020). Can cluster analyses of linked healthcare data identify unique population segments in a general practice-registered population? BMC Public Health, 20(1). https://doi.org/10.1186/s12889-020-08930-z

Penafiel, S., Baloian, N., Sanson, H., & Pino, J. A. (2021). Predicting Stroke Risk with an Interpretable Classifier. IEEE Access, 9, 1154–1166. https://doi.org/10.1109/ACCESS.2020.3047195

Setiawati, E., Fernanda, U. D., Agesti, S., Iqbal, M., & Herjho, M. O. A. (2024). Implementation of K-Means, K-Medoid and DBSCAN Algorithms In Obesity Data Clustering. IJATIS: Indonesian Journal of Applied Technology and Innovation Science, 1(1). https://doi.org/10.57152/ijatis.v1i1.1109

Shpigelman, E., & Shamir, R. (2023). A feature ranking algorithm for clustering medical data. https://doi.org/10.1101/2023.09.30.23296349

Sonjaya, C. B., Masruriyah, A. F. N., Kusumaningrum, D. S., & Pratama, A. R. (2022). The Performance Comparison of Classification Algorithm in Order to Detecting Heart Disease. INTERNAL (Information System Journal, 5(2), 166–175. https://doi.org/10.32627

Sugiura, T., Takase, H., Dohi, Y., Yamashita, S., & Seo, Y. (2024). Impact of medical checkup parameters on major adverse cardiovascular events in the general Japanese population. Preventive Medicine Reports, 38. https://doi.org/10.1016/j.pmedr.2024.102600

Sutramiani, N. P., Arthana, I. M. T., Lampung, P. F., Aurelia, S., Fauzi, M., & Darma, I. W. A. S. (2024). The Performance Comparison of DBSCAN and K-Means Clustering for MSMEs Grouping based on Asset Value and Turnover. Journal of Information Systems Engineering and Business Intelligence, 10(1), 13–24. https://doi.org/10.20473/jisebi.10.1.13-24

Yang, W. C., Lai, J. P., Liu, Y. H., Lin, Y. L., Hou, H. P., & Pai, P. F. (2024). Using Medical Data and Clustering Techniques for a Smart Healthcare System. Electronics (Switzerland), 13 (1). https://doi.org/10.3390/electronics13010140




DOI: https://doi.org/10.26760/mindjournal.v10i1.99-113

Refbacks

  • Saat ini tidak ada refbacks.


____________________________________________________________

ISSN (Print): 2338-8323 | ISSN (Online): 2528-0902

Published by:
Department of Informatics, Institut Teknologi Nasional Bandung

Address:
Building 2, Jl. PHH Mustofa No. 23, Bandung 40124, Indonesia

Contact:
Phone: +62-22-7272215 (ext. 181) Fax: +62-22-7202892

Email: mind.journal@itenas.ac.id

______________________________

Statistik Pengunjung :

Flag Counter

  Web
Analytics Statistik Pengunjung

 Jurnal ini terlisensi oleh Creative Commons Attribution-ShareAlike 4.0 International License.

Creative Commons License