Prediksi Retensi Mahasiswa Menggunakan Algoritma Random Forest dengan Optimasi Algoritma Genetika
Sari
Ketidakseimbangan kelas (imbalanced data) memicu bias mayoritas pada model konvensional dalam memprediksi retensi mahasiswa. Penelitian ini mengusulkan model peringatan dini (early warning system) dengan mengintegrasikan teknik penyeimbangan data Synthetic Minority Over-sampling Technique (SMOTE) dan pengklasifikasi Random Forest (RF). Untuk menghindari inefisiensi pencarian hyperparameter manual, Algoritma Genetika (GA) diaplikasikan guna melakukan optimasi secara global. Pengujian terhadap dataset historis mahasiswa STT Terpadu Nurul Fikri angkatan 2021 membuktikan bahwa kombinasi SMOTE dan GA-RF sangat efektif. Model hibrida ini mencapai akurasi global 99%, dengan nilai Precision 1,00 dan Recall 0,67 pada deteksi kelas minoritas (dropout). Analisis ekstraksi fitur (Feature Importance) mengungkap bahwa ketahanan studi mahasiswa didominasi oleh performa Indeks Prestasi Semester (IPS) di tahun pertama serta faktor administratif berupa jalur pendaftaran seleksi mandiri.
Kata kunci: prediksi Dropout, Ketidakseimbangan Data, SMOTE, Random Forest, Algoritma Genetika
AbstractClass imbalance triggers majority bias in conventional models for predicting student retention. This study proposes an early warning model integrating the Synthetic Minority Over-sampling Technique (SMOTE) for data balancing and a Random Forest (RF) classifier. To avoid manual hyperparameter tuning inefficiencies, a Genetic Algorithm (GA) is applied for global optimization. Testing on the 2021 historical student dataset of STT Terpadu Nurul Fikri proves the effectiveness of combining SMOTE and GA-RF. The hybrid model achieved 99% global accuracy, with 1.00 Precision and 0.67 Recall in minority class (dropout) detection. Feature Importance analysis reveals that student study retention is predominantly driven by first-year Grade Point Average (GPA) performance and administrative factors, specifically the independent selection admission path.
Keywords: Dropout Prediction, Imbalanced Data, SMOTE, Random Forest, Genetic Algorithm
Teks Lengkap:
PDFReferensi
Bedri, M. A., Putra, Y. P., & Rilvani, E. (2025). Prediksi kelulusan mahasiswa menggunakan algoritma decision tree. Jurnal Inovasi Multidisiplin dan Teknologi Modern. 8(3).
Campbell, C. M., & Mislevy, J. (2009). Students’ perceptions matter: Early signs of undergraduate student retention/attrition. Ha¬rbor in the Storm: Institutional Research in the Age of Accountability, 66–96.
Dridi, S. (2024). Supervised learning—A systematic literature review. Open Science Framework. https://doi.org/10.31219/osf.io/qtmcs
Fernandez, A., Garcia, S., Herrera, F., & Chawla, N. V. (2018). SMOTE for learning from imbalanced data: Progress and challenges, marking the 15-year anniversary. Journal of Artificial Intelligence Research, 61, 863–905. https://doi.org/10.1613/jair.1.11192
Fitriana, S., Riniyanty, Laila, R., Pratama, S. A., & Lamasitudju, C. A. (2024). Prediksi siswa putus sekolah dan keberhasilan akademik menggunakan machine learning. The Indonesian Journal of Computer Science, 13(6). https://doi.org/10.33022/ijcs.v13i6.4453
Indahyanti, U., Azizah, N. L., & Setiawan, H. (2022). Educational data mining on student academic performance prediction: A survey. Procedia of Sciences and Humanities.
Katoch, S., Chauhan, S. S., & Kumar, V. (2021). A review on genetic algorithm: Past, present, and future. Multimedia Tools and Applications, 80(5), 8091–8126. https://doi.org/10.1007/s11042-020-10139-6
Kharis, S. A. A., & Zili, A. H. A. (2022). Learning analytics dan educational data mining pada data pendidikan. Jurnal Riset Pembelajaran Matematika Sekolah, 6(1), 12–20. https://doi.org/10.21009/jrpms.061.02
Qisthiano, M. R. (2022). Klasifikasi terhadap prediksi kelulusan mahasiswa dengan menggunakan metode Support Vector Machine (SVM). Seminar Nasional Teknologi dan Multidisiplin Ilmu (SEMNASTEKMU), 2(2), 203–207. https://doi.org/10.51903/semnastekmu.v2i1.170
Martinez-Plumed, F., Contreras-Ochando, L., Ferri, C., Hernandez-Orallo, J., Kull, M., Lachiche, N., Ramirez-Quintana, M. J., & Flach, P. (2021). CRISP-DM twenty years later: From data mining processes to data science trajectories. IEEE Transactions on Knowledge and Data Engineering, 33(8), 3048–3061. https://doi.org/10.1109/TKDE.2019.2962680
Matharaarachchi, S., Domaratzki, M., & Muthukumarana, S. (2024). Enhancing SMOTE for imbalanced data with abnormal minority instances. Machine Learning with Applications, 18, 100597. https://doi.org/10.1016/j.mlwa.2024.100597
Moesarofah, M. (2021). Analisis karakteristik retensi mahasiswa di perguruan tinggi. Didaktis: Jurnal Pendidikan dan Ilmu Pengetahuan, 21(1). https://doi.org/10.30651/didaktis.v21i1.7005
Nawawi, I., Sugiarto, H., & Yuliandari, D. (2024). Meningkatkan akurasi prediksi kelulusan mahasiswa menggunakan metode algoritma genetika. Jurnal Informatika, 16(2).
Pusdatin Kemendikbud. (2020). Panduan penggunaan pangkalan data pendidikan tinggi (PDDikti). Kementerian Pendidikan, Kebudayaan, Riset, dan Teknologi. https://pddikti.kemdikbud.go.id
Hardiansyah, Ramdhani, I., & Mukhamad Khotib Arifai. (2025). Implementasi Algoritma Machine Learning untuk Prediksi Keberhasilan Mahasiswa di Program Studi Teknik Informatika. Jurnal Onevision, 1(2), 153–160. Retrieved from https://ejournal.visione.co.id/ojs/index.php/juvismi/article/view/17
Ridwansyah, R., Wijaya, G., & Purnama, J. J. (2020). Hybrid optimization method based on genetic algorithm for graduates students. Jurnal Pilar Nusa Mandiri, 16(1), 53–58. https://doi.org/10.33480/pilar.v16i1.1180
Salman, H. A., Kalakech, A., & Steiti, A. (2024). Random forest algorithm overview. Babylonian Journal of Machine Learning, 2024, 69–79. https://doi.org/10.58496/BJML/2024/007
Sulehu, M., Wisda, W., Wanita, F., & Markani, M. (2025). Optimasi prediksi kelulusan mahasiswa menggunakan Random Forest untuk meningkatkan tingkat retensi. Jurnal Minfo Polgan, 13(2), 2364–2374. https://doi.org/10.33395/jmp.v13i2.14472
Takahashi, K., Yamamoto, K., Kuchiba, A., & Koyama, T. (2022). Confidence interval for micro-averaged F1 and macro-averaged F1 scores. Applied Intelligence, 52(5), 4961–4972. https://doi.org/10.1007/s10489-021-02635-5
Vincent, A. M., & Jidesh, P. (2023). An improved hyperparameter optimization framework for AutoML systems using evolutionary algorithms. Scientific Reports, 13(1), 4737. https://doi.org/10.1038/s41598-023-32027-3
DOI: https://doi.org/10.26760/mindjournal.v11i1.115-126
Refbacks
- Saat ini tidak ada refbacks.
____________________________________________________________
ISSN (Print): 2338-8323 | ISSN (Online): 2528-0902
Dipublikasikan oleh:
Program Studi Informatika, Institut Teknologi Nasional Bandung
Alamat:
Gedung 2 Informatika, Jl. PHH Mustofa No. 23, Bandung 40124, Indonesia
Kontak:
Telp: +62-22-7272215 (ext. 181) Fax: +62-22-7202892
Email: mind.journal@itenas.ac.id
______________________________
Statistik Pengunjung :
Jurnal ini terlisensi oleh Creative Commons Attribution-ShareAlike 4.0 International License.
1.png)



