Prediksi Retensi Mahasiswa Menggunakan Algoritma Random Forest dengan Optimasi Algoritma Genetika

PUDY PRIMA, AHMAD RIO ADRIANSYAH, ALFIAN NUR USYAID

Sari


Abstrak

Ketidakseimbangan kelas (imbalanced data) memicu bias mayoritas pada model konvensional dalam memprediksi retensi mahasiswa. Penelitian ini mengusulkan model peringatan dini (early warning system) dengan mengintegrasikan teknik penyeimbangan data Synthetic Minority Over-sampling Technique (SMOTE) dan pengklasifikasi Random Forest (RF). Untuk menghindari inefisiensi pencarian hyperparameter manual, Algoritma Genetika (GA) diaplikasikan guna melakukan optimasi secara global. Pengujian terhadap dataset historis mahasiswa STT Terpadu Nurul Fikri angkatan 2021 membuktikan bahwa kombinasi SMOTE dan GA-RF sangat efektif. Model hibrida ini mencapai akurasi global 99%, dengan nilai Precision 1,00 dan Recall 0,67 pada deteksi kelas minoritas (dropout). Analisis ekstraksi fitur (Feature Importance) mengungkap bahwa ketahanan studi mahasiswa didominasi oleh performa Indeks Prestasi Semester (IPS) di tahun pertama serta faktor administratif berupa jalur pendaftaran seleksi mandiri.

Kata kunci: prediksi Dropout, Ketidakseimbangan Data, SMOTE, Random Forest, Algoritma Genetika

Abstract

Class imbalance triggers majority bias in conventional models for predicting student retention. This study proposes an early warning model integrating the Synthetic Minority Over-sampling Technique (SMOTE) for data balancing and a Random Forest (RF) classifier. To avoid manual hyperparameter tuning inefficiencies, a Genetic Algorithm (GA) is applied for global optimization. Testing on the 2021 historical student dataset of STT Terpadu Nurul Fikri proves the effectiveness of combining SMOTE and GA-RF. The hybrid model achieved 99% global accuracy, with 1.00 Precision and 0.67 Recall in minority class (dropout) detection. Feature Importance analysis reveals that student study retention is predominantly driven by first-year Grade Point Average (GPA) performance and administrative factors, specifically the independent selection admission path.

Keywords:  Dropout Prediction, Imbalanced Data, SMOTE, Random Forest, Genetic Algorithm



Teks Lengkap:

PDF

Referensi


Bedri, M. A., Putra, Y. P., & Rilvani, E. (2025). Prediksi kelulusan mahasiswa menggunakan algoritma decision tree. Jurnal Inovasi Multidisiplin dan Teknologi Modern. 8(3).

Campbell, C. M., & Mislevy, J. (2009). Students’ perceptions matter: Early signs of undergraduate student retention/attrition. Ha¬rbor in the Storm: Institutional Research in the Age of Accountability, 66–96.

Dridi, S. (2024). Supervised learning—A systematic literature review. Open Science Framework. https://doi.org/10.31219/osf.io/qtmcs

Fernandez, A., Garcia, S., Herrera, F., & Chawla, N. V. (2018). SMOTE for learning from imbalanced data: Progress and challenges, marking the 15-year anniversary. Journal of Artificial Intelligence Research, 61, 863–905. https://doi.org/10.1613/jair.1.11192

Fitriana, S., Riniyanty, Laila, R., Pratama, S. A., & Lamasitudju, C. A. (2024). Prediksi siswa putus sekolah dan keberhasilan akademik menggunakan machine learning. The Indonesian Journal of Computer Science, 13(6). https://doi.org/10.33022/ijcs.v13i6.4453

Indahyanti, U., Azizah, N. L., & Setiawan, H. (2022). Educational data mining on student academic performance prediction: A survey. Procedia of Sciences and Humanities.

Katoch, S., Chauhan, S. S., & Kumar, V. (2021). A review on genetic algorithm: Past, present, and future. Multimedia Tools and Applications, 80(5), 8091–8126. https://doi.org/10.1007/s11042-020-10139-6

Kharis, S. A. A., & Zili, A. H. A. (2022). Learning analytics dan educational data mining pada data pendidikan. Jurnal Riset Pembelajaran Matematika Sekolah, 6(1), 12–20. https://doi.org/10.21009/jrpms.061.02

Qisthiano, M. R. (2022). Klasifikasi terhadap prediksi kelulusan mahasiswa dengan menggunakan metode Support Vector Machine (SVM). Seminar Nasional Teknologi dan Multidisiplin Ilmu (SEMNASTEKMU), 2(2), 203–207. https://doi.org/10.51903/semnastekmu.v2i1.170

Martinez-Plumed, F., Contreras-Ochando, L., Ferri, C., Hernandez-Orallo, J., Kull, M., Lachiche, N., Ramirez-Quintana, M. J., & Flach, P. (2021). CRISP-DM twenty years later: From data mining processes to data science trajectories. IEEE Transactions on Knowledge and Data Engineering, 33(8), 3048–3061. https://doi.org/10.1109/TKDE.2019.2962680

Matharaarachchi, S., Domaratzki, M., & Muthukumarana, S. (2024). Enhancing SMOTE for imbalanced data with abnormal minority instances. Machine Learning with Applications, 18, 100597. https://doi.org/10.1016/j.mlwa.2024.100597

Moesarofah, M. (2021). Analisis karakteristik retensi mahasiswa di perguruan tinggi. Didaktis: Jurnal Pendidikan dan Ilmu Pengetahuan, 21(1). https://doi.org/10.30651/didaktis.v21i1.7005

Nawawi, I., Sugiarto, H., & Yuliandari, D. (2024). Meningkatkan akurasi prediksi kelulusan mahasiswa menggunakan metode algoritma genetika. Jurnal Informatika, 16(2).

Pusdatin Kemendikbud. (2020). Panduan penggunaan pangkalan data pendidikan tinggi (PDDikti). Kementerian Pendidikan, Kebudayaan, Riset, dan Teknologi. https://pddikti.kemdikbud.go.id

Hardiansyah, Ramdhani, I., & Mukhamad Khotib Arifai. (2025). Implementasi Algoritma Machine Learning untuk Prediksi Keberhasilan Mahasiswa di Program Studi Teknik Informatika. Jurnal Onevision, 1(2), 153–160. Retrieved from https://ejournal.visione.co.id/ojs/index.php/juvismi/article/view/17

Ridwansyah, R., Wijaya, G., & Purnama, J. J. (2020). Hybrid optimization method based on genetic algorithm for graduates students. Jurnal Pilar Nusa Mandiri, 16(1), 53–58. https://doi.org/10.33480/pilar.v16i1.1180

Salman, H. A., Kalakech, A., & Steiti, A. (2024). Random forest algorithm overview. Babylonian Journal of Machine Learning, 2024, 69–79. https://doi.org/10.58496/BJML/2024/007

Sulehu, M., Wisda, W., Wanita, F., & Markani, M. (2025). Optimasi prediksi kelulusan mahasiswa menggunakan Random Forest untuk meningkatkan tingkat retensi. Jurnal Minfo Polgan, 13(2), 2364–2374. https://doi.org/10.33395/jmp.v13i2.14472

Takahashi, K., Yamamoto, K., Kuchiba, A., & Koyama, T. (2022). Confidence interval for micro-averaged F1 and macro-averaged F1 scores. Applied Intelligence, 52(5), 4961–4972. https://doi.org/10.1007/s10489-021-02635-5

Vincent, A. M., & Jidesh, P. (2023). An improved hyperparameter optimization framework for AutoML systems using evolutionary algorithms. Scientific Reports, 13(1), 4737. https://doi.org/10.1038/s41598-023-32027-3




DOI: https://doi.org/10.26760/mindjournal.v11i1.115-126

Refbacks

  • Saat ini tidak ada refbacks.


____________________________________________________________

ISSN (Print): 2338-8323 | ISSN (Online): 2528-0902

Dipublikasikan oleh:
Program Studi Informatika, Institut Teknologi Nasional Bandung

Alamat:
Gedung 2 Informatika, Jl. PHH Mustofa No. 23, Bandung 40124, Indonesia

Kontak:
Telp: +62-22-7272215 (ext. 181) Fax: +62-22-7202892

Email: mind.journal@itenas.ac.id

______________________________

Statistik Pengunjung :

Flag Counter

  Web
Analytics Statistik Pengunjung

 Jurnal ini terlisensi oleh Creative Commons Attribution-ShareAlike 4.0 International License.

Creative Commons License