Students’ weak learning ability is a problem that occurs in most countries around the worldwide and leads to many bad effects on students such as boredom leading to dropout, guilt with friends and with many other students. Students’ poor academic results will greatly affect the teaching effectiveness and the reputation of the school. Therefore, predicting the student learning outcomes in high school can help educators to find innovative and effective solutions to support teachers, students in improving the learning and teaching quality in high schools. In this work, machine learning models were used to predict academic performance of high school students. These models were built from a dataset of 21,222 student records with 2,545 (11.99%) very good students, 7,859 (37.03%) good students, 8,099 (38.16%) average students, 2,531 (11.93%) poor students and 188 (0.89%) very poor students in high schools in Ca Mau province, Vietnam. With the use of the Synthetic Minority Oversampling TEchnique algorithm to balance the dataset before putting it into the machine learning models, the results have shown that the Random Forest, XGBoost, Light GBM models give the best results with the Accuracy of 81.69%, 80.86% and 80.82%. In addition, important features that contribute decisively in predicting academic performance were also extracted, including Grade Point Average (GPA) of semester 1 and 2, Age, Class, Academic Performance of semester 1 and 2, Father’s occupation, Mother’s occupation and Learning online.
Tạp chí khoa học Trường Đại học Cần Thơ
Lầu 4, Nhà Điều Hành, Khu II, đường 3/2, P. Xuân Khánh, Q. Ninh Kiều, TP. Cần Thơ
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn
Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên