Font Size: a A A

Study Of College Student Performance Prediction Based On Machine Learning

Posted on:2021-01-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y L MaFull Text:PDF
GTID:1368330602982461Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Educational data mining(EDM)is an interdisciplinary research domain of educational psychology,computer science,and statistics.Based on the massive data accumulated in the field of education,EDM aims to utilize machine learning,data mining,statistics,and visualization technologies to solve various education-related issues.Student performance prediction is one of the most important research topics of EDM.It aims to predict students' future academic performance(i.e.,scores,grades,and rankings)based on student-related information.In recent years,with the continuous enrollment expansion of colleges,the size of the classroom teaching has become larger and larger.Given the large size of a course's students at colleges,it has become impossible for teachers to keep track of the performance of individual students,which has negative conse-quences on teaching quality.Furthermore,there always exist a certain number of college students who fail course exams,or even drop out,which seriously affect the future development of these students.In these circumstances,it is desirable to automatically predict each students performance and then develop academic early warning systems.In recent decades,extensive research effort has been devoted to student performance prediction.However,existing methods still have the three following limitations:1)Most studies highly rely on the information of target courses,e.g.,attendance and the scores of midterm exam,and thus need to makepredic-tions when target courses are in progress or even close to the end.As a result,these methods tend to be poor performance in terms of foreseeability;2)Most studies predict student performance based on self-built data sets,and generally suffer from the issue of data scarcity.It thus can not meet the needs of training complex and effective machine learning models,which seriously affects model accuracy;3)Most studies mainly utilize data about students' learning behavior,and ignore other student-related data.Furthermore,they construct predictive models based on handcrafted features,which further affects the prediction accuracy of the method.To address the aforementioned issues,this thesis focuses on traditional classroom teaching scenarios,and investigates the following three aspects from the two perspective of students' performance on course and grade point average.The main target of this thesis is to improve both foreseeability and accuracy of the prediction methods.Firstly,to improve the foreseeability of the prediction method,we propose a precourse student performance prediction method based on college course correlations.We seek to leverage students' performance in past semesters to predict their performance on each next-term course prior to its commencement.To this end,we cast the task of precourse student performance prediction as a multi-instance multi-label problem.We represent each student as a "bag" consisting of multiple instances,and each instance represents the information about a previous course of the student.Besides,in multi-label prediction,we treat target courses as labels and predict them simultaneously.In this way,we can predict students' performance prior to the start of each course.Compared with traditional methods,the proposed method achieves better performance in terms of foreseeability.Secondly,considering the correlations among college similar majors,we develop a novel multi-task learning method,namely "MIML-Circle",and jointly train models for multiple majors in a unified framework.In MIML-Circle,multiple models can be jointly learned on different data sets.In order to exploit the ben-efits from other related tasks,the labels of a sample predicted by all classifiers(i.e.,including classifiers of a task itself and those of other tasks)are utilized as new features of the sample.Then MIML-Circle builds predictive models itera-tively with these augmented features.In this way,we predict the performance of students from different majors in a unified framework,and effectively alleviate the issue of data scarcity,which further improves the accuracy of the prediction method.Lastly,we propose a behavior-driven student performance prediction method.Psychological research show that students' behavior habits are highly correlated with their academic performance.Motivated by these findings,we seek to predict student performance based on huge amount of campus smart card records.Instead of extracting features manually,we exploit end-to-end learning style to predict students' performance.We propose a dual path convolutional neural networks method to model three behavioral characters,including duration,vari-ation and periodicity,and construct predictive models for students' performance with the three types of information.Besides,given limited student samples in some majors,we introduce multi-task learning to our method and train predictive models for different majors jointly in a unified framework,which improves the accuracy of the prediction method.
Keywords/Search Tags:Educational Data Mining, Student Performance Prediction, Multi-Instance Multi-Label Learning, Multi-Task Learning, Convolutional Neural Networks
PDF Full Text Request
Related items