With the continuous development of university campus informatization,universities have a large number of college students’ campus online behavior and one-card campus consumption behavior data.Therefore,making full use of campus behavior data to mine the value of data,predicting and early warning of college students’ academic performance,and providing effective basis for school teaching managers and teachers to strengthen management and learning guidance are important issues that colleges and universities urgently need to discuss and solve.question.On the basis of combing and analyzing college students’ campus Internet and campus consumption data,and statistically calculating college grade point data,this thesis identifies important features through a correlation analysis of the relationship between Internet surfing and consumption behavior characteristics and grade points.Using a variety of classic machine learning algorithms to establish a grade point prediction model based on Internet access and consumer behavior,a college grade point prediction model based on Stacking,and a prototype system for college student performance point early warning was designed and implemented.The main work includes:(1)Acquisition and preprocessing of college students’ campus behavior data.Obtained 1,573,233 pieces of online data from the campus network log data of two grade students in a college of a certain university,and parsed out 31 dimensions of online type information with JSON and unified traffic units;obtained 549,239 pieces of consumption data from the one-card,according to time Duan merged multiple card swiping records into one dining or shopping record,and extracted 12 dimensions of consumption information;after filling or deleting the missing value data,determined19 campus behaviors of college students,such as the duration of surfing the Internet,the flow of online meetings,and the number of breakfasts feature data.(2)Selection of behavior characteristics of college students.Obtain 8,316 pieces of grade data from the grade system,and calculate the average grade point of each student by semester.Integrate the pre-processed students’ campus Internet access,consumption behavior data and grade point data,and conduct multi-dimensional visual analysis and display of the relationship between Internet access,consumption characteristics and grade point.Students with more scores have better grades,while students with more web streaming visits and more supermarket shopping times have poorer grades.Further,through the Spearman correlation analysis of college students’ campus behavior characteristic data and grade points,nine important characteristic data such as breakfast frequency,network meeting traffic,and Web streaming media are selected as the input of the grade point prediction model.(3)Construction of grade point prediction model based on Stacking integrated learning method.Using the selected nine kinds of campus behavior characteristics data of college students,the grade point prediction model was established using five algorithms including random forest,CatBoost,decision tree,naive Bayesian,and support vector machine;research and construction of grade point prediction based on the Stacking integrated learning method For the model,the random forest,CatBoost and decision tree algorithms were selected for the first layer optimization;in order to prevent overfitting,the logistic regression algorithm was used in the second layer.Furthermore,using the sample data processed by the ADASYN algorithm oversampling,a grade point prediction model based on five data mining algorithms was established;a grade point prediction model based on Stacking was constructed using single-semester behavioral characteristic data,and the first layer of optimization selection The decision tree algorithm and the random forest algorithm were adopted;the grade point prediction model based on Stacking was constructed by using the behavior data of two semesters,and the random forest algorithm and the CatBoost algorithm were selected for the first layer of optimization.The results of comparative experiments show that(1)the prediction effect of the grade point prediction model based on Stacking is generally better than that of the other five models;(2)the prediction effect of the Stacking model constructed using oversampled data is generally better than that of the Stacking model constructed using non-oversampled data;(3)The prediction effect of the Stacking model constructed by using the two-semester behavior characteristic data after oversampling is generally better than that of the Stacking model constructed by using the one-semester data after sampling.(4)Design and implement a prototype system of academic early warning based on grade point classification.The system is based on the B/S architecture,developed using the Spring Boot and Vue frameworks,and the database uses Mysql and Redis.The grade points are divided into three categories: excellent,medium,and poor.Users include teaching secretaries,counselors,head teachers,and students.Various teacher users can query early warning information and statistical analysis information within the scope of their authority,and students can query their own academic early warning.information. |