Font Size: a A A

Research On Accurate Identification Of Poor Students In Campus Based On Ensemble Learning Algorithm

Posted on:2021-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:H WangFull Text:PDF
GTID:2507306095480394Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
With the arrival of the era of big data,it is a strategic goal for colleges and universities to realize the overall management of digitalization and informatization.In particular,it has become the core task of poverty alleviation in education for colleges and universities.How to use modern information technology means such as big data and artificial intelligence technology to tap the characteristics of poverty differences among college students,realize the goal of accurately identifying poor students,accurately subsidizing more poor students,and achieve the goal of "poverty alleviation,motivation and strength" is of great strategic significance.This article is based on the initial student data such as the logistics department,library,education department,and academic work department of a university.After the initial analysis of the original student data,consistency analysis,missing values,outliers,duplicate values and other dirty data are analyzed Dimensionally build an index system to identify poor students,analyze and discuss the differences in behavior between poor and non-poor students;based on the constructed characteristic variables,use the Smote algorithm to deal with sample imbalance problems.At the same time,in order to accurately identify the high-dimensional feature space of the feature data set and the nonlinearity of the final establishment of the classification integration algorithm model,univariate analysis of variance is used,combined with the random forest-based Filter feature selection algorithm to deal with excessive data redundancy Problem;and based on the integrated learning algorithm,with the help of Grid Search CV,combined with the 50% cross-validation method,continuously train and adjust the important parameters of the integrated learning algorithm model to build an accurate identification model for poor students in colleges and universities.According to the evaluation indexes of the model: AUC value and F1 score,Ada Boost,XGBoost and GBDT were comprehensively evaluated through the test set,and the XGBoost model with the best performance of AUC value and score was selected as the evaluation model for accurately identifying poor students.It can effectively realize the feasibility application of data mining technology in the identification of poor students,and provide guarantee for accurate financial aid for students.
Keywords/Search Tags:Precision funding, ensemble learning, feature construction, the Filter feature selection algorithm based on random forest
PDF Full Text Request
Related items