Research On Accurate Prediction For College Student Grants Based On Machine Learning

Posted on:2020-07-08

Degree:Master

Type:Thesis

Country:China

Candidate:H C Peng

Full Text:PDF

GTID:2507305981952809

Subject:Master of Engineering

Abstract/Summary:

In order to promote fairness in education,the state has been striving to improve the poverty-stricken student support system to ensure that every student can enjoy the basic right to receive education.However,there are many problems in the current accreditation process for granting students.Since universities do not directly contact the specific family background of students,they can only judge according to the written materials applied by the students.However,some students use social relations to defraud grant quotas by making false poverty certificates or exaggerating the poverty level of large families,which results in some genuinely poor students being unable to receive state funding.So many schools appear so-called "false identification" poor students every year,which has aroused widespread concern in society.With the development of the era of big data,more and more problems that are difficult to solve in the traditional field has been integrated the thinking of the Internet and provided new solutions.This paper trained a machine learning model that is based on the spending,learning,and living habits data of a college in the past two years generated by students,to help the manager grasp the real consumption situation and the economic level of the students during school days and provide important ideas to find “hidden poverty” or “falsely identified” students.The following work was carried out around the subject of this paper:(1)Based on the student’s consumption and behavior dataset,this paper has done some data statistics work,which show some difference between the two types of student groups on total consumption,consumption mode,student ranking and so on.And then the feature engineering is carried out basing on the analysis work,such as the original data is subjected to missing value processing,one-hot encoding and normalization.Many derivative features are constructed according to time,place and other dimensions.Then,the 68 features with the highest score are selected as the experimental sample set by the stability selection method.Finally,in order to relieve the situation that the label from the dataset is imbalance,the SMOTE algorithm is used to expand the sample of a few classes.(2)Based on the dataset after feature engineering,this paper uses naive Bayes,support vector machine,neural network,random forest,XGBoost for preliminary experiments,and use AUC to evaluate these models,which is originally designed for binary classification criteria.The results show that XGBoost is the best algorithm in this dataset as a single model.Then use the idea of grid search to find the optimal value of main parameter of XGBoost.Aiming at the shortcomings of single model being affected by dataset,this paper proposes a XGBoost hybrid model based on Bagging idea.Finally,results show that Bagging&XGBoost hybrid model has better robustness.

Keywords/Search Tags:

Grants, Machine Learning, Feature Engineering, XGBoost, Bagging Ideas

Related items

1	Prediction Of Online Education User Behavior Based On Machine Learning
2	A Research On CTR Prediction Based On Ensemble Of RF,XGBoost And FFM
3	Default Risk Prediction Of A P2P Platform Based On Machine Learning
4	MOOC Learning Behavior Analysis And Dropout Prediction Based On Feature Engineering
5	Construction And Empirical Study Of Recovery Model Of Customized Cold-Water Immersion After Exercise Based On XGBoost Algorithm
6	Research On Mining And Analyzing College Students Online Learning Habits Based On Feature Engineering
7	Based On The Research Of Credit Overdue Prediction Under Internet Finance
8	Research On Prediction Of MOOC Dropout Based On Feature Engineering
9	Call Fraud Detection Behavior Analyses Based On Machine Learning
10	Research And Implementation Of Chinese Automatic Abstract Technology Based On Machine Learning