Font Size: a A A

Research On Key Technologies Of Subsidy Prediction For College Students

Posted on:2020-05-29Degree:MasterType:Thesis
Country:ChinaCandidate:K H LiFull Text:PDF
GTID:2428330596496915Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the expanding enrollment of college students and the implementation of the special enrollment plan in poverty-stricken areas,the proportion of poor students in colleges and universities has gradually increased.For those students,financial difficulties can have a negative impact on their physical and mental health as well as academic performance.Thus,the government and universities have set up various subsidies to help poor students complete their studies.However,situations like pseudopoor students who are not real poor but pretended to be and human intervention bring about the problem that the subsidies sometimes cannot be accurately distributed to students in real need.Under this background,the research on key technologies of subsidy prediction has important practical significance.This thesis first proposes an improved outlier detection algorithm to identify the noise data caused by fraudulent using and other reasons.Then,a frequent pattern mining method PrefixSpan is optimized to extract useful features from students' trajectories.Finally,to increase the prediction accuracy,a model based on confidence fusion for subsidy prediction is proposed.The main research work of this thesis is as follows:(1)Outlier detection algorithm LOF produces inaccurate results for ignoring the importance of attributes when detect outliers.To solve this problem,a new algorithm EA-LOF is proposed.The algorithm provides a novel method for calculating the weight of attributes by combining the information entropy which represents the uncertainty of data distribution and the expert matrix which carries subjective information.This calculation can effectively measure the abnormal information an attribute could provide in outlier detection for it considering both objective and subjective factors.Then,a new distance metric formula is constructed by introducing the weight of attributes.The experimental results show that the EA-LOF algorithm can improve the accuracy of outlier detection,and the student data processed by the algorithm can enhance the prediction effect of subsidy.(2)The support of traditional Prefixspan algorithm is limited in application because it only considers the number of patterns,since some patterns may appear many times,they do not value so much significance.In view of this,a new support constructed by frequency of sequences and the average length of suffix sequences is taken as a new model indicator to measure sequential patterns.A scoring criterion is also proposed to evaluate and rank the mining results in order to screen out frequent patterns which contain the most information of students' behavior.The analysis of frequent patterns show that the proposed method can extract patterns which represent students' behavior in school accurately and provide reliable feature input for model training.(3)In order to increase the prediction accuracy,this thesis proposes a subsidy prediction model based on confidence fusion.Since the calculation of confidence depends on the accuracy of each basic learner,two types of accuracy indicators are designed: local classification accuracy and category classification accuracy.The former one measures the ability of classifying certain sample and its neighbors,while the latter one measures the ability of classifying all samples in a certain class.On this basis,the confidence evaluation letter is generated,introducing a balance parameter to adjust the contribution of two indicators in the calculation of confidence.The experimental results show that the accuracy of the proposed model is 0.025 higher than single classification algorithm,0.027 higher than that of the voting fusion,and 0.036 higher than that of the mean weighted fusion.
Keywords/Search Tags:Subsidy Prediction, Outlier Detection, Feature extraction, Frequent Pattern Mining, Model Fusion
PDF Full Text Request
Related items