Font Size: a A A

The Study Of Multi-classification Cost-sensitive Method For Poor Students Data In Guangxi

Posted on:2023-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:H Y HuangFull Text:PDF
GTID:2557306836464374Subject:Engineering
Abstract/Summary:PDF Full Text Request
A key element of the national poverty alleviation policy is education,of which it is crucial to identify poor students in order to achieve personalized student financial assistance.With the digital development of colleges and universities,many of them use information systems to manage student financial aid and use big data technology to achieve precise support for poor students.However,the existing poverty identification methods face the challenges of missing data,attribute redundancy,and class imbalance.Based on the summary of existing methods and research results,this paper aims to improve the data quality of poor students and the classification accuracy of high-needs student classes,and to study data pre-processing methods as well as the imbalanced classification methods respectively,so that to start the research on the data classification methods for poor students in Guangxi.The specific research contents and contributions are as follows:Faced with the data quality problem,it is given a feature selection method for incomplete data.The existing filling and feature selection methods can hardly cope with the large amount of missing data and attribute redundancy in the poverty data at the same time.In this regard,this paper proposes a feature selection method based on Fisher Score algorithm and Wrapper algorithm for searching feature subsets in order to improve the classification accuracy and the data quality of poor students.The experimental results show that the FSIPD method can effectively improve the classification accuracy and improve the data quality.Faced with the class imbalance problem,it is given a classification method for poor students based on cost-sensitivity.The existing class imbalance classification method can hardly cope with the problem of setting the cost matrix and balancing the classification accuracy of the overall and each poverty classes in the identification of poor students.In this regard,this paper proposes a classification optimization model for multi-class poverty while giving a classification method based on genetic algorithm and cost-sensitivity for searching the cost matrix and constructing a sensitive classification model for poor student data.The results from experiments show that the proposed method can improve the accuracy of high-needs student classes while keep the overall accuracy.
Keywords/Search Tags:poor student classification, data preprocessing, cost sensitive, imbalanced dataset, targeted poverty alleviation
PDF Full Text Request
Related items