Research And Application On The Classification Method For Imbalanced Data

Posted on:2024-03-26

Degree:Master

Type:Thesis

Country:China

Candidate:Z Q Sun

Full Text:PDF

GTID:2568306941464364

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In the field of data mining,machine learning methods learn inductive rules from raw data,and the combination of rules with practical applications includes the classification problem of data.The regular classification methods are applicable to balanced data sets and can achieve superior results in handling the problem.However,when the classification problem is migrated to an imbalanced environment,regular classification methods are not able to learn generative rules for different classes fairly due to the defective data categories left in terms of data volume or misclassification cost,coupled with the guidance bias brought by global performance metrics.The minority class groups cannot be accurately predicted by classification methods,thus causing the classification task of imbalanced data to become difficult.In this dissertation,we conduct a study on the classification problem of imbalanced data,propose two improved classification methods for imbalanced data,and design a practical application of campus funding under imbalanced scenarios based on campus big data.The main research contents are as follows:(1)From the data pre-processing level,we address the problem that the existing undersampling methods do not consider the effect of minority group samples on the majority group,which may lead to information loss in the process of performing undersampling on the majority group samples.In this study,the kernel density estimation method is used to learn the density distribution of the minority group and perform undersampling on the majority class samples according to the distribution characteristics.The optimized method obtains better classification performance at a lower consumption cost.(2)From the classification algorithm level,an adaptive weighted extreme learning machine is proposed to address the problem of low overall classification performance caused by ignoring the differences between samples within classes when dealing with imbalanced data.In this study,the initial cost weights and extra cost weights are designed to construct an adaptive penalty matrix.This design approach takes into account the distribution of samples in different classes and effectively improves the overall classification accuracy of the algorithm on the imbalanced data set.(3)The number of economically disadvantaged students on campus is much lower than the number of regular students,and such applications are likewise classification problems with imbalanced data,making it a challenging task to apply predictive models in this area based on the context of big data on campus.In this dissertation,we combine research methods with practical applications to design a financial aid decision system for economically disadvantaged students.

Keywords/Search Tags:

Imbalanced Data Classification, Machine Learning, Kernel Density Estimation, Extreme Learning Machine, Cost-sensitive Learning

PDF Full Text Request

Related items

1	Research On Weighted Extreme Learning Machine Algorithm Based On Imbalanced Data Distribution
2	Study Of Active Learning Algorithms On Imbalanced Data Using Extreme Learning Machine
3	Designing Feature Selection And Classincation Methods For Classificationmethods For Imbalanced Learning And Cost-sensitive Learning Problems
4	Study Of Class Imbalance Learning Based On Extreme Learning Machine
5	Research On Imbalanced Data Classification Algorithm Based On Extreme Learning Machine
6	Research On Classification Methods Based On Extreme Learning Machine
7	Research On Extreme Learning Machine For Imbalanced Data Classification
8	Hybrid Ensemble Learning For Imbalanced Data
9	Imbalanced Classification Methods Based On Extreme Learning Machine And The Application
10	Research On Text Classification Methods Based On Extreme Learning Machine