Research On Imbalanced Data Classification Methods Based On Ensemble Learning

Posted on:2022-11-22

Degree:Master

Type:Thesis

Country:China

Candidate:K C Hou

Full Text:PDF

GTID:2518306611995909

Subject:Automation Technology

Abstract/Summary:

PDF Full Text Request

Artificial intelligence based on machine learning is attracting much attention and influencing people’s lives constantly.As an important tool,machine learning has been successfully applied in material performance prediction,network intrusion analysis,medical detection and other fields.In this process,we often face to many situation i.e.some datasets are much larger than others.This problem is usually called the data imbalance.In the problem,the distribution of the data is sparse and imbalance.At present,many models do not work well when they process the imbalanced datas.For example,standard support vector machine is used to deal with imbalance data,the majority of classes are paid more attentions than ones.In this case,the decision boundary biased to the majority classes and classification effect is not ideal.However,in the face of practical problems,since minority classes often have more significant performance,it is required to be evaluated standard to distinguish classes accurately.In this paper,considering the difficulty and complexity of imbalanced data classification,based on support vector machine（SVM）and ensemble learning framework,a new imbalanced data classification algorithm is proposed.As follows:Firstly,SVM is improved to SVM with Gaussian kernel function in the construction of base classifier,and the standard integration algorithm is improved to cost sensitive ensemlble algorithm.Secondly,by modifying the misclassification cost of the majority class and minority class data,the weight of the majority class and minority class data is balanced.Therefore decision boundary does not bias to the majority class to achieve good classification performance.Finally,in the experimental analysis stage,11 data sets from UCI database are selected to evaluate the proposed algorithm form accuracy,recall rate and G-MEAN.The performance of our algorithm is compared with other existing classification ones.Experimental results show that the proposed algorithm of our classification performance works well.It is believed that our work of this paper will make positive effects to the further study of imbalanced data classification.

Keywords/Search Tags:

Imbalanced datas, Support vector machines, Ensemble learning, Cost-sensitive, Weight balance

PDF Full Text Request

Related items

1	Improvement And Application Of Ensemble Learning Method Based On Support Vector Machin
2	Research On Ensemble Method Of Structured Support Vector Machine For Imbalanced Data
3	Research On Automatic Diagnosis Methods Of Breast Cancer Based On Cost-Sensitive Learning And Its Application
4	Research On Classification Algorithm For Imbalanced Data Sets Based On Support Vector Machines
5	Research On Cost-sensitive Learning Method Based On Probability Density
6	Anomaly Detection Research For Imbalanced Classes
7	Hybrid Ensemble Learning For Imbalanced Data
8	Research And Applications On Intrusion Detection Based On Support Vector Machines For Imbalanced Datasets
9	Research And Application Of Imbalanced Data Classification Algorithm Based On Ensemble Learning
10	Imbalanced Data Classification And Its Application In The Prediction Of The Mobile Phone Replacement