Font Size: a A A

Research On Weighted Extreme Learning Machine Algorithm Based On Imbalanced Data Distribution

Posted on:2020-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:Q S SunFull Text:PDF
GTID:2428330578960291Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The problem of learning from imbalanced data plays an important role in the field of data mining.How to effectively deal with imbalanced data has become a research hotspot.When using the traditional classifiers,the imbalance of data often leads to the a critical mistake,which makes it difficult to obtain a satisfactory classification effect.Nowadays,scholars at home and abroad have proposed a variety of methods for solving the class imbalance problem,but have not fully considered the impact of data distribution on the performance of the classifiers.In view of the above problems,based on the cost-sensitive learning,this paper fully discusses the impact of data class imbalance on classification performance.Then,based on the prior distribution characteristics of sample data,this paper gives a study on both numerical data and image data,with binary and multiclass classes,separately.The main research content has the following two aspects:(1)Numerical data refers to data that has been manually screened and digitized,which can be directly used for classifier learning,but the traditional classifier tends to have a greater preference for majority classes,resulting in a lower classification accuracy for minority classes.In this regard,this paper proposes a algorithm named data distribution based weighted extreme learning machine(D-WELM).The algorithm is based on cost-sensitive learning,and it considers not only the effect of sample sizes,but also the influence of data distribution.Meanwhile,this work takes overall loss into account,when designing weighting scheme.The feasibility and effectiveness of DWELM were verified by 39 binary-class and multi-class imbalanced data sets.The experimental results show that D-WELM performs in the imbalanced classification problem.(2)Image data generally has the characteristics of large scale and high dimensionality.For getting good results,images can not directly used in simple models,but the image features can be effectively extracted when using the convolutional neural network(CNN).In this paper,a weighted extreme learning machine based on convolutional neural network and data distribution(CNN-DWELM)is proposed for the imbalance classification of images.The algorithm is also based on cost-sensitive learning.It combines the advantages of CNN for feature extraction and the advantages of ELM for quick training speed and high classification accuracy.Through experimental comparison of three data sets,the results show that CNN-DWELM has better imbalance image classification ability.
Keywords/Search Tags:Imbalanced data classification, Cost-sensitive learning, Extreme learning machine, Convolutional neural network
PDF Full Text Request
Related items