Font Size: a A A

Research On Imbalanced Data Classification Algorithm Based On Extreme Learning Machine

Posted on:2022-06-27Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y WenFull Text:PDF
GTID:2518306575968469Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
With the advent of the big data era,real life has produced a lot of imbalanced data.It brings great difficulty to data mining.Classification is a common technique for data mining.However,traditional classification algorithms are mostly designed based on balanced data distribution,it is difficult to obtain ideal classification results for data sets with imbalanced distribution.In response to this phenomenon,this thesis focuses on the research of Extreme Learning Machine(ELM)in the classification of imbalanced data.From two different perspectives,two algorithms of ELM are improved for the classification of imbalanced data.At the data processing level,the combination of ELM and sampling technology can effectively improve the classification effect,but due to the unique complex attributes of imbalanced data,the sampling technology combined with ELM currently proposed cannot provide a good learning strategy for ELM.so this thesis conducts research and innovation on sampling technology,a resampling strategy that can effectively balance the data set is proposed to combine ELM classification.At the classification algorithm level,Cost-sensitive learning is a common technique to solve the problem of category imbalance.so,this thesis introduces the idea of cost sensitive learning into the ELM algorithm to improve it,and proposes a cost sensitive weighted extreme learning machine classification algorithm.1.Classification algorithm based on resampling technique combined with ELM.Firstly,a resampling classification model is designed according to the unbalance degree of the data set.Then,the idea of DPC clustering principle is introduced into the sampling technology to process the two types of samples,and the clipping method of Majority samples is proposed,in the minority class processing stage,a method of adaptive synthesis of minority class samples is designed according to the local density and proximity distance of each sample,and mitigate the problem of intra-class imbalance in the minority clusters.Finally,the re-sampled balanced data set is classified and trained using the ELM algorithm.The experimental results show that the classification effect of this algorithm is improved compared with other similar algorithms.2.A cost sensitive weighted CW-ELM classification algorithm.Firstly,the algorithm considering the distribution characteristics of different categories of samples and the importance of internal samples within the same category,These two methods are proposed that A method of setting misclassification penalty factor based on information theory and the in-class sample weight determination method based on DPC clustering.Then,the problem of calculating the weight of the ELM output layer is solved by constructing a constrained optimization theory.Finally,it is compared and verified on different data sets.The experimental results show that the algorithm has better stability and better classification performance than other similar algorithms.
Keywords/Search Tags:imbalanced data set, resampling, ELM, cost-sensitive learning, classification algorithm
PDF Full Text Request
Related items