Font Size: a A A

Weighted Localized Generalization Error Extreme Learning Machine For Multiclass Imbalance Problems

Posted on:2022-12-19Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhongFull Text:PDF
GTID:2518306731494534Subject:Statistics
Abstract/Summary:PDF Full Text Request
Imbalanced data means that there is extremely large difference between the sample sizes of different classes in a data set.Generally,the class with small sample size is called the minority class,while that with large sample size is called the majority class.Imbalanced data widely exists in all fields of real life,such as credit fraud identification,medical diagnosis,network intrusion monitoring,and industrial fault detection.It is highly valuable to identify minority class examples in such data.However,regarding the classification of imbalanced data,traditional classification algorithms often focus on the accuracy rate of classification and are easy to mistakenly classify minority samples into the majority class,so that the classifier has weak ability in generalization of the minority class.In addition,existing researches on the classification of imbalanced data mainly concentrate on two-class classification,and few of them involve more general and complex multi-class classification.Furthermore,researches have demonstrated that the technique for two-class imbalance problems may become fully invalid when being used for multiclass imbalance problems.As a single hidden layer feedforward neural network with random weights,extreme learning machine(ELM)has the advantages of simple structure,extremely high training speed,strong generalization ability,and supporting multi-classification tasks.However due to the imbalance of data set,significant reduction can still be seen in its classifier's ability to identify the minority class,which is similar to traditional classification algorithms.Therefore,in this research,the ELM was made the following improvements in regard to multiclass imbalance problems from the perspectives of loss function transformation and hyper parameter optimization.1.ELM does not take into account different misclassification costs on different classes and is more sensitive to the fluctuation of minority class.In view of this,weighted localized generalization error extreme learning machine(WLGE-ELM)was proposed,and an iterative solution and a proof of convergence were provided for model solution.This algorithm combined cost-sensitive idea and localized generalization error model,making it available to suppress the output of such nodes in hidden layer as sensitive to fluctuation of minority class of samples,while increasing the cost of minority class misclassification.The purpose was to improve the rate and stability of classifier to identify the minority class.2.As WLGE-ELM involved in selection of two important hyper parameters(i.e.,sample weight and sample neighborhood),Bayesian optimization theory was introduced and a Bayesian optimization based weighted local generalization error extreme learning machine(BO-WLGE-ELM)was constructed to give play to the best performance of the classifier.Finally,simulated data experiment and real data experiment were carried out.As a result,the simulated data experiment demonstrated that WLGE-ELM could improve its ability to identify the minority class by adjusting the sample weight and sample neighborhood,while optimum combination of hyper parameters could be precisely searched out by Bayesian optimization.The real data experiment proved that compared with other improved algorithm of ELM,BO-WLGE-ELM had more optimum classification effect,especially for highly imbalanced data set.
Keywords/Search Tags:Multi-classification of imbalanced data, Extreme learning machine, Localized generalization error model, Bayesian optimization
PDF Full Text Request
Related items