Font Size: a A A

Research On Extreme Learning Machine For Online Sequential Imbalanced Data Classification

Posted on:2017-12-08Degree:MasterType:Thesis
Country:ChinaCandidate:J W WangFull Text:PDF
GTID:2348330488967328Subject:Engineering
Abstract/Summary:PDF Full Text Request
In the practical engineering problems,there are many imbalanced classification problems,such as fault diagnosis,network intrusion detection.And this kind of problems has distinct time characteristics especially in large-scale data environment.We call this kind of problems as online sequential imbalanced data classification problems.As a single-hidden layer feed-forward neural network,extreme learning machine(ELM)has been successfully applied in pattern recognition and regression estimate and other issues because of its very high learning speed and good generalization performance.However,ELM tends to get biased classifier when the training data are seriously imbalanced.Therefore,we proposed some algorithms with better performance to solving sequential data imbalance problem on the basis of fully collecting data and processing.The main contents and contributions of this thesis are summarized as follows:(1)To improve the classification accuracy of minor class in online sequential data imbalance problem,a new weighted online sequential extreme learning machine based on imbalanced samples-reconstruction was proposed.This algorithm combined the data-based strategy and algorithm-based strategy.Firstly,the principal curve was introduced to exploit distribution characteristic of online data and the improved SMOTE method was used for over-sampling.To emphasize the importance of samples,a new weighted method was proposed to update network weight dynamically,where the values of weight were related to training errors.This algorithm inherited the features of fast of online sequential extreme learning machine and effectively solved the online imbalanced data classification under the condition of without affecting the computation complexity.(2)To improve the generalization performance of existing algorithms,this paper proposed a new online sequential extreme learning machine based on leave-one-out(LOO)cross-validation,which can estimate the generalized error quickly and efficiently and reduce the time complexity simultaneously.To solve online imbalance problem,under-sampling was taken according to LOO error.On the other hand,the strategy of add-delete mechanism was employed for updating network weights.Furthermore,to evaluate the rationality of this algorithm,a theoretical analysis about the information loss in under-sampling process was provided,which proved the effectiveness of this method in theory.(3)To improve the classification accuracy of minor class and reduce the loss of classification accuracy of major class simultaneously,a new hybrid sampling online extreme learning machine on sequential imbalanced data was proposed,which follows sample distribution characteristics while considering sample's importance.Firstly,reconstruct the sample set by the principal curve.Second,we eliminate redundant samples based on the index of sample importance.Then adjust the model according to the LOO error to ensure the optimal network structure.And the theoretical analysis and experimental results proved the rationality and effectiveness of this algorithm.The research in this paper can not only improve the theory of extreme learning machine,but also provide a new solution for online sequential imbalanced data classification.At the same time,the proposed algorithms effectively solve the problem of imbalanced data classification and have a good reference to other related fields.
Keywords/Search Tags:extreme learning machine, imbalanced data classification, online sequential data, principal curve, leave-one-out cross-validation
PDF Full Text Request
Related items