Font Size: a A A

Research On Online Learning Algorithms For Drifting Imbalanced Data Stream

Posted on:2020-12-13Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:2518306548495964Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the arrival of big data era,more and more data are produced in human activity.To explore the knowledge burred in these data,many research work began to mine different types of data.Data stream has high speed,real-time,high-dimensional features.This article studies the classification of data streams with concept drift and class imbalance.The main work of this paper is as follows:1.For the background and related research,this paper starts from the data stream mining technology,methods of concept drift data stream,methods of class imbalance data stream,methods of the joint problem of concept drift and class imbalance,active learning methods of data stream are reviewed in detail.The shortcomings and improvement methods of existing methods are summarized.Then,the evaluation indicators for the class imbalance problem in data stream learning are summarized.2.Aiming at the classification problem of data streams with class label missing,concept drift and class imbalance,an online active learning paired ensemble method is proposed.The paired ensemble framework consists of stable classifier and dynamic classifier that can handle multiple types of concept drift.The hybrid labeling strategy can select the most representative and the minority class instances in the data stream.The experiments use the real datasets and the synthetic datasets to verify the good performance of the algorithm.3.Aiming at the classification problem of concept drift and class imbalance data stream,an Resample-based ensemble learning method is proposed.The Resample-based ensemble learning algorithm combines the block-based learning method and the incremental learning method,including the advantages of the two learning methods.The algorithm proposes an reinforcement mechanism to predict the weight of the base classifier in the ensemble classifier.Through rewards and punishment,improving the ability of the ensemble classifier on the minority class;the ensemble classifier consists of a stable classifier and multiple dynamic classifiers,ensuring the processing ability for different types of concept drift.A new resampling method is proposed.The resampling method supplements the instances of the minority class in the initialization stage of the base classifier for training,and improves the classification ability of the base classifier for some the minority class.Experiments prove the superiority of the algorithm both on synthetic and real datasets by comparing other latest supervised learning algorithms.In addition,the application ability of the proposed method is also verified on the client credit assessment datasets.
Keywords/Search Tags:online learning, active learning, supervised learning, concept drift, class imbalance, sampling method, ensemble learning
PDF Full Text Request
Related items