Incremental Learning Algorithms With Concept Drift Adaptation

Posted on:2018-05-23

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Y Sun

Full Text:PDF

GTID:1318330512485617

Subject:Computer software and theory

Abstract/Summary:

With the coming of the big data era,the processing and learning tasks for large-scale data have attracted much attention from the research field,which also facilitates the high-quality industrial application and daily service.Incremental learning processes large scale data by updating learning machines(models)when new training data arrives,which has been widely studied in recent years.However,the phenomenon of concept drift,i.e.,changes in joint distribution of data,always deteriorates the performance of incremental learning,and poses a great challenge in the application of incremental learn-ing in the real world.To handle concept drift in incremental learning,the thesis brings forward two incremental learning algorithms with concept drift adaptation,and designs a parallel learning implementation method.The main contributions are stated below.First of all,to exploit historical knowledge in incremental learning to facilitate the adaptation of concept drift,a novel ensemble learning method,namely Diversity and Transfer based Ensemble Learning(DTEL),is proposed.It is assumed that the histor-ical knowledge is related to the one in current learning step in incremental learning.Hence,the transfer learning operation can be applied in concept drift adaptation.On one hand,it will exploit the useful knowledge in model trained from historical data(i.e.,historical model);on the other hand,it also can avoid the negative impact of the incon-sistent information therein.Moreover,due to the limitation of the memory size,only a fixed number of historical models can be preserved in learning system.A diversity-based model selection criterion is employed to preserve the previously trained models,to provide as much as possible knowledge for transfer operation and the concept drift adaptation task.To verify the effectiveness of DTEL,multiple sets of synthetic data and real-world data are tested in experiment.The synthetic data involves five different types of concept drift,and the real-world data covers data from four different real appli-cations.Empirical results have shown that DTEL can handle concept drift effectively and has a satisfactory performance on different types of concept drift.Secondly,to handle class evolution,a class-based ensemble learning algorithm is proposed.Class evolution,which is special type of concept drift,refers to the class emergence and class disappearance.The existing works for class evolution implicitly assume the classes emerge or disappear in a transient manner,which is not true for many real-world problem.This work investigates the class evolution problem with gradually evolved classes.To deal with class evolution,the algorithm maintains a base learner for every class.Specifically,initialize a new model for class emergence and inactivate the corresponding model for a disappeared class.The gradual evolution of classes will cause the dynamic class-imbalance problem.In order to handling this problem,a novel under-sampling method is designed and embedded in each base model.Class evolu-tion has three basic elements,i.e.,the emergence of novel classes,the disappearance of outdated classes,and the reoccurrence of disappeared classes.In this experiment,syn-thetic data and real-world data are used to represent different types of class evolution,to comprehensively verify the performance of CBCE.Two real-world data are processed to simulate the phenomenon of class evolution,and the data from social network appli-cation is used as the real-world data.Empirical studies verify the reliability of CBCE on handling class evolution and show that CBCE also could deal with the dynamic class imbalance problem caused by the gradual class evolution.Finally,to apply the incremental learning algorithm in real-world applications,an parallel learning implementation method for concept drift adaptation in incremen-tal learning is designed in this work.In real-world applications for learning big data,the algorithms are not only needed to have a high prediction accuracy,but also have to meet the requirement of time efficiency,in case of the rapid generation of data.The parallelizable algorithm is the precondition for building a parallel learning system.In incremental learning,the ensemble learning algorithms are of natural parallelism.In order to improve the time efficiency of learning algorithm,this work analyzes the en-semble models in incremental learning and generalizes a parallel learning implemen-tation method,to show how to implement algorithm by this implementation method.In addition,the two ensemble learning algorithms proposed in this thesis,i.e.,DTEL and CBCE,are implemented by the parallel implementation method and tested in the experiment.The experiment results have shown that the parallelized DTEL and CBCE algorithms have a high speed-up ratio comparing to the original ones,and verified the effectiveness of the parallel implementation method in this work.

Keywords/Search Tags:

Incremental Learning, Concept Drift, Ensemble Learning, Online Learn-ing, Data Stream Mining, Supervised Learning

Related items

1	Research On Incremental Learning Algorithm And Application For Data Stream With Concept Drift
2	Research On Online Learning Algorithms For Drifting Imbalanced Data Stream
3	Research On Concept Drift Data Stream Classification Algorithm Based On Ensemble Learning
4	Research On Online Learning Of Big Data Based On Concept Drift Detection
5	The Study Of Selective Adaptive Ensemble Learning Method For Concept Drift Problem
6	Research On Concept Drift Data Stream Classification Based On Ensemble Learning
7	Research On Classification Algorithms For Imbalanced Data Stream With Concept Drift
8	Research On Ensemble Classification Algorithms Of Data Stream Based On Concept Drift
9	Research On Semi-supervised Classification Algorithm For Data Stream With Concept Drift
10	Research On Online Learning For Concept Drift