Font Size: a A A

The Research Of Incremental Learning Methods For Large-scale Multi-class Data Classification

Posted on:2018-11-21Degree:MasterType:Thesis
Country:ChinaCandidate:T T XieFull Text:PDF
GTID:2428330569999063Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,dynamically growing data and incrementally growing number of classes pose new challenges to large-scale data classification research.Thus,training a new model with all the data is so time-consuming that it doesnot make sense.Besides,data is not arriving in a regular way,while some data are unlabeled,which hinders the development of incremental learning.To remedy it,we propose methods about incremental learning in both supervised learning and unsupervised learning.For incremental supervised learning,Most traditional methods struggled to balance the precision and computational burden when data and its number of classes increased.However,some methods are with weak precision,and the others are time-consuming.In this paper,we propose an incremental learning method,namely,heterogeneous incremental Nearest Class Mean Random Forest(hi-RF),to handle this issue.It is a heterogeneous method that either replaces trees or updates leaves of trees in the random forest adaptively,to reduce the computational time in comparable performance,when data of new classes arrive.Particularly,to keep the accuracy,one proportion of trees are replaced by new NCM decision trees;to reduce the computational load,the rest trees are updated their leaves probabilities only.Most of all,out-of-bag estimation and out-of-bag boosting are proposed to balance the accuracy and the computational efficiency.Fair experiments were conducted and demonstrated its comparable precision with much less computational time.For incremental unsupervised learning,traditional methods could only work in unsupervised classification,other than incremental learning,especially unsupervised incremental learning.To accomplish this goal,Incremental AutoEncoder(IAE)is proposed.IAE takes AutoEncoder as the basic model,which makes the original CNN as an encoder.Added with some constraints,IAE could enhance the classification ability of unsupervised tasks,and maintain the classification ability of original tasks.Besides,IAE do not need the original data to update the model,which reduces the storage memory a lot.Experiments were conducted to show that,the classification result of IAE is better than k-means,and make the incremental procedure to be end-to-end.
Keywords/Search Tags:Incremental learning, Supervised Learning, Unsupervised Learning, large-scale data classification
PDF Full Text Request
Related items