Font Size: a A A

Research On Single And Multi Label Data Stream Classification Based On Ensemble Category

Posted on:2018-03-21Degree:MasterType:Thesis
Country:ChinaCandidate:L L WangFull Text:PDF
GTID:2348330512992122Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Comparing with the traditional static data,modern data is produced and accumulated in the form of streaming.In addition to changes of data form,multi-label data is increasingly common in the real application.Therefore,the change of data form and type is a great challenge to both single-label and multi-label data stream classification.For the single-label data stream classification,this paper proposes two improved algorithms based on the previous work.On the other hand,inspired by the pioneering work,this paper proposes two novel classification algorithms for the multi-label data stream.The specific work is as follows:(1)Most of the existed single-label data stream classification work ignores the problem of feature evolution and classification results are ineffective.In order to resolve these problems,this paper improves the unsupervised feature selection method which is designed for static data environment and reduces its time complexity to adapt to the stream environment.Then takes the DXMiner algorithm to a prototype to optimize its feature selection process by adopting the improved unsupervised feature selection method.Finally,a new method of data stream classification based on ensemble learning and unsupervised feature extraction is proposed.(2)The time complexity of the algorithm which is proposed in work(1)still has room for reduction.This paper adopts a data structure which performs well in the high data dimension scenarios to further improve the algorithm and proposes an improved method of data stream classification based on ensemble learning and unsupervised feature extraction.(3)In order to resolve the problem of multi-label data stream classification that combining multi-label static data classification characteristics and problems of single-label data stream classification,inspired by the work of multi-label static data classification,a dynamic weighting and ensemble-based multi label data stream classification model is proposed in this paper.This model uses ML-KNN and idea of KNN to train the basic classifier,and also designs a novel dynamic weighting management to integrate the basic classifiers.Finally,the classified data will be used to train a new classifier and replace the worst performance classifier in the ensemble classifiers.(4)Since the ensemble size in the work of(3)has a great impact on the classification performance,but this parameter is determined artificially.Besides,it will discards some useful information during the update process.In order to address the parameter's difficult determination and the loss of useful information,this paper proposes an unlimited ensemble size based multi-label data stream classification algorithm.The main contributions of this paper are as follows:firstly,both work(1)and(2)solve the feature evolution problem which is ignored by most other work and the time complexity is low enough to the stream environment.Second,in such condition that few researches have done on multi-label data stream classification,the work(3)and(4)provide two practical solutions for other researchers and enrich relevant areas' research results.Finally,the experiments on real data sets show that:the four algorithms proposed in this paper perform well on classification and time consuming.
Keywords/Search Tags:Data Stream, Classification, Unsupervised Feature Selection, Multi-label Data Stream, Dynamic Weighting, Ensemble Learning
PDF Full Text Request
Related items