Font Size: a A A

Research On The Emerging Patterns-based Integrative Weighted Classification Algorithm For Stream Data

Posted on:2012-11-30Degree:MasterType:Thesis
Country:ChinaCandidate:X Y MaoFull Text:PDF
GTID:2248330395485431Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years, with the emergence of all kinds of applications, such as the stockmarket trading, the security testing for internet, the telecommunication records, andthe wireless sensor networks, people pay their attentions on an ever-variational,continuous and large-scale streaming data, namely, stream data. The stream datachanged the existing form of the traditional static data, and it has many propertiesdifferent from the traditional data, for example, the distribution patterns of streamdata are changing constantly, and its data elements are reaching consecutively, whichmakes it difficult to do data mining in the environment of stream data.By comparing the single classifier and the ensemble classification algorithm inthe environment of stream data, we find that, we can improve the performance ofclassification algorithms for stream data by integrating them. In addition to this, wealso find that a high classification precision can be achieved by using a classifierconstructed with the essential emerging patterns (eEPs). Based on these basesmentioned above, we propose to improve the classification precision by integratingand weighting multiple classifiers for stream data, and adopt the essential emergingpatterns to construct the basic classifiers of integrated algorithm. Finally, anintegrative weighted classification algorithm for stream data is proposed based on theemerging patterns in this thesis. On training the basic classifier, the essentialemerging pattern is trained in order that it can have an adaptive weight, in this way,we can construct a basis classifier with a good distinguishability, which can beconverged quickly when the phenomenon of concept drifting appears. On integratingthese constructed basic classifiers, we update them continually before weighting them,which makes it possible for the integrated classification algorithm to fit for thedistribution of stream data suitably and to adapt well to the phenomenon of conceptdrifting.Experimental results demonstrate that, in the same environment of stream data,the classification precision of the algorithm proposed in this thesis is slightly betterthan the other integrated algorithm whose basis classifiers are constructed based onother methods. In addition, the proposed algorithm in this thesis outperforms thesingle classifier based on the emerging patterns obviously.
Keywords/Search Tags:data mining, stream data, concept drifting, classification
PDF Full Text Request
Related items