Font Size: a A A

Data Stream Classification Algorithm Based On Eep

Posted on:2007-08-16Degree:MasterType:Thesis
Country:ChinaCandidate:H J JiangFull Text:PDF
GTID:2208360185971219Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With fast-development of the information technology and great improvement of information collecting ability, a new kind of data has been generated in many applications recently, namely "Data Stream", with characteristics of newly fast, large volume, time-varying, etc. These data is stored in disks of enterprises in the form of static data or appears in the form of time-changing data.Classification is one of the most important problems in data mining and also a kind of important data analysis form. For the traditional statically stored data, the classification algorithms have already been extensively researched, but in data stream environment, those algorithms are encountered with many new challenges. There have been some data stream classification algorithms proposed in a number of papers, but not any related researches based on eEP (essential Emerging Patterns). eEPs have good differentiation ability and eEP-based algorithms perform comparably with other types of classification algorithms. Meanwhile, eEP-based algorithms have been applied in many domains successfully, such as DNA analysis, automatic text categorization, etc.Considering what mentioned above, we carried out thorough research in eEP-based algorithms in data stream context and proposed a classification algorithm DSCEEP (Data Stream Classification by eEP). The main contents of the paper are illustrated as follows. Firstly, summarizing the characteristics of data stream and analyzing eEP-based algorithm, we combined the concepts of basic windows and sliding windows with the eEP-based classification algorithm to make our algorithm appropriate for data stream and solve the problem of concept drifting. Secondly, we proposed a model called "three layer structure models" to construct multi-classifiers, namely, mining eEPs and adding a relevant weight to each eEP, constructing base classifier and combining the multiple classifiers. At last, in the process of classifying the coming testing examples, we paid much more attention to the latest data blocks, so...
Keywords/Search Tags:Data Mining, Classification, Data Stream, Emerging Patterns (EP)
PDF Full Text Request
Related items