Font Size: a A A

Study Of Mixture Ensemble Classifications For Mining Data Streams With Concept Drift

Posted on:2013-06-22Degree:MasterType:Thesis
Country:ChinaCandidate:L GuiFull Text:PDF
GTID:2248330377960917Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With rapid development and wide application of network communications,computer science and information technology, large amounts of data streams areproduced in many applications fields, such as financial analysis, networkmonitoring, telecommunications data processing and sensor networks, whichcontain a lot of information to be mined urgently. However, it is a challenge fortraditional algorithms and application system because of the characteristics ofstreaming data as being continuous, high-volume, rapid and open-ended. Especially,owing to the hidden concept drift in data streams, the research is more difficult.The work of this dissertation focuses on the problems of concept drift in datastreams, and the main contribution is follows:(1) Existing efforts on data streams research and concept drift in data streamsare reviewed, which include the definitions, applications, and the characteristic ofmodels. Related work on data steams with concept drifts is surveyed and analyzed.(2) To tackle the concept drifts detection in data streams, a method for miningdata streams based on mixture ensemble models is proposed, called WE-DTB. Themodel use C4.5to build an ensemble model,adopts Na ve Bayes classifier to filternoise from misclassification instances,and use the typical method HoeffdingBounds and μ test to do concept drift detection in data streamenvironment.Extensive experiments demonstrate that our proposed methodWE-DTB enables detecting concept drifts effectively while maintaining thegood performance on classification accuracy andconsumptions on time and space.(3)A Prototype System of Data Stream Classification was designed andimplemented,it integrated the three data stream classification algorithm of AP, AE,and WE-DTB with the three typical concept drift detection such as the HoeffdingBounds, μ test and DDM.We can compare the difference between data streamsclassification algorithms on the system.
Keywords/Search Tags:Data Streams, Ensemble Learning, Concept drift
PDF Full Text Request
Related items