Font Size: a A A

Research And Application Of Online Ensemble Method On Evolving Data Stream With Concept Drift

Posted on:2017-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:J W XuFull Text:PDF
GTID:2308330485988266Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the past decades, the classification algorithm in data mining develops rapidly. A lot of excellent algorithms have been put forward. However, in the era of big data, these classical algorithms may be unavailable. Because the data volume and data dimension is increasing,and the speed of data generation is increasing too. So, new algorithms should be put forward to adapt evolving data stream, as our project of car networking needs these algorithms.This thesis is aimed at studying the improvement and application of ensemble methods, mainly contains three aspects. Firstly, this thesis will analysis the existing block-based ensemble methods and online methods, and propose some strategies for improving these methods. Then, based on these strategies, two classical algorithms, AUE and OSBoost algorithm, will be reformed. As a typical block-based ensemble methods, AUE will be reformed to online ensemble method. Meanwhile, OSBoost will be reformed to better online method. What’s more, this thesis will also study how to use ensemble methods for practise.The main achievement of this thesis are as follows:1) Three commonly strategies have been put forward. First of all, using incremental weighting mechanism can guarantee high adaptability of ensembles for the latest data. Then, using incremental learning algorithms as the component of ensemble method can promote the adaptive capacity of the ensemble method for evloving data stream. At last, ensemble method can detection concept drift more precise by using change detector. The experiments shows that the three strategies have different improvement on accuracy of the ensemble methods.2) Based on the first conclusion, this thesis makes improvements on AUE and OSBoost algorithm. Firstly, four scheme have been established and been compared for improving AUE algorithm. As a result, OAUEAdwin algorithm have been proposed, which takes advantages of incremental weighting mechanisms and change detectors. Then, this thesis reforms the OSBoost algorithm to OSBoostAdwin algorithm by adding a concept drift detector and modifying its weighting mechanism. The experimental results show that the new algorithm has higher generalization accuracy.3) In order to study the application feasibility of ensemble methods, this thesis, based on the VRSS and MOA platform, have implement a real-time mining platform for data stream named VRSS-MOA. Based on the VRSS-MOA platform, this thesis have compared the OAUEAdwin and OSBoostAdwin in the project of car networking for learning the driver’s driving style.
Keywords/Search Tags:Data Stream, Ensemble method, Data Mining, Concept drift
PDF Full Text Request
Related items