Font Size: a A A

Research And Implementation Of Uncertain Data Streams Classification Technique Based On Distributed Extreme Learning Machine

Posted on:2015-04-16Degree:MasterType:Thesis
Country:ChinaCandidate:X ZhangFull Text:PDF
GTID:2308330473453713Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Data stream classification technique is an important component of data stream mining. However, uncertainty of data is ubiquitous during the process of many practical applications, and with the sharp increase of the data volume, traditional centralized classification approach is not effective in learning such mass data any more. The classification of uncertain data stream has been faced such challenges:(1)How to classify uncertain data stream effectively;(2)How to detect and process the concept drift;(3)How to deal with large volume data with the help of distributed algorithm.Based on above reasons, this thesis has made following researches about the uncertain data stream classification that contain the drift concept:Firstly, the thesis made an overall comprehension of the background and characteristics of uncertain data stream generation, then the classification algorithms and their core ideas have been collected. Secondly, in order to deal with large amount of data analysis, this thesis presented the DELM which adopted MapReduce technique to optimize the big matrix operations. Thus the traditional centralized ELM can be better applied in the large-scale data processing. Thirdly, according to the problem of classifying uncertain data stream, the thesis has promoted WE-DELM based on the distributed ELM. WE-DELM build uncertain streaming data model and process uncertainty by converting uncertain streaming data from possible world model to uncertain streaming data. And then, it can adjust the weight value of the base classifier dynamically according to the classification result of each base classifier, which can ensure to delete the old ones when drift concept happens. And at the same time, classifier that can astringe new concept more accurate and faster will be rebuild. In addition, the concept in data stream usually contains some complicated characteristics, based on above WE-DELM algorithm, the CBWE-DELM algorithm was presented which adopted buffering concept. The CBWE-DELM has avoided the problem that the existing classification algorithm can only store the present concept. When new concept happens, the model will learn from the flaws every time. It is suitable for concept reciprocating learning.At last, the performance of the algorithm has been tested through amount of experiments. As a result, this algorithm can effectively solve the classification problems about uncertain data stream and the drift concept problem. Meanwhile, as for the large amount and high speed data stream, it has been more effective and more accurate.
Keywords/Search Tags:uncertain data stream, classification, drift concept, ensemble classifier, distributed, extreme learning machine, concept reciprocating
PDF Full Text Request
Related items