Font Size: a A A

Analysis Of Clustering Algorithms For Uncertain Data Stream

Posted on:2013-01-20Degree:MasterType:Thesis
Country:ChinaCandidate:Q LiuFull Text:PDF
GTID:2248330395955654Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the rapid development of modern computer technology, the communicationtechnology and Internet technology, Electronic commerce, etc areas are facing massflow data. At the same time, because of the impact of instrument accuracy andsurrounding environment, the data collection process often cause data damage to declinein the quality, and produce a considerable amount of uncertain data stream. Clusteringalgorithm for uncertain data stream should not only meet data stream features ofcontinuous, rapid, infinite and unknown, but also reduce the impact of the uncertaindata and abnormal data. Therefore, exploring efficient clustering algorithm for uncertaindata stream has becoming an important research topic in area of modern data mining.This paper mainly focuses on uncertain data stream feature and study the issue oftheir clustering. Based on density and the grid method for clustering and CluStreamdouble framework, GDU-Stream algorithm and EGDU-Stream algorithm are proposedto solve the problem of uncertain data stream clustering and abnormal data duringclustering. The main work of this paper is summarized as follows:1. Several commonly used data clustering methods are summarized, and combingthe density-based and grid-based clustering methods is detailed analysis in theadvantages and disadvantages.The uncertain data stream model and its clusteringcharacteristics and difficulties are described.2. The impact on clustering by uncertain data is studied, and the conceptual modelsof uncertain data stream clustering are designed. Based on the CluStream double frame,a density and grid based clustering algorithm GDU-Stream for uncertain data stream isgiven, and the accuracy and efficiency of this algorithm are proved by emulation results.3. The performance of density and grid based method to deal with abnormal data isstudied, For the relationship of clustering and abnormal data in uncertain data stream,this paper has given the EGDU-Stream algorithm with abnormal data clearingmechanism by improving and extending the GDU-Stream algorithm, and simulationexperiments show that the algorithm can not only effectively remove the abnormal datain data source but also accurate and efficient complete clustering.
Keywords/Search Tags:Uncertain Data Stream, Clustering, Abnormal Data, GridDensity
PDF Full Text Request
Related items