Font Size: a A A

Study And Implementation On Uncertain Data Stream Clustering Algorithms Based On Density

Posted on:2012-08-30Degree:MasterType:Thesis
Country:ChinaCandidate:M SongFull Text:PDF
GTID:2248330395458149Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, with the rapid development and application of networks and network devices, a large number of uncertain data have emerged from the people’s business applications and academic research. For example, the rapid development of Wireless Sensor Networks, Radio Frequency Identification technologies make large volume of uncertain streaming data come to being. The clustering analysis for uncertain data streams environment has a very important applications and perspective. However, the existing for static data or data streams clustering algorithm can not meet the current demand. So it is imperative to carry out the research for uncertain data stream clustering algorithm.In the uncertain data stream, the uncertainty of the data presented a great challenge for the clustering algorithm to make full use of data, but only calculate the expected distance will not get the high-quality clustering results. Second, the existing data stream clustering algorithm mostly use the demarcation window or similar demarcation window, it often simply delete the least-recently-updated cluster and therefore can not efficiently handle with the evolving data stream and also can not analysis the recent clustering details. Finally, the data stream clustering algorithm based on the divided mostly only can form spherical clusters and it can not get the arbitrary shape clusters depend on the different distribution of the data.For this reason, this paper study on the density-based clustering problems for the uncertain data stream environment. First, this paper proposes the concept of uncertainty to measure the distribution information of uncertain data and by improving the certain data clustering algorithm DENCLUE make it can deal with the uncertain data with uncertainty, to minimize the impact of data uncertainty on clustering results. Second, propose the density-based clustering algorithm USDENCLUE on uncertain data stream under the sliding window and achieve rapid deletion of features by using exponential histogram of cluster features. So it can analyze the details of a specific time window and can efficiently handle with noisy data, evolving data stream and generates clusters of arbitrary to improve the clustering quality. Finally, Compared with the well-known Clustream clustering algorithm throw the real and synthetic data sets, the experimental results show that the USDENCLUE clustering algorithm has a good effect on clustering uncertain data stream and can effectively deal with noisy data and evolving data stream.
Keywords/Search Tags:uncertain, clustering, density, data stream
PDF Full Text Request
Related items