Font Size: a A A

Research On Grid And Density Based Clustering Algorithm

Posted on:2012-10-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y F MaFull Text:PDF
GTID:2218330368986906Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Grid and density based clustering algorithms are fast, and they can find clusters of arbitrary shape, so these algorithms are suitable for clustering of spatial data. However, currently available grid and density-based clustering algorithms often require the user to enter two parameters: grid size and density threshold, thus increasing the burden on users and making clustering results uncontrollable. Grid size determines the resolution used to observe the data, which basically determines the results of clustering. Meanwhile the grid size also affects the speed of grid clustering algorithm. When dealing with the grid size and density threshold, most currently available algorithms are based on the total number of data points, the average density and other statistics. They use an empirical formula to obtain these two parameters. Theses methods are rather simply.After analyzing the main clustering algorithm especially the approaches of dealing with grid size and density threshold, this paper firstly presents a point of view that under a given density threshold condition the grid size is most optimal when dense grid reaches the maximum. On this basis, a new approach to get optimal grid size and density threshold is proposed at a set of given density threshold conditions. This approach gets the optimal grid size and density threshold according to the principle of maximum density grids and the situation of dense and parse grids'generation. Because of the obtained grid size reflecting the internal structure of data, at the same time not falling into the trivial local details, its size is appropriate for clustering analysis. Meanwhile these grids can be a very good compression of dataset. This approach greatly reduces the user's demand for knowledge on the neighborhood. This method is a basically free parameter clustering algorithm. Experiments show that this method is less time-consuming, and able to find main clustering structures of spatial data structures.
Keywords/Search Tags:clustering algorithm, grid granularity, density threshold, nonparametric clustering
PDF Full Text Request
Related items