Font Size: a A A

Research On An Effective Self Adapted Grid-Density Based Clustering

Posted on:2012-03-15Degree:MasterType:Thesis
Country:ChinaCandidate:Y K YaoFull Text:PDF
GTID:2178330335970093Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Clustering Analysis is one of the important branches with the subject Data Mining. To further study clustering algorithms has theoretical and practical values.In this dissertation, we worked over algorithms of clustering systematically, keep extra emphasis on those Grid-Based clustering algorithms and Density-Based clustering algorithms and study these algorithms further.Through systematically study of the Grid-Based and Density-Based clustering algorithms, a new clustering algorithm named GDSCLU (Grid-Density based Self-adapted Clustering)was proposed by us, the main innovations of it were listed as follows:Through partition the data space into grid, we mirror the clustering tasks of data objects to grid cells, which greatly reduced the complexity of such work; Several important concepts such as Grid Cells Density Connected and Grid Cells Density Reachable were proposed, which were very valuable to GDSCLU and can avoid the trouble of giving theεvalue of DBSCAN; According to the natural distribution attribute, a new and effective method was given to decide the threshold value of grid cell density, so that we can differentiate the dense cells and sparse cells; To decide where to begin the clustering steps, a new way was given to find out core cells from those dense cells, which named DenRfac (Density Ratio between density of a cell and the mean density of all nonempty adjacent cells of it), thus, the correctness was assured effectively and obviously; To construct an improved spatial index structure SP-TREE, we effectively advanced the efficiency of GDSCLU; By means of combination the sparse cells adjacent with those dense cells, we advanced the accuracy of our clustering results; Based on the work above, a new clustering algorithm GDSCLU was given, and we have tested the validity, correctness, effectiveness and scalability of it through both theoretic and experimental methods;A novel grid partition method was proposed to try to advance grid based clustering, and a comprehensive mathematical function was given to describe the edges of the Honeycomb Grid.
Keywords/Search Tags:Data Mining, Clustering, Grid, Density, Honeycomb Grid
PDF Full Text Request
Related items