Font Size: a A A

The Study Of Clustering Algorithm Based On Grid-Density And Spatial Partition Tree

Posted on:2007-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:D H CengFull Text:PDF
GTID:2178360212978271Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Clustering analysis is an important research problem in the domain of data mining. It can be used not only as a separate technique to discover the information about data distribution, but also as the preprocessing of other data mining operations, therefore it is very meaningful to research how to boost the performance of clustering algorithms.This thesis mainly studies a new clustering algorithm based on the grid-density and the spatial partition tree (CGDSPT) through analyzing many presented representative clustering algorithms especially the density-based clustering algorithm. We design and realize a clustering experimental system (MODE-CES) with the C# development tool. It is proved that the CGDSPT is efficient by analyzing experiments of many data sets. The primary research include as follows:1. The presented clustering algorithms are divided to five classes and discussed systemically. And some density-based clustering algorithms are described in detail.2. The spatial indexes are described and a novel spatial index structure (SP-Tree) is presented based on the spatial partition. The SP-Tree can keep the spatial location of the data efficiently that makes the region neighborhood search become facilitative. Meanwhile it only indexes the non-empty cells in the partitioned space that saves the memory and boosts the performance.3. A clustering algorithm based on the grid-density and spatial partition tree (CGDSPT) is presented by assimilates the advantages of the based-density and based-grid clustering algorithm and the spatial index structure. CGDSPT is a high performance clustering algorithm whose computational complexity is linear-time. Meanwhile this algorithm have many others outstanding characteristics such as it is robust to outliers, can identify clusters having any shapes and wide variances in size , non-sensitive to the sequence of the sample.4. Aiming at the issue of set the parameters correctly, we offer a novel method, which makes the parameters be changed with the statistic characteristic of the dataset,...
Keywords/Search Tags:Clustering, Grid-Density, Spatial Partition Tree
PDF Full Text Request
Related items