Font Size: a A A

Research On Improved Clustering Algorithm Base On Density

Posted on:2008-12-12Degree:MasterType:Thesis
Country:ChinaCandidate:Z H YuFull Text:PDF
GTID:2178360242967572Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With an explosive increase in global information, data mining technique has been a focus of the new century computer science and technology research. Clustering is one of the most fundamental and essential data mining tasks with broad applications. And the quality and efficiency of clustering algorithms play a vital role in it, which is also a problem in the field of computer science. A lot of clustering algorithms are presented so far, such as hierarchical methods, partition methods, grid-based methods, density-based methods and so on.Density-based clustering algorithm is an important embranchment of cluster analysis with the advantages in capability of discovering clusters with arbitrary shape and insensitivity to noise. However, most of these methods are not effective on handling practical datasets for the reason of incapability of tackling the various densities. The various density problem has become one of the focuses for the density based clustering research.A novel dispersive degree based algorithm combined with classification, called CUDL, is presented in this paper to remove the hurdle. In this two step algorithm, a sequence is made according to a new relative density metrics, called dispersive degree, for depicting the data distributions intuitively. Furthermore, the sequence is used for discriminating cores and classifying edges. And then the edges are classified to the proper clusters using the KNN-kernel density estimation method. Finally, the clusters are discovered by utilizing the revealed information.Several experiments are performed and the results suggest that CUDL can avoid the shortcomings of the other algorithms, which is that the parameters are fixed without any cognition to the structure of a data set, and is effective in handling the various density problems and more efficient than the well-known density-based algorithms such as DBSCAN, OPTICS and KNNCLUST.
Keywords/Search Tags:Clustering analysis, Various density, Dispersive degree, Data mining
PDF Full Text Request
Related items