Font Size: a A A

Improvement Of Density-based Algorithm In Cluster Analysis

Posted on:2014-11-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z LinFull Text:PDF
GTID:2298330434472506Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Cluster analysis is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups. It is a main task of exploratory data mining, and a common technique for statistical data analysis used in many fields, including statistics, machine learning, pattern recognition, information retrieval, bioinformatics, etc.Numerous clustering algorithms have been proposed so far, the density-based clustering is one of the powerful methods that can detect arbitrarily shaped clus-ters in data space. The existing density-based clustering algorithms, such as DB-SCAN, DENCLUE, are not suitable to deal with clusters of different densities due to their usage of global parameters. SNN is not very efficient because it has to reconstruct the shared nearest neighbor (sNN) graph from the k nearest neigh-bor (kNN) similarity matrix. In this paper, we propose a clustering algorithm DEFAT which is based on a novel model called Density-Flow. In Density-Flow model, data objects can share their local density information for global objects’ similarity. Based on that, DEFAT can separate dense area from the spare easily, so that it can detect clusters of various shape and size, different density, even the clusters are overlapping. Our experiments on both synthetic and real-world data sets demonstrate that our approach outperforms existing density-based clustering both on effectiveness and efficiency.
Keywords/Search Tags:Density-Flow, Similarity, Clustering, Data Mining
PDF Full Text Request
Related items