Font Size: a A A

Adaptive Spectral Clustering Based On Shared Nearest Neighbors

Posted on:2011-12-05Degree:MasterType:Thesis
Country:ChinaCandidate:J W LiFull Text:PDF
GTID:2178330332961274Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Spectral clustering has become one of the most popular clustering algorithms in recent years. Traditional clustering methods often suppose the data coincide with certain distribution such as k-means. But spectral clustering method doesn't make any assumption. It directly computes the leading eigenvectors of Laplacian matrix and gets an approximation to graph partitioning problems, so it is applicable to a wide range especially when the data set is non convex. As spectral clustering bases directly on the corresponding Laplacian matrix of the similarity matrix, similarity measurement is crucial to the performance of spectral clustering.In this paper it firstly gives an introduction to the mathematical objects used by spectral clustering and explains why spectral clustering algorithms work in two different views. One is graph partitioning and the other is from random walk perspective. Then the mostly used similarity measure, the Gaussian kernel function, and other similarity measures used in spectral clustering are analyzed. With the same Euclidean distance, two points in the same cluster should have higher similarity than two points in different clusters, this is the clustering assumption. Neither the traditional Gaussian kernel function nor the measure using the local scale parameter proposed in Self-Tuning satisfies the clustering assumption. Through analyzing of the strength and weakness of these measurements, we propose a novel similarity measure, namely adaptive Gaussian kernel function based on shared nearest neighbors in this paper. It can detect the intrinsic structure of the cluster embedded in the data sets through exploiting the information about local density embedded in the shared nearest neighbors and obtain higher similarity value when there are many common neighbors between the two data points. This measure holds the clustering assumption, and the affinity matrix has clear block structure. The corresponding spectral clustering method is called adaptive spectral clustering based on shared nearest neighbors. Finally, a number of experiments on some synthetic dataset and four real data sets which comes from the UCI data repository are carried out. Experimental results show that the proposed algorithm achieves considerable improvements over the typical spectral clustering algorithm and the Self-Tuning spectral clustering algorithm. It can reveal the relationship between the data points and find the real clusters.
Keywords/Search Tags:Spectral Clustering, Similarity Measure, Shared Nearest Neighbors
PDF Full Text Request
Related items