Font Size: a A A

The Improvement Of The Spectral Clustering Algorithm

Posted on:2015-12-27Degree:MasterType:Thesis
Country:ChinaCandidate:P YangFull Text:PDF
GTID:2348330518470360Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
At present, clustering analysis is one of the very active research focuses in the international field of Data Mining and Machine Learning. And as a novel clustering algorithm,spectral clustering algorithm has many advantages over traditional clustering algorithms.Spectral clustering algorithm is simple, easy to implement, fall into global optimal solution,and it can analysis data of any shapes.Built on the basis of affinity matrix, traditional spectral clustering algorithm requires to construct the laplacian matrix and to compute its eigenvalues and the corresponding eigenvectors. At the same time, according to some rules it will select one or more eigenvectors to analyze the data. However, three are at least two problems waiting to be solved. The first problem is how to set the scale parameters required by constructing the affinity matrix. The second one is the computational complexity of directing computing the eigenvalues decomposing (EVD) is proportional to O(n3).The two problems severely limit the practical application of the traditional spectral clustering algorithm. The following works to further improve the spectral clustering algorithm was performed in this paper.(1) The studies showed that like the sampling technique, the low-rank approximation method could also be well used to solve the problem of the high computational complexity of EVD. What's more, the approximation error of the low-rank approximation method was lower than the sampling technique. As a result, a novel spectral clustering algorithm named spectral clustering using the low-rank approximation method is proposed in this paper by combining the low-rank approximation method into the traditional spectral clustering algorithm. Experiments show that new algorithm is fast and can gain good quality while reducing the approximation error effectively.(2) Though both the sampling technique and the low-rank approximation method can largely lower the computational complexity of the EVD, they involve sampling technique. As is known, no matter how the samples or representatives are chosen randomly or by using a more expensive selection,the samples may not completely represent the whole dataset and may not correctly capture the cluster geometry structures. So, some other methods of finding the approximation embedding without involving the sampling technique are required. A novel spectral clustering algorithm named spectral clustering using the commute time is proposed in this paper by combining the commute time into the traditional spectral clustering algorithm.Experiments show that the proposed algorithm is fast and can gain good quality.(3) A spectral clustering algorithm using the Nystrom method is improved in this paper to perfect its theoretical supports and improve its efficiency.
Keywords/Search Tags:Clustering Analysis, Data Mining, Machine Learning, Spectral Clustering
PDF Full Text Request
Related items