Font Size: a A A

Improved Spectral Clustering Algorithm And Its Application In Social Networks

Posted on:2015-02-23Degree:MasterType:Thesis
Country:ChinaCandidate:J YanFull Text:PDF
GTID:2268330431957573Subject:Computer application technology
Abstract/Summary:
Cluster analysis is to divide the data into a plurality of clusters (or class) of the process. Compared with the supervised classification, clustering analysis is unsupervised learning, which is the case in the absence of the training, data object is divided into a plurality of clusters, the data object belongs to which cluster and divided into several clusters depends on the data itself, In traditional clustering algorithms, efficient k-means algorithm and the EM algorithm has been widely used. However, both traditional clustering algorithms can only find a convex spherical sample space, when the sample space is not convex spherical, the algorithm will fall into local optimal solution.In recent years, spectral clustering algorithm received extensive attention by the researchers, it was another hot topic in data mining and machine learning fields. Compared with traditional clustering algorithms, spectral clustering algorithm can find the correct clustering in the sample space in any shape, and eventually converge in the global optimal solution. Spectral clustering algorithm based on spectral graph theory, to reconstruct the original data space by Laplacian matrix, reduce the dimension of the cluster analysis object, which makes the distribution of data structures on the subspace clearer. At the same time it can get excellent results, spectral clustering algorithm there are many problems, in order to make the algorithm more widely used, spectral clustering algorithm itself have many places need to be optimized by researchers. This article will briefly introduce these issues. For the issue of the traditional spectral clustering algorithm is sensitive of the Gaussian kernel parameters, inspired by the density sensitive similarity measure, the paper designed two similarity measure, these two methods of similar measure are not introduced Gaussian kernel parameters, the main difference between the two methods is that the shortest path first introduced, while the second does not, the second proved better similarity measure overall performance, verified by experiment its improve the stability of the whole algorithm. Spectral clustering algorithm is matching algorithm, the final stage of the algorithm is to use the k-means (or other traditional clustering algorithms) for the selected feature vector clustering, however k-means algorithm is sensitive to the initial cluster centers, this article also designed a simple but effective k-means algorithm of optimize the initial cluster centers, the method is applied to spectral clustering algorithm which improved in this paper, the experiments proved more stable of the clustering results. Finally, combined with improved spectral clustering algorithm, presents a clustering algorithm framework which is applied in social network. It includes an effective sampling technique which can select a best representative subgraph, both to ensure the quality of the training phase of clustering and reduces the computation time. In addition, use the modularity to choose the best clustering model (ie, select the appropriate parameters, such as the number of clusters k), the experimental results show the effectiveness of the algorithm framework.
Keywords/Search Tags:Cluster analysis, spectral clustering, social network
Related items