Font Size: a A A

Cluster Analysis And Its Applied Research On Social Networks

Posted on:2021-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:D R ChuFull Text:PDF
GTID:2428330611473219Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
As an unsupervised learning method,cluster analysis is one of the important research directions in the field of machine learning.It has been successfully applied to finance,business,social networking,bioinformatics and other fields.At present,there are a large number of mature and effective clustering algorithms.Among them,the spectral clustering algorithm based on graph theory has the advantages of being able to divide data of arbitrary shape and being easy to execute,and has attracted extensive attention.However,the computational complexity and space overhead of spectral clustering algorithms are relatively large,which is an unbearable computational burden for large-scale data sets.On the other hand,the rapid development of information technology represented by the Internet has led to an increasingly urgent need for social network research.Using clustering methods to analyze social networks in reality has become an important and effective research method.This paper studies the scalability of the spectral clustering algorithm and applies the improved spectral clustering algorithm to the community detection of large-scale social networks.The main research content includes the following four aspects:(1)Aiming at the problem that most spectral clustering algorithms usually use distance to determine the similarity between data and result in low clustering efficiency,an axiomatic fuzzy shared nearest neighbor adaptive spectral clustering algorithm is proposed.Firstly,a fuzzy similarity measurement method is proposed in combination with the axiomatized fuzzy set theory.The recognition features are used to construct a more suitable similarity matrix.Then,the shared nearest neighbor method is used to automatically adjust the scale parameters according to the density of the neighborhood where each point is located.To further improve the accuracy of clustering.Simulation experiments show that,compared with distance spectrum clustering,adaptive spectrum clustering,fuzzy clustering method and landmark point spectrum clustering,the improved algorithm can achieve better clustering effect.(2)Aiming at the problem of excessive computational complexity when the spectral clustering algorithm is used in large-scale data sets,a weighted PageRank self-coding spectral clustering algorithm with improved landmark representation is proposed.First,select the node with the highest weight in the data affinity graph as the landmark point,and use the similarity between the selected landmark point and other data points to approximate the similarity matrix as the input of the superposition encoder.Then the clustering error based on KL divergence is used to update the parameters of the autoencoder and the clustering center at the same time,and the reconstruction error is considered to reduce the negative impact of the spatial distortion of the embedded representation on the clustering.Experimental results show that the algorithm can effectively reduce the complexity of the algorithm,and it is suitable for large-scale data sets.(3)Aiming at the problem that most semi-supervised spectral clustering algorithms can not effectively use the constraint information and still need to perform feature decomposition on the Laplace matrix of all data,a semi-supervised spectral clustering algorithm based on incomplete Cholesky decomposition is proposed.First,the incomplete Cholesky decomposition idea is used to select the limited columns and rows in the similarity matrix,so that the corresponding sparse data set represents the complete data set well,and the approximate similarity matrix is obtained.Then the approximate similarity matrix is used to improve the objective function of constrained spectral clustering and the scalability of the semi-supervised spectral clustering algorithm.Experiments show that the improved algorithm has better clustering performance than several other semi-supervised spectral clustering algorithms.(4)Current social network community detection based on clustering algorithms still requires matrix decomposition,and it is difficult to use large-scale social network datasets with high complexity.In response to this problem,combined with the improved spectral clustering algorithm mentioned above,it is applied to the community detection of large-scale social networks.The experimental results show that the proposed algorithm is used in the community detection of large-scale social networks,while ensuring the accuracy of community division and improving the efficiency of community division.
Keywords/Search Tags:spectral clustering, axiomatic fuzzy set theory, landmark, semi-supervised learning, incomplete cholesky decomposition, social network, community detection
PDF Full Text Request
Related items