Font Size: a A A

Research On The Algorithm Of Mining Community Structure Based On Social Network

Posted on:2019-06-02Degree:MasterType:Thesis
Country:ChinaCandidate:S P ChenFull Text:PDF
GTID:2370330545977171Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet,a great deal of data have been generated from the social network.It is of great import theoretical value and practical significance to study the social network,which can help us understand the network structure,analyze the characteristics of the netwok and discover the hidden rules in the netwok.How to mine meaningful social substructure from these social network data has become the one of the main hot researchs of data mining.In this thesis,the real social networks are taken as research objects and two types of community discovery problem from an unweighted and weighted social networks are investigated respectively.For the unweighted networks,we firstly propose a community subgraph discovery algorithm based on the penalty matrix decomposition composing of singual value decomposition and penalty decomposition.For the weighted networks,the similarity of network toplogical features is used to compute the edge weight,a community subgraph mining algorithm based on improved CPM and K-Means is proposed.The main work of thesis is as follows:(1)For the unweighted social networks,we firstly propose a Community Structure Discovery Algorithm based on Penalized Matrix Decomposition(PMDCSDA)with the multifactor.First the network dataset is numbered,and then the network is transformed into adjacency matrix accoding to the network topology,next,the adjacency matrix is normalized,and then the normalized matrix is decomposed by singular value.Finally,multifactor penalty matrix algorithm is used to discovery the community subgraphs.Comparing with the existing algorithms(K-Means,GN,CPM),experimental results show that the proposed method can not only find community subgraphs but also achieve good results in terms of precision and F-measure in the Karate,Dolphins and Football dataset.(2)The performance of PMDCSDA in low density social network is generally good,whearas in the relatively dense network is relative poor,we propose a CPM-KMeans algorithm based on CPM and K-Means.First,the maximum number of cliques are generated combining with dept-first search and breadth-first search,which is used to determine the number of cluster center point.Then we extend the K-Means algorithm based on the clique,and the similary value of points are employed to compute the distance between points.Compared with the PMDCSDA algorithm,the CPM-KMeans showed the excellent performance on the Football dataset in terms of accuracy and seperation.We further validate the effectiveness of our method on the protein interation network of Collins dataset.
Keywords/Search Tags:Social Network, Community Structure, Penalty Matrix, K-Means, CPM
PDF Full Text Request
Related items