Font Size: a A A

Community Clustering Algorithm Based On The Label Propagation And Fitness

Posted on:2016-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:C LiFull Text:PDF
GTID:2180330461467805Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Many network systems in the real word can be abstracted into a social network, in these networks, nodes represent individuals and the edges between nodes indicate the relationship of individuals. Along with the unceasingly thorough development of the social network research, people found that the network has the characteristics of community structure. The links between nodes within the community are close, while the nodes between them connect to each other are sparse. Community structure often represents set of nodes in the network having the same property features and mining this structure of network has the very important theoretical significance on studying the evolution of social network, analyzing the potential capabilities and understanding the topology of network.The related research work of this paper focus on how to mining the community structure quickly and effectively from network. We first analyzes and summarizes the related background and current research status of the community clustering, and studies the important theory about social network, which laying the theoretical foundation for follow-up work. Then we make a deep analysis of the label propagation algorithm which is short for LPA and find LPA algorithm has many problem, it has the " label upstream" phenomenon, has many initial label, the condition to update label is single and it has randomness, which affect the its performance and effectiveness. We give the general solution to solve those problem, and on the basis of that general solution, propose a label propagation community clustering algorithm with label fitness, referred to as the LPA-FA algorithm.First of all, all the nodes in the network are formed an ordered sequence in the basis of the size of the node degree. In each iteration process of LPA-FA algorithm, the nodes sequentially update their labels according to the sequence of nodes, which avoids the " label upstream" phenomenon of LPA algorithm caused by using the random sequence of nodes, and improves the efficiency of the algorithm. In the process of node label initialization, LPA-FA algorithm reduces the number of initial labels in the network by using a simple linear initialization method, which reduces the running time of the algorithm. When the update label is not unique for the current node, we introduce three parameters, the clustering coefficient of adjacency subsystem, the weight of adjacent edges, the similarity of nodes’ attributes, and get the comprehensive evaluation value of the three parameters by Topsis method as the condition of choosing label. In this paper, this comprehensive evaluation value is name as label fitness and we choose the label of the adjacency subsystem which has the biggest label fitness as the update label for the current node, which not only solves the problem that the nodes’ update condition of LPA algorithm is single and incomplete, but also overcomes the randomness of LPA algorithm, thus LPA-FA algorithm reduces the time overhead and improves the quality of clustering community.Finally, LPA-FA algorithm and LPA algorithm are tested on Zachary karate club network and scientist co-authorship network respectively, and we compare and analyse their performance. The results of test show that LPA-FA algorithm is more effective than LPA algorithm, which not only reduces the algorithm’s running time, but also improve the quality of clustering community.
Keywords/Search Tags:social network, label propagation, community clustering, label fitness
PDF Full Text Request
Related items