Font Size: a A A

Research And Implementation Of Self-adaptive Label Propagation Algorithm

Posted on:2017-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:L Y GengFull Text:PDF
GTID:2348330503465913Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years, with the rapid development of the Internet, Big Data have been produced, and data mining technology were spawned when processing the Big Data. The community detection is an important part of data mining technology, and has become a hot topic for many scholars. Community detection algorithm can found potential communities in a variety of networks, thus allowing further analysis of the different communities. But the huge amount of data brought by diversity and complexity of the community has also brought inconvenient for community detection. Thus community detection algorithms which can quickly convergence become more and more important. Label propagation algorithm has received much attention for its low time complexity. On the foundation of label propagation algorithm, someone proposed the attenuation label propagation algorithm, and also someone proposed the SLPA(Speaker-Listener label propagation algorithm), but SLPA can easily produce monster communities, and the attenuation label propagation algorithm can lead to over-segmentation of the community.Based on the SLPA and attenuation label propagation algorithm, this paper proposed a label propagation algorithm called self-adaptive distance-control label propagation algorithm which combines the advantages of the two algorithms. The main characteristics of this algorithm are:(1) For each label, we define its propagation distance. Each time a label is propagated, its distance reduced by one, until the propagation distance is zero, in that case the label can no longer propagate to its neighbors, thus preventing the unlimited expansion of the label, and eventually reduce the possibility monster communities;(2) Users can set the number of iterations on user's own will. Increase the number of iterations can stabilize the community detection result, but will also increase the probability of monster communities emerging. Reduce the number of iterations will assure that small communities will not be swallowed, but will increase the randomness of the algorithm. So the algorithm can customize the number of iterations, and find balance between randomness and monsters community.(3) The Algorithm can self-adaptively re-divide the monster community. Users can define the minimum number of nodes in the monster communities, when detected that the node number in communities is greater than the minimum amount of node in monster community, the algorithm will split the monster community founded, until the number of nodes in all communities are smaller than the minimum number of nodes in monsters community, thus preventing the generation of monster community.Through the analysis of the experimental results on different networks, we find that compared with original label propagation algorithm, speaker-listener label propagation algorithm and attenuation label propagation algorithm, the communities found by self-adaptive distance-control label propagation algorithm is more evenly distributed and has a smaller chance to produce monster communities on the premise that the algorithm does not affect the time complexity and quality of the result.In the meantime, this paper makes a friend recommendation system based on self-adaptive distance-control label propagation algorithm, which illustrate the usage of with self-adaptive distance-control label propagation algorithm in real world application with concrete example.
Keywords/Search Tags:data mining, label propagation algorithm, monster community, self-adaptive distance-control, friend recommendation system
PDF Full Text Request
Related items