Font Size: a A A

Research On Clustering Algorithms For Large-Scale Social Networks Based On Structural Similarity

Posted on:2014-09-30Degree:MasterType:Thesis
Country:ChinaCandidate:J J ChenFull Text:PDF
GTID:2298330467479757Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Social Network is the complex network structure consists of relationship between individuals in social system. With the rapid development of Internet technology in the information age, social networks, especially online social network has become an integral part of information sharing between people. As a path of information dissemination, the relationship between individuals in social networks has an essential role in many aspects. Such as advertising delivery, potential business opportunities discovery, affect predication and crisis early warning. Therefore, how to obtain valuable information from these huge networks becomes an important research topic. Network structure analysis has also attracted the attention of many researchers. The network clustering i.e. structural analysis is an effective means and ways.However, network clustering algorithms still face major challenges. Firstly, the existing network clustering algorithm did not fully consider the characteristics of actual social networks. The structure of social networks is different from general networks. There are usually some individuals play special roles in social networks. And most relationships in social networks are directed. Secondly, most resent network clustering studies do not consider large-scale data processing as a target. In this paper, we propose a set of network clustering algorithms for directed large-scale social networks to deal with these problems current social network clustering algorithms facing.Firstly, clustering algorithms for directed networks based on structural similarity are proposed. In this paper we propose two different ways for directed network clustering.we first propose a two-stage solution by first finding a undirected network approximation for directed network and then cluster this approximation using structural clustering method based on structural similarity. Then we improve existing clustering method for undirected networks to make it able to deal with directed networks. Secondly, to deal with the large-scale network, a parallel network clustering method is proposed. In this algorithm, a reasonable data partitioning strategy and exchange strategy for data in different machines are proposed based on the characteristics of social networks. This paper proves that the results of parallel structural clustering algorithms for networks are consistent with the results of original algorithms.Finally, based on the MapReduce parallel architecture, the proposed parallel network clustering algorithms are implemented. A large number of experimental results show that the proposed algorithms can improve the accuracy of network clustering, and the proposed parallel method can deal with large-scale network clustering problem effectively.In summary, this thesis achieves considerable progress and effectiveness in the area of parallel clustering directed social networks. The algorithm has good application prospect in the area of social network structure information discovery.
Keywords/Search Tags:Directed networks, Parallel algorithm, Network clustering, Structural clustering, MapReduce
PDF Full Text Request
Related items