Font Size: a A A

Algorithms In Mapreduce For Large-scale Social Network Mining

Posted on:2015-02-04Degree:MasterType:Thesis
Country:ChinaCandidate:L H QianFull Text:PDF
GTID:2298330452464073Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the rise of Web2.0, online social networks have attracted bothdomestic and foreign researchers. These networks share a lot of peculiartopological characteristics, such as power-law degree distributions, veryshort path lengths and tightly connected users, which have direct orindirect influences on information diffusion and user interactions withinthe networks. Studying the underlying network structures is of greatsignificance for us to get insights into the architectures and evolutions ofhuman society. Currently, the most popular online social networks in theworld boast hundreds of millions of users, as well as billions ofconnections. Traditional analytical tools (such as relational databases) andtraditional algorithms (based on single CPUs) cannot suffice any more.As to the structures of online social networks, this paper explores thetopological characteristics comprehensively, including degree distribution,reciprocity of connections, clustering, degree correlation, path length andcommunity detection. Our research is mainly based on Sina-Weibo andTwitter, and also includes previous works for comparison and contrast. TheSina-Weibo dataset used in this paper was obtained by our distributedcrawler in a3-month crawling, which consists of135million users and10.4billion connections.As to mining large-scale online social networks, this paper proposesseveral graph mining algorithms in MapReduce. The core of our proposedalgorithms is our quasi-parallel breadth-first search algorithm, which hasbetter performance in several dimensions than Pegasus, a state-of-the-artgraph mining library. We give the theoretical performance analyses of ourproposed approaches, as well as empirical results and experimental results based on the topological characteristics of Sina-Weibo.
Keywords/Search Tags:online social network, MapReduce, breadth-first search, Sina-Weibo
PDF Full Text Request
Related items