Font Size: a A A

Community Identification Of Social Network Based On The GraphLab Cloud Computing Platform

Posted on:2016-11-28Degree:MasterType:Thesis
Country:ChinaCandidate:S Y WangFull Text:PDF
GTID:2308330476452168Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rise and popularity of social network, more and more researchers put their eyes on the study of it. In general, social network are connected by notes and presents a kind of community structure which has a high density of edges within them and a lower density of edges between groups. Community structure plays a very important role in the study of social network, thus it soon developed into a hot research topic and many representative algorithm on community detection appeared. However, with the expanding of social network and the increasing of nodes, the users of many dating sites, like Tencent, Sina Microblog and so on, have reached a billion. Due to the stand-alone iteration algorithm of most classic algorithm for community detection, which could only be applied to some small social network, the classic algorithm can’t effectively identify the community structure as such large-scale social network. To solve the problem, this thesis, based on the Graph Lab Cloud Computing Platform, proposes a kind of community detection algorithm that is capable of parallel computing.This thesis, first of all, introduces the relevant theory of social network and community detection, such as the graph representation of the static and dynamic network, the basic description of community structure, and the principles of measuring a community to evaluate the community detection algorithm.Later, it makes an outline of parallel computing framework to process the data, like Map Reduce model under the framework of Hadoop, the Pregel framework based on BSP model, and a detailed description is given to Graph Lab framework, which as a parallel cloud computing platform, based on Gather-Apply-Scatter computing model, can do massive mapping data calculation effectively.Finally, under the Graph Lab parallel computing model, this thesis proposes community detection algorithm for massive static and dynamic social network respectively in detail, that is, DOCVN(Detecting the Overlapping Community algorithm based on Vital Node Expanding in Graph Lab) and PDCI(Parallel Dynamic Community Identification) based on IC algorithm. In the algorithm of DOCVN, proposed the idea by the Page Rank values of node to choose vital nodes and then based on the affiliation degree of other nodes to these vital nodes to expanding the vital nodes to realize the community recognition of large scale static networks; The PDCI algorithm is improved based on the parallel IC algorithm, according to the increment of related vertex set defined by the IC algorithm and the evaluation of incremental community recognition function.This algorithm first achieve the pre-processing computing of incremental related vertex based on parallel framework Spark and implement the parallel incrementation of the community recognition on Graph Lab platform. After the experiments, these two kinds of algorithm were proved effective on identifying overlapping community structure of massive static and dynamic social network which providing new perspective and methods for massive social network community detection.
Keywords/Search Tags:GraphLab, social network, overlapping community identification, dynamic incrementation, large-scale graph
PDF Full Text Request
Related items