| Massive data from social networks contains a wealth of information.Among various methods to mine such information,graph theory is an attractive tool.With the increase of the volume of the graph data,distributed computation of graphs becomes an effective means to deal with large-scale graph data.In distributed graph computation,the time consumed in communications contributes significantly to the overall computation time.A well-designed graph partitioning algorithm can effectively reduce the communications as well as achieve load balancing,thus improving the efficiency of distributed graph computation.Typical examples include the Metis graph partitioning algorithm.However,existing graph partitioning algorithms to deal with social network graphs which involve non-equilibrium graph data will result in imbalance between subgraph communications,thus affecting the computational efficiency.To solve this problem,this paper took a further reserch int the characteristics of social network data,and established a load balance partitioning model which achieved load balance and communication balance based on the traditional partioning model.Then,we design and implement Communication Balanced Lable Exchanging Algrithm based on the communication load balance partitioning model.In order to further reduce the local impact of greedy algorithm,we use the simulated annealing algorithm to optimize the CBLE algorithm.And then we analys is the result which generated by the Hash method,Metis algorithm and the CBLE algorithm respectively in the volume of communication,vertex size and communication balace.To give a more powerful proof that the algorithm improve the effic iency of distributed graph computing framework in dealing with social network data,CBLE algorithm is used in the Hama,a Apache open source graph computing framework.Results show that the communication balance can improve the effic iency of graph computation when us ing social network data.As well as cluster resources utilization improves,the algorithm results on the improvement of computational efficiency is much more obvious. |