Font Size: a A A

Large Scale Dynamic Adaptive Graph Partition Algorithm

Posted on:2016-10-26Degree:MasterType:Thesis
Country:ChinaCandidate:J F XuFull Text:PDF
GTID:2308330476452145Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years, with the rapid popularization of Internet, the number of Internet users has increased explosively, which generates a lot of large scale graphs which are difficult to be dealed with on stand-alone computer. The treatment of the large scale graph has become a development trend of network research. However, ordinary computers cannot resolve the graph because of the limitation of memory. It’s a great challenge for the giant graphs. The best way to solve the problem is under distributed computing environment, which has developed a lot of large scale graph processing platforms. These platforms need to divide the large scale graph into some subgraphs, and put them into each computer node respectively, which ensure the nodes perform tasks parallelly. In order to improve the efficiency of distributed computing system, not only the subgraphs should be balance divided, but also the edge numbers between computer nodes had better reduce enough to decrease the communication costs. However, existed algorithms under current large scale graph processing platform always use hash algorithm to partition the graph, which produces a lot of communication overhead and affects the efficiency of large scale graph processing platforms. In fact, the large scale graph partition has become the hotspot in the field of distributed parallel computing. In this paper, large scale graph partition algorithms are analyzed in detail as follows:Firstly, we study the communication overhead of distributed graph processing system. In recent years, most of the large scale graph partition algorithms usually reduce the communication overhead by moving vertices. We combine the idea of simulated annealing algorithm with traditional vertex transformation strategies, which further reduces the communication overhead and enhances the efficiency of distributed graph processing system.Then, we study the load balancing of distributed graph processing system. In recent years, traditional research on loading balance is always controlling the number of vertices of partitions, and then forcing to balance the load of the partition without considering the internal structure of the large scale graph. In this paper, we combine the workload of partitions with graph topology, and then balance the workload of partition by monitoring the running time of each superstep. Also we analyze the real reason why the running time of the superstep too long. At last we propose two solutions to fit for the change of running time of each superstep dynamically.Finally, the two proposed algorithms are achieved and the experiments shows that they are effective and feasible.
Keywords/Search Tags:Large Scale Graph Partition, Distributed Computing, Communication Overhead, Load Balance
PDF Full Text Request
Related items