Font Size: a A A

Design And Implementation On The Optimization Technique Of NEU-BSP In Vertical Adjacent List-Based Graph Processing System

Posted on:2014-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:Q X WanFull Text:PDF
GTID:2308330473953917Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The recent rapid development of Internet technology makes the Web network, social networking and other large-scale graph analysis become a research focus, such as social network shortest path, PageRank and other web search. Cloud Computing provides a powerful solution for parallel computing. And the most representative is MapReduce, a kind of distributed parallel programming model. However, most of graph algorithms that execute in iterated style are not capable for MapReduce. BSP model become a better solution as Google proposed a large-scale graph procession system, Pregle. In our implementation of BSP model, which called NEU-BSP, we proposed a partitioning algorithm to ensure a balanced number of vertices among sub-tasks. However, this algorithm cannot effectively reduce communication volume. Therefore, it is a challenging task to partition a large graph so that minimize communication and ensure work balance.In order to solve the above problem, we introduce a new data structure to store graph data in distributed environment and develop NEU-BSP+system based on the original NEU-BSP system. The main contributions of this thesis are as follows. Firstly, we take a number of basement optimization techniques that include paralleling the tasks scheduler and improving the method of object serializing and so on. We get 1x speedup in NEU-BSP+ system. Secondly, based on the above data structure, we present a data placement algorithm in distributed cluster. We provide a Vertex-Edge computation model and the interfaces to adapt to the data structure’s change. Thirdly, applying the techniques of data placement method and computation model presented in this article into NEU-BSP system, we give the implementation of PageRank and SSSP algorithms.We take a series of experiment to evaluate the various aspects of performance of NEU-BSP+. Firstly, the experiment results shows that data loading time has increased a little, but the amount the communication among tasks has greatly reduced, making the application of the iterative calculation of the average over time declines 25%-30% than original NEU-BSP. Secondly, we make the performance comparison between NEU-BSP+ and the existing systems such as Hadoop, Giraph, Hama in multiple sets of data. The results turn out that the NEU-BSP+ can run up to 3-7,2-3 and 1.4 faster than Hadoop, Giraph and Hama. Meanwhile NEU-BSP+ processes more data size in memory than Giraph and Hama.
Keywords/Search Tags:BSP, graph processing, vertical partitioning adjacent list, optimization, distributed computing
PDF Full Text Request
Related items