Font Size: a A A

Research On Performance Optimization Of Disk-based Graph Processing Systems On A Single Server

Posted on:2018-01-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:S H ChengFull Text:PDF
GTID:1368330566987973Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Storage-based graph processing system is economical and potentially scalable,thus become popular in the research community.However,these systems are facing problems such as poor locality,irregular structure and so on.As the graph scale increases rapidly,these problems become critical for disk-based graph systems on a single server.On one hand,many basic graph algorithms are hard to perform optimally over large scale graphs.On the other hand,general purpose graph processing systems on a single server have to solve problems like poor I/O efficiency and slow convergence.Therefore,this work starts from BFS,a typical graph traversal algorithm to pitch in the topic,to optimize its performance based on the characteristics on both real-world data and storage device.It covers aspects such as data I/O,graph partitioning and consistency model.The main contribution is as follows:·Fast breadth-first search over social graphs on a single server.By leverage of the preceding access pattern during the BFS iterations over a big graph,Fast BFS can accelerate the traversal.Fast BFS uses an edge-centric graph processing model to obtain the high bandwidth of sequential disk accesses without expensive data preprocessing.With a new asynchronous trimming mechanism,Fast BFS can efficiently reduce the size of a big graph by eliminating useless edges in parallel with the computation.Moreover,Fast BFS schedules I/O streams efficiently and can attain greater parallelism if an additional disk is available.Fast BFS can outperform X-stream and Graph Chi in the computing speed by up to 2.1 and 3.9 times faster respectively.·Fast breadth-first search over web graphs on a single server.Web graphs with large diameters are hard to be traversed efficiently.We propose Fast BFS+ to address the web graph traversal issue.With an asynchronous trimming mechanism which spans two computing iterations,Fast BFS+ leaves very large time window for the edge-trimming task to write the new edge files to disk.With a dynamic and asynchronous trimming mechanism,Fast BFS can turn on and off the trimming efficiently to reduce overheads.Experimental results show that,Fast BFS can be several times faster compared with X-stream and Graph Chi in the computing speed.·Asynchronous graph processing using 2-dimensional asymmetric partitioning on asingle server.Async Stripe adopts a 2-dimensional asymmetric graph partitioning scheme to enable optimized locality and fine selective-scheduling.Second,AsyncStripe utilizes an efficient stripe-based data access strategy,obtaining high disk bandwidth and smaller I/O amount.Third,Async Stripe executes graph algorithms asynchronously with two kinds of consistency models,therefore reducing unnecessary intermediate I/O and accelerating the convergence speed.Experimental results show that Async Stripe has better performance than state-of-the-art graph analysis systems.
Keywords/Search Tags:BFS, storage, I/O optimization, graph processing, asynchronous processing
PDF Full Text Request
Related items