Font Size: a A A

Graph Analysis On Extreme-scale Heterogeneous Architecture

Posted on:2018-06-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:H LinFull Text:PDF
GTID:1368330596452848Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Graph is a powerful data structure to present data in many areas,such as social network,biology information,webpage ranking and transport system.Graph processing system is designed to process graph data efficiently.However,the size of graphs grows too fast to be handled by present graph processing engines.For example,the edge number of Chinese web graph has exceeded ten trillion at the early 2017.Graphs that are extremely large pose a great challenge to graph processing engines,in terms of both hardware and software.We study extreme-scale graph processing problems in this thesis.The hardware platform chosen is Sunway TaihuLight,which ranks the first in latest Top500[2]list(Jun.2017).A fundamental graph algorithm,i.e.,Breadth First Search,would serve as the starting point of this thesis,then generalizing the implementation to other graph algorithms to form a graph processing framework.1.Extreme-scale Breadth First Search.Targeting the SW26010 processor of Sunway TaihuLight,we propose a contention-free data shuffling method that avoids atomic operation in accelerator cores to efficiently use memory bandwidth;a pipelined module mapping method is proposed to assign kernel modules to accelerator cores,and reserve tedious scheduling and communication work to general-purpose cores;and a group-based message batching method that aggregates small messages in a two-stage way that can reduce the number of messages by two orders of magnitude at node scale of 40,000.2.Extreme-scale graph framework.Compare to Breadth First Search,most other graph applications have much more inter-node messaging and atomic updates.Our graph framework,named ShenTu,incorporates differential message passing method to better handle load balance,in order to reduce the amount of message.That method carefully split the edges into three parts,according to their degree of source and destination vertices,and chooses the most suitable processing technique for each type of edges.To overcame the atomic update challenge,ShenTu proposes a push-based vertex state update method.According to Graph500[3]benchmark specification,the performance result of our Breadth First Search achieve 23755.7 GTEPS,sitting at the top of the list for heterogeneous architectures and the second overall in latest rank list(Jun.2017).ShenTu had been tested on both real world and synthetic graph——one iteration of PageRank is only 21 seconds on a Sogou Chinese web graph with 12 trillion edges.
Keywords/Search Tags:Heterogeneous Architecture, Super Computer, Graph Computing, High Performance Computing
PDF Full Text Request
Related items