Font Size: a A A

Design And Implementation Of GPU Accelerated Distributed Graph Query System

Posted on:2021-02-02Degree:MasterType:Thesis
Country:ChinaCandidate:S Y WangFull Text:PDF
GTID:2518306503973949Subject:Software engineering
Abstract/Summary:PDF Full Text Request
The Resource Description Framework(RDF)is a standard data model developed by the W3 C organization to represent linked data on the World Wide Web,which describes linked data as a set of triples forming a highly connected graph.Users can retrieve RDF data through the SPARQL query language.Online graph query is an important way to access linked data.Its goal is to find the vertices in the graph data that meet the query constraints.Graph queries can be classified into two types: light queries and heavy queries.Light queries only need to traverse a small portion of vertices and edges,while heavy queries usually need to traverse a large part of graph data,which incurs lengthy execution time.Existing systems suffer a dramatical drop in overall performance when facing hybrid workloads comprising both light and heavy queries.The reason is that there is a huge difference between the latency of these two types of queries.The processing of heavy queries is resource-consuming,which blocks the execution of light queries.Fortunately,the development of hardware makes advanced hardware begin to spread.In the datacenter,servers have equipped with GPU and RDMA-capable NICs.GPU can provide more computing resources and much higher memory bandwidth than CPU,and RDMA can reduce the overhead of cross-machine communications.Advanced hardware features bring new opportunities for designing high-performance graph query systems.This thesis proposes to introduce GPU to accelerate the processing of heavy queries,while CPU is specifically responsible for handling light queries.Based on this idea,this thesis presents Wukong+G,the first graphbased distributed RDF query processing system that efficiently exploits the hybrid parallelism of CPU and GPU.In summary,this thesis makes the following contributions:1.By analyzing the characteristics and problems of existing graph query systems,combining the development trend of hardware,we proposed a design idea of building a heterogeneous graph query system to handle heterogeneous queries.2.In order to achieve efficient GPU query execution,we designed and implemented a GPU-friendly RDF store and RDF cache,and proposed a series of optimization techniques for CPU-GPU data transfer,including queryaware prefetching,pattern-aware pipelining and fine-grained data loading.3.We implemented a prototype by extending a state-of-the-art distributed RDF store(i.e.,Wukong)with GPU support.Evaluation on a 5-node CPU/GPU cluster(10 GPU cards)with RDMA-capable network shows that Wukong+G outperforms Wukong by 2.3X-9.0X in the single heavy query latency and improves latency and throughput by more than one order of magnitude when facing hybrid workloads.
Keywords/Search Tags:Distributed Graph Query, GPU, RDMA, Heterogenous System
PDF Full Text Request
Related items