Font Size: a A A

Improving The Performance Of OLAP In Large-Scale Graph Processing System

Posted on:2021-11-20Degree:MasterType:Thesis
Country:ChinaCandidate:S Y QianFull Text:PDF
GTID:2518306104999769Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the era of big data,large-scale graph processing systems are important tools for efficient management and analysis of real-life graph data.However,current large-scale graph processing systems that depend on big data framework(e.g.,graph databases)generally use CPU-based distributed architectures.The limited number of CPU cores and the communication overhead between clusters make OLAP inefficient.Compared to CPUs,graphical computation accelerators(GPUs)have a large number of computational units and adopt a single-instruction multiple-data architecture,which is suitable for improving the OLAP performance of graph processing systems.However,the analysis of large-scale graph data on GPUs is quite challenging due to the limitations of their small memory size.To address these issues,the RockGraph system implements the idea of merging a graph database system with a graph computing accelerator(GPU)to accelerate OLAP using the GPU.In addition,to improve the efficiency of OLAP in case of out-of-memory,RockGraph further designs and implements several techniques for OLAP optimization,including subgraph extraction,graph partitioning,cyclic calculation,and dynamic scheduling.The RockGraph system uses traditional big data system HDFS and column database HBase as the storage layer,stores and queries graph data through gremlin language,and completes large-scale graph processing with GPU.When processing OLAP tasks with GPU,RockGraph extracts subgraphs containing core information from the graph database according to the users' inputs.After data format conversion,RockGraph calls dynamic link libraries with JNI tools,transferring subgraphs to GPU and analyzing them by the GPUbased computing engine.When the subgraph exceeds the GPU memory space,RockGraph divides the subgraph into multiple partitions.Then,according to the dynamic scheduling strategy,RockGraph transfers the partitions cyclically to the GPU to complete the OLAP tasks.Finally,the system writes the results back to the graph database.Experiments show that RockGraph significantly improves the performance of OLAP and is able to maintain high online analytical efficiency for large-scale graphs that are unfit to the GPU memory.Compared to GraphX,the performance of RockGraph is about five times higher on processing the OLAP tasks.When processing large-scale graphs,the OLAP performance of RockGraph is about 3-5 times higher than GraphX and 1.5-1.8 higher than the GPU-based graph processing Totem.
Keywords/Search Tags:graph processing system, large-scale graph, GPU, subgraph extraction
PDF Full Text Request
Related items