Font Size: a A A

Communication Optimization For Graph Processing System Based On Multi-GPUs

Posted on:2018-06-21Degree:MasterType:Thesis
Country:ChinaCandidate:X LuoFull Text:PDF
GTID:2428330569975187Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Large-scale graph processing has been one of the core technologies of big data processing,and has been applied in various fields.GPU has the characteristics of strong parallel capability and high bandwidth,and graph processing technology needs to meet high-performance and real-time requirements.So,many researchers pay more attention on large-scale graph processing on GPU in recent years.However,the problem size which traditional graph processing systems can handle is limited by GPU memory capacity.Also the traditional graph processing systems are developed based on BSP(Bulk Synchronous Parallel Computing Model),which requires the sequential execution of the computer and communication tasks.It results in heavy thread synchronization overhead and wasting the computing resources.What's more,communication overhead between GPU and CPU restricts the performance of large-scale graph processing on multiple GPUs.Graph processing system called Purin,which is based on the multi-GPU,aims to solve the problem that current systems has limited data size,heavy communication overhead and utilization of GPU resources.Firstly,Purin uses simple data representation,and provides random and stream based graph partitioning methods.Then Purin transfers the partitioned data to the GPU devices.Secondly,system assigns the computation task to data graph in each GPU device according to the characteristic of partitioned data,and sets different priorities for different computer tasks.Thirdly,system manages the execution order of different tasks by using the asynchronous streams provided by CUDA.Also system provides two different programming model and some optimization strategies to implement the graph algorithm.Finally,Purin terminates the execution on GPUs and collects the results from all devices.Purin provides effective APIs to the programmer for easy programming,which hides the details of programming on GPU.Our experiment shows that Purin outperforms other state-of-the-art approach Cusha by over 3.0x when running BFS algorithm on one GPU,and shows significant performance improvement over multi-GPUs based system Gunrock when processing datasets most datasets on multi-GPUs.Also,Purin exhibits good scalability.
Keywords/Search Tags:Graph Processing, GPU, Communication Overhead
PDF Full Text Request
Related items