Research On Key Technologies Of GPU Architecture Optimization For Graph Computing

Posted on:2019-06-15

Degree:Doctor

Type:Dissertation

Country:China

Candidate:H Guo

Full Text:PDF

GTID:1360330611493110

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Large scale graph processing has been one of the critical components in many data analysis applications.As one of the most basic abstract data structure,graphs are used to represent the relationships between objects and also they are used in various applications as the way of data representation,such as web page ranking,social networks,tracking the effects of drugs on the cells,genetics,infectious diseases spreading,etc.GPUs have become a mainstream massive parallel computing platform and they can achieve higher performance with less energy cost than multiprocessors.Many graph computing programming models try to use GPU to accelerate graph computing algorithms.To solve the load imbalance and control divergence problems when accelerating graph computing algorithms by GPUs,we analyse the instruction execution and memory access of graph computing algorithms on GPU.From the GPU architecture point of view,we propose the hardware support for GPU to effectively solve the problem of load imbalance,low utilization of data cache and high data access latency and improve the performance of graph computing on GPUs.Our works are listed as follows:1.We propose a dynamic multi-grain cache management mechanism for GPU.There exist many problems,such as memory divergence,fine-grain data access,low utilization of on-chip data cache,etc.,when using GPU to accelerate graph computing algorithms,which makes GPU unable to release its strong computing power.We propose the dynamic multi-grain cache management mechanism,targeting the contradiction between the small size of the requested data and the coarse grain cache management,and design and implement the hardware cache management unit on the simulator.The cache management mechanism solves the problem of low utilization of L1 data cache and effectively improves the throughput of applications with irregular memory accesses on GPU.The experiment results show that compared with the current L1 data cache on GPU and other fine-grain cache management mechanisms,the proposed cache management mechanism improves the space utilization of L1 data cache and the performance of applications effectively.2.We propose a data structure-aware prefetching technique.Although data prefetching techniques for fixed data access pattern work well,there have not had any good data prefetching techniques for irregular data accesses.We analyse the data structure access pattern of breadth-first search,propose the technique of data-structure aware prefetching,and design and implement hardware data prefetching unit and programming interface on the simulator.This technique uses the explicit graph data structure access information to improve the accuracy of data prefetching,reduces the overhead of data prefetching on memory bandwidth,and reduces the latency of data accesses effectively.The experiment results show compared with existing data prefetching techniques,the proposed data prefetching technique is able to improve the accuracy of data prefetching for irregular data accesses,reduce the latency of memory access,and improve the performance of GPU dramatically.3.We propose a novel high-effective GPU architecture for graph computing.Although many GPU-based graph computing programming models are proposed,the load balance problem of graph computing is not able to be solved thoroughly.We analyse the software graph computing programming models,propose the novel high-effective GPU architecture for graph computing,and design and implement the hardware architecture and programming model on the simulator.The novel architecture eliminates the overheads of the load balancing pre-computations,and implements high effective load balance of inner GPU core and inter GPU core.The experiment results show compared with software-implemented graph computing programming models,the proposed GPU architecture can reduce the overheads of load balancing effectively,and dramatically improve the throughput of GPU.

Keywords/Search Tags:

GPU, Large scale graph computing, Data cache management, Data prefetching, Load balance

PDF Full Text Request

Related items

1	Energy And Performance Management In Large Data Centers: A Queuing Theory Perspective
2	Distributed Balanced Storage Management Of Massive Spatial-temporal Data Based On Graph Division
3	Segmentation And Computing Platform Of Large-scale Graph
4	Research And Implementation On Distributed Partition Algorithm Based On Heuristic Of Large Graph Data
5	Design And Implementation Of The Streaming Graph Engine On Imbalance Cluster
6	Research On Large-scale Non-negative Matrix Factorization With Graph Regularization
7	The Research To Massive Terrain Data Processing Method Based On Cloud Computing
8	Research And Optimization For Data Management Technology Of MARS
9	Research And Implemention Of K Optimal Path Planning Algorithm Based On GIS For Large Scaledata
10	Research On Campus Electronic Map Based On An Improved Cache Replacement Algorithm