Font Size: a A A

Research On Key Technologies Of Accelerating Computation Of Large-scale Graph Neural Network

Posted on:2022-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:L Z ZhangFull Text:PDF
GTID:2558307169980109Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Graph neural network can effectively explore the complex dependencies of vertices in graphs,and has become a powerful tool for processing graph structure data,which has been successfully applied in social networks,knowledge graphs,recommendation systems and chemical reactions.With the increasing scale of graph data,the computation of large-scale graph neural network has become a hot and difficult point in current research,and it still faces enormous performance challenges in minibatch sampling,data loading and embedding computation.Due to the complexity and irregularity of graph,randomization of mini-batch sampling data access will lead to poor locality of sampling data and explosion of neighborhood expansion.At the same time,the vertices in the graph have complex dependencies,and one vertex is likely to be connected to multiple target vertices,so that different minibatches can select repeated vertices for many times.This redundant vertex access mode leads to the problem of inefficient data loading from CPU to GPU.Moreover,redundant vertex access mode and graph neural network inference algorithm lead to embedding computation of redundant vertices in GPU.Therefore,in the training and inference stage of graph neural network,this paper studies the mini-batch sampling,data loading and embedding calculation.Firstly,aiming at the problems of poor data locality and explosion of neighborhood expansion in the process of mini-batch sampling in training stage,this paper proposes a locality-aware mini-batch sampling method.By sampling on the vertices after clustering,it not only improves the locality of vertex access,but also limits the range of neighborhood expansion,thus significantly reducing the sampling time.In view of the problems of excessive data loading delay and low efficiency of traditional data caching methods in graph neural network training,this paper also proposes a GNN layer aware caching method,which greatly reduces the number of vertex features of cached graph by caching all vertices in the l hop neighborhood of specified target vertices,and achieves better caching efficiency and memory utilization.The test results show that the proposed method can complete the training on very large-scale datasets.Compared with DGL,the proposed method can reduce the sampling time by 85.5% and the data loading time by 90.4% on average,and the comprehensive acceleration ratio is up to 5 times.Secondly,to solve the problems of redundant vertex embedding and data loading in large-scale graph neural network reasoning,this paper proposes a graph neural network inference method supporting adaptive graph structure and optimizes it by using feature partition caching method.In this paper,a graph neural network inference method based on adaptive graph structure is proposed,which can select the optimal inference algorithm according to the computing modes of different inference tasks,so as to minimize the computation of vertex embedding.In addition,this paper proposes a feature partition caching strategy,which divides the target vertices in advance and caches the corresponding feature data on GPU.After calculating the target vertices of one partition,it is replaced by the subsequent feature data partition.This feature partition caching strategy greatly reduces the cached graph vertex features,and the vertices in each mini-batch can match the cached features in GPU memory,thus realizing efficient data loading.Test results show that compared with DGL,this method can reduce the computation time of vertex embedding by 99% and the data loading time by 99%.Finally,aiming at the problems of large communication volume and poor scalability in large-scale graph neural network distributed computing,this paper applies the method of training and inference to distributed computing based on self-satisfied partition.In this paper,the existing clustering partition method is used to partition the graph,so as to reduce the data dependence among different blocks.For each graph partition,the neighborhood of the target vertex in the subgraph is extended,including all the information of neighbor vertices required during sampling,so as to avoid cross-machine data transmission and feature collection.After that,the method proposed in this paper in training and inference is applied in distributed environment,which further accelerates the computation speed of graph neural network and improves the scalability of distributed computing.The experimental results show that the proposed method is 8.7 times faster than DGL,and the scalability is improved by 21%.
Keywords/Search Tags:Graph neural network, training, inference, pipeline parallel, data parallel, sampling, data caching, embedded computation
PDF Full Text Request
Related items