Font Size: a A A

Research On Key Technologies Of Graph Algorithm Optimization For Autonomous Processors

Posted on:2022-04-29Degree:MasterType:Thesis
Country:ChinaCandidate:X L LaiFull Text:PDF
GTID:2518306731453574Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of information technology and the advent of the era of big data,the generation of data information is very rapid.As an effective means of processing big data,graphs have been unanimously recognized by academia and industry.Graph-based computing technology can effectively extract key information from massive amounts of data,but at the same time,the demand for computing power has also increased sharply.At present,there have been a lot of research work based on the X86 CPU and GPU architecture to accelerate the graph algorithm,and have achieved good results,while the work of graph calculation optimization based on the ARM architecture is relatively small.ARM processors have been widely used in high-performance computing and terminal computing equipment.In 2020,the supercomputer Fugaku developed by Japan reached the top and won the TOP500 champion twice,which will greatly promote the use and development of application of computing systems based on the ARM architecture.Therefore,it is right and necessary to study graph algorithm optimization technology based on ARM architecture.This article mainly conducts the research of graph computing optimization based on ARM architecture from two aspects.First of all,this article optimizes the Kruskal and Prim algorithms based on the ARM architecture and proposes a Graph500 BFS validation algorithm based on the minimum spanning tree.In order to speed up the verification process,corresponding optimizations have been made at the data structure and algorithm level.The CSR format is used to store sparse graphs,which accelerates data access while reducing storage space.The function takes CSR sparse graphs and BFS spanning trees as input to generate a set of edges(source vertices,target vertices,weights),and further judgments are made based on the set of edges with corresponding logic to verify the correctness of the BFS spanning tree.Before and after optimization,the Kruskal,PRIM and the original verification algorithm are tested and compared on the FT-2000 + multi-core high-performance processor single machine platform and cluster system.The results show that the Kruskal algorithm can accelerate the BFS validation process in Graph500,and to a certain extent can ensure the reliable and stable operation of the program.Secondly,this article analyzes and optimizes the performance of Graph Mat using the vertex programming model under the ARM architecture.This paper analyzes the characteristics of the Graph Mat framework calculation mode and memory access method,and performs detailed statistics and analysis on the proportion of execution time of each part.It is found that the performance bottleneck of Graph Mat on FT-2000+is mainly the low efficiency of multi-core utilization and remote data access.In response to this discovery,this paper proposes and implements a virtual multi-node-based multi-level parallel optimization strategy and a core correlation-based data localization development optimization strategy.The experimental results show that the performance of the optimized program is greatly improved,and the effect is better in the time-consuming SPMV stage.In addition,the performance and scalability tests of the optimization methods proposed in this paper are performed on four graph algorithms.Graph Mat has excellent performance on the FT-2000+/64-core processor platform and the scalability can reach 41.3 times the speedup ratio.
Keywords/Search Tags:FT-2000+, NUMA, Kruskal algorithm, Prim algorithm, GraphMat
PDF Full Text Request
Related items