Graph is a data structure with strong description ability,it can represent the rela-tionship between objects in the real world,such as the road connection between cities and the blood relationship between people.Many problems can be represented by graphs,resolved by graph algorithm.Graph and graph algorithms have been widely used in social networks,genetic analysis and other fields,with the rapid growth of graph data size,how to efficiently process graph data has become a research hotspot.The breadth-first search(BFS)algorithm is a representative graph search algorithm,Graph500 benchmark uses it to evaluate the ability of a computer system to run big data applications,researching its optimization techniques can provide reference for the optimization of other graph algorithms.To research optimization techniques of BFS algorithm on many-core processors,this paper optimizes the breadth-first search algo-rithm targeting hardware features of the new generation Many-core processors(KNL).The main research work and achievements of this paper are as follows:1.A method for BFS algorithm optimization targeting the on-chip network of KNL is proposed.KNL differs from previous generations of products in that its processor cores are interconnected via a 2D on-chip network,which has many operating modes.This paper proposes a method to optimize the BFS algorithm targeting the network,which can improve the performance of BFS algorithm.2.Design and implement a fully optimized hybrid BFS algorithm targeting KNL.Based on the optimization method targeting the on-chip network,the hybrid BFS al-gorithm is further optimized from two aspects:a.Vectorization.The advanced vector instruction set extensions provided by the KNL can process 512 bits of data simulta-neously.In order to make full use of the parallelism of the BFS algorithm,we use the vector instruction to search adjacent list in parallel in the bottom-up search stage of the breadth-first search algorithm;b.Preprocess the input data to reduce the workload.Renumbering the vertices of graph to remove the isolated vertices,reduce size of the bitmap data structure and the workload of bitmap scanning;Sorting the adjacent list to reduce the number of check edge operation.Experimental results show that the proposed optimizing method targeting on-chip network can effectively improve the performance of BFS algorithm.The fully-optimized hybrid BFS algorithm targeting hardware features designed by this paper has excellent performance.It can be used as a module of other graph algorithms and has some prac-tical value. |