Font Size: a A A

The Design, Implementation And Optimization Of Data Path On Stream Accelerator For Graph Search

Posted on:2015-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:J F LiFull Text:PDF
GTID:2348330509960897Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years, the rapid development of the information society led to the explosive growth of data storage, big data problems gradually attracted researchers' much attention. However, the big data problems exhibit different load characteristics with traditional scientific computing problems, making the general-purpose processor facing challenges. Graph theory problem is typical of big data problems, playing an important role on analyzing the relationship of data objects in big data sets, especially the BFS algorithm as the representative of graph search problem, which is widely used in social networks, artificial intelligence, etc. many other fields. Considering the stream processor high performance, high efficiency and other characteristics, exploring the solutions to the problems of graph search based on stream architecture has important significance. Therefore, this paper uses the specific stream accelerator as research platform for graph search problem, and study the design and optimization of data path on chip.For the stream accelerator, on the basis of deep analyzing the load characteristics of its execution of graph search algorithms, this paper focuses the attention on the design and optimization of its on-chip data path, mainly to expand research in the following four areas:Firstly, for the SRF data path in the existing stream processor, at the basis of analyzing its lack of fierce competition of multiple threads accessing SRF in solving the graph search problem, we proposed multi-bank & multi-controller SRF structure, increasing the degree of SRF parallelism, and implemented the interconnect between the banks of SRF and the cores of processors based on AXI protocol. The data path can deal with the requests of multi-threads accessing SRF balanced, and has better throughput bandwidth, which adapts to the cores' fine-grained accessing SRF effectively.Secondly, through deep analyzing the load characteristics of the stream accelerator's execution of graph search algorithms, for further optimizing the SRF data path we proposed the method of the core owning the SRF bank privately, which avoids the complex cross-connect between them, and enhances the system's performance of the stream accelerator.Thirdly, with the analysis the data path of accessing memory of the existing stream processors, we realized that it is not suitable for multi-threads' fine-grained accessing memory, and designed the data path of high concurrency and low-latency based on packet-type format of requests of. This method not only can deal with cores' request of accessing memory balanced, but also has better performance of bandwidth and latency, which adapts to the need of multi-threads processing graph search problem.Finally, for the shared-data's consistency in graph search,at first,we explored the method of adding lock to the SPM to protect its consistency, but found that it would lead to fiercely conflict of accessing the SPM. And then, we further researched the characteristics of multi-threads accessing memory at the time of executing graph search algorithms, and proposed the mechanism of transactional memory access in the data path, which protects the shared-data consistency by accessing the shared-memory atomically, and avoids conflicts of accessing SPM frequently. At the same time, we also designed the module to access memory effectively. This method effectively reduces the block time of multi-threads access to shared-data, improves the parallelism between threads, and increases the bandwidth utilization of memory, which improves the overall performance of the stream accelerator.
Keywords/Search Tags:Big data, Graph search, Stream accelerator, Data path
PDF Full Text Request
Related items