Font Size: a A A

CPU/GPU Asynchronous Computing Patterns On CUDA

Posted on:2011-08-13Degree:MasterType:Thesis
Country:ChinaCandidate:P YaoFull Text:PDF
GTID:2178360308455368Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
While CUDA (Compute Unified Device Architecture) starts a new era of doing general-purpose computing on GPUs by providing developers a friendly development environment to fully use GPU's computing power, it also makes a new requirement of getting high CPU/GPU cooperative computation efficiency. This requirement includes two parts: (1) the load balance of threads on GPU should be preserved; (2) keep the utilization of CPU and GPU high enough. This thesis analyzes the Advantages and disadvantages of the CPU/GPU synchronous computing pattern on CUDA; proposes a CPU/GPU asynchronous computing pattern which can be used in applications that process large scale parallel data with uneven effective computation distribution. Using a bioinformatics application HMMER, such two computing patterns are implemented and compared in this thesis. This research will help developing application-drived CPU/GPU cooperative computation methods on CUDA.The major research contents and achievements cover the following aspects. Firstly, we design a generic data structure to effectively manage large scale parallel data with uneven effective computation distribution, including the main data management data structure and the aid data management data structure. Data elements with the same property will be dispatch to threads on GPU to keep load balancing. Secondly, we propose a CPU/GPU asynchronous computing pattern to overcome the CPU/GPU computing power waste which is common in the synchronous computing pattern. This asynchronous computing pattern includes the generic data structure mentioned above and a multithread code partition method which makes CPU and GPU work completely in parallel to improve the CPU/GPU utilization. Thirdly, we implement and compare the performance of an important bioinformatics application HMMER on CUDA using CPU/GPU synchronous and asynchronous computing patterns. The result shows the advantages of asynchronous computing pattern over synchronous computing pattern. Meanwhile, the impact of the effective computation interval design, communication way among threads on CPU, velocity of data producing and consuming, and data migration methods on performance are discussed.
Keywords/Search Tags:CUDA asynchronous computing pattern, load balance, HMMER
PDF Full Text Request
Related items