Research On Performance Optimization Of Heterogeneous Platform Based On CPU-GPU And Multicore Parallel Programming Model

Posted on:2012-06-19

Degree:Master

Type:Thesis

Country:China

Candidate:B Chen

Full Text:PDF

GTID:2178330338492039

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the computing power and programmability of graphics processor uint (GPU) increasing continuously, general purpose computing on GPU (GPGPU) is gradually becoming a research hotspot. Usually the computing with GPGPU utilizes a heterogeneous mode of CPU and GPU. Although the heterogeneous system based on CPU-GPU can achieve good performance gains, program development and performance optimization of it are more complexity compared with the homogeneous system.Computing on the heterogeneous system based on CPU-GPU will encounter a lot of performance bottlenecks, such as load balancing, synchronization and delay, data locality, task division and so on. These factors are essential to improve the performance of the program. On the other hand, although the programming difficulty of the heterogeneous system based on CPU-GPU reduced greatly due to the CUDA programming model, the development requirement is still high for most of the serial program developers. And when the underlying hardware changes, software developers have to learn a new programming model and rewrite programs for the new hardware platform, which increases the burden on the programmer. So it is very significate to designing a simple and platform-independent multicore parallel programming model.We mainly did the following researches:(1) Analyzed the key factors which affect the performance of CUDA programs, summarized the existing optimization methods comprehensively and proposed our new optimization methods and optimization strategy, such as using atomic functions to achieve synchronization between different thread blocks. For each optimization method, we did experiments to verify its effectiveness and theoretical analysis. And our method that exploiting atomic functions to synchronize different thread blocks is 4~5 times faster than existing method that restarting the kernel function.(2) To further validate the effectiveness of various optimization methods, and also to descripte the development process (algorithm design, programming, performance optimization) of heterogeneous platform based on CPU-GPU, we exploited CPU-GPU heterogeneous computing platform to solve the problem of DNA or protein sequence alignment which is a bioinformatics problem, namely designed and implemented a new column-based parallel Smith-Waterman algorithm based on CUDA platform. The optimized parallel program is 37 times faster than the serial program.(3) After analyzing the OpenMM parallel programming framework deep, we proposed a library-based and hardware-independent multicore parallel programming model. In order to verify the feasibility and simplicity of the model, we implemented a prototype system for scientific computing and tested it. It shields the details of underlying hardware for upper users through designing rational hierarchy architecture of APIs. To parallelize the serial programs, programmers only need to select the appropriate dynamic-link library depending on underlying hardware at the compile time.

Keywords/Search Tags:

GPU, GPGPU, CUDA, Heterogeneous Computing Platform, Performance Optimization, Parallel Programming Model

PDF Full Text Request

Related items

1	Research On Optimized Programming For Heterogeneous Multi-core Platform
2	Study On Multi-thread Parallel Programming Method Based On Multi-core Environment
3	Broadband And Narrowband Radar Signal Processing Based On GPGPU
4	The Design Of Parallel Programming Simplification Based On CUDA
5	Design And Implementation Of High Performance Computing Platform Based On SLURM Scheduling And Heterogeneous Programming
6	Parallel Implementation Of K-means Algorithm And Performance Optimization
7	Implementation Of Radar Signal Processing Algorithms Based On GPGPU
8	Research And Implementation Of Transplant CUDA Program Based On Android
9	The Implement And Performance Analysis Of Parallel Computing Platform
10	Research On Programming Models And Optimizations For Petascale CPU-GPU Heterogeneous Computing Systems