Research On Key Technologies Of Code Generation For CPU-GPU Heterogeneous Parallel Computing

Posted on:2018-02-23

Degree:Master

Type:Thesis

Country:China

Candidate:Y D Zhao

Full Text:PDF

GTID:2348330512489102

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

General purpose computing on graphics processing units,which has recently gained wide popularities among researchers and engineers,provides a convenient platform for accelerating the compute-bound applications.Programming frameworks on CPU-GPU heterogeneous parallel computing platforms such as CUDA and OpenCL provide relatively low-level interfaces to use the graphics processing units in the ways of SIMT and SPMD to address problems which can be expressed as data-parallel computation.However,with the low-level languages provided by the vendors,the researchers and the engineers need to understand the hardware architecture,the memory hierarchy and the execution model of the CPU-GPU heterogeneous systems for parallel computing,moreover,they need to handle a large amount of details involving synchronization and the optimization of the usages of all kinds of device memory.These burdens make programming on the CPU-GPU heterogeneous parallel computing platforms to solve scientific computing problems really difficult and error-prone.The evolutions in machine learning,data mining,image procession and other domains bring high requirements on the efficiency of large-scale scientific computation.Although the Nvidia CUDA provides C-like interfaces for researchers and engineers to program the Nvidia GPUs,however,due to the fact that programming on CPU-GPU heterogeneous parallel computing platforms is still challenging for many programmers,we think it is necessary for us to invent a high-level programming model which is efficient enough and can be easily used.Therefore,in this thesis,we introduce a tool chain for programming the CPU-GPU heterogeneous parallel computing platforms:1.We design a programming language Roya L for the users,which supports largescale linear algebra operations.This script language provides matrix types and related operations in the type system,moreover,it supports strong type semantic checking.2.We develop a compiler framework Roya to translate the Roya L program to the optimized Nvidia CUDA code.The Roya compiler framework implements important modules such as the abstract syntax tree,the symbol table and the intermediate representation,on this basis,kinds of optimization methods are used to optimize the input source code at different stages of the compile time.These convenient tools hide the complexities of the hardware architecture,the memory hierarchy and the execution model of the CPU-GPU heterogeneous systems and handle the tedious and error-prone tasks in an automatic and elegant manner.The RoyaL programming language and the Roya compiler framework,which provide a high-level programming model for the users,can not only conducts domain specific optimizations involving matrix chain multiplication/addition,but can find the parallelization patterns in nested loops and extract them as kernel functions in an automatical manner.On this basis,we conduct a series of contrast experiments to validate various optimization methods and present the performance gains on the time overhead under different scenarios.Finally,we summarise the drawback to the type system as well as the block parallel methods of the Roya compiler framework,and give out the research proposal at the end of this thesis.

Keywords/Search Tags:

CPU-GPU Heterogenous Platform, Compiler Framework, Optimization, Code Generation, CUDA

PDF Full Text Request

Related items

1	Heterogeneous Platform Research Based On Streamit Compiler
2	Optimizing Compiler Code Generation
3	Research And Implementation On Compiler Framework For Translating Ansic C Into CUDA C
4	Researches On Two Important Topics Of Certifying Compiler
5	Research On Automatic Code Generation And Optimization In Parallelizing Compiler
6	Parallel Compiler Code Generation And Communication Optimization
7	CUDA-CHiLL: A programming language interface for GPGPU optimizations and code generation
8	Research On Some Key Compiler Techniques For Embedded Processors
9	Research And Development Of The Cross Compiler Based On GCC For Embedded System
10	Compiler transformations for automatic generation of VHDL from C for code acceleration on reconfigurable devices