A Compile-time Optimization Method For Heterogeneous Computing Platforms Based On LLVM

Posted on:2014-09-07

Degree:Master

Type:Thesis

Country:China

Candidate:G Pei

Full Text:PDF

GTID:2268330401487029

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The International organization for Standardization (ISO) Khronos Group came upwith the OpenCL (Open Computing Language) in2009. The OpenCL is a frameworkfor running parallel program in heterogeneous platforms. By contrast with theunnormalized instances such as CUDA, OpenCL is normalizes and cross-platforms, it isused to open up transplantable parallel program in heterogeneous platforms. The needof high-performance platform drives the development of OpenCL. And the OpenCLsolves the problems about complex memory level structure, the parallel running of dataas well as some limitation which exists in pre-existing program model.Compared with the mass of memory cache and complex logical control units, andthe ability to deal with single thread program, GPU is more suitable for the throughoutof parallel program as it owns amounts of core with lower frequency and cache andhigher register and memory bandwidth. The speed of float computing of GPU is tenfoldpower than the CPU and also the same as video memory in the same time. But the resultis opposite between the single-core CPU and GPU.This paper proposed a new method to deal with the optimization problems inheterogeneous platforms on the basis of OpenCL proper model during the compile stage.This method will put some redundancy computing forward to the kernel of new devicesterminal or master terminal to reduce the request of memory reading and writingaccording the charact of respective terminal. The method is very flexible to thetraditional method, and it can reduce the dependency between heterogeneous platformsprogram and the hardware and produce high-perform code to improve the efficiency.And this new framework mainly contains two modules, the running-check module andcompile-optimization module.This paper mainly undertook the following work. This paper extends theloop-invariant code motion algorithm, that is to confirm which load and storeinstructions can be put out the loop based on the analyzing of data dependency. And thispaper also designs a variable-analyzer to the OpenCL to compute the dimensionbetween every instruction and the function of get_global_id and get_local_id. It proveda fact that there is no need to regard the dependency of the dimension–loop when computing the dependency in kernel. So it only need to consider the non-loopdependency when putting the load instruction and store instruction outside the loop inthe OpenCL model.Finally, this paper gives the experiment. This experiment can improve the speed ofOpenCL effectively in most of scene under the LLVM(Low Level Virtual Machine)platform. It achieved a impressive improvement than ever before.

Keywords/Search Tags:

OpenCL, Compilation Optimization, LLVM, GPU

PDF Full Text Request

Related items

1	Research On LLVM JIT Compilation And Optimization Technology Based On Domestic Platform
2	Research On Parallel Compilation Technology Of Sunway Processor Based On LLVM
3	Research On Heterogeneous OpenCL Code Generation And Optimization Methods For Many-core Accelerators
4	Porting And Implementation Of OpenCL Based On FT-M7002
5	LLVM-based Energy Consumption Optimization Method For Embedded Software At Compile Time
6	Research On Heuristic Optimization Of OpenCL Program Based On Graph Network
7	Practical Formal Techniques and Tools for Developing LLVM's Peephole Optimization
8	Research On Compiler Optimization Based On OpenCL
9	The Improvement And Research Of DEM Based On OpenCL
10	Research And Implementation Of RISC-V And Post-compilation Technology