Font Size: a A A

A Compile-time Optimization Method For Heterogeneous Computing Platforms Based On LLVM

Posted on:2014-09-07Degree:MasterType:Thesis
Country:ChinaCandidate:G PeiFull Text:PDF
GTID:2268330401487029Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The International organization for Standardization (ISO) Khronos Group came upwith the OpenCL (Open Computing Language) in2009. The OpenCL is a frameworkfor running parallel program in heterogeneous platforms. By contrast with theunnormalized instances such as CUDA, OpenCL is normalizes and cross-platforms, it isused to open up transplantable parallel program in heterogeneous platforms. The needof high-performance platform drives the development of OpenCL. And the OpenCLsolves the problems about complex memory level structure, the parallel running of dataas well as some limitation which exists in pre-existing program model.Compared with the mass of memory cache and complex logical control units, andthe ability to deal with single thread program, GPU is more suitable for the throughoutof parallel program as it owns amounts of core with lower frequency and cache andhigher register and memory bandwidth. The speed of float computing of GPU is tenfoldpower than the CPU and also the same as video memory in the same time. But the resultis opposite between the single-core CPU and GPU.This paper proposed a new method to deal with the optimization problems inheterogeneous platforms on the basis of OpenCL proper model during the compile stage.This method will put some redundancy computing forward to the kernel of new devicesterminal or master terminal to reduce the request of memory reading and writingaccording the charact of respective terminal. The method is very flexible to thetraditional method, and it can reduce the dependency between heterogeneous platformsprogram and the hardware and produce high-perform code to improve the efficiency.And this new framework mainly contains two modules, the running-check module andcompile-optimization module.This paper mainly undertook the following work. This paper extends theloop-invariant code motion algorithm, that is to confirm which load and storeinstructions can be put out the loop based on the analyzing of data dependency. And thispaper also designs a variable-analyzer to the OpenCL to compute the dimensionbetween every instruction and the function of get_global_id and get_local_id. It proveda fact that there is no need to regard the dependency of the dimension–loop when computing the dependency in kernel. So it only need to consider the non-loopdependency when putting the load instruction and store instruction outside the loop inthe OpenCL model.Finally, this paper gives the experiment. This experiment can improve the speed ofOpenCL effectively in most of scene under the LLVM(Low Level Virtual Machine)platform. It achieved a impressive improvement than ever before.
Keywords/Search Tags:OpenCL, Compilation Optimization, LLVM, GPU
PDF Full Text Request
Related items