Font Size: a A A

Method And Implementation Of Control Flow In GPGPU Based On Intel Gen

Posted on:2016-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y J ZhangFull Text:PDF
GTID:2298330467994079Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
As the development of CPU has passed the era of raising clock frequency, theperformance promotion of CPUS with a single core is becoming more and moreslower than before, manufactures began to develop CPUs with multi cores. Butcurrent CPUs are usually contains only a few cores because of the cost, processorfever, difficulty of architecture and so on, the parallelism of CPU is not ideal. Underthis circumstance, people are becoming more and more interested in GPUs sinceGPUs have a native characteristic of massive parallelism. With the popularity of PCsand smart phones, GPUs are becoming cheap and common computing devices whichlead to the development of GPGPU technology.In order to make GPUs have more generalized features, executing units in GPUsare usually programmable cores; each core executes SIMD instructions and deal withmulti input data elements in one computing cycle while there are tens or evenhundreds of such cores in one GPU which makes the massive parallelism feature ofGPUs. There are several frameworks such as OpenCL for simplifying GPGPU.OpenCL has a compiler which compiles source codes written by a language based onC99into binaries which can be executed on corresponding devices, and it alsocontains APIs to manipulate the execution of the binaries. Since the executing units inGPUs are usually SIMD processors, how to map the control flows in a program to thescenario of SIMD exactly and efficiently has become a focus of OpenCL compiler. Inthis paper, we discuss the problems during the mapping process based on the IntelGen architecture.We firstly introduce the concepts such as SIMD, simple architecture of GPUs,GPGPU, OpenCL, Gen and so on, and describe the details of control flow in SIMDscenario including the category of structured control flow and unstructured controlflow, sources of unstructured control flow. We present a method which can map thecontrol flow of OpenCL kernel program to Intel Gen GPUs, this method linearize the control flow graph through inserting extra instructions into original instruction flowand it can handle both structured and unstructured control flow. We evaluate themethod as well.After describing the method, we present two ways of optimizing it. One way is toreduce the number of instructions executed actually while the other is to reduce thenumber of instructions generated by the compiler. The first plan inserts instructionswhich would pass the whole basic block based on whether this block has thepossibility to be executed, the second plan treats structured control flow andunstructured control flow separately, it manipulates structured control flow withnative control flow instructions. At the end of this paper, we test the correctness andperformance of all the methods we described with several testing suites, and glad tosee that the methods have a fine feature of correctness, robustness and adaptability.On the other hand, this paper summarizes the current practice of control flow analysismethods, and discusses the future of handling of control flow in OpenCL compilers.
Keywords/Search Tags:SIMD, Control Flow, OpenCL Compiler, Structural Analysis
PDF Full Text Request
Related items