Font Size: a A A

Software Optimization Scheme On Control Flow Divergence For NVIDIA Maxwell GPGPU Applications

Posted on:2018-06-17Degree:MasterType:Thesis
Country:ChinaCandidate:Z C YeFull Text:PDF
GTID:2428330545961138Subject:Integrated circuit engineering
Abstract/Summary:PDF Full Text Request
Due to the increasing computing power and enhanced programmability of GPU,a large numbers of general purpose applications have been ported to GPU platform in recent years.However,these applications' performance widely suffer from control flow divergence thanks to the inherent SIMD architecture of GPU.Control flow divergence reduces the utilization of SIMD units in the GPU,leading to application performance degradation.Software optimization,that is,modifying the source code to re-group the GPU threads,is an effective way to mitigating this problem.Its implementation relies on three key information:performance degradation brought by control flow divergence,original control flow of the GPU threads,and the corresponding source code.Existing methods to obtain the information are difficult to implement,time-consuming,or unable to collect sufficient data respectively.Therefore,in this thesis the author proposes an innovative scheme to collect the key information and corresponding optimizing procedure based on binary instrumentation,control flow graph analysis and debug info parsing,with which the control flow divergence can be fast pinpointed and thus optimized.In this thesis,the performance degradation brought by control flow divergence is first analyzed quantitatively and expressed in a combination of a series of runtime variables and the lengths of the branch path.These two proxy information are obtained through binary instrumentation and control flow graph analysis.The original control flow of the threads is acquired by instrumentation as well,in which the warp ID and execution times of the branch instruction are preserved to generate the pattern for thread re-group.As to the source code corresponded to specific branch instruction,debug information is parsed to find out the mapping between source code location and machine instruction so that the target source code can be extracted.At last,the author illustrated detailed procedure of optimization with the information obtained above.The scheme is implemented on NVIDIA GTX 960 GPU and evaluated against 24 benchmarks in Rodinia.The three key information of these benchmarks are extracted and analyzed,and Back Propagation is selected as the analyzing case due to its significant control flow divergence.The output of instrumentation are verified through static analysis of the program,and software optimization based on these information is applied to Back Propagation.The experiment result shows that the optimization reduces divergent branch by 76.7%and increases IPC by 19.5%compared to the original version.This proves that the scheme proposed can obtain reliable key information rapidly and mitigate control flow branch divergence effectively to improve application performance.
Keywords/Search Tags:General Purpose GPU(GPGPU), Control flow divergence, Binary instrumentation, Control flow graph, Debug info
PDF Full Text Request
Related items