Research On Key Technologies For Pipeline Programs On Heterogeneous Systems

Posted on:2020-10-14

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Z Zheng

Full Text:PDF

GTID:1368330626964470

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Heterogeneous systems with GPUs as accelerators are important operating platforms for high-performance programs.Current GPU programming models provide excellent support for data-parallel programs,but lack good support for pipeline programming.Pipeline programming mode can be used to simplify programming,better utilize the computational advantages of multi-device platforms,and explore the parallelism of more dimensions.Thus,pipeline programming is used in various fields.In complex architec-tures,by exploiting the parallelism between different task phases of the pipeline program(ie,task parallelism),the program can make better use of the computing power of the hardware.The architecture of GPU heterogeneous systems is complex,thus task par-allelism should be helpful for pipeline program to fully exploit hardware performance.However,current GPU programming model lacks good support for pipeline parallelism,making it difficult to develop high-performance pipeline programs.First of all,current GPU programming models lack good support for task parallel mode.The existing task parallel methods bring problems of low GPU computing resource utilization and reduced program parallelism.Meanwhile,there is a lack of research on cross-device pipeline optimization on CPU-GPU systems.Current programming model are difficult to support pipelined data transmission across devices.Focusing on this problem,this paper studies the optimization techniuqes of pipeline program on heterogeneous systems from three aspects:computing,communication and programming.The main research contents and contributions of this paper are as follows:(1)This paper systematically analyzes and summarizes the performance bottleneck of the existing pipeline execution models on GPU.Based on the existing time-dimension task scheduling models,the concept of pipeline task scheduling in space-dimension is proposed,alone with two new pipeline task execution models.The new scheduling models can effectively support task parallelism and improve program parallelism by reducing the GPU thread's occupation of register and shared memory resources,ultimately helping the program to make fuller use of computing resources on the CPU-GPU system.Experiments show that the new scheduling model can bring up to 6.9 times(average 2.88 times)performance improvement for pipelined programs.(2)This paper proposes a method for efficient pipeline data transmission across de-vices.The method implements a cross-device queue data structure,which can realize fine-grained and small-batch dynamic asynchronous data transmission between multiple devices.On this basis,this paper solves the problem of multi-threaded access con-tention of shared data structure on GPU,optimizes the performance of Stencil pipeline cross-device data transmission,and proposes an efficient pipeline termination condition checking mechanism.Experiments show that this method brings 1.22 times to 2.13 times performance improvement for the cross-device pipeline program(3)This paper summarizes eight key metrics of the pipeline programs,and proposes a hybrid model that can fully utilize the advantages of all pipeline execution models ac-cording to the load characteristics of the pipeline program.Aiming at the problem that the hybrid model's parameter configuration space is too large,this paper proposes a per-formance tuning method with both dynamic and static approach.This paper implements a pipeline programming framework on CPU-GPU heterogeneous systems,including a programming interface,performance tuning tools and runtime library,which can help developers write high-performance pipeline programs efficiently.Experiments show that developers only need to modify dozens of lines of code to port existing pipeline pro-grams to the framework.Performance tuning tools can successfully find the optimal or near-optimal parameter configuration method.

Keywords/Search Tags:

pipeline program, pipelined communication, performance optimization, GPU

PDF Full Text Request

Related items

1	Study And Design Of 12-bit Pipelined SAR ADC
2	High-speed And Low-power Research On Pipelined-SAR ADC
3	Research And Design Of High Performance Cmos Pipeline Adc
4	Pipelined multithreading transformations and support mechanisms
5	Game Analysis, Debugging And Performance Optimization Based On Graphics Pipeline
6	Research On Key Circuits Of High Performance Pipelined ADC
7	A Optimization Design Of Pipeline ADC For DSP
8	Research Of Novel Resolution Improvement Technique For High-speed Pipelined ADC
9	Based On The 0.18 Mu Mcmos Process 12-100 - Bit Installed Base Design Of Pipeline Adc
10	External Broadband Greatly Dither Technology Research In Pipeline Adc