Font Size: a A A

Program Feature Analysis And Optimization On CPU&GPU Heterogeneous Many-core Integrated Processors

Posted on:2017-08-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q ZhuFull Text:PDF
GTID:1368330569998410Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Recent years have witnessed a processor development trend that integrates CPU and GPU into a single chip.The integration helps to save some host-device data copying that a discrete GPU usually requires,but also introduces deep resource sharing and possible interference between CPU and GPU.This work investigates the performance implications of independently co-running CPU and GPU programs on these platforms.A number of key theoretical and technical issues are studied.The main contributions and innovations are concluded as follows.1.Understanding Co-run Performance and Power on CPU-GPU Integrated ProcessorsOn CPU-GPU integrated processors,the integration creates deeper resource sharing;CPU and GPU share much of the memory systems,including the last-level cache(LLC)space,memory space and memory bus bandwidth.Current research focus on proposing sharing memory management design on simulators rather than offering a systematic study on the co-run performance on actual systems.There is a lack of comprehensive understanding of co-run contention on CPU-GPU integrated processors.Through comprehensive measurements and analysis,this work reveals the important effects of context switches,power management,and process scheduling,and the subtle influence from CPU-GPU data copying.We also point out the some desirable properties that future runtime system or OS should have for better supporting such integrated heterogeneous systems.2.Anticipatory Wakeup—CPU Sleep management on CPU-GPU Integrated ProcessorsCurrent OS has been able to support multi-type peripheral equipments.However,there are still a lot of works to do for OS to release the performance potential of integrated GPU.There are two features of integrated GPU system.First,the integration requires that CPU should response GPU request in time.Interrupt may introduce untrival overhead as the workload on integrated GPU is usually very short,while polling may incur additional power.Neither of them is a power efficient “in-time” synchronization.Second,as GPU is a computing unit,it is very hard to predict the execution length of jobs on GPU.This works illustrates a fundamental dilemma between GPU responsiveness,energy efficiency,and co-run interference.Through anticipatory wakeup,which is proposed to handle the dilamme,this work tries to maximize energy efficiency and to obtain better cooperativeness for the co-run jobs on CPU-GPU integrated processors.3.Co-Run Scheduling with Power Cap on Integrated CPU-GPU ProcessorsCurrently,to meet power constraint of processors,hardware can directly adjust frequency and state of processor cores.However,the hardware adjustment policy can not cooperate with the runtime information of application level.In some cases,this may incur negative impact for the applications.To overcome this shortage,our work considers a lot of runtime and hardware factors together,including memory contenion,power contention and job length,that affect job co-scheduling performance.This work unveils the complexity of the problem and proposes heuristic algorithms to efficiently find local optimal co-schedulings,which distributes power among all processor cores reasonably and produces performance close to the optimal.The proposed optimizations have been implemented on AMD/Linux and Intel/Windows system.The experiment results show that these optimizations can improve program performance on CPU-GPU integrated processors with trivial power overhead,which consequently improve energy efficiency of the entire system.
Keywords/Search Tags:Heterogeneous Fused System, GPU, Resource Contention, Performance Analysis, Power Analysis, Compiler Optimization, Runtime Optimization, Operating System
PDF Full Text Request
Related items