Font Size: a A A

Research On Efficient Data Communication Between CPU And GPU Through Transparent Partial-Page Migration

Posted on:2019-08-08Degree:MasterType:Thesis
Country:ChinaCandidate:S Q ZhangFull Text:PDF
GTID:2428330611993388Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,CPU-GPU heterogeneous systems have gradually become an impor-tant tool for scientific computing due to the outstanding performance of GPUs in complex scientific computing.Currently,CPU-GPU heterogeneous systems are usually connected using the PCI-E protocol,which has the characteristics of low bandwidth and high la-tency.However,computing applications on GPUs typically require large data through-put,in which case the efficiency of data communication has a significant impact on the performance of the entire CPU-GPU heterogeneous system.As the memory management mode changes from independent memory and unified address space to the current unified memory,the data communication mechanism between the CPU and the GPU is also constantly developing,from the initial programmer explicitly controls the communication to the page migration mechanism in which the page is auto-matically migrated between the CPU and the GPU according to the data requirements of the program.In high-performance workloads,small pages introduce high address trans-lation overhead,so the page size in CPU-GPU heterogeneous systems has increased in recent years.However,due to the low bandwidth and high latency interconnect between CPU and GPU,the migration latency will increase as the page is larger,causing the com-putation on the GPU to block due to waiting for data,resulting in severe performance degradation.This paper has conducted research on the above issues,the main work is as follows:(1)This paper analyzes the address translation overhead and migration delay intro-duced by the whole-page migration mechanism in CPU-GPU heterogeneous systems,and proposes a transparent partial-page migration mechanism.This partial-page migration mechanism can automatically migrate the requested portion of the page as needed and apply without modifying the program code and runtime libraries to limit both address translation overhead and migration latency.(2)This paper defines a new ""partially valid" page state,adds the migrated scope record in the TLB and page table entries,proposes two migrated scope management strate-gies,and modifies the address translation operation,to support the new partial-page mi-gration communication mechanism.At the same time,this paper evaluates the impact of page size and migration unit size on performance in the partial-page migration mecha-nism.(3)This paper proposes a new partial-page migration operation,which expands the function of the GPU memory management unit so that it can specify the migration scope and merge requests when generating the migration request,to support the new partial-page migration communication mechanism.At the same time,this article adds pre-migration optimization function to the GPU memory management unit to compare with the pre-migration optimization effect on the current whole-page migration mechanismExperiments show that when the page size is 2MB and the PCI-E bandwidth is 16GB/sec,partial-page migration can largely hide the performance overhead of the whole-page migration.Compared with the whole-page migration mechanism,the partial page migration mechanism proposed in this paper achieves an acceleration of about 94 times.
Keywords/Search Tags:Heterogeneous system, unified memory, data communication, partial-page migration
PDF Full Text Request
Related items