Font Size: a A A

Heterogeneous Processors and Memory Systems

Posted on:2016-04-29Degree:Ph.DType:Thesis
University:The University of Wisconsin - MadisonCandidate:Wang, HaoFull Text:PDF
GTID:2478390017981328Subject:Electrical engineering
Abstract/Summary:
With aggressive technology scaling, chip manufacturers have been integrating both the CPU and the GPU in a single chip. This heterogeneous integration improves the overall throughput and energy efficiency as the serial and parallel portions of a workload can be efficiently executed on the CPU and the GPU, respectively. Such a single-chip heterogeneous processor (SCHP) introduces unique challenges in terms of managing resources shared between the CPU and the GPU.;First, the CPU and the GPU share a total chip power budget and thus the power budget allocation can impact the throughput of an SCHP. A power budget partitioning technique is demonstrated which substantially improves the overall throughput when the workload is partitioned between the CPU and the GPU. An effective runtime algorithm is proposed to determine optimal power budget partitioning at runtime. Second, the CPU and the GPU share the off-chip memory bandwidth and therefore the memory scheduling policy should properly manage the shared memory channel to maximize the throughput of an SCHP. A detailed analysis of memory access characteristics and various optimization techniques are presented for a special yet important scenario, where a single workload is partitioned between the CPU and the GPU. On the other hand, motivated by the heterogeneous integration of processing units, heterogeneous memory systems are also studied in this thesis. First, recent high-speed serial interface demonstrates a much higher bit rate with notably longer latency than current parallel interface. A hybrid memory channel architecture is proposed, which consists of both parallel and serial channels. Coupled with a hybrid-channel-aware memory page mapping technique, the performance of an SCHP can be improved by mapping memory pages of latency-sensitive and bandwidth-consuming applications to parallel and serial channels, respectively. Second, generally heterogeneous memory systems require a judicious page placement mechanism. Specifically, an asymmetric memory device with fast and slow regions requires frequently-accessed pages to be placed in the fast region. Considering the difficulties in implementing such a smart page placement mechanism, a lightweight page migration mechanism that can transfer a page between fast and slow regions with minimal performance overhead is proposed.
Keywords/Search Tags:Memory, GPU, CPU, Heterogeneous, Page, Power budget, SCHP
Related items