Font Size: a A A

Research On Dynamic Cache Partition For Fused CPU-GPU Architecture

Posted on:2016-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:C W SunFull Text:PDF
GTID:2308330473461600Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Recently, with fast development of GPU hardware and easy to adopt programming models such, using GPUs to accelerate non-graphics CPU application is becoming more and more popular and an inevitable trend. Thus "combination " of the CPUs and GPUs on the same chip has become a new architectural trend. This kind of combination offers several advantages than traditional systems, such as reducing communication cost, more efficient utilization and better performance. Due to the multi-stage filtration of hierarchical cache structure, the last level cache locality becomes poor. Thus, it is hard to use the last level cache efficiently with traditional LRU cache replacement policy and performance improvement is seriously affected. Particularly, the management of last-level cache (LLC) is very important to performance. Firstly, the CPU and GPU have different architectures and they differ on the last level cache capacity sensitivity. Secondly, GPU has a large number of cores, and under the LRU replacement strategy, GPU applications read/write the cache heavily. But the performance of the GPU applications did not significantly improved with the increase of the cache capacity. On the contrary, the CPU applications are thus allocated with less cache and the performance will be seriously affected. Due to the different characteristics of the CPU and GPU applications, it will bring new challenges to manage the the shared LLC between CPU and GPU.In this paper, we will analyze the GPU applications’features. And we propose I-M CP dynamic cache partition for the fused architecture by absorbing previous cache management schemes.The main research contents and contributions include:1. In order to manage the last level cache of CPU-GPU fused architecture efficiently, we analyze the similarities and differences of the CPU and GPU architecture, including the cost of thread switching, parallel cores, memory bandwidth, the way of cache read. After a brief introdution of the advantages and programming model on CPU-GPU fused architecture, we put forward GPU application cache sensitivity evaluation method and classify the GPU applications.2. We analyze the importance of last level cache and the challenges of LRU cache replacement policy on multi-core processors. Based on the optimization strategy of last level cache and the features of CPU and GPU applications, we propose I-M CP dynamic cache partition for the fused architecture.In this thesis, we test static cache partition on multiple dimensions through detailed experimental design. The experiments indicate that the interference between CPU and GPU applications can be avoided by cache partitioning. The results show that after cache partition, the performance of the whole system improves significantly, which indicates the effectiveness of the proposed method in this thesis. Compared to LRU, optimal static cache partition and I-M CP dynamic cache partition improves performance by 11.68% and 13.68% respectively. And GPU performance programs are degraded by 3.27% and 0.87%.
Keywords/Search Tags:GPU cache sensitivity, fusion architecture, share last level cache, dynamic cache partitionn
PDF Full Text Request
Related items