Research On Performance Optimization Of Cache Replacement Algorithm For Herogeneous Multi-core Systems

Posted on:2018-01-10

Degree:Master

Type:Thesis

Country:China

Candidate:Q W Fan

Full Text:PDF

GTID:2348330563952343

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

The arrival of the big data has brought new challenges to deal with massive amounts of data.The traditional multi-core architecture is difficult to meet the needs of large-scale computation.The combination of GPU and CPU on single chip is a trend to relieve the stress of large-scale computation.For traditional multi-core processors,as the number of cores on single chip becomes more and more,the unbalanced development speed between the processor and the memory system would result in the famous "memory wall",causing great pressure to the storage system.Nowadays,in addition to the dramatic increase in the number of cores on single chip,the heterogeneous types of the core have brought some other difficulties.In CPU-GPU heterogeneous multi-core system,both of the CPU and GPU applications have their own characteristics.CPU is responsible for the execution of tasks and serial logic control,while GPU has a great advantage in parallel computing due to its thread-level parallelism.Thus,this architecture gives full play to their performance advantages in their respective applications.However,this architecture has put a lot of pressure on the management of the various types of resources,especially the last-level cache(LLC),which are shared between the CPU and the GPU,greatly affecting system performance and power consumption.This paper analyzes the current situation of optimization scheme on shared memory in heterogeneous multi-cores and finds out that the implementation of all the present Cache replacement algorithms in heterogeneous multi-core environment are thread-blind,which ignores the respective characteristics of GPU and CPU applications,causing low utilization of the cache.Most of the GPU applications utilize thread-level parallelism and caching to reduce the negative impact of memory latency while CPU applications can only make up the problems caused by memory latency by caching.Hence,the cache deployment of CPU and GPU applications should be treated separately.Here we present a cache replacement algorithm based on LLC misses for heterogeneous multi-core to improve the system performance.The algorithm takes account of the characteristics of CPU and GPU applications in runtime.CPU and GPU applications share the las level cache in a different way,which ensure that CPU applications with high cache sensitivity have a longer lifecycle in the cache.Furthermore,the algorithm takes into account the recent access time and access frequency of the Cache block,and dynamically selects the LRU or LFU algorithm to fit the current operating state by comparing the number of cache misses in the last cache.The algorithm not only considers the operation characteristics of CPU application and GPU application under heterogeneous multi-core,but also combines the two classical replacement algorithms of LRU and LFU,which is of great significance to the efficient utilization of the last level cache under heterogeneous multi-core.In order to give full play to the advantages of the two classical algorithms,this paper proposes a novel method which dynamically alter the replacement algorithm based on the reuse ability of data block.The algorithm sets different priority for CPU and GPU application to access the cache and adds an array to save the information of the buffered cache block to avoid the problem caused by LRU algorithm.The algorithm not only considers the characteristics of different types of core memory,but also considers the localities of the program in the process of running the program,so as to achieve the purpose of improving the system performance under heterogeneous multi-core system.In order to measure the performance accurately,we evaluate our proposal using gem5-gpu as the base architectural simulator.We use workloads constructed from the SPEC CPU 2006 benchmarks and Rodinia for evaluation.Experimental results show that our methodology improves the performance up to 9.1% and 6.7% on average for CPU programs and improves the performance up to 7.5% and 7.0% on average for GPU programs.The miss rate of LLC reduces by 44% and 23% on average for CPU applications and reduced by 54.7% and 34.7% on average for GPU applications.The experimental results indicate that our optimization methods can effectively improve system performance in heterogeneous multi-core system.

Keywords/Search Tags:

heterogeneous multi-core, CPU-GPU, cache replacement algorithm, performance

PDF Full Text Request

Related items

1	Research On Cache Optimization Technology Based On CPU-GPU Heterogeneous Architecture
2	The Design Of Shared L2-Cache Structure Based On Heterogeneous Multicore System
3	Design Of Multi-level Cache Replacement Strategy In Multi-Core Processors
4	Research On LLC Replacement Policy For Heterogeneous Chip Multi-processors
5	Research On Shared Cache Management Technology In Heterogeneous Multi-core Environment
6	Study On Cache Partition Optimization Based On Non-stacked Cache Replacement Algorithm
7	Research On Key Technologies For Cache Power And Performance Optimization On Many-core Heterogeneous Architecture
8	Research On Cache Optimization Mechanism In Heterogeneous Memory Environment
9	Application Research Of Data Cache Technology In MIS
10	Optimization Of Internal Storage Architecture Of Heterogeneous Multi-core SOC Processors