Font Size: a A A

Research On Performance Optimization Of Cache Replacement Algorithm For Herogeneous Multi-core Systems

Posted on:2018-01-10Degree:MasterType:Thesis
Country:ChinaCandidate:Q W FanFull Text:PDF
GTID:2348330563952343Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The arrival of the big data has brought new challenges to deal with massive amounts of data.The traditional multi-core architecture is difficult to meet the needs of large-scale computation.The combination of GPU and CPU on single chip is a trend to relieve the stress of large-scale computation.For traditional multi-core processors,as the number of cores on single chip becomes more and more,the unbalanced development speed between the processor and the memory system would result in the famous "memory wall",causing great pressure to the storage system.Nowadays,in addition to the dramatic increase in the number of cores on single chip,the heterogeneous types of the core have brought some other difficulties.In CPU-GPU heterogeneous multi-core system,both of the CPU and GPU applications have their own characteristics.CPU is responsible for the execution of tasks and serial logic control,while GPU has a great advantage in parallel computing due to its thread-level parallelism.Thus,this architecture gives full play to their performance advantages in their respective applications.However,this architecture has put a lot of pressure on the management of the various types of resources,especially the last-level cache(LLC),which are shared between the CPU and the GPU,greatly affecting system performance and power consumption.This paper analyzes the current situation of optimization scheme on shared memory in heterogeneous multi-cores and finds out that the implementation of all the present Cache replacement algorithms in heterogeneous multi-core environment are thread-blind,which ignores the respective characteristics of GPU and CPU applications,causing low utilization of the cache.Most of the GPU applications utilize thread-level parallelism and caching to reduce the negative impact of memory latency while CPU applications can only make up the problems caused by memory latency by caching.Hence,the cache deployment of CPU and GPU applications should be treated separately.Here we present a cache replacement algorithm based on LLC misses for heterogeneous multi-core to improve the system performance.The algorithm takes account of the characteristics of CPU and GPU applications in runtime.CPU and GPU applications share the las level cache in a different way,which ensure that CPU applications with high cache sensitivity have a longer lifecycle in the cache.Furthermore,the algorithm takes into account the recent access time and access frequency of the Cache block,and dynamically selects the LRU or LFU algorithm to fit the current operating state by comparing the number of cache misses in the last cache.The algorithm not only considers the operation characteristics of CPU application and GPU application under heterogeneous multi-core,but also combines the two classical replacement algorithms of LRU and LFU,which is of great significance to the efficient utilization of the last level cache under heterogeneous multi-core.In order to give full play to the advantages of the two classical algorithms,this paper proposes a novel method which dynamically alter the replacement algorithm based on the reuse ability of data block.The algorithm sets different priority for CPU and GPU application to access the cache and adds an array to save the information of the buffered cache block to avoid the problem caused by LRU algorithm.The algorithm not only considers the characteristics of different types of core memory,but also considers the localities of the program in the process of running the program,so as to achieve the purpose of improving the system performance under heterogeneous multi-core system.In order to measure the performance accurately,we evaluate our proposal using gem5-gpu as the base architectural simulator.We use workloads constructed from the SPEC CPU 2006 benchmarks and Rodinia for evaluation.Experimental results show that our methodology improves the performance up to 9.1% and 6.7% on average for CPU programs and improves the performance up to 7.5% and 7.0% on average for GPU programs.The miss rate of LLC reduces by 44% and 23% on average for CPU applications and reduced by 54.7% and 34.7% on average for GPU applications.The experimental results indicate that our optimization methods can effectively improve system performance in heterogeneous multi-core system.
Keywords/Search Tags:heterogeneous multi-core, CPU-GPU, cache replacement algorithm, performance
PDF Full Text Request
Related items