Font Size: a A A

Research On Cache Optimization Mechanism In Heterogeneous Memory Environment

Posted on:2022-01-20Degree:DoctorType:Dissertation
Country:ChinaCandidate:D ChenFull Text:PDF
GTID:1488306575451974Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the development of big data,artificial intelligence,and the mobile Internet industry,the data scale presents explosive growth.Applications and computing architectures significantly improve the efficiency of data transmission,storage,and computing by storing more data in memory,which also makes the demand for large-capacity main memory more urgent.Traditional Dynamic Random Access Memory(DRAM)technology faces challenges of high price,low density,and high energy consumption in memory capacity expansion,making it difficult to match the increasing data scale.The development of Non-volatile Memory(NVM)technology,with its advantages of low price,high density,and low static power consumption,has become a promising candidate for DRAM.However,compared with DRAM,the current NVM technology still has shortcomings of high write latency and high write power consumption.It is not feasible to leverage NVM to replace DRAM solely in a short time.Heterogeneous memory combining the advantages of both DRAM and NVM has become a highly competitive solution for memory capacity expansion.Due to the large difference in access delay between DRAM and NVM,memory access overhead is asymmetrical when the LLC misses.This inconsistency makes the traditional LLC optimization mechanism inefficient and inapplicable to heterogeneous memory systems.Focused on the on-chip cache optimization in the heterogeneous memory environment,we optimize the LLC in the three directions of cache replacement,cache bypass,and cache allocation,which reduces the memory access overhead and improves the throughput and fairness of heterogeneous memory systems significantly.The cache hit rate is no longer a valid measure of the efficiency of the LLC in heterogeneous memory systems.Current cache replacement strategies that aim to improve the cache hit rate are also inefficient in heterogeneous memory systems.Taking into account the asymmetry of the memory access cost,we advocate the average memory access time as the evaluation metric of the LLC efficiency in heterogeneous memory systems and propose a cache miss penalty aware replacement strategy called MALRU.MALRU preferentially replaces DRAM cache blocks with a low memory access latency to reduce the overhead of LLC misses,while frequently accessed DRAM and NVM blocks reside in the LLC to prevent cache thrashing.Experimental results show that MALRU can improve system performance by up to 13.1%,compared to the state-of-the-art HAP policy.Due to the inconsistent memory access latency,the competition among applications is no longer the only factor that affects the throughput and fairness of heterogeneous memory systems.Traditional LLC bypass strategies do not take into account the characteristics of heterogeneous memory,and thus cannot accurately determine the weight of cache blocks and applications that are at a competitive dis-advantage.To address this problem,we propose a cache-memory coordinated LLC bypass strategy in heterogeneous memory systems called CMC.CMC figures out the three main factors that affect system fairness in heterogeneous memory systems and advocates the average service time of memory requests to quantify the competitiveness of applications.At the memory layer,CMC prioritizes the scheduling of lightweight applications and applications with high average service time.At the LLC layer,CMC leverages applications' memory access behaviors in heterogeneous memory systems to guide the LLC to bypass dead blocks.Experimental results show that compared with the state-of-art TCM strategy,CMC can improve system throughput and fairness by up to 8.76% and 8.99%,respectively.Current cache allocation strategies always require additional hardware or intrusive modifications to existing hardware and cannot be applied to existing commercial processors.Besides,applications with a high percentage of write requests are blocked more seriously in heterogeneous memory systems,which aggravates the decline of system throughput and fairness.To address this problem,leveraging the Intel Cache Allocation Technology(CAT)to allocate LLC space for applications on physical ma-chines,we propose an application clustering-based cache allocation strategy for heterogeneous mem-ory systems called ACCA.ACCA groups applications and prioritizes to meet the demand of two kinds of applications for LLC space: memory lightweight and recency-friendly.Besides,ACCA quantifies the competitiveness of applications in two dimensions: processor operation and memory access.More LLC space will be allocated to applications with a high proportion of write requests.Experimental results show that compared to the state-of-the-art Dunn policy,ACCA can improve system throughput and fairness by 9.83% and 43.19% on average,respectively.
Keywords/Search Tags:Non-volatile Memory, Heterogeneous Memory, Hybrid Memory, Last Level Cache, Cache Replacement, Cache Bypass, Cache Allocation
PDF Full Text Request
Related items