Font Size: a A A

Model Driven Cache Management

Posted on:2020-07-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:C C YeFull Text:PDF
GTID:1368330590458834Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Modern computers use a variety of cache systems that consist of multiple kinds of memories,for example,SRAM and DRAM for CPU cache,DRAM and disk for memory cache,DRAM and PCM(Phase-change memory)for emerging hybrid memory cache.The wide existence of cache systems makes the performance tunning critical to them.Unfortunately,the performance depends on a lot of considerations,including cache space needs,cache miss ratio and bandwidth utilization.The considerations constitute a high dimensional problem space and prevent the further optimizations.Cache sharing in multi-core systems introduce more dimensions,the performance of individual programs and the overall performance.The problem space is consequently expanded and becomes a fundamental challenge to effectively and systematically tune the performance of shared cache systems.There are two difficulties in solving the problem of cache performance tuning.First,the problem space is too large to be exhausted.Second,the performance considerations interact with each other,optimize one could harm another one.For example,optimizing overall performance may result in the performance loss of individual programs.Modeldriven cache management is a practical solution to allow the performance being predictable without actually running the program and thus,the management can be specifically designed or tuned.Since the tuning process relies only on the predicted values,it is efficient,such that a large number of management strategies can be inspected and the optimal solution can be derived.The management consists of three components:(1)a performance model for a specific cache architecture,(2)a parameterized management tuning performance of a single program with multiple considerations and(3)a management trading-off between overall performance and performance of individual programs.The first component provides the prediction of cache performance and is the basis of the latter two managements,which solve the tuning problem from different aspects.For the first component,a theorem called the victim footprint(VFP)is proposed to model the shared exclusive cache.It predicts the performance of any exclusive cache hierarchy,any cache size and any program combination.More importantly,the proof of the uniqueness and correctness of the VFP is presented.In particular,the performance modeling problem of shared exclusive cache as used on AMD processors is described and formalized as constraints called victim cache requirement(VCR),then the VFP is proved to be the only solution to the VCR.The VFP is evaluated by using it to predict the miss ratio of parallel mixes of sequential programs.For the second component,a memory management called fraction cache(FCache)and its related tuning technology is presented as a solution to the performance tuning problem of multiple considerations of a single program.The DRAM-PCM hybrid memory architecture is then employed as an application scenario to demonstrate the effectiveness of FCache on tuning two performance considerations simultaneously,DRAM space needs and DRAMPCM migration cost for specific.FCache then divides the data into fractions and stores them into different types of memories.A fraction can be cached in DRAM and evicted to PCM.By parameterizing the fractions,the FCache is flexible,such that it encompasses a broad solution space.Furthermore,its parameters can be automatically tuned for different performance objectives.For the third component,a memory management called Rochester Elastic Cache Utility(RECU)is designed to trade-off between overall performance and performance of individual programs.RECU bounds the performance loss of individual programs when optimizes the overall performance via cache partition.It defines two performance baselines for individual programs,cache space baseline and cache miss ratio baseline.The elasticity is then defined over each baseline,for example,20% cache miss ratio elasticity indicates that the miss ratio of individual programs should not be higher than 120% of the baseline when tuning the performance.In addition,the cache performance model quantifies the correlation between miss ratio and cache space,such that the cache miss ratio elasticity can be reduced to the cache space elasticity,which bounds the cache space loss of individual programs and is more intuitive for cache partition.Then,an algorithm proposed finds the optimal cache partition subjecting to the elasticity constraints.At last,the practice of RECU is evaluated on a trade-off problem on shared servers.VFP and the two management aim different aspects of the performance tuning problem,they are orthogonal,modularized and thus,can be arbitrarily composed.For example,RECU can use cache performance model other than VFP,FCache can be composed with RECU to tune the program with more considerations of the performance.Such modularization is critical for solving a board range of problems.
Keywords/Search Tags:Locality Theory, Locality Optimization, Cache Performance Modeling, Cache Sharing, Cahce Management, Hybrid Memory Architecture
PDF Full Text Request
Related items