Cache Design And Runtime Performance Optimization Based On Utilization Characteristics

Posted on:2011-12-20

Degree:Master

Type:Thesis

Country:China

Candidate:L X Xiang

Full Text:PDF

GTID:2178360302474610

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Memory system has been one of the major bottlenecks of computer system performance. A visit to main memory usually costs several hundred of CPU cycles.To narrow the gap between the processor and memory system,cache memories are prevalently deployed. The importance of cache has become even more critical due to the increasing memory latency and increasing memory,requirement of emerging applications.The foundation of cache is the data locality,in applications' access streams.However, current cache designs pay little attention to characteristics of local reference for different cache levels nor to various access behaviors of applications and application phases.Therefore, cache perfomance within such designs is limited by the difficulty of adapting cache to access behaviors.This thesis analyzes the utilization characteristics for different cache levels,and proposes corresponding cache designs and runtime optimizations.For the first level cache,this thesis investigates its miss locality.Using the short miss phase as the metric of program phases,we observed that cache misses in L1 cache are mainly due to few leaky sets,which have both good continuity and good predictability.This thesis proposes a structure called Leacky Set Cache(LSC) to eliminate conflict misses for caches with low associativity.Through predicating the location of leaky sets,LSC adaptively buffers victims evicted from these leaky sets,and thus reduces conflict misses without lengthening the visit latency.In L2 cache,traditional LRU policy behaves poorly for workloads that have a working set larger than L2 cache,resulting in a great number of less reused lines that are never reused or reused for few times.In this case,the cache performance can be improved through retaining a portion of working set in cache long enough.Previous schemes approach this by bypassing never reused lines.Nevertheless,severely constrained by the number of never reused lines,sometimes they deliver no benefit because of the lack of never reused lines. This thesis proposes a new filtering mechanism that filters out the less reused lines rather than just never reused lines.The extended scope of bypassing provides more opportunities to fit the working set into cache,overcoming the problem encountered by previous schemes.This thesis also proposes the Less Reused Filter(LRF).a separate structure that precedes L2 cache,to implement the above mechanism.LRF employs a reuse frequency predictor to accurately identify the less reused lines from incoming lines.Meanwhile,based on our observation that most less reused lines have a short life span.LRF places the filtered lines into a small filter buffer to fully utilize them,avoiding extra misses. Our evaluation,for 24 SPEC 2000 benchmarks,shows that augmenting a 512KB LRU-managed L2 cache with a LRF having 32KB filter buffer reduces the average MPKI by 27.5%,narrowing the gap between LRU and OPT by 74.4%.With equal overall data lines,LRF outperforms other recent proposals including the V-Way cache,the dynamic insertion policy and the shepherd cache.

Keywords/Search Tags:

cache performance, utilization characteristics, cache filtering, leaky sets, less reused lines

PDF Full Text Request

Related items

1	Application Research Of Data Cache Technology In MIS
2	Research On Performance Analysis And Joint Optimization Of Cache And Wireless Transmission
3	Adaptive Cache Management Policies For High Performance Microprocessors
4	The Optimization Of Memory Utilization Based On Memcached
5	The Research Of Cache Optimization Problems For Content Centric Cetwork
6	Design And Implementation Of Distributed Cache Management System For In-memory Columnar Database
7	Study On Cache Partition Optimization Based On Non-stacked Cache Replacement Algorithm
8	Composite pseudo associative cache with victim cache for mobile processors
9	Classification-based Prefetch-Aware Cache Partition Mechanism
10	Research And Implementation Of Cache Technology Based On WWW