An adaptive chip multiprocessor cache hierarchy

Posted on:2008-03-07

Degree:Ph.D

Type:Thesis

University:University of Colorado at Boulder

Candidate:Settle, M. W. Alexander

Full Text:PDF

GTID:2448390005969989

Subject:Engineering

Abstract/Summary:

System performance is increasingly coupled to cache hierarchy design as chip-multiprocessors (CMP) increase in core count across generations. Higher core counts require large last level cache (LLC) capacities to avoid costly off-chip memory band-width and the inherent bottleneck of memory requests from multiple active cores. Currently there are two divisions of thought in CMP cache design---shared versus private last level caches. At center of the issue is that CMP systems can improve different workloads: the throughput of multiple independent single-threaded applications and the high-performance demands of parallel multi-threaded applications. Consequently, maximizing the improvement of CMP performance in each of these domains requires opposing design concepts. As a result, it is necessary to investigate the behaviors of both shared and private LLC design models, as well as investigate an adaptive LLC approach that works for multiple workloads.; This thesis proposes a scalable CMP cache hierarchy design that shields applications from inter-process interference while offering a generous per core last level cache capacity and low round trip memory latencies. A combination of parallel and serial data mining applications from the Nu-Mine Bench suite along with scientific workloads from the SpecOMP suite are used to evaluate the cache hierarchy performance. The results show that the scalable CMP cache hierarchy decreases the average memory latency of the parallel workloads by 45% against the private cache configuration and an average of 15% against the shared cache. In addition, the memory bandwidth is 25% lower than the private cache bandwidth for parallel applications and 30% lower for the serial workloads.

Keywords/Search Tags:

Cache hierarchy, Private cache, Workloads, Applications, Last level cache, Parallel

Related items

1	Cache Optimizations And Parallel Simulation For Multi-threaded Workloads
2	Cache Partitioning Policies On Chip Multi-processors For Scientific Applications
3	Design And Implementation Of Distributed Cache Management System For In-memory Columnar Database
4	Adaptive Cache Management Policies For High Performance Microprocessors
5	Study On Cache Partition Optimization Based On Non-stacked Cache Replacement Algorithm
6	Research On Management Policy Of Shared Last Level Cache For Chip Multiprocessors
7	Architectural Level Leakage Power Optimization For Cache Memory In Microprocessors
8	Research On Cache Optimization Mechanism In Heterogeneous Memory Environment
9	The Dual-core Processor Of Multi-level Cache
10	Application Research Of Data Cache Technology In MIS