Research On Compressed Cache Technology For Performance Optimization

Posted on:2008-07-18

Degree:Doctor

Type:Dissertation

Country:China

Candidate:X H Tian

Full Text:PDF

GTID:1118360242999234

Subject:Electronic Science and Technology

Abstract/Summary:

PDF Full Text Request

Since innovations in CMOS technology in recent years have led to performance gap between processor and memory widening, modern processors use one or more levels of on-chip caches to alleviate the ever-increasing pressure of memory accesses. In addition, as the chip density increasing, chip multiprocessors (MP) and multithreading (MT) are becoming mainstream architectures of current processor design. The both architectures can greatly improve processor performance and throughput by exploiting both thread-level and instruction-level parallelism, but the growing memory access demand in MP/MT environment challenge the throughput ability of their memory sub-system. The processor designer must determine the tradeoff between cores and caches in a fixed area budget so that neither cores nor caches is the only performance bottleneck. Compressed cache technology can change the tradeoff between cores and caches and allow a design where more on-chip area is allocated to processor cores since on-chip cache compression can increase the effective cache size without significantly increasing its area and avoid some misses. Unfortunately, cache compression also has a negative side effect, since compressed cache lines have to be decompressed before being used by processor. This means that storing compressed lines increases cache hit latency. So this paper researched on the compressed cache technology for performance optimization. The methods, such as optimizing compressed cache hierarchy, simplifying compressed algorithm and improving cache replacement policy etc. were proposed to improve performance of compressed cache.The main contributions of this paper are as follows:1. With simplifying the Frequent Pattern Compression (FPC) algorithm, which used by L2 cache compression, and dividing the decompression process of compressed cache line into two stages, we proposed a novel decompression process of L2 compressed cache line based on Simple Frequent Pattern Compression (S-FPC) algorithm. The proposed scheme can decrease L2 decompressed latency 1 cycle and support compressing L1 data cache data. We evaluated the scheme by simulation experiments and described the hardware implementation of the compression scheme in detail.2. We proposed a unified compressed cache hierarchy (UCCH) that uses a unified compression algorithm in both L1 D-cache and L2 cache, called Simple Frequent Pattern Compression (S-FPC). UCCH can increase the cache capacity of L1 D-cache and L2 cache without any sacrifice of the L1 cache access latency. The layout of compressed data in L1 data cache of UCCH enables partial cache line prefetching and does not introduce prefetch buffers or increase cache pollution and memory traffic. The experiment shows UCCH can distinctly improve the performance.3. We proposed a novel modified LRU replacement policy for compressed cache (MLRU-C). MLRU-C replacement policy uses extra tags in compressed cache to construct a shadow tag struct, which be used to identify and record the mistake replacement in LRU policy. The mistake replacements in LRU policy recorded by shadow tag struct would be stored in Mistake Record Table (MRT). The MLRU-C would correct the mistake replacement decision according to the mistake replacement record in MRT. The experiment shows that MLRU-C can evidently decrease L2 compressed cache miss rate.4. We proposed using compressed cache technology to improve multithreading processor performance. Because the data locality of L1 D-cache and L2 cache is hurted by sharing on-chip cache hierarchy between threads, MT technolgy distinctly increases the cache miss rate and memory traffic. The demands for cache capcity and data bus bandwidth between levels of caches increase apparently. Because our UCCH scheme can increase capacity of L1 D-cache and L2 cache and decrease miss rate of both L1 D-cache and L2 cache distinctly, it can alleviate the L1-L2-main memory bandwidth demand and improve the performance of MT processor.

Keywords/Search Tags:

S-FPC, Compressed Cache, Partial Cache Line Prefetching, Compressed Cache Replacement Policy, SMT, MLRU-C Replacement Policy

PDF Full Text Request

Related items

1	The Research On Data Replacement Policy Based On Request Frequency Of NDN Cache
2	Study On Cache Partition Optimization Based On Non-stacked Cache Replacement Algorithm
3	Research On Cache Replacement Policy Of CDN
4	Study On Caching Policies And Security Technologies For Content Centric Networks
5	Adaptive Cache Management Policies For High Performance Microprocessors
6	The Research Of Cache Optimization For Small Files And System Implementation
7	Research On WEB Caching Replacement And Prefetching Technology In Networked Auditing Platform
8	Design Of Multi-level Cache Replacement Strategy In Multi-Core Processors
9	Research On Cache Optimization Technology Based On CPU-GPU Heterogeneous Architecture
10	Research On Optimization Of The Memory Cache Policy Based On Hadoop In Hybrid Cloud Environment