Exploring speculative techniques to improve the memory system performance

Posted on:2004-02-02

Degree:Ph.D

Type:Dissertation

University:University of Minnesota

Candidate:Sendag, Resit

Full Text:PDF

GTID:1468390011474194

Subject:Engineering

Abstract/Summary:

As processor clock speeds have increased along with microarchitectural innovations, the gap between processor and memory performance has become a greater bottleneck and improvements in memory system design have become more important. This dissertation focuses on improving memory performance through the addition of novel functionalities in the memory system. Specifically, we have proposed two different techniques to hide the latency for memory accesses: Incorrect Speculation and Address Correlation.; The speculated execution of threads in a multithreaded architecture, plus the branch prediction used in each thread execution units, allows many instructions to be executed speculatively, that is, before it is known whether they actually will be needed by the program. We have found that incorrect speculation (wrong execution) on the instruction- and thread-level provides an indirect prefetching effect for the later correct execution paths and threads. By continuing to execute the mispredicted load instructions even after the instruction- or thread-level control speculation is known to be incorrect, the cache misses observed on the correctly executed paths can be reduced. However, we also found that these extra loads can increase the amount of memory traffic and can pollute the cache. We introduce the small, fully-associative Wrong Execution Cache (WEC) to eliminate the potential pollution that can be caused by the execution of the mispredicted load instructions. Our simulation results show that the WEC can improve the performance of a concurrent multithreaded architecture due to the reductions in the number of cache misses.; In another approach, we investigate a program phenomenon, Address Correlation, which links addresses that reference the same data. This work shows that different addresses containing the same data can often be correlated at run-time to eliminate a load miss or a partial hit. For the programs tested, a great majority of the L1 data cache load misses and the partial hits, can be supplied from a correlated address already found in the cache. Our source code-level analysis shows that semantically equivalent information, duplicated references, and frequent values are the major causes of address correlations.

Keywords/Search Tags:

Memory, Performance, Address

Related items

1	The Design Of TLD In Hign Performance Embedded CPU
2	Research And Implementation Of The Embedded Memory System Based On High Performance VIM Architecture
3	Performance Analysis Of Off-Chip Memory Architecture
4	The Optimization Of Memory Controller For High Performance CPU
5	HPC Application Address Stream Compression, Replay and Scaling
6	Research On High Performance Cache And Memory System
7	The Realization Of Embedded CPU Memory Management Unit
8	Address Randomization For Dynamic Memory Allocators On The GPU
9	Research On Memory Forensics For Memory Injection In Windows 10 User Address Space
10	Design Of Memory Management Unit In 32-bit MIPS Processor