Font Size: a A A

Optimizing the cache performance of non-numeric applications

Posted on:2001-01-10Degree:Ph.DType:Dissertation
University:University of Toronto (Canada)Candidate:Luk, Chi-KeungFull Text:PDF
GTID:1468390014456524Subject:Computer Science
Abstract/Summary:
The latency of accessing instructions and data from the memory subsystem is an increasingly crucial performance bottleneck in modern computer systems. While cache hierarchies are an important first step, they alone cannot solve the problem. Further, though a variety of latency-hiding techniques have been proposed, their success has been largely limited to regular, numeric applications. Few promising latency-hiding techniques that can handle irregular, non-numeric codes have been proposed, in spite of the popularity of such codes in computer applications.; This dissertation investigates hardware and software techniques for coping with the instruction-access latency and data-access latency in non-numeric applications. To deal with instruction-access latency, we propose cooperative instruction prefetching , a novel technique which significantly outperforms state-of-the-art instruction prefetching schemes by being able to prefetch more aggressively and much further ahead of time while at the same time substantially reducing the amount of useless prefetches.; To cope with data-access latency, we investigate three complementary techniques. First, we study how to use compiler-inserted data prefetching to tolerate the latency of accessing pointer-based data structures. To schedule prefetches early enough, we design three prefetching schemes to overcome the pointer-chasing problem associated with these data structures, and we automate them in an optimizing research compiler. Second, we study how to safely perform an important class of locality optimizations, namely dynamic data layout optimizations, in non-numeric codes. Specifically, we propose the use of an architectural mechanism called memory forwarding which can guarantee the safety of data relocation, thereby enabling many aggressive data layout optimizations (which also facilitate prefetching) that cannot be safely performed using current hardware or compiler technology. Finally, in an effort to minimize the overheads of latency tolerance techniques, we propose new cache miss prediction techniques based on correlation profiling. By correlating cache miss behaviors with dynamic execution contexts, these techniques can accurately isolate dynamic miss instances and so pay the latency tolerance overhead only when there would have been cache misses.; Detailed design considerations and experimental evaluations are provided for our proposed techniques, confirming them as viable solutions for coping with memory latency in non-numeric applications.
Keywords/Search Tags:Latency, Non-numeric, Applications, Techniques, Cache, Data, Memory, /italic
Related items