Font Size: a A A

Improving Program Cache Performance By Program Analysis And Optimization

Posted on:2008-10-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:X FuFull Text:PDF
GTID:1118360212998675Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In the past 40 years, the ever-increasing speed gap between processor and memory forms the "Memory Wall" and has resulted in one of the primary bottlenecks in computer system. Now modern computer architecture widely employs cache to decrease the impact of the gap and cache performance has an increasing influence on system speed, cost and energy usage. The utility of cache mostly depends on program locality, especially program data locality, including the temporal locality and the spatial locality. Thus, optimizing program data locality has become an important method to improve cache performance.Due to its independence of hardware and platform support, optimizing program locality by program analysis and optimization has become an active area to improve cache performance. There used to be two primary methods to optimize program data locality: the first is to analyze cache behavior of program data, then basing on cache behavior model, build a program analysis tool which shows the bottlenecks of data cache performance and hence directs programmer to tune performance of data cache; the second is program transformation by an optimizing compiler or an optimizing tool, and data cache performance is optimized in the transformation.This thesis introduces the following works basing on the above two methods:(1) Research on Program Cache Behavior ModelThough reuse distance has become an important metric of program cache behavior, problems such as high complexity and possible memory overflow have largely restricted its application. By limiting the max cache size and the area of reuse distance analysis, this paper introduces a limited reuse distance analysis method. This method efficiently avoids possible memory overflow problem in normal reuse distance analysis, and at the same time reduces complexity of reuse distance analysis to linear. Experiments on some integer and floating-point programs have demonstrated the feasibility and correctness of cache miss rate analyzed based on this reuse distance analysis.(2) Design and Implementation of Cache Behavior Analysis ToolProgram cache behavior analysis and comprehension help programmer to find the bottlenecks of program cache performance, optimize locality by program trans- formation and improve program cache performance. This thesis introduces a reuse distance based program cache behavior analysis tool, employing a source level instrumentation method based on intermediate information stack to collect program data access information, and using a limited reuse distance model to analyze program cache behavior. An example shows that this cache behavior analysis tool is capable of not only locating the cache performance bottlenecks in source codes to guide the code transformation, but also analyzing the cache behavior relationship among variables, thus to direct the data reorganization.(3) Data Layout Optimization Using Reuse Distance DistributionDuring cache behavior analysis, much similarity among reuse distance signatures of variables which are often accessed together can be easily observed. If layout of this kind of variables is optimized by reorganization, the cache performance can be improved. Motivated by this consideration, this thesis outlines a data layout optimization framework. A variable relation model based on reuse distance signature is used to find variables that are often accessed together in the framework. The framework implements a dynamic array regrouping method to reorganize the dynamic arrays. Experiments on some benchmarks from SPEC CPU2000 show this framework is feasible and effective to optimize data locality and improve cache performance.(4) Improving Cache Performance by Structure SplittingStructure splitting is a common method to improve cache performance by data reorganization, and it is of certain specialities since structure splitting only needs to analyze the relative relation of fields. To overcome the complexity of reuse distance analysis, this thesis proposes a field relation model based on a simple distance to quantitate the relation of structure fields, then searches an optimized layout for structure, optimizes program data layout and improves program data cache performance by structure splitting. Experiments show that this method is effective in optimizing program data layout, improving program data cache performance and thus improving the performance of the whole program.
Keywords/Search Tags:Cache Behavior, Reuse Distance, Source Level Instrumentation, Data Layout, Spatial Locality, Program Transform, Dynamic Array Regrouping, Structure Splitting
PDF Full Text Request
Related items