A tuning framework for software-managed memory hierarchies

Posted on:2010-09-09

Degree:Ph.D

Type:Thesis

University:Stanford University

Candidate:Ren, Manman

Full Text:PDF

GTID:2448390002984066

Subject:Engineering

Abstract/Summary:

New architectures are emerging at a rapid pace, architectures with multiple processing units on a chip and with deep memory hierarchies have become pervasive; while architectures with software-managed memory hierarchies (such as the Sony/Toshiba/IBM Cell processor) have gained popularity. Due to the increased complexity of architectures, re-targeting a legacy application to a new architecture requires lots of time porting and tuning. To achieve both portability and high performance on modern machines, we propose a programming environment that includes a portable language (Sequoia), a portable runtime and a tuning framework. In this thesis, we focus on the design and implementation of the tuning framework.;Achieving good performance on a modern machine with a multi-level memory hierarchy, and in particular on a machine with software-managed memories, requires the meticulous tuning of programs to the machine's particular characteristics. Further, the choices made when tuning a program for one machine will typically be very different to those made when tuning the same program for a different machine. A large program on a multi-level machine can easily expose tens or hundreds of inter-dependent parameters which require tuning, ranging (for example) from subarray sizes to compiler flags to loop optimizations to decomposition strategies, and manually searching the resultant large, non-linear space of program parameters is a tedious process of trial-and-error. These challenges entail the design of an automatic tuning framework.;In this dissertation, we present a general framework for automatically tuning arbitrary applications to machines with software-managed memory hierarchies. The tuning framework matches the decomposition strategies to the memory hierarchies. It uses a search algorithm, I specialized to software-managed memory hierarchies, that achieves good performance quickly due to the smoothness of the search space. The framework also applies a novel fusion algorithm that considers multiple outermost loop levels in a single step. The knowledge learned when searching the tunable space is used to guide the selection of a fusion configuration.;We evaluate our framework by measuring the performance of benchmarks that are tuned for a range of machines with different memory hierarchy configurations: a cluster of Intel P4 Xeon processors, a single Cell processor and a cluster of Sony Playstation 3s. The tuning framework gives similar or better performance than what is achieved by the best-available hand-tuned version coded in Sequoia.

Keywords/Search Tags:

Tuning framework, Memory, Performance, Architectures

Related items

1	Optically-Connected Memory: Architectures and Experimental Characterizations
2	Array syntax compilation and performance tuning
3	A journey through performance evaluation, tuning, and analysis of parallelized applications and parallel architectures: Quantitative approach
4	Comparison of MDO architectures within a universal framework
5	Decoupled memory access architectures with speculative pre-execution
6	Researches On Processing And Memory Integrated Architectures
7	Research Of Oracle Performance Tuning And Real-Time Monitoring
8	Studies On The PIM Architectures And Techniques For Scientific Applications
9	Improving energy and performance of data cache architectures by exploiting memory reference characteristics
10	High-performance packet processing engines using set-associative memory architectures