Font Size: a A A

A tuning framework for software-managed memory hierarchies

Posted on:2010-09-09Degree:Ph.DType:Thesis
University:Stanford UniversityCandidate:Ren, ManmanFull Text:PDF
GTID:2448390002984066Subject:Engineering
Abstract/Summary:
New architectures are emerging at a rapid pace, architectures with multiple processing units on a chip and with deep memory hierarchies have become pervasive; while architectures with software-managed memory hierarchies (such as the Sony/Toshiba/IBM Cell processor) have gained popularity. Due to the increased complexity of architectures, re-targeting a legacy application to a new architecture requires lots of time porting and tuning. To achieve both portability and high performance on modern machines, we propose a programming environment that includes a portable language (Sequoia), a portable runtime and a tuning framework. In this thesis, we focus on the design and implementation of the tuning framework.;Achieving good performance on a modern machine with a multi-level memory hierarchy, and in particular on a machine with software-managed memories, requires the meticulous tuning of programs to the machine's particular characteristics. Further, the choices made when tuning a program for one machine will typically be very different to those made when tuning the same program for a different machine. A large program on a multi-level machine can easily expose tens or hundreds of inter-dependent parameters which require tuning, ranging (for example) from subarray sizes to compiler flags to loop optimizations to decomposition strategies, and manually searching the resultant large, non-linear space of program parameters is a tedious process of trial-and-error. These challenges entail the design of an automatic tuning framework.;In this dissertation, we present a general framework for automatically tuning arbitrary applications to machines with software-managed memory hierarchies. The tuning framework matches the decomposition strategies to the memory hierarchies. It uses a search algorithm, I specialized to software-managed memory hierarchies, that achieves good performance quickly due to the smoothness of the search space. The framework also applies a novel fusion algorithm that considers multiple outermost loop levels in a single step. The knowledge learned when searching the tunable space is used to guide the selection of a fusion configuration.;We evaluate our framework by measuring the performance of benchmarks that are tuned for a range of machines with different memory hierarchy configurations: a cluster of Intel P4 Xeon processors, a single Cell processor and a cluster of Sony Playstation 3s. The tuning framework gives similar or better performance than what is achieved by the best-available hand-tuned version coded in Sequoia.
Keywords/Search Tags:Tuning framework, Memory, Performance, Architectures
Related items