Font Size: a A A

A dual address space architecture: Implementation and evaluation

Posted on:2009-10-13Degree:Ph.DType:Thesis
University:The University of Wisconsin - MadisonCandidate:McCurdy, Collin BFull Text:PDF
GTID:2448390005451368Subject:Computer Science
Abstract/Summary:
While a hardware supported shared address space offers programmability advantages and better performance for fine-grained applications, the very mechanisms that create those advantages appear to prevent the machines that use them from scaling to large numbers of processors. For this reason, hardware-based distributed shared memory (DSM) platforms have largely been dismissed in the context of scientific computing. This dissertation proposes changes to a hardware-based DSM architecture that allow users to use two address spaces to gain the scalability of distributed architectures while retaining the benefits of the shared address space architecture.;The thesis of this dissertation is that the Dual Address Space Architecture provides both an efficient mechanism for enabling increased user and compiler control over data consistency and data locality, and a high-level interface for programmers to conceptualize and take advantage of the mechanism. The Dual Address Space Architecture thus provides users and compilers an efficient means, particularly appropriate to the structure of scientific applications, of keeping data consistent and local, enabling improved performance over current global address space implementations.;The dissertation evaluates its thesis in two parts. The first part demonstrates the feasibility of the ideas by describing the details of the high-level architecture, including the programming model, and its implementation via extensions to a standard directory-based cache coherence protocol. The second part evaluates the performance of Dual Address Space implementations of two applications that currently underperform on distributed address space platforms. First, cycle-accurate simulation results indicate that a Dual Address Space version of HYCOM, an ocean model which features an irregular data decomposition, significantly reduces the number of last-level cache misses that cause communication, resulting in substantial improvement in performance. Second, experiments with two implementations (distributed memory and shared memory) of the Fast Multipole Method, a tree-based algorithm for solving N-body problems, expose weaknesses in both that would be cleanly and efficiently addressed by a Dual Address Space implementation of the algorithm.
Keywords/Search Tags:Address space, Implementation, Performance
Related items