Font Size: a A A

A high-throughput, low-power asynchronous Mesh-of-Trees interconnection network for the eXplicit Multi-Threading (XMT) parallel architecture

Posted on:2009-07-08Degree:M.SType:Thesis
University:University of Maryland, College ParkCandidate:Horak, Michael NFull Text:PDF
GTID:2448390002995005Subject:Engineering
Abstract/Summary:
As the number of processing cores per chip continues to grow, the on-chip network connecting processors to memory becomes increasingly crucial for performance. Future architectures will face scalability concerns as networks will require more die area and consume more power. The Mesh-of-Trees interconnection network is a high-throughput, low-latency network that uses pipelined routing decisions to achieve high performance for single-chip parallel processors that require high band-width to on-chip memory resources. The network has similar area requirements to other existing networks, but can utilize more bandwidth due to its unique topology.;Current single-chip parallel processors are developed as synchronous (clocked) circuits. A recent trend has emerged towards implementing GALS (globally asynchronous, locally synchronous) architectures, which do not require a clock tree spanning the entire chip, thus avoiding the considerable challenges of design and managing power consumption.;This thesis presents an asynchronous (clockless) implementation of the Mesh-of-Trees network that features lower power and area demands, while maintaining the high throughput and low latency properties of the synchronous network. Two new asynchronous designs are proposed for the fundamental pipelined components of the Mesh-of-Trees network (routing and arbitration), which are optimized for power, area, latency and throughput. Performance and power consumption are evaluated for asynchronous components in isolation, as well as a projected full network layout.;Two issues top the agenda of CPU design in the emerging many-core era: programmers' productivity and power consumption. Through its reliance on the richest available theory of parallel algorithms, the eXplicit Multi-Threading (XMT) parallel architecture addresses programmers' productivity. The motivation for this work is to provide an effective interconnection network for the XMT architecture in terms of both performance and power consumption.;In order to provide communication between the asynchronous and synchronous timing domains, mixed-timing interfaces are implemented. The network, coupled with mixed-timing interfaces, can be used to implement a GALS architecture, where different timing domains communicate via the same asynchronous network. Performance of the XMT processor with the asynchronous network and mixed-timing interfaces is measured for several applications.
Keywords/Search Tags:Network, XMT, Asynchronous, Power, Parallel, Mixed-timing interfaces, Mesh-of-trees, Performance
Related items