Font Size: a A A

Dynamically reconfigurable on- and off-chip networks

Posted on:2012-10-10Degree:Ph.DType:Dissertation
University:University of Southern CaliforniaCandidate:Zafar, BilalFull Text:PDF
GTID:1458390008992702Subject:Engineering
Abstract/Summary:
Computing systems are increasingly becoming communication bound . From mega-watt servers in data centers to mobile and embedded systems, the cost, performance and reliability of the communication fabric has rapidly become a first-order design concern in digital systems at every scale. At the same time, traditional solutions to connecting resources are fast approaching their natural limits beyond which they will not be able to offer the requisite performance at acceptable area, power or dollars cost.;The growing need for a low-latency, high-bandwidth interconnect fabric for cluster systems has made InfiniBand Architecture (IBA)-compliant devices the fabric of choice in high-performance systems. IBA is an open, industry-standard specification that is designed specifically for high performance, quality of service guarantees and high robustness. For high robustness, availability and dependability, IBA-compliant networks must be able to adapt to changes in the topology caused by, for example, hot-swapping of components or faults in the network, without negatively impacting the end-to-end latency of application traffic. In this work, we present a mechanism for InfiniBand networks to reconfigure the routing algorithm without stopping injection of packets, discarding routable packets already in the network or significantly reducing the available bandwidth. The proposed mechanism uses a deadlock-free dynamic reconfiguration scheme called the Double Scheme and features available in IBA to seamlessly update the active routing function in seven steps with very little overhead. Simulation results show that using the proposed mechanism InfiniBand network remain available for application traffic during the entire reconfiguration process. This availability in the presence of failure comes at the cost of only a slight increase in the reconfiguration time.;Challenges facing traditional interconnects are not limited to off-chip interconnect fabric like InfiniBand networks. Point-to-point wires, fully-connected crossbars and shared busses which have been used as communication fabrics in multi-core processors also face a myriad of challenges like scalability, power dissipation and wiring complexity. Packet-switched on-chip interconnection networks have been proposed as a solution and this work prunes the design space of topologies for these on-chip interconnection networks using relevant quantitative and qualitative metrics. Our analysis shows that 2D mesh, torus and hierarchical ring topologies meet most of the requirements for interconnects in moderate-sized many-core CMPs.;As the number of cores per die grows, power dissipation in the on-chip interconnection network will increasingly becoming a key barrier to scalability. Studies have shown that on-chip networks can consume up to 36% of the total chip power, while analysis of network traffic reveals that for extended periods of execution time, network load is well below the network capacity in many applications. We analyze seven representative parallel applications to quantify the temporal variability in the bandwidth demand of typical applications. Analysis shows that in most of the characterized applications the demand for network bandwidth varies across applications as well as over time during the execution of the same application. In five of the seven analyzed applications, the bandwidth demand is close to zero for over 90% of the execution time. More importantly, the off periods are 1ms or longer, which means that there is a significant opportunity to turn network resource on and off based on the temporal variations in the bandwidth demand of the application.;To address the power dissipation problem by exploiting the temporal variability in the bandwidth demand of applications, this work proposes the Polymorphic Cubic Ring (pcRing) topology. Enabled by a deadlock-free dynamic reconfiguration protocol, a pcRing network provides an elegant way to trade off network bandwidth for lower (static) power without a significant latency penalty. A complete formalism for the proposed Cubic Ring topology, the associated routing algorithm and a deadlock-free dynamic reconfiguration mechanism is presented in this work. Performance of 16- and 64-node cRing networks in various configurations is evaluated via simulation, and the power savings are quantified by synthesizing a polymorphic router with power-gated ports.
Keywords/Search Tags:Network, Power, Deadlock-free dynamic reconfiguration, Systems, Bandwidth demand
Related items