Font Size: a A A

Research On The Key Techniques Of Routing Algorithm And Flow Control Optimizations For Cache-Coherent Networks-on-Chip

Posted on:2013-11-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:S MaFull Text:PDF
GTID:1268330422973742Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The advancement of semiconduct technology increases the core count, and makesthe traditional bus or point-to-point communication mechanisms face several challenges,including low bandwidth, high latency, high power consumption, low scalability and etc..To address these challenges, Network-on-Chip (NoC) was proposed. NoC was regardedas a simple, efficient and scalable communication paradigm for future many-core plat-forms. On the other side, due to the difficulty of parallel programming and compatibilityrequirements of history codes, cache coherence protocols will exist in many-core plat-forms. In cache coherent many-core platforms, the traffic delivered by NoC is mostlydecided by the applied cache coherence protocol. To provide efficient communicationsupport for coherent traffic, it is necessary to analyze the traffic characteristics, and thenoptimize the NoC design. The main contributions of this thesis are as follows.1. Efficient routing algorithm to support workload consolidation scenarios.Duetothehierarchicalcachecoherenceprotocolandlimitedapplicationparallelism,it is quite possible that multiple applications will run concurrently in a many-core plat-form. These workload consolidation scenarios require the routing algorithm to provideboth sufficient adaptivity and dynamic isolation. This thesis proposes Destination-BasedAdaptive Routing (DBAR). By leveraging a low-cost congestion propagation network,DBAR utilizes both local and non-local network status to efficiently avoid congestion.More importantly, by integrating the destination information into the output port selectionprocedure, DBAR dynamically isolate multiple concurrent applications. DBAR offersbetter performance than the best baseline algorithm for many measured configurations.2. Efficient design of fully adaptive routing algorithm for cache coherent trafficDue to area and power consumption limitations, cache coherent NoC generally con-figure a small number of virtual channels (VCs). Limited VCs pose several challengesto the design of fully adaptive routing algorithms. Previous deadlock avoidance theoriesrequire a conservative VC re-allocation scheme, which strongly limits the performance ofrouting algorithms. This thesis proposes a novel VC re-allocation scheme, whole packetforwarding (WPF). We prove that WPF does not induce deadlock, thus WPF is an im-portant extension of previous deadlock-avoidance theories. WPF can greatly improvethe performance of fully adaptive routing algorithms. To efficiently utilize WPF in VC- limited networks, we design a novel fully adaptive routing algorithm which maintainspacket adaptivity without significant hardware cost.3. Flit bubble flow control for torus cache coherent NoCsShortandlongpacketscommonlyco-existincache-coherentnetworks-on-chip(NoC-s). Existing deadlock avoidance designs for torus networks do not efficiently handle thismixofpacketsizes. ThesepreviousdesignseitherleveragetwoVCsorregardeachpacketas a maximum-length packet. We propose a novel deadlock avoidance theory, flit bubbleflow control (FBFC). The insight of FBFC is that maintaining one free flit-size bufferslot inside a ring can avoid deadlock for wormhole torus networks. Only one VC is re-quired. FBFC does not treat short packets as long ones; this yields high buffer utilization.Based on this theory, we present two implementations, and both show large performanceimprovement than previous designs.4. Efficient support of collective communication in cache coherence protocolsCache coherence protocol utilizes reduction and multicast. Hardware support is nec-essary to prevent these collective communication from becoming a system bottleneck.This research explores support for reduction and multicast communication operationsin a directory cache coherence protocol. This paper makes two primary contributions:an efficient framework to support the reduction of ACK packets and a novel Balanced,Adaptive Multicast (BAM) routing algorithm. By combining ACK packets during trans-mission, this framework not only reduces packet latency, but also improves the networksaturation throughput with little overhead. The balanced buffer resource configuration ofBAM helps to get some additional saturation throughput improvements.In summary, this thesis aims to ‘optimizing the design of NoCs for cache coherenceprotocols’. Based on the analysis of the characteristics of coherent traffics, we optimizethe design of routing algorithms and flow control mechanisms for NoCs. The proposedoptimizations not only improve the performance, but also extend the deadlock avoidancetheories. Thus, this thesis has both engineering value and theoretical significance.
Keywords/Search Tags:Networks-on-Chip, Cache Coherence Protocol, Workload Con-solidation, Fully Adaptive Routing, VC Re-allocation, Flit Bubble Flow Control, Reduction Communication, Multicast Communication
PDF Full Text Request
Related items