Font Size: a A A

Fault-Tolerant Methods For Permanent Failures On Network-on-Chip

Posted on:2021-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y L ChenFull Text:PDF
GTID:2428330605480086Subject:Information security
Abstract/Summary:PDF Full Text Request
With the progress and development of integrated circuit manufacturing process and semiconductor technology,the number of Processing Elements(PE)integrated in one chip is more and more.To meet the communication requirements of high bandwidth,low latency and low power between on-chip PEs,Network-on-Chip(NoC)is expected to replace the bus and become the mainstream interconnect solution within the chip.However,due to the shrinking of feature size and the limitation of metal interconnection technology,with the expansion of the chip scale,the probability of failures is also increased.Thus,NoC is facing many performance and reliability challenges.This thesis focuses on the research of the tolerance of permanent faults in NoC.The main work and results include the following aspects:(1)We propose a fault tolerant scheme on NoC-based SoC,for the problems of isolated PE and parted regions caused by permanent faults,to reduce the waste of the computational resource and improve the reliability of the NoC.The scheme is referred to as a Router-Shared-Pair mesh(RSPmesh).The topology architecture of the RSPmesh uses the design that a pair of neighboring PEs shares a pair of routers,and use MUXs to provide diversity of link-connections between routers.Topology reconfigurable algorithm and routing algorithm corresponding to the RSPmesh are also proposed.Thus,when there are faulty routers or links,RSPmesh-based NoC can be reconfigured to a new 2D mesh NoC with maybe smaller size,but regular and no faults.And it is able to service all healthy PEs.The RSPmesh uses no spare routers,and only make several routers disable according to actual need in topology reconfiguration.The evaluation and simulation experiment results show that the proposed scheme achieves significant improvements on reliability compared with those reported in the literature.(2)We propose a fault tolerant scheme on NoC-based SoC,for the communication performance degradation problem caused by re-routing,to tolerant multi-faults and keep the communication performance.The scheme is referred to as Big-hop Multi-faults-tolerant Minimal routing Methods.A modified router architecture is used,so the vertical and horizontal links attached to the fault router can respectively retain interconnection.Also,rectangular Faulty-zones are constructed,and routers separate by Faulty-zone can connect to each other.Then,we introduce a concept of Big-hop,and design a routing algorithm based on the Double-Y routing,to adapt to the new network topology.The simulation results show that the proposed method can tolerant multi faults routers.And at the same time,the communication performance is also guaranteed.Summarily,our research focus on reconfigurable redundant topology and fault-tolerant routing algorithm,to improve the reliability of the NoC.On one hand,for the problems of isolated PE and parted regions caused by permanent faults,we first design a redundant topology architecture,and then propose the relative topology reconfigurable algorithm and the routing algorithm.On the other hand,for the communication performance degradation problem caused by re-routing,we study the routing algorithm that can tolerant multi-faults and keep the communication performance.The evaluation and simulation experiment results show that,both studies have achieved well results.
Keywords/Search Tags:Network-on-Chip, fault-tolerant methods, reconfigurable, redundant architecture, fault-tolerant routing, minimal routing
PDF Full Text Request
Related items