Font Size: a A A

Research On Cross-Layer Design And Optimization Of Fault-Tolerant Network-on-Chip In Deep Submicron

Posted on:2018-07-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:J S WangFull Text:PDF
GTID:1318330542477537Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the development of technology and design methodology,a numerous number of components and functions has been integrated into a single chip.Integrated circuits have entered the era of Multi-core or Many-core Systems-on-Chip(SoCs).Such a large number of IP cores put a great pressure on the on-chip interconnection architecture.Buses have become the bottleneck of the system performance due to their limited bandwidth and scalability.On the other hand,Networks-on-Chip(NoCs)have been widely used in SoCs because they can satisfy the requirement of higher bandwidth and better scalability.In the deep submicron era,the reliability of integrated circuit is a serious concern.Since building perfect components is not applicable,designers attempt to build reliable systems based on imperfect components.As an important part of SoCs,the reliability of NoCs is directly related to the performance and reliability of systems.Failure of NoC impacts the data exchange between IP cores so that they cannot get necessary data in time and correctly.In addition,the execution of applications is interrupted,and the performance and reliability of SoCs are reduced.Therefore,the fault-tolerant design of NoC has always been a challenging issue in this domain.A fault-tolerant design demands new circuit which in turn introduces area and power overhead.These circuits are also prone to failure with the same risk as the protected circuits.Thereby,it has been a new trend in the research of fault-tolerant NoCs to reduce the overhead and failures introduced by fault-tolerant designs.This thesis researches the fault-tolerant design of NoC under deep submicron technology,aiming at reducing the area overhead and limiting the performance loss.The contributions of this thesis are listed in three parts:1)This thesis proposes a design model for cross-layer fault-tolerant designs.Based on the comprehensive study and analysis of the fault diagnosis and recovery methods in the current state of the art,this thesis states that fault diagnosis and recovery methods largely differ on protected target and failure type.Thereby,integrating fault-tolerant methods from different layers could increase the reliability of NoCs.The design model solves the problems of exchanging the failure information,controlling the configuring the fault diagnosis and recovery methods,and scheduling fault diagnosis and recovery methods.The final design model protects all the components in NoCs and covers the typical fault diagnosis and recovery methods.Based on the requirements of actual scenarios,the design model can generate reliable fault-tolerant designs by removing unnecessary and high-overhead fault-tolerant methods.Finally,this thesis applies the proposed faulttolerant design model on link failures.Simulation results show that the fault-tolerant design model can generate reasonable designs to improve the reliability of NoCs.2)This thesis explores the design space of Error Correcting Codes(ECC)in NoCs considering the failures introduced by ECC units.In NoCs,the ECC units are located in the data paths to correct the faulty information according to the protection strategy.First,the thesis gives out the reliability model of data paths with ECC.After that,this thesis proposed the method to calculate the reliability of routing information and payload information through data path with ECC units.The calculation method can give the correct rate of routing and payload information based on the failure modes of routers,links and ECC units.Finally,this thesis calculates the delivery rates of routing information and payload information under a typical scenario.The theoretical calculation shows that increasing the number of ECC units in data paths does not continually improve the reliability.This thesis suggests that different location strategies should be applied on routing information and payload information separately.Routing information should be protected by Switch-to-Switch ECC while the payload information prefer to be protected by End-to-End ECC.The analysis of the state of the arts proves the correctness of this conclusion.3)This thesis proposed a testing strategy,ESYTest,which can significantly reduce the influence of Build-in Self Test on system performance.BIST can test the circuit completely and in a fine-grain manner.However,BIST has to separate the circuit under test from the NoC.So the integrity of network has been affected and the performance degrades.The ESYTest strategy suggests different strategies to the data path and the control path.For testing the data path,test packets with test vectors are injected in the free slots of the data path.During testing the control path,the data paths are fixed so that the data packets can still deliver through the routers under test with the help of fault-tolerant routing algorithm.Therefore,ESYTest guarantees that all the IP cores can still access the NoC maintaining the computation capacity.At the same time,by optimizing the router architecture,the fault-tolerant routing algorithm and the testing sequence,ESYTest can make sure that the testing procedure does not affect the network performance to a great extent.Simulations show that ESYTest introduces negligible influence on the SoC performance so that the testing frequency can be increased significantly to improve the reliability of NoCs and SoCs.
Keywords/Search Tags:Network-on-Chip, Fault-tolerant Design, Cross-layer Design, Overhead, Performance Loss
PDF Full Text Request
Related items