Font Size: a A A

The Design Of Autonomous Opportunistic Protection Mechanism In Neural Network Accelerator Architecture

Posted on:2022-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:B DongFull Text:PDF
GTID:2518306602466544Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
Reliability plays a central role in deep sub-micron and nano-IC manufacturing technology.Permanent and transient faults,device aging,and thermal issues have prevented chips with near-atomic feature sizes from developing their full potential performance.In recent years,artificial neural networks have proven their success in machine learning related applications.However,recent studies have shown that the inherent fault tolerance of neural networks is very limited.Moreover,the topology of neural network engines' computing fabric,which continues large arrays of processing elements(PEs),has been increasing dramatically to incorporate the huge size and heterogeneity of the rapid evolving neural network algorithms.And it is commonly observed that activations of zero value lead to reduce PE utilization.In summary,the reliability research based on neural network accelerators is very important.According to the above research status,this thesis focuses on the fault-tolerant design mechanism based on the large-scale computing arrays aiming at the low reliability of neural network accelerator.This work presents a novel and low-cost approach which named as opportunistic redundancy to enhance the reliability of generic neural network accelerators by opportunistically exploring the chances of runtime redundancy provided by adjacent PEs.Based on this computing parallelism,mainstream hardware accelerators generally support the two-dimensional array of PEs to process data streams.Compared with the redundant replication of physical resources,the opportunistic redundancy protection mechanism makes full use of the characteristics of activations propagating in the pipeline of the array,and the parallel design concept of the neural network accelerator.Therefore,this thesis proposes three effective protection strategies: two adjacent PEs with the same input activation will dynamically form a pair of mutual verification;a PE with ignorable zero input activation forms a pair of directed verification with its neighboring PE;two adjacent PEs with both zero activations are self-isolated from faulty MAC operation.Compared with the traditional redundancy technology,the proposed technology does not introduce additional computing resources,which greatly reduces the implementation overhead and significantly improves the protection level.In order to evaluate the feasibility and effectiveness of the proposed fault-tolerant design,this thesis has done the following work: In the modeling phase,the mainstream neural network model framework Darknet is used to simulate the parallel computing architecture of neural network accelerator.By detecting the data of adjacent activation streams at runtime,the opportunities achieving redundancy protection are explored.In the platform building phase,a fault injection simulation platform based on FPGA is proposed,and the corresponding software tool chain is customized.Compared with the simulated error injection based on software level,the platform can achieve more accurate error evaluation.In the verification phase,the mainstream neural network models such as Resnet and Yolo are executed on the simulation platform and fault injection is carried out under both protected and unprotected conditions.The results show that the reliability has been enhanced.In the physical estimation phase,the extended architecture is completely designed in Verilog and synthesized using Synopsys Design Compiler tool.Compared with the baseline architecture,the architecture of this design introduces only 3.55% area overhead.Overall,the proposed fault-tolerant mechanism has great performance,while achieving low overhead and high robustness.
Keywords/Search Tags:Fault tolerance, neural network accelerator, opportunistic redundancy, FPGA
PDF Full Text Request
Related items