Font Size: a A A

Software Implemented Control Flow Error Detection For Transient Failures In On-board Computers

Posted on:2010-03-19Degree:MasterType:Thesis
Country:ChinaCandidate:J M LiFull Text:PDF
GTID:2178360278957235Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With more space exploration activities, on-board computer plays a more important role. However, transient hardware faults bring great impacts on computer in space environments. Traditionally, radiation-hard device is using for on-board computer. But radiation-hard device is expensive and difficult to purchase, on the other hand, COTS(Commercial-Off-The-Shelf)is more suitable to applying to space computer with the advantages of low price and high performance. However, COTS is limited in capacity of anti-radiation, so it must be reinforced by SHIFT (SHIFT, Software Implemented Hardware Fault Tolerance).Generally, SHIFT detects hardware transient faults by duplicate computing and comparing the result. Inserting instructions for redundant computing can realize fault tolerance simply and effectively during compiling, so fault tolerance by compiler is a good implementation of SHIFT. In this paper, we study on designing compiling fault-tolerant algorithm with grate fault detection coverage and high performance.With an in-depth understanding of the advantages and disadvantages of current fault-tolerant technologies, a control flow checking algorithm——CFCPT(Control Flow Checking by Path Tracking)implemented by path tracking is proposed. CFCPT using two variables for tracking path at the same time have a higher performance than the traditional one using only one variable.We proposed and implemented an enhanced algorithm based on RSCFC(Relationship Signatures for Control Flow Checking). RSCFC is a control flow checking approach for hardware transient faults. In RSCFC, the maximum sum of basic blocks is limited by the length of machine word. Through the segmented encoding of signatures, our optimized method solves the problem effectively. Comparing with RSCFC, our algorithm improves the availability remarkably. According to the law of date fault spreading in program, we bring up a optimized strategy that decrease the sum of checkpoint in EDDI reasonably. In order to prove fault-tolerance capability and the performance of the algorithm proposed in this paper, we carried out simulation experiment by Failure Injection technology.Simulation result show that the average value of fault detecting rate is 98.8% with CFCPT, and the average performance overhead are 60.5%; the average value of fault detecting rate is 98.7% with ERSCFC, and the average performance overhead are 59.3%.
Keywords/Search Tags:SHIFT, COTS, On-board Computer, Transient Fault, Control Flow Checking, Failure Injection
PDF Full Text Request
Related items