Font Size: a A A

The Research On Improving The Fault Tolerance Capability Of Programs In Radiation Environment

Posted on:2011-08-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q XiongFull Text:PDF
GTID:2178360308985687Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Onboard computers are core equipments of the satellite and basic platforms for information processing in space, so it is strategic to study their reliability. In space, to ensure the reliability of the onboard computer is still a challenge in face of the transient hardware faults caused by various radiations. To improve reliability with radiation hardened components is high-cost, power-wasting and low-performance, so it is not suitable to be used in onboard computers that have limit power. From recent research, using software implemented transient hardware fault tolerance on COTS components can provide space computers with high reliability, high performance, low cost and low power dissipations.Based on the analysis of advantages and disadvantages in classical fault-tolerant computing techniques, this paper proposes a program fault-tolerant ability evaluation method, which gives a measurement of fault tolerance ability based on the register of reliability analysis in the different program structure. Then, based on the previous method, FTCO (Fault Tolerance by Compiler Optimization) is given, which is used for improving the fault tolerance ability of program. Compared with software fault-tolerant technology by redundancy, FTCO can improve the fault tolerance ability effectively, while its spending is the same as original program. Finally PPRF (Fault Tolerance by Partial Protected Register Files) is described, which gives the quantitative measurement of key indicator in register using instruction classification, propagation model from the data flows and errors. Combine key indicator to the count of register's operation, give a weightiness in every register. Therefore, the measurement provides the basis for register reallocation. Compared with using hardware protection in the all registers, PPRF is not only more pertinent to protect system from transient faults, but also reduce power. In this way, PPRF improves the utilization of power in program's fault tolerance.As the experiments show, the FTCO technology can make the program fault tolerance ability improve 7.14%-10.7% without any increase in redundant cost in the original program, and the PPRF technology can make itself fault-tolerant ability improve 11.3%-60.05% on the basis of 10%-100% power improved.
Keywords/Search Tags:software fault tolerance, program fault tolerance, complier optimization, partial protected register files
PDF Full Text Request
Related items