Font Size: a A A

Instruction Level Fault-Tolerant Mechanism Design Based On The Time Redundancy

Posted on:2016-01-08Degree:MasterType:Thesis
Country:ChinaCandidate:X Z LvFull Text:PDF
GTID:2308330461992676Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As the manufacturing technology in semiconductor is more and more advanced, both transistor size and chip working voltage are decreasing, which cause microprocessors more susceptible to soft errors. Transient faults are usually caused by energized particles present in space or secondary particles such as alpha particles, which are produced by the interaction of neutrons and materials at ground level. Consequently, using tolerant execution and scheduling ALUs through the priority which is decided by the degree of being related of the instructions to enhance reliability against soft errors. Based on the above technologies, we can provide soft error detection and recovery. Meanwhile, we can improve the performance of error recovery and reduce the power consumption of all the operations.Due to the lower overhead of time redundancy than that of hardware redundancy of the system implementation, which is in the consideration of the aspects of performance cost and design complexity, time tolerance is a better choice. As the terminal electron equipment is more and more universal, through redundant execution and forecasting technology to improve the system performance is full of theoretical significance and wide practical value. The design and realization process of redundancy and prediction, using the principle of probability and statistics to infer the instantaneous error probability to reduce unnecessary redundant and improve the accuracy of prediction have been the core of the solution, have been the current research hot spot.In this paper, our design is part of time redundancy. Its performance in both power consumption and instructions executed per cycle is better than traditional double instruction execution.Firstly, we put forward the reliable station (RS) which is based on the structure of reservation station to improve the recovery efficiency of traditional double execution. Thereby, we can guarantee the reliability of the computation, and make the system performance loss lower than the traditional double execution at the same time. The contribution of RS is to detect errors and recover to the correct state once the soft error occurs. It attempts to ensure the reliability of computing systems by buffering dual results of previously executed instructions. When the first result of an instruction has been produced, it will be stored in RS along with other required information. The verification will be performed when the tolerant result has been produced. If both of the results are equal, the instruction is retired; otherwise it is marked and issued back to pipeline. At last, the record will be released and marked available.Secondly, we put forward the tolerant measure to handle the errors happening during the transfer process. The fault happens in the propagation that source operands are sent to issue window from registers, and then the operands stored in the window are used by both primary and duplicate instructions. Though the two copies get same operands, the data has been corrupted by single upset events. In such a case, the incorrect result is undetected. An enhancement is presented to provide additional redundancy for these paths. The rule of this solution is similar to the double execution. By having a tolerant path, the broadcasted tags and data can be compared with which on the initial path after the last entry has received it. For example, by having a duplicate path, the first arrival of an operand will be stored temporarily until the second arrival is completed. This requires an additional comparison at the end of the broadcast.Lastly, we propose an ALU scheduling algorithm which is on the basis of the depth of instruction been related, that is, taking the degree of dependence as the priority to allocate ALU. The more be related, the earlier been allocated ALU. Because in a sequence of instructions, the dependence on different instructions is different, so take the related degree in statistics, and then at the time of instruction launch, the execution unit will be scheduled according to the weights, of course, the comparison is performed if and only if the instructions need the same type of ALU.
Keywords/Search Tags:soft error, time tolerance, double instruction execution, reliability, station, ALU scheduling
PDF Full Text Request
Related items