Font Size: a A A

Modeling And Mitigation Of Microprocessor Soft Error Vulnerability

Posted on:2017-05-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:L TangFull Text:PDF
GTID:1108330503992401Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Soft errors are a major reliability barrier for high performance processor designs. The vulnerability to soft errors of such systems grows exponentially with technology scaling. To meet the reliability requirements in a cost-effective way, it is critical to modeling soft error vulnerability in early design stages.It has been observed that a large part of the raw errors that occurred at the device and circuit levels may be masked at architecture level. The architectural vulnerability factor(AVF) is the probability that a soft error finally produces a visible error in the program output. It is an important reliability metric that quantifies the soft error masking effect at the architectural level. Modeling soft error vulnerability may help to improve the accuracy of the reliability prediction; achieve a better trade-off between performance, power, area and reliability; and develop a cost-effective soft error mitigation technology in the early design cycle.Unfortunately, the relationship of program and microarchitecture parameters to AVF has not been sufficiently studied. More and more researchers realize the importance of the effects that both machine and program impact on AVF. Instruction occupancy of a structure is the critical metric for modeling AVF, but how a change in structure size or other parameters impacts instruction occupancy is not precisely illustrated in existing literatures. Moreover, the low-overhead mitigation of soft error need more research based on the benefits from the masking effect.To address the above-mentioned issues, this dissertation involves two aspects. First, the AVF of pipeline structures is modeled by tracking the instruction occupancy cycle by cycle; analyze the relationship between program characteristic, microarchitecture parameters and AVF; explore the relationship between instruction occupancy and structure size. Second, two technologies are proposed to mitigate soft error based on dynamic resizing IQ and adjusting instruction flow mix.This dissertation contributes the following:1. An analytical model to describe and analyze the AVF of a microprocessor structure. Compared with existing analytical models, the proposed model suits more general processors and generates a lower prediction error. It considers the constraints of the machine on instruction-level parallelism, captures the overlap of missed events in pipeline via cycle-by-cycle tracking, and considers the contention of functional units. Moreover, it combines more program characteristics and microarchitecture parameters for more precise analysis of their relationship with AVF.2. A piecewise curve to represent the relationship of program characteristic and microarchitecture parameters. We demonstrate that there are several parameter pairs affecting AVF duing front-end stage, issue execution stage and retire stage of pipeline. The effect of a variable microarchitecture parameter on AVF has an bound. This bound is computable and depends on the program instruction-level parallelism.3. The revelation of the relationship between a structure size, its instruction occupancy and AVF. Based on our proposed analytical model, we use a Logistic Function to analyze the relationship between instruction occupancy and structure size. The theoretical derivation and detailed simulation proved that this relationship is in accordance with the Logistic Curve.4. A method for mitigating soft error based on resizing IQ. In the compiler stage, the dynamic IQ size is decided by analyzing the critical path of the basic block. Then, an instruction that adjusts the information about the IQ size is inserted before the basic block. Finally, soft error is mitigated along with the reduction of instruction occupancy. This technology is easy to implement, with almost no need to modify the underlying circuit and almost not added complexity to the system. The experiment results show that the average reduction of AVF is 19.7%, and the average performance loss is limited to 5.7%.5. A technology for controlling IQ AVF. This is done by judging and improving the matching degree between instruction mix and the configurations of the functional units, reducing the stall time caused by large mismatching between the instruction mix and the configurations of function units, and then mitigating AVF. Furthermore, for long delay instructions and long data chains, some optimizations are made. The experiment proved that the best effect is the 6.6% reduction of AVF. Meanwhile, the effect on performance is kept under 2%. This technology can effectively improve the balance between reliability and performance. It is carried out by adjusting fetching instructions at the front-end. It is easy to implement with little hardware overhead.This dissertation provides a research on microprocessor soft error vulnerability modeling and mitigation. It proposes realizable methods for reliability-aware microprocessor architecture design.
Keywords/Search Tags:microprocessor reliability, architectural vulnerability factor, analytical model, instruction occupancy, soft error mitigation
PDF Full Text Request
Related items