Font Size: a A A

Tolerating radiation-induced transient faults in modern processors

Posted on:2006-06-28Degree:Ph.DType:Dissertation
University:University of California, IrvineCandidate:Li, XiaobinFull Text:PDF
GTID:1452390008465385Subject:Engineering
Abstract/Summary:PDF Full Text Request
As technologies continue scaling down, smaller device size, reduced voltage levels, and higher integrated transistor counts correspondingly raise concerns of higher transient faults. For one thing, radiation-induced transient faults, a.k.a. Single-Event Upsets (SEUs), are predicted to become increasingly significant in the near future. In order to handle these inevitable errors, we must integrate in our design fault-tolerant features so that the microprocessors can continue to correctly perform its specified tasks despite the occurrence of the errors.; The main goal of our research is to develop architecture mechanisms to protect the microprocessors against radioactive particles striking. To that end, we firstly bottom up the microprocessor design abstraction levels to examine the associated SEU issue and then conclude that the Soft Error Rate (SER) of microprocessors is related not only with the particle flux and energy but also with the design abstraction levels: program, instruction set architecture, microarchitecture, VLSI, and fabrication technology. Consequently, we adopt the functional fault models of microprocessors, which mainly consist of the following two aspects: (i) control flow errors, i.e., the sequencing of program being violated; (ii) data errors, i.e., incorrect data computations. Based on the adopted fault models, we then correspondingly develop two architecture mechanisms to target the identified microprocessor failure behaviors: (i) assigned-signature control flow checking; (ii) redundant execution in the Simultaneous Multi-Threading (SMT) platform. In our research, the developed control flow checking algorithm and its cost-effective implementation are present. Furthermore, we also address the design trade-off and deadlock issues that are associated with the redundant execution in the SMT platform.
Keywords/Search Tags:Transient faults
PDF Full Text Request
Related items