Font Size: a A A

DESIGNING FAULT-TOLERANT ALGORITHMS FOR DISTRIBUTED SYSTEMS USING COMMUNICATION PRIMITIVES (BYZANTINE, AGREEMENT)

Posted on:1987-11-10Degree:Ph.DType:Dissertation
University:Cornell UniversityCandidate:SRIKANTH, T. KFull Text:PDF
GTID:1478390017458711Subject:Computer Science
Abstract/Summary:
Fault-tolerance is an important requirement in distributed computing systems. However, designing applications for distributed systems is a difficult task, particularly when components of the system can fail. The difficulty of this task increases with the severity of failures encountered. Arbitrary process failures are generally much harder to overcome than failures that are restricted, e.g., where processes only fail by halting. Thus, techniques that restrict the disruptive behavior of faulty processes can greatly simplify the design of fault-tolerant algorithms. Such techniques effectively provide reduction mechanisms from one class of failures to a more benign class.; Message authentication is an example of a technique that imposes restrictions on the externally visible behavior of faulty processes. This technique has been used to derive simple solutions to many problems of fault-tolerance for systems with arbitary faults. To exploit the simplicity provided by authentication, we present communication primitives that provide properties of authentication without using digital signatures. These primitives can also be extended to provide properties beyond those of authentication, thereby further restricting the types of faults that have to be overcome.; These communication primitives lead to a general methodology for designing fault-tolerant algorithms. We first design an algorithm assuming that messages are signed. Then, replacing signed communication in this algorithm with our broadcast primitive automatically results in an equivalent non-authenticated algorithm. We illustrate this methodology by deriving new solutions to the problems of distributed agreement and clock synchronization in the presence of faults. Our solutions to the problems of Byzantine Agreement, early-stopping Byzantine Agreement, Byzantine Elections, and clock synchronization are simpler and more efficient than those previously known. Furthermore, the clock synchronization algorithm that we propose is the first one that achieves optimal accuracy with respect to real time.
Keywords/Search Tags:Algorithm, Systems, Distributed, Communication primitives, Designing, Clock synchronization, Byzantine, Agreement
Related items