Font Size: a A A

Fast low-cost failure recovery for real-time communication in multi-hop networks

Posted on:1999-09-03Degree:Ph.DType:Dissertation
University:University of MichiganCandidate:Han, SeungjaeFull Text:PDF
GTID:1468390014473298Subject:Computer Science
Abstract/Summary:
Best-effort communication is inadequate for QoS-sensitive applications (like multimedia), since such applications require bounded message delay and predictable throughput. Instead, real-time communication which can provide QoS guarantees by resource reservation has been actively researched. As the application domain of real-time communication expands to include business- or mission-critical applications, network dependability becomes essential.; This dissertation addresses how to make real-time communication dependable. We have developed an integrated scheme for restoring real-time connections from network component failures. As applications with different dependability requirements share the same network, the dependability level and its associated cost should be flexibly chosen depending on the criticality of applications. Our scheme is based on five key design principles: per-connection dependability guarantee, fast failure recovery, small fault-tolerance overhead, robust failure handling, and high interoperability and scalability.; To quickly restore failed connections, cold-standby backup channels are set up in advance along with each primary channel. Upon failure of a primary channel, one of its backups is promoted to replace the primary channel. To minimize the resource overhead in maintaining backup channels, resources for backups are shared judiciously so that connection dependability may not be compromised. By choosing the degree of resource sharing and the number of backups, the network can control the dependability of a connection in accordance with the application's request.; Our scheme covers all aspects of connection failure recovery such as backup routing, failure detection, channel switching, and resource reconfiguration after failure recovery. Particularly, we develop two behavior-based failure-detection schemes that do not require any special hardware support, and experimentally evaluate their effectiveness using a testbed implementation. We also develop a novel protocol that provides distributed and robust handling of detected failures. Good coverage in recovering from failures is shown to be achievable with low degradation in network utilization under reasonable failure conditions.; Our distributed architecture scales well, and the procedures of backup establishment, failure detection, and channel switching are independent of the underlying communication system so that our scheme is interoperable with various real-time communication schemes.
Keywords/Search Tags:Communication, Failure, Network, Applications, Channel, Scheme
Related items