Font Size: a A A

Incorporating application-transparent node-crash tolerance to a soft real-time self-planned agent framework

Posted on:2006-05-28Degree:M.Comp.ScType:Thesis
University:Concordia University (Canada)Candidate:Madhavan, RamprasadFull Text:PDF
GTID:2458390005993712Subject:Computer Science
Abstract/Summary:
Fault tolerance is essential to any soft real-time distributed system; besides correctness and timeliness. Traditionally system designers are required to consider both real-time and fault-tolerance requirements while building real-time applications. This is a complex task for a designer. In general distributed systems, fault tolerance has been researched well. However, significantly less work has been done in the field of fault tolerance in soft real-time systems. This thesis focuses on achieving application transparent fault-tolerance in a soft real-time system framework and addresses the issue of redundancy management in the presence of deadlines. Specifically, the thesis focuses on incorporating application-transparent node-crash tolerance in a soft real-time self-planned agent framework (SPAF). A SPAF application is decomposed into several missions and each mission is completed by successfully completing multi-agent tasks through a sequence of phases. Each task in a mission can have many solutions and the choice of the solution depends on the remaining time and available resources. Fault tolerance is achieved by using the conventional primary-backup approach in conjunction with the dynamic task planning feature of SPAF. A cold backup and hot backup are used to accomplish application and system recovery during a node crash respectively. The model and the design of the fault tolerance solution are presented in detail. The functionality and efficiency of the fault tolerance design is illustrated through the implementation and simulations using a custom built application respectively. The test results are very encouraging and the application performance is almost the same even after inclusion of the fault tolerance mechanisms.
Keywords/Search Tags:Tolerance, Soft real-time, Application, System
Related items