Research On The Real-Time Fault-Tolerant Scheduling Algorithms For Distributed Systems

Posted on:2009-09-07

Degree:Doctor

Type:Dissertation

Country:China

Candidate:W Luo

Full Text:PDF

GTID:1118360278462363

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the development of high speed network and high performance computers, distributed systems have been widely applied for critical real-time systems, in which correctness of systems depends not only on the results of a computation, but also on the time instant at which the results are available. If the real-time systems can not produce the correct results before deadlines, it can cause catastrophic results. As such, several fault-tolerance mechanisms must be integrated to guarantee timing constrains of real-time tasks even in presence of failures.Real-time fault-tolerant scheduling plays a key role in implementing the fault-tolerantce in real-time distributed systems. Aiming the characteristics of the modern real-time distributed systems, the real-time fault-tolerant scheduling algorithms based on homogeneous as well as heterogeneous distribted systems are designed to meat many requirements targets. The goal of this dissertation's research is to satisfy diverse performance metrics of real-time applications, including schedulability, reliability and the acceptance ratios by high-quality scheduling algorithms.Modern real-time distributed systems are always applied to the systems with strong time and spaces contraints, such as the control systems in secondary planet. Increasing the number of processors leads to more spaces and energy consumptions. Therefore, one major research interests in this thesis is to reseach the schedulability in homogeneous real-time distributed systems, that is to try to reduce the redundancies without jeopardizing the fault-tolerance abilityTraditional FTRMFF leverage both active backup copy and passive backup schemes to achieve fault-tolerance and adopt the first-fit strategy to try to reduce the worst case response time(referrrd as WCRT), which make as many backup copies as possible to be executed as passive wasy in order to reduce the system recundancies. Practical examples reveal that the FTRMFF will exhibit considerable redundancies if the WCRTs of primary copies are larger that those of backup copies. To remedy this deficiency, the terminates the execution of active backup copies (referred as Tercos) algorithm is proposed. Tercos terminates the execution of active backup copies when corresponding primary copies are successfully completed. Tercos aims at reducing scheduling lengths in fault-free scenario to enhance schedulability by virtue of executing portions of active backup copies in passive forms.The second algorithm (referred to as Debus) uses a deferred active backup scheme to further minimize schedule lengths to improve the schedulability performance. Debus schedules active backup copies as late as possible while terminating active backup copies when their primary copies are completed. Experimental results show that compared with existing algorithms in literature, Tercos can significantly improve schedulability over FTRMFF. Furthermore, empirical results reveal that Debus can enhance schedulability over Tercos when the WCRT of primary copies are nearly equl to the WCRT of active backup copies.The homogeneous distributed systems require that each node in the distributed systems should be identical. However, this is very hard to be implemented in practical. Therefore, heterogeneous distributed systems are more widely adopted in practical. Different from homogeneous distributed systems, the most important feature of heterogeneous distributed systems is their great variety of the computing power and the reliabilities of different processors. So reliability issues should be taken into account when designing algorithms for heterogeneous distributed systems. Most existing real-time fault-tolerant scheduling algorithms, based on heterogeneous distributed systems, either only support non-preemptive, aperiodic tasks or only consider one type of backup-copy. To address these problems, in this thesis we propose two novel reliability models based on preemptive periodic tasks. Moreover, three real-time fault-tolerant scheduling algorithms based heterogeneous distributed systems, namely NRFTAHS and RDFTAHS which are based on the first reliability model and IRDFTAHS which are based on the second reliability model, are presented. The first reliability model defines the system reliability which does not consider the reliability after fault occurs. NRFTAHS tries to assign tasks with the aim of improving schedulabilty of the systems, while RDFTAHS with the aim of improving reliability of system. The second reliability modle, however, fully consider the system reliability when fault occurs. Thereore, IRDFTAHS is flexible and practical than existing algorithms. Finally, simulation experiments are carried out to compare the three algorithms in several aspects, the experiments results show that the RDFTAHS performs significantly better than NRFTAHS with respect to overall performance and IRDFTAHS is superior to RDFTAHS.Different from static algorithm, dynamic algorithm is more complex because ti has to adjust its scheduling strategy according to current situation. We propose a dynamic and reliability-driven real-time fault-tolerant scheduling algorithm on heterogeneous distributed systems (DYFARS). DYFARS employs reliability costs as its main objective to dynamically schedule independent, non-preemptive aperiodic tasks, therefore system reliability is enhanced without additional hardware costs. In addition, scheduling time and dispatch time are both incorporated into our scheduling scheme so as to make scheduling results more realistic and precise.

Keywords/Search Tags:

Distributed systems, Real-Time scheduling, Fault-tolerance, Primary/Backup copy, dynamic priority scheduling

PDF Full Text Request

Related items

1	Research On The Fault-Tolerant Scheduling Algorithms For Hard-Real-Time Systems
2	Research Of Some Scheduling Problems For Real-Time Tasks On Heterogeneous Clusters
3	The Study Of Self-Adaptive Fault-Tolerant Scheduling For Real-Time Tasks On Cluster Computing
4	Research On Fault-tolerant Scheduling Algorithm For Real-Time Tasks In Cloud Computing
5	Research On The Fault-tolerance Oriented Task Scheduling And Resource Optimization Methods In Virtualized Clouds
6	Scheduling Algorithms For Multiprocessor Hard Real-Time Systems
7	Fault-Tolerant Task Scheduling In Real-Time Industrial Systems And Its Applications
8	Real-time Distributed Fault-tolerant Scheduling Algorithm
9	Real-Time Scheduling And Load Balancing Algorithms In Fault Tolerant Systems
10	Research On The Dynamic-Priority Scheduling Algorithms Of Tasks In Real-Time Systems