A model for efficient real-time distributed computing middleware incorporating a fine-grained program-segment-level deadline-based scheduling policy and an efficient checkpoint-based replication scheme

Posted on:2006-08-19

Degree:Ph.D

Type:Dissertation

University:University of California, Irvine

Candidate:Li, Yuqing

Full Text:PDF

GTID:1458390008965762

Subject:Computer Science

Abstract/Summary:

Starting in mid-1990's, the field of real-time (RT) computing applications has been showing a rapid growth pattern. However current practice in the development of RT systems has encountered severe challenges in handling large-scale distributed RT applications.; Conventional RT programming styles and associated methodologies have been error-prone and inefficient in dealing with complex modern RT systems. Moreover, the lack of design-time guaranteeing of service and response time of computing components inherent in conventional approaches has been one of the major obstacles which have been frequently preventing the industry from producing predictable and certifiable RT systems. The practice of existing RT execution engines such as RT operating systems and middleware is often limited due to insufficient design considerations on the system flexibility, scalability and extensibility. In addition, the timing-based scheduling polices widely adopted in current RT execution engines such as the Earliest-Deadline First (EDF) and the Time Slice based Least Laxity First (TS-LLF) requires extensive work from the system engineers in order to make them practical. Fault tolerance has been an integral part of most RT systems; however both fault tolerance techniques based on active replication schemes and those based on semi-active replication schemes have yet to be realized in cost-effective and integrated forms capable of dealing with a wide range of RT applications.; To address the above challenges, this dissertation proposes a model for RT distributed computing middleware with highlights on the following features:; 1. This dissertation presents a hierarchical resource management scheme, which attempts to meet the complex requirements of modern RT systems. This scheme is aware of the inherent hierarchical physical nature of complex RT systems. It allocates resources from the top system level to the low program-segment level systematically based on the resource requirements of application components, especially TMOs.; In addition, a PS-Level Deadline-based Scheduling Policy (PSLDSP) is introduced with formal proof that PSLDSP does not compromise the resource utilization potential of the system compared to the EDF.; 2. This dissertation proposes a low-cost and efficient checkpoint-based replication scheme which is able to support a wide range of soft RT applications.; Moreover, a formal performance analysis of the proposed scheme is reported and it reveals that the recovery time is directly related to the CPU utilization of the primary TMO and the checkpoint period, i.e., as the CPU utilization of the primary TMO increases, the recovery time increases. Similarly, as the checkpoint period increases, the recovery time increases.; 3. This dissertation presents an implementation model for RT middleware which supports the execution of high-level real-time distributed programs, i.e., Time-Triggered Message-Triggered Object (TMO) programs. Existing TMO Support Middleware (TMOSM) has been enhanced and re-constructed with analyzable, pluggable, and configurable software components. The enhanced TMOSM provides a safe and secure means to extend and incorporate new features into TMOSM. It makes TMOSM an open research experimental platform to a much broader audience in the RT research community.; The enhanced features mainly include a customizable scheduling framework, a Kernel Adaptation Layer (KAL) component and an Enhanced Real-time Multicast and Memory Replication Channel (RMMC) component.

Keywords/Search Tags:

Time, Replication, RT systems, Computing, Scheduling, Middleware, RT applications, Distributed

Related items

1	Research On Data Transfer In Real-time Supervision Information Systems
2	Flexible scheduling in middleware for distributed rate-based real-time applications
3	A QoS-driven resource allocation framework based on the risk incursion function and its incorporation into a middleware architecture and mechanisms supporting distributed fault-tolerant real-time computing applications
4	Distributed systems middleware: A framework for parallel and distributed computing on heterogeneous systems
5	Research On Energy-Efficient Scheduling Algorithm Of Distributed Real-Time Systems
6	Research And Implementation Of Replication Scheme In Distributed System
7	Research On Primary-Backup Based Fault-Tolerant Scheduling Algorithms For Cloud Computing
8	Research On Grid Scheduling
9	Researches On Some Key Issues In Grid Computing Environments
10	The Design And Implementation Of Database Replication System Based On BeyonDB