Font Size: a A A

The Optimization Strategy Study Of Fault-Tolerant Scheduling In Cloud Computing Resource Management

Posted on:2012-10-27Degree:MasterType:Thesis
Country:ChinaCandidate:Z S LuoFull Text:PDF
GTID:2178330332483131Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
With the development of Internet and data centers, more and more real-time systems used in various distributed environments. Especially, cloud computing has been increasingly concerned in science and business. The main idea of cloud computing is to integrate a variety of computing resources on the Internet. These computing resources are heterogeneous, thus the effective management of large-scale computing resources is imminent. And communication efficiency and reliability is an inherent requirement of cloud computing, is also an important measure of quality of service which the system provides to users. But considering processor failure, current researches assume that there is only one processor failure during the execution of a task. With respect to distributed computing and grid computing, large-scale resources are highly dynamic and heterogeneous in cloud computing. And the resource is not reliable, so the possibility of large-scale resource failure in cloud computing greatly increased. Therefore, for fault tolerant problem in cloud computing the assumption that there is only one processor failure is clearly significant limitation.Therefore, we first summarizes the fault-tolerant scheduling strategy, arguing the research of fault-tolerant scheduling strategy at home and abroad, and introducing fault-tolerant scheduling strategy from the single-processor failure and multiprocessor failures. Analyzing the current outstanding problems of existing research; then briefly introduced the general framework of cloud computing model is proposed; for the proposed fault-tolerant requirements in resource management of cloud computing, analysis the current popular technology of fault-tolerant and reliability. On the base of this knowledge we get the objectives of this paper; for these research objectives, we propose fault-tolerant scheduling policy based on communication efficiency and reliability.In Cloud computing system, we first propose a communication model, obtain set of communicate messages based on the model and analyze the relationship between the backups. For each case, we derive some important constraints that limit the earliest start time of a backup and its eligible processors. Considering communication efficiency, we develop an algorithm, called the Fault-tolerant Maximum Communication Efficiency-Driven Algorithm (FMCED), to dynamically schedule dependent, non-preemptive, non-periodic real-time tasks in the case of one processor failure. Further expansion to the multi-processors, we propose reliability model for evaluating fault-tolerant performance of the system, define the priority of a task so that a critical task is defined as the task with the highest priority. Then we determine some limited conditions that not influence the earliest start time of its successors in the process of scheduling task. Thus, based on active replication technology, we develop the Dynamic and Reliability-driven Real-time Fault-tolerant Scheduling Algorithm (DRFACS) in the case of arising massive resource failures, it targets maximizing reliability to dynamically schedule dependent, non-preemptive, non-periodic real-time tasks, trying to improve the quality of service through scheduling.Finally, we conduct extensive simulation experiments from schedulability, latency, communication efficiency, reliability. Through comparing eFRD, MCT-LRC, FTBAR, FTSA with the proposed strategy, the experimental results show the good performance of the proposed algorithms on quality of service.
Keywords/Search Tags:cloud computing, resource management, fault-tolerant scheduling, strategy optimization, communication efficiency, reliability
PDF Full Text Request
Related items