Font Size: a A A

Study On Multiple-Objective And Fault-Tolerant Resource Scheduling For Workflow Task In Cloud Computing Systems

Posted on:2017-12-31Degree:DoctorType:Dissertation
Country:ChinaCandidate:G S YaoFull Text:PDF
GTID:1318330536950359Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
In recent years, Cloud computing has received more and more recognition and use because of its high extensibility, low cost and on-demand service providing. With the popularity of Cloud computing, the work of Cloud data center will become heavier and heavier. To face the increasing heavy tasks, it is particularly important to design a reasonable scheduling strategy for large number of heterogeneous resources to ensure that the Cloud data center can operate in high efficiency and stable as the scheduling result is directly related to the quality and efficiency of Cloud computing service as well as the satisfaction of user, especially when the task is workflow that has precedence constraints among tasksIn this dissertation, we focus on workflow and research about the multi-objective and fault-tolerant resources scheduling in Cloud systems. The main contributions of this dissertation paper are as follows.(1) An Endocrine-based Coevolutionary Multi-Swarm for Multi-Objective Optimization algorithm(ECMSMOO) is proposed and applied for multiple objectives resource scheduling in Cloud systems. In ECMSMOO, multi-swarms are adopted and each swarm employs improved multi-objective particle swarm optimization(MOPSO) to find out non-dominated solutions with one objective. To avoid falling into local optima which is common in traditional heuristic algorithms, an endocrine-inspired mechanism is embedded in the particles' evolution process. Furthermore, a competition and cooperation techniques among swarms is designed in the ECMSMOO. All these strategies effectively improve the performance of ECMSMOO. The experiment results highlight the effectiveness of the proposed approach for multi-objective scheduling in Cloud systems.(2) For the resource failure during the task execution process, an immunological mechanism inspired rescheduling algorithm is proposed for workflow in Cloud systems(IRW). There are four units to imitate the immune system in the proposed IRW algorithm. The surveillance unit monitors possible faults for each Virtual Machine(VM) in resources pool. Once a resource fault is detected, the response unit is triggered to search an appropriate strategy either in the memory unit or in the learning unit for rescheduling the available resources. The available resources are clustered into multiple clusters by K-means to narrow the search scope in the learning unit. If none of available VMs can meet the Quality of Services, a new VM is created for the faulty resource. The simulation results highlight the feasibility and effectiveness of IRW.(3) For the fault-tolerant scheduling before task execution, this dissertation proposes a fault-tolerant elastic scheduling algorithm for workflow in Cloud systems(FTESW) based on Primary-Backup model. After analyzing the constrains of primary-backup scheduling in Cloud systems caused by the dependence among tasks in the submitted workflow, an elastic resource provisioning mechanism in the context of fault tolerance is designed to dynamically adjust the resource provisioning based on the resource request by adopting resource migration. Then, the FTESW is proposed to achieve both fault tolerance and high resource utilization in Cloud systems for workflow. The simulation results demonstrate that the proposed FTESW is able to effectively provide corresponding fault-tolerant scheduling strategy for workflow with high resource utilization in Cloud systems.(4) Also for the requirement of fault tolerance, this dissertation presents a novel fault-tolerant scheduling(IMWSFW) algorithm for unbalanced workflow in Cloud systems by combining replication and resubmission together to play their respective advantages for fault tolerance while trying to meet the soft deadline of workflow. Firstly, it segments the soft deadline of workflow into multiple sub-deadlines for each task. Then, it selects corresponding fault-tolerant strategy and reserves suitable resources for each task by taking the imbalance sub-deadlines among tasks and on-demand resource provisioning of Cloud systems into consideration. Finally, an online scheduling and reservation adjustment scheme is designed to select suitable resource for the task with resubmission strategy when the resource encounters a fault during its initial execution and adjust the sub-deadlines of some unexecuted tasks during the task execution process, respectively. The proposed algorithm is evaluated on both real-world workflows and randomly generated workflows, which demonstrate that it outperforms some well-known approaches on corresponding metrics.A conclusion is made at the end of this dissertation. Some inadequacies of current work and some future works, which requires further investigation, are also discussed, respectively.
Keywords/Search Tags:Cloud computing, resource scheduling, workflow task, multi-objective scheduling, fault-tolerant scheduling, rescheduling, Primary-Backup scheduling, artificial immune system
PDF Full Text Request
Related items