Font Size: a A A

Research On Optimization Of Scientific Workflow Scheduling

Posted on:2018-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:W H DongFull Text:PDF
GTID:2359330512988280Subject:Engineering
Abstract/Summary:PDF Full Text Request
As e-science becomes a new scientific research model that supports complex scientific researches,how to analyze its workflow structure and the resource analysis become an important scientific workflow scheduling problem,as well as task allocation of the corresponding resources in the workflow to meet the user-defined timing restraints.Because of the large scale,multi-granularity and error-prone characteristics of high-density computing and massive data,the scientific workflow is very different from the usual commercial workflow.Therefore,the research on the scheduling of the scientific workflow is still in the initial stage.In this thesis,the scheduling algorithms,clustering algorithms and fault tolerant algorithms are optimized,aiming at improving the efficiency of scientific workflow scheduling,reducing the cost of scientific workflow,reducing the system overhead and improving the scientific workflow fault tolerance algorithm in the clustering model.Specific research work is as follows:1.Establish a scientific workflow scheduling model,including the scheduling environment model and scheduling algorithm model.The existing two kinds of classical algorithms are analyzed.According to the problem of lack of consideration of communication losses in the heuristic priority formula and easily falling into the local optimal solution when the greedy strategy is used in the distribution processor,a heuristic scheduling algorithm based on input and output data streams and key father / son tasks is designed.At the same time,the simulation results show that the algorithm is stable and efficient,and it is a stable and efficient practical algorithm in the case of changing the environment and conditions.2.Through the analysis of the depth of the scientific workflow scheduling system,the task-system overhead mathematical model is established,and the method of calculating the system overhead is expounded.Based on the analysis of classical algorithm level clustering algorithm and parent-child clustering algorithm,the two algorithms lack consideration of the relationship between task granularity control and workflow sub-task.Therefore,a granularity control algorithm and influence control algorithm are proposed to improve the problem of dislocation between the task granularity control and clustering tasks.Simulation results show that the two new algorithms can effectively reduce the system overhead as well as the scheduling time.3.This paper presents a scientific workflow failure model under clustering algorithm.At the same time,it analyzes the shortcomings of existing fault-tolerant algorithms based on primary and secondary versions under clustering.Then,the dynamic clustering algorithm which can change clustering factor in real time based on the failure rate has been proposed,as well as the selective clustering algorithm which can isolate the error task and reunion class.The simulation results show that the dynamic clustering algorithm can effectively reduce the impact of the fault,and the selection clustering algorithm can effectively avoid the impact of the fault task on other tasks,so the efficiency is higher than the other two algorithms.4.In view of the above research,the scheduling algorithm of science workflow,clustering algorithm and fault-tolerant mechanism are combined into cloudSim scheduling module.And simulate the experiment with the existing scheduling algorithm as a whole.The results show that the improved algorithm can effectively reducing the run time of the scientific workflow.
Keywords/Search Tags:Scientific workflow, Scheduling, Clustering, Fault-tolerant, CloudSim
PDF Full Text Request
Related items