Font Size: a A A

Research On Key Techniques Of Scientific Workflows In IaaS Environment

Posted on:2017-09-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z M ZhuFull Text:PDF
GTID:1318330512971839Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Scientific workflow is one of the most popular model to represent large-scale scientific computations.Due to the continuous development of science,the requirement of computing resources for scientific research is in the trend of explosive increasing.Infrastructure as a Service,which is probably the most important service model of Cloud,can deliver numerous on-demand computational resources,arranged by virtual machines,through the Internet.Thus,it might be the promising computing platform for scientific workflow computations.However,as a novel computing model,there are significant differences between IaaS plat-forms and existing computing environments like local clusters and grids,which bring new challenges for scheduling and executing scientific workflows in IaaS.In this dissertation,we study several key problems related to the constructions,scheduling and executions of scientific workflows in IaaS environments.The main contributions of our work include1)We study the characteristics of the IaaS model and formulate the problem of schedul-ing scientific workflows in IaaS environments.Also,we highlight the challenges due to the unique characteristics of IaaS including the virtual-machine-based resource management,complex pricing schemes and various data sharing options,for scheduling and executing the scientific workflows in IaaS environments using existing scheduling algorithms.2)To solve the budget-constrained performance-effective scheduling problem in IaaS environments,we propose a new heuristic scheduling algorithm named BHI algorithm.The proposed algorithm use heuristic information including task's finish time,used budget and least reservation to schedule task to proper virtual machine.Also,we introduce a novel approach to address the problem that the list-based heuristic scheduling algorithms cannot be directly applied in IaaS due to its dynamic resource model.Experiments show that with the same budget constraints,the proposed algorithms can achieve smaller makespan and higher scheduling success rates than the state-of-the-art algorithms.3)For the scheduling problem which needs to consider both makespan and cost of the workflow executions in IaaS,we point out that most existing meta-heuristics might not have the expected performance.Especially,for the evolutionary algorithms,due to the specific properties of the problem,the existing genetic operations are hard to be adopted as solutions.Thus,we design basic genetic operations including the encoding,fitness evaluation and population initialization etc.In particular,two novel crossover and mutation operators are introduced,which can effectively explore the whole search space and simultaneously exploit the regions which have been explored previously.The results also show that our algorithm can achieve significantly better solutions than the state-of-the-art scheduling algorithms in most cases.4)Also,we proposed a novel meta-heuristic to address the multi-objective scheduling problem in IaaS.Unlike most existing evolutionary algorithms which encode the scheduling schemes into chromosomes and use classical string-based crossover/mutation operators to generate new offspring,we introduce novel splitting-merging based mechanism for the repro-duction during the evolutions.The initialization approaches of the first-generation individuals are also improved to achieve faster search in the solution spaces.Extensive experiments show that the proposed algorithm has better performance on finding Pareto-optimal schedules for IaaS platforms than several state-of-the-art multi-objective scheduling algorithms.Also,the results indicate that the presented designs can achieve faster convergence rate than those in the other meta-heuristics.5)To simplify the modeling and constructing of large scientific workflows,and to make full use of the IaaS resources for workflow executions,we present Brick scientific workflow platform.The platform enables users to easily scripting complex workflow applications in Python.Also,it includes several static and dynamic scheduling engines and can be easily adopted to different services/providers including VMs/IaaS,containers/CaaS and process-es/clusters.A complete example of using Brick to construct,execute and analyze an real workflow application in IaaS has been demonstrated.6)Also,Brick include a component named Briareus.For existing Python computing programs,by simply adding several descriptive comments,Briareus can accelerate it by automatically reforming part of the program into workflow and transparently migrating the executions of specific tasks to VMs in IaaS.The effectiveness and simplicity of the framework has been shown by real-world use cases.
Keywords/Search Tags:Cloud Computing, Infrastructure as a Service(IaaS), scientific workflow, scheduling algorithm, heuristic, meta-heuristic, scientific workflow management system
PDF Full Text Request
Related items