Font Size: a A A

Scientific Workflow Data Placement Method Based On Task Assignment And Dataset Replicas In Cloud Environment

Posted on:2021-02-18Degree:MasterType:Thesis
Country:ChinaCandidate:L ShangFull Text:PDF
GTID:2428330614965767Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Scientific workflows provide researchers not only visual programming interfaces but also the ability to collaborate through distributed systems including computing resources and data sets.This enable researchers to deploy large-scale scientific experiments and knowledge exploration.Cloud computing,with its unique on-demand payment model and strong scalability,has received a lot of attention since its appearance,providing a great execution environment for scientific workflows.How to place the dataset for scientific workflows in the cloud environment has become a hot issue in the field of scientific workflow research.In the cloud environment,data centers are distributed around the world.It is unavoidable for scientific workflow execution to transmit data across data centers.Different data placement solutions bring different costs,which greatly affects the workflow execution expenses.Aiming at reduce the data placement cost,with the consideration on load balance for data centers,a scientific workflow data placement method based on task allocation and data set replicas is proposed in this thesis.The task assignment is first deployed based on the quantified dependent degrees among tasks which are calculated according to the relationship of tasks.Considering the execution characters of scientific workflow in the cloud environment,the data placement is deployed through building and running stage.For the building and running stage,the original and generated data sets are respectively placed on multiple data centers based on the task allocation result.Moreover,some data replica sets are constructed on data centers according to different replica construct conditions,to further reduce the transmission cost and finally optimize the data placement cost for scientific workflow execution.The feasibility and validity of the method is verified through some simulation experiments.
Keywords/Search Tags:cloud environment, scientific workflow, task assignment, dataset replicas, data placement, transmission cost
PDF Full Text Request
Related items