Font Size: a A A

Research On The Intermediate Data Management For Scientific Workflow Systems In Cloud

Posted on:2018-09-18Degree:MasterType:Thesis
Country:ChinaCandidate:S MengFull Text:PDF
GTID:2348330518999105Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As a kind of data-intensive application,scientific workflow often generates a large number of intermediate datasets with closely dependency during their executions.The management of these intermediate datasets has a direct impact on the quality and efficiency of the scientific workflow,therefore,the management of these intermediate datasets became more and more complicated.In the cloud environment,the use of scientific workflows to perform computational tasks or scientific experimental tasks requires payment of computational and data storage costs.Therefore,in order to improve the efficiency and reduce the overhead of scientific workflow,this paper studies the storage of scientific workflow intermediate data in the cloud environment.The main contributions of this paper can be summarized as follows:Firstly,this paper analyzes the CTT-SP algorithm for solving the data storage problem in scientific workflow.The analysis shows that the CTT SP algorithm has the defects of time complexity,main path sensitivity and algorithm instability.Furthermore,the analysis of CTT-SP algorithm was verified by designing linear and nonlinear scientific workflow experiments.Secondly,against the defects of main path sensitive and unstable of CTT-SP algorithm,the CTT-SP algorithm based on the critical path is proposed,and three non-linear scientific workflows with different complexity are designed to verify the validity and correctness of the improved algorithm.Thirdly,for the situation of multiple cloud provider provide service at the same time,this paper puts forward the deploy strategy of scientific workflow and intermediate datasets storage strategy in multi-cloud environment,and using five kinds of cloud provider,design liner and nonlinear scientific workflow to validate.The experimental results show that the proposed strategy is better than the existing one.
Keywords/Search Tags:cloud environment, scientific workflow, intermediate data, optimal storage, multiple cloud environment
PDF Full Text Request
Related items