Font Size: a A A

Research Of Replication And Placement Strategies For The Intermediate Data Of Scientific Workflow In Cloud

Posted on:2015-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:S PengFull Text:PDF
GTID:2298330452450762Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Scientific workflows can automatically process the procedure of scientificexperiments. As a data-intensive application, scientific workflow has been widely usedto process and analyze the large-scale data in scientific experiments. Nowadays, moreand more scientific workflows are deployed in cloud since cloud computing canprovide the computing and storage resources during their execution. However, thisparadigm introduces many new challenges. Firstly, the performance of scientificworkflow is one of the challenges since it takes a long time during the execution.Secondly, based on the pay-as-you-go model of cloud, the deployment cost ofscientific workflow in cloud should be considered. Furthermore, the data security hasalso become a serious problem since the resources are shared among all the customers.In this thesis, the main research work for the challenges above is shown asfollows:1、A replica placement strategy for the intermediate data which frequently used isproposed to effectively improve the efficiency of multiple scientific workflows. Inour strategy, firstly, some intermediate data were chosen as the object to be replicatedaccording to the frequency threshold. And then, the replica numbers of replicated dataare set depend on the different sizes of intermediate data. Finally, to improve theperformance of scientific workflows and balance the load of data centers, the geneticalgorithm is used to solve the replica placement problem for intermediate data whenthe transmission time is considered as the main objective.2、To ensure the efficiency of the multiple scientific workflows while minimizingthe placement cost of replicas, a replication mechanism about cost-aware forintermediate data in cloud is proposed. At first, a model of transmission time and amodel of placement cost have been brought forward by analyzing the efficiency anddeployment cost for multiple scientific workflows. Secondly, the data transfer time istreated as the main objective while the replica placement cost in the secondary. Thegenetic algorithm is utilized to solve the problem of intermediate data replicationduring the cost perceiving process. Thus, under the prerequisite of maintaining the scientific workflow efficiency, the goal of effectively reducing the replica placementcost is achieved. Finally, the proposed strategy is simulated from the data transfertime and the placement cost of multiple scientific workflows. Compared with otherrelated strategies, our strategy can effectively improve the efficiency while ensuringthe placement cost.3、A cost-aware intermediate data placement strategy is proposed to improve thedata security while minimizing the placement cost. This strategy firstly introduces thesecurity model and deployment cost. Then, an ant colony optimization-basedalgorithm was used to dynamically select the appropriate data centers for theintermediate data when the data security and the deployment cost were regarded asthe optimization objectives. At last, compared with other strategies, the experimentalresults show that our strategy has certain advantages in terms of data security’simproving during the execution of the scientific workflows while reducing theplacement cost.To sum up, this thesis optimizes the replica placement of intermediate data, thereplication mechanism and placement for intermediate data by analyzing theperformance, cost and security issues for scientific workflows in cloud. In this thesis,the relevant results make up for the shortage of the research for data replication anddata security to some extent in cloud, especially for the research on the cost-awareintermediate data replication strategy of multiple scientific workflows.
Keywords/Search Tags:Scientific Workflow, Cloud Computing, Intermediate Data, ReplicaPlacement, Deployment Cost
PDF Full Text Request
Related items