Font Size: a A A

Research On Minimum Cost Data Storate Problem In Multi-clouds

Posted on:2020-10-20Degree:MasterType:Thesis
Country:ChinaCandidate:J H ZhangFull Text:PDF
GTID:2428330572984280Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,the development of cloud computing provides users with on-demand,?flexible,cheap and scalable ways to deploy applications,many cloud services providers appear on the market,such as Aliyun,AWS and Azure.This "multi-cloud" coexistence environment enables users to ultilize the differences(such as resource price,QoS,bandwidth,etc.)between CSPs to conveniently and flexibly use multi-cloud to deploy and run their applications to reduce costs and improve QoS.Because of these characteristics of cloud computing,more and more data-intensive applications(such as astronomical applications)begin to deploy in the cloud environment.These applications usually contain complex workflows,which process data step by step with many complex tasks in the workflow and generate a large number of intermediate data.However,due to the pay-as-you-go usage manner of cloud computing,the storage of these data in the cloud will incur a very high storage cost.If all generated data are deleted and the data need to be generated from the original data when it is reused,users will incur high computing cost.Moreover,unreasonable data storage methods will also cause a great waste of cloud computing resources.These problems bring great challenges to the deployment and operation of data-intensive applications in cloud computing.Therefore,a reasonable data storage and placement strategy can not only save a lot of costs for users,but also reduce the waste of cloud computing resources.To address the above problems,this paper models the minimum cost data storage problem and studies the relationship between data storage strategy and total cost,as well as the minimum cost data storage algorithm in multi-cloud environment.We divide DDG into linear-DDG and complex-DDG according to the characteristics of data generation relationship.Then we study their minimum cost data storage algorithms.Specifically,1)To solve the problem of data storage with linear-DDG,a linear-PCE algorithm with linear time complexity is proposed.Line-PCE utilizes the property that too long data generation process will incur very high computing cost.It uses dynamic programming and reduction rules to quickly find the optimal origin data of delete data.Finally,it obtains the minimum cost data storage strategy by traversing the optimal origin data in reverse.At the same time,linear-PCE uses incremental computation to reduce the time complexity of the algorithm.2)For the problem of data storage with complex-DDG,this paper proposes an efficient data storage algorithm:PCE algorithm.The PCE algorithm can calculate the minimum cost data storage strategy for complex data dependencies by specifying the origin data for the branches of DDG and finding the optimal combination of origin data for the data in the merged branches.By using the property that the linear-DDG segment has maximum polynomial optimal storage strategies,the PCE can quickly calculate the minimum cost data storage strategy by saving intermediate results for reuse.This paper evaluates the approach with real astronomical applications based on the background of astronomy,and constructs a real DDG according to a real astronomical application.The CSP generated according to existing mainstream CSPs is used in experiment.The results show that the data storage strategy obtained by this algorithm can not only reduce the cost of applications but also reduce the application response time.In addition,a lot of experiments based on simulated data show that for the data storage problem with linear-DDG,the algorithm in this paper can be completed in 50ms,which is 2-4 orders of magnitude faster than existing algorithms.For the data storage problem with complex data dependence,the algorithm in this paper can complete in 30s.
Keywords/Search Tags:Cloud Computing, Data Intensive Workflow, Multi-Cloud Environment, Data Storage Strategy
PDF Full Text Request
Related items