Research On Optimization Techniques Of Multi-datacenter Cloud Services

Posted on:2021-03-14

Degree:Doctor

Type:Dissertation

Country:China

Candidate:X P Xu

Full Text:PDF

GTID:1488306302961229

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

Cloud computing is a service-oriented computing capability,generally divided into three layers:infrastructure as a service(IaaS),platform as a service(PaaS)and software as a service(SaaS).With the rapid development and iteration of cloud computing,the construction of one single datacenter is no longer sufficient to support increasing cloud services,and the computing infrastructure has gradually evolved into a multi-datacenter architecture.However,providing new IaaS/PaaS/SaaS services and corresponding new computing models on a multi-datacenter infrastructure still faces many problems and challenges.The wide area network between datacenters is one of the main bottlenecks.Due to the scarcity and dynamics of network resources between data centers,as well as high transmission costs and large price differences,the deployment and provision of cloud services across datacenters also face three major challenges:At the on-demand scaling level of IaaS,the cost of virtual cluster expansion is difficult to reduce;At typical PaaS level of big data analysis,the performance of geo-distributed data analysis is difficult to guarantee;at the SaaS level,it is difficult to achieve a win-win service request distribution between users and cloud providers.To this end,this article focuses on the optimization of cloud services on multi-datacenters,starting from the basic theory and key technologies of the network between datacenters,and conducting research on three aspects of virtual cluster expansion,geo-distributed data analysis,and user request allocation.The specific contents and contributions are as follows.At the level of on-demand elastic IaaS level,how to implement efficient virtual machine placement and migration strategies to support the virtual cluster expansion needs of cloud tenants is a key problem.Existing methods are mainly for internal datacenters,rarely considering the scarce transmission resources and relatively expensive bandwidth costs of networks between datacenters,so can't be simply applied to multi-datacenter scenarios.Therefore,this paper begins to study the problem of scaling a virtual cluster,and at the same time minimizing bandwidth costs and satisfying the bandwidth requirements across datacenters.Specifically,this paper first proposes an efficient algorithm to scale the virtual cluster without changing its initial VM placement.By observing that such VM placement may hinder the scalability of virtual clusters,this paper then proposes an optimization algorithm with VM migration to minimize the sum of bandwidth cost and migration cost at the same time.Finally,extensive simulations are performed to verify the effectiveness of the proposed algorithm,especially in terms of bandwidth costs and requests acceptance rate with bandwidth guarantees.For a typical big data PaaS platform,how to implement cross-datacenter coflow scheduling and data-set query,to greatly improve the performance of geo-distributed data analysis tasks are two important problems.This article focuses on the two objectives of "cost-performance" trade-off and "cost-throughput" trade-off,and conducts research on these two problems.Coflow is an ion of a set of parallel data flows with dependencies,a coflow will not finish only until all its flows have completed.Firstly current method can only reduce the average completion time of coflow,or reduce the average transmission cost between datacenters.In this paper,an optimization problem is constructed to minimize the combination of average coflow completion time and average transmission cost,with an online framework of coflow-aware optimization-Lever,to balance these two conflicting objectives.Without any prior knowledge of future coflows,results from large-scale simulations demonstrate that Lever has a non-trivial competitive ratio,which can significantly reduce the average transmission cost,and at the same time,speed up the completion of these coflows.Secondly,when performing query analysis operations on geo-distributed data sets across multi-datacenters,the existing methods do not attempt to solve the throughput problem caused by such operations.Considering the transmission cost,system throughput and maximum queuing delay between datacenters,this paper takes advantage of Lyapunov optimization techniques to design and analyze a two-timescale online control framework-2TGDA,to align the cost with the throughput.Without prior knowledge of future query requests,this framework makes online decisions on input data placement and admission control of query requests.Rigorous theoretical analyses show that our framework can achieve near-optimal solution,and as well maintain system stability and robustness.Extensive trace-driven simulation results further demonstrate that our framework is capable of reducing inter-datacenter traffic cost,improving system throughput and guaranteeing a maximum delay for each query request.At the SaaS level of massive conversations and transactions,how to design an effective adaptive request allocation algorithm to simultaneously guarantee user delay requirements and minimize cloud service provider network costs,is a key problem.This paper finds that the existing methods have certain limitations:they either focus on optimizing the interests of one side,or simply ignore some essential factors in joint optimization,such as delay demand and the diversity of bandwidth cost.Thus,this paper firstly formulates an integer programming problem then relaxes it into a continuous convex optimization-which could be practically solved.Next,this paper takes the advantages of random sampling when designing a request allocation algorithm,to enforce the solution to be a feasible one for the original integer programming.Through rigorous theoretical analysis,this paper proves that the algorithm can provide a tight upper bound for the total bandwidth cost.Extensive simulations demonstrate that the proposed algorithm can efficiently reduce the total bandwidth cost of service providers while guaranteeing the latency requirements of all requests.

Keywords/Search Tags:

Inter-Datacenter Networking, Virtual Cluster Scaling, Geo-Distributed Analytics, Request Allocation

PDF Full Text Request

Related items

1	Methods Of Optimizing Anycast Strategy And Resource Allocation For Elastic Optical Inter-datacenter Networks
2	Research On Cloud Network Resource Sharing And Isolation Methods For Virtual Datacenter
3	Research On Non-structure Data Replication Method Of Multi-Datacenter
4	Scaling Analytics via Approximate and Distributed Computin
5	Research On Reliable Resource Scheduling Mechanism In Cloud Datacenters
6	Research On Key Issues Of Applying Software-Defined Networking To Multi-Tenants Datacenter
7	Bandwidth-Guaranteed Path Planning Based On OpenFlow In Inter-datacenter Networks
8	On Fast And Coordinated Data Backup In Geo-Distributed Optical Inter-Datacenter Networks
9	The Design And Implementation Of Virtual Network Inter-Datacenters
10	Investigation Of Dynamic Resource Allocation In Inter-Datacenter Networks Over Optical Infrastructure