Font Size: a A A

Research On Resources Provision Technologies In Distributed Cloud Computing

Posted on:2017-05-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:J T ZhangFull Text:PDF
GTID:1108330503969870Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
An efficient resource provision which can guarantee the satisfactory cloud computing services to the end user, is one of the key factors for the success of cloud service provider(CSP). In the infrastructure act as service(Iaa S) model, CSP mainly delivers services in the form of virtual machines(VMs). So the resource provision problem is transformed as the placement of VMs to support the requested services. According to the deployment sequence, from the view point of top-to-down, this paper divides the deployment process into three phases, i.e., cloud networks selection phase, data centers(DCs) selection phase and servers selection phase. Then specific resource provision schemes are explored for the specific services with various requirements in different phases.In the phase of cloud networks selection, relying on the pricing benefit of long-term reserved resources and multiplexing gains, cloud brokers strive to minimize the cost by utilizing infrastructure resources from public CSP. Because various reserved instances terms accompanied by different prices are provisioned by the provider, broker must choose appropriate reserved instance terms from the candidates in time to meet the dynamic requirements of users with the lowest cost. This paper addresses the challenge by two offline algorithms, i.e., the longest term preferred multi-level reservation heuristic algorithm and the approximation algorithm based on set cover, as well as the online algorithm based on the historic resource utilization. Extensive real world traces driven evaluations show that the heuristic algorithm runs about twice as fast as the approximation one. Comparing to the scenario where all instances are on-demand, both offline algorithms can save almost the same cost up to 27%. For the online scenario, the proposed algorithm can save cost up to 14%. The algorithms proposed in this dissertation are more practical comparing to the existing algorithm which only considers one kind of reserved instance terms.In the phase of data centers selection, this dissertation explores two problems, i.e.,data centers selection for clustering based VM placement and data centers selection for moving big data to the cloud. The first problem is to minimize the inter-DC delay and bandwidth. Since the placement of VMs of one big task or of one organization may span multiple DCs, the longest distance between DCs should be minimized so that the communication latency is minimized and the expensive inter-DC long distance bandwidth is saved. In contrast to existing method that only considers the distance between data centers,a more efficient clustering based 2-approximation algorithm is developed firstly by taking full use of the topology and density property of the cloud network and the capacity information of DCs. The execution time of the algorithm decreases about 15%~72%. Then,with the introduction of a half communication model(HCM), a novel HCM based heuristic algorithm is presented to partition VMs to the selected DCs. The simulation shows that the proposed algorithm not only further cuts down the inter-DC bandwidth consumption, but also runs about 2 times faster compared to the existing method. All algorithms can be applied to the scenario whenever VMs are heterogeneous or homogeneous, hence overcome the limitation of the existing algorithms. The second problem aims to select DCs for moving distributed big data to cloud with minimum cost while guaranteeing fast local data access. First, four objectives are analyzed and recommended, i.e., fair data placement, preferential data placement, transmission cost minimization data placement and cost minimization data placement. Then the problem is formulated as a bipartite graph. A tight 3-approximation algorithm is proposed to address the former two objectives. The latter two objectives are addressed by a heuristic which prefers the nearest DC.The two algorithms can be applied to the scenario where not all DCs are available because of the legal requirements or user preference. Comparing to the optimal method and other schemes, extensive simulations demonstrate that the proposed algorithms can find rather good solutions with less time, and hence are more appropriate for large scale applications.In the phase of servers selection, servers should be selected to realize SLA(servicelevel aggrement) aware cost efficient VM placement. Servers and network contribute about 60% to the total cost of DC in cloud computing. How to efficiently place VMs so that the cost can be saved as much as possible, while guaranteeing the quality of service, plays a critical role in enhancing the competitiveness of the service cloud provider.Considering the heterogeneous servers and the random property of multiple resource requirements of VMs, the problem is formulated as a multi-objective nonlinear programming. VM cluster with higher traffic is made staying together by exploiting the topology information of DCs. This reduces the communication delay while saving the inter-server bandwidth consumption. At the same time, statistical multiplex and newly defined “similarity” techniques are leveraged to consolidate VMs. The violation of resource capacity is kept at any designated minimal probability. Thus, the quality of service will not be deteriorated while saving servers and network cost. An offline and an online algorithms are proposed to address this problem. Comparing to several baseline algorithms the experi-ments show the validity of the new algorithms: more cost is cut down at less computation effort.This dissertation also explores the co-selection of data centers and servers for big data analytics across DCs. Considering that it is not always practical to store the worldwide data in only one DC, and that Hadoop, the commonly accepted framework for big data analytics, can only deal with data within one DC, the distribution of data necessitates the study of Hadoop across DCs. A novel architecture and a key value based scheme are proposed which can respect the locality principle of traditional Hadoop as much as possible while realizing big data analytics across DCs. A bi-level programming is used to formalize the problem and it is solved by a tailored two level group genetic algorithm. Extensive simulations demonstrate the effectiveness of the tailored algorithm. It outperforms both the baseline and the state-of-the-art mechanisms by 49% and 40%, respectively.
Keywords/Search Tags:Distributed Cloud Computing, Resource Provision Algorithms, Cloud Computing Networks Selection, Data Centers Selection, Servers Selection
PDF Full Text Request
Related items