Font Size: a A A

Workload Optimal Scheduling For Large-scale Cloud Data Centers

Posted on:2015-07-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:X J LvFull Text:PDF
GTID:1108330470467807Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Cloud data center is the infrastructure of the large-scale cloud computing system or platform. It is one of the most important basis to implement various huge and complex cloud computing services. With the world-wide fast proliferation of cloud data centers, workload optimal scheduling for cloud data centers is becoming one of the focus in academia. It is also one of the most important factors that influence the quality of service for cloud data centers. Traditional workload optimal scheduling methods cannot overcome the challenges of new workload characteristics and large-scale of cloud computing. In this paper, we focus on the research of the optimal workload scheduling framework for large-scale cloud data centers, the optimal scheduling methods for Web and Job workloads, the optimal scheduling method for bulk data transfers between cloud data centers, which are described as following.Firstly, a flexible, efficient and intelligent workload scheduling system JTangWOS has been proposed for large-scale cloud data centers. JTangWOS aims to solve the limita-tions of traditional workload scheduling systems, e.g. low scalability and limited support for intelligent scheduling decision making. JTangWOS consists of the workload monitoring system based on data distribution service (DDS), the scheduling decision making support platform based on complex event processing (CEP), and the concrete workload schedul-ing system. It can overcome the scale challenge of cloud data centers, support the efficient collection and transmission of huge monitoring data, and support the decision making for various kinds of scheduling and management in cloud data centers. It also with great scal-ability. The experiments demonstrate the high efficiency (the concurrency and throughput increase 4.8 times and 20 times seperately) and intelligence of JTangWOS on the optimal workload scheduling.Secondly, an efficient bursty and self-similar workload generation method BURSE has been proposed for cloud benchmark Cloudstone, based on a superposition of 2-state Markov Modulated Possion Processes (MMPP2s). This method can overcome the limi- tation that the workload generated by current bursty generation methods or self-similar generation methods cannot satisfy the practical situation well. Compared with traditional methods, BURSE fits the practical situation better, and it’s more straightforward. Then, we also develop a workload balancing method for large-scale cloud data centers, which consid-ers the bursty and self-similar workload characteristics. Experiments demonstrate the high accuracy (the average deviation is lower than 10% for all the combinations of burstiness and self-similarity) and robustness (the accuracy doesn’t increase as the number of sam-ples increases) of our workload generation method, as well as the high performance and efficiency of our load balancing method.Thirdly, an efficient distributed method has been proposed to deal with the hetero-geneity of servers and Job resource requirements in the cloud data centers, based on the Alternating Direction Method of Multipliers (ADMM). This method uses the diversity of the location of large-scale cloud data centers. By considering the diversity of electricity prices of different cloud data centers and the different network latency to different cloud data centers, we implement a cost-minimized Job scheduling method by using the joint se-lection of data centers and servers. Experiments show that our method can converge to an acceptable near-optimal solution within serval tens of iterations (the maximum is 60 and just 33 for 80% of the time). Compared with existing non-joint scheduling methods, our method can ensure the quality of service (QoS). Compared with the joint scheduling meth-ods which just focus on minimizing energy cost or utility loss, our method can achieve a better balance between energy cost and utility loss, which produces the minimized total cost of data centers.Finally, an energy cost efficient two-stage scheduling method has been proposed for inter-data center bulk data transfers (Inter-DC BDTs), by using the dealy-tolerance of bulk data transfers and diversity of electricity prices of geographically distributed cloud data center. We systematically study the problem of how to route and schedule inter-data-center bulk data transfers to minimize the energy cost in the multi-electricity-market environment, model this problem as a min-cost multi-commodity flow problem, and then develop an ef- ficient two-stage optimization method to solve it. Extensive evaluations on a real-life inter-data-center network with real-life electricity prices show that our two-stage optimization method brings significant energy cost savings over existing bulk data transfer methods. The energy cost reduction ratio reach 31% and 48% for the free (early morning) and busy (early evening) time seprately. The experiments also show our method has lower time complexity (the con sued time is linear with the deadline). It can achieve a better balance between the performance and time complexity.
Keywords/Search Tags:cloud data center, bursty workload, self-similar workload, workload gen- eration, workload monitoring, optimal workload scheduling
PDF Full Text Request
Related items