Font Size: a A A

Scheduling Coflows In The Data Center Networks

Posted on:2020-11-05Degree:MasterType:Thesis
Country:ChinaCandidate:B Q WangFull Text:PDF
GTID:2428330575958035Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The last decade has seen a rapid growth of cloud computing.As the infrastructure of cloud computing,data center networks play an important role in cloud computing.How to schedule the network resources efficiently is a common problem in academia and industry in today's data centers.The study found that the data transfer across the networks accounts for more than 50%of job completion times.Therefore,the management and optimization of network resources in the data center are very critical to shorten the job completion time.Earlier researches on network scheduling in data centers focused on the flow-level schedul-ing.However,the abstraction of flows cannot capture the semantics of communication between two groups of machines in a cluster application.The coflow abstraction is a major leap forward of application-aware network scheduling.Coflows make it easier for the applications to convey their communication semantics to the network.In this paper,we focus on how to schedule coflows in the data center to minimize the total job completion time.For single-stage jobs,we focus on multicast commu-nication pattern in this paper.Efficient multicast algorithms can greatly improve the performance of applications in data centers.As we know,the data link layer and the network layer multicast supports are usually disabled for management reasons in data center networks.So,we target application layer multicast algorithm.There are two major challenges:First,how can we accurately infer the topology of data center net-works?Second,how to design an efficient multicast algorithm under the premise of accurately inferring the network topology?In this paper,we make full use of hierar-chical clustering which can accurately infer the network topology even in the wired and wireless hybrid data center network architecture.Then,we use the hierarchical topology information of data center network and propose an Inter-Rack First Multi-cast(IRFM)algorithm.The results show that IRFM is 3.7?11.2x faster than other multicast algorithms in the pure wired data networkscenter networks,and 4.8?14.6x faster in the wired and wireless hybrid data center.In the context of multi-stage jobs,there are dependencies among coflows.As a result,there is a large divergence between coflow completion time and job completion time for multi-stage jobs.To our best knowledge,this is the first work that systemati-cally studies:how to schedule dependent coflows of multi-stage jobs,so that the total weighted job completion time can be minimized.We first present the original formula-tion of multi-stage coflow scheduling problem and prove its strong NP-hardness.Then,we design an algorithm MCS that runs in polynomial time to solve this problem with an approximation ratio of(2M+1)in general case,and 3 in special case,where M is the number of hosts.As we know,Oversubscription is quite common in datacenters.However,oversubscribed topologies is not easy to analysis.As a result,none of existing theoretical works considered this scenario.In this paper,we extended MCS to over-subscribed network architecture.Finally,we evaluate our algorithm MCS in testbeds and large-scale simulations respectively.In testbeds,we design and implement an ap-plication layer scheduling framework and reduce the JCT by up to 81.65%comparing with pure DCTCP.In large-scale simulations,we use an event-based flow-level simu-lator and compare MCS with two classical algorithms:Aalo and LP-OV-LS.We reduce the average JCT by up to 33.48%comparing with Aalo,a heuristic multi-stage coflow scheduler;Also,we reduce the total weighted JCT by up to 83.58%comparing with LP-OV-LS,the state-of-the-art approximation algorithm of coflow scheduling.Eval-uation results demonstrate that,the largest gap between our algorithm and the lower bound is only 9.14%.
Keywords/Search Tags:Data Center Networks, Coflow Scheduling, Multicast, Approximation Algorithm
PDF Full Text Request
Related items