Font Size: a A A

Study Of Load-balanced Scheduling Algorithms In Multi-stage Packet Switching Networks

Posted on:2015-06-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y GaoFull Text:PDF
GTID:1228330431962476Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
The rapid development of the Internet as well as the emergence of new gener-ation data centers and cloud computing applications have placed higher demands onswitches and routers, both of which set up the infrastructure of the Internet. It requiresthe switch fabric, the core of network devices, to develop towards larger capacity andbandwidth, higher performance and scalability, as well as more sophisticated QoS as-surance to meet a variety of demands of new services and applications. Single-stagecrossbar, which is widely used in the backbone routers, can’t be scaled up to largecapacity because of the implementation limits (such as power supply, chip volume,port density).Clos-network switches have attracted much attention because of their modularity,scalability and non-blocking. The research on three-stage Clos networks in literaturesis mainly extended from that on the single-stage crossbar, but many of them do notperform well for high-speed larger capacity switches/routers because of complicat-ed scheduling process and high implementation cost. Problems such as high com-munication overhead between stages, out-of-sequence (OOS) problem, and lack ofmulticast support limit the scalability and development of multi-stage networks. Con-sidering cells can be forwarded to one output port through different routes, dispatchingalgorithm in Clos network needs to balance traffic among these paths without causingout-of-sequence problems or adding complexity. The two-stage load-balanced switch,which we can draw lessons from, has advantages in simplifying scheduling processand providing stable throughput. Thus based on the load balancing strategy, we inves-tigate the key issues of three-stage Clos networks and tries to resolve the problemsmentioned above. The main contributions of this dissertation are listed as follows:1. The use of buffers in the central stage of memory-memory-memory (MM-M) switch can potentially cause the forwarding of packets to the outputs in out-of-sequence order. Traditional schemes delivers packets in order by introducing complexmatching process or adopting feedback control mechanism, which impair the scalabili-ty of MMM switch and fail to provide100%throughput. We propose two padded-framebased in-sequence dispatching schemes for MMM Clos network, named EnhancedPadded Frames (EPF) scheme and Frame-based In-sequence scheme in MMM switch(FIM3), respectively. Both EPF and FIM3schemes balance cells per flow among all central modules at the first stage. So every cell of the same frame suffers the samequeuing delay independently of the path it takes. Thus, no out-of-sequence occurs.Besides, to prevent starvation of lightly loaded queues, fake packets are inserted foran incomplete frame to get full. Adopting a predetermined cyclic shift configuration atthe first and third stages, EPF scheme doesn’t need to perform any matching algo-rithm, and it’s decentralized and scalable. To further improve the delay performance,FIM3scheme adopts crosspoint queued switches with oldest-cell-first scheduling atthe third stage. It has been shown, by means of both analysis and simulation, thatboth the proposed algorithms can provide in-sequence service and are stable for anyadmissible traffic patterns.2. The multicast scheduling algorithms in MMM Clos network are investigated.Existing multicast scheduling algorithms applied to the central stage buffered Clos net-work can eliminate the in-packet OOS problem, but still suffer from inter-packet OOSproblem due to varied packet length and the unexpected packet arrivals. Besides, theperformance of these existing schemes is restricted by the HoL(head-of-line) blockingwith the FIFO queue structure. Considering the characters of the fanout set, it is im-practical to balance the per flow multicast traffic, thus these in-sequence dispatchingschemes for unicast traffic cannot fit well. In this paper, we propose a frame-basedmulticast scheduling algorithm for MMM Clos networks (FMClos). Multicast cells arereplicated at the IMs and OMs based on address-copy technology, which eliminatesHoL blocking and improves the throughput performance. The frame-based schedulingscheme performed at the input module, as well as the buffered crossbar switch ele-ment adopted by the central module, contributes to reducing the mis-sequence cells.Simulations show the padded-frame based multicast scheduling algorithm can achievenearly100%throughput. And compared with existing algorithms, the proposed onedecreases the proportion of mis-sequence cells and reduces the resequencing delaygreatly.3. With the objectives of achieving high throughput with reasonable hardwareand algorithm complexity, we propose a distributed weight matching dispatching (D-WMD) scheme for Memory-Space-Memory (MSM) Clos-network switches, which haslower complexity, less communication overhead and higher matching efficiency. In theproposed scheme, each input module balances request tokens among central mod-ules, where virtual token counters are adopted to keep counting the request tokens that have been received. Each central module performs an iterative weight matchingalgorithm or a randomized weight matching algorithm concurrently and independentlybased on the counter values maintaining locally. The proposed algorithm keeps theadvantages of the load balanced switch and the weight matching scheme without rais-ing out-of-order problems or adding communication overhead. Simulation results showthat this scheme has higher matching efficiency than its weight matching counterpart,and can provide100%throughput under uniform and unbalanced traffic.4. Previously proposed dispatching schemes for MSM Clos-network switcheslack efficient support for multicast traffic forwarding. In this paper, we propose a novelscheduling scheme, called Multi-/Unicast Static Round Robin Dispatching (MUSRRD).The proposed scheme allocates a small number of virtual output module queues ateach input module for multicast traffic, which isolate multicast traffic from unicast one.Multicast cells are replicated at the IMs based on address copy technology, which e-liminates HoL blocking and improves the throughput performance. Compared with theunicast dispatching algorithm, MUSRRD scheme does not increase any schedulingoverhead between stages, thus only the slave arbiters of the input module need to beredesigned. The proposed scheme keeps the simplicity and efficiency of the SRRDalgorithm in the pointers’ initializing and updating, and is fair and easy to implement.
Keywords/Search Tags:Three-stage Clos network, Load-Balanced Birkhoff-von Neu-mann switch, Communication Overhead, Maximal-Weight Matching, MulticastScheduling, In-Order Delivery
PDF Full Text Request
Related items