Font Size: a A A

Highly Scalable Switches And Network Topologies For Large Scale Data Center

Posted on:2014-02-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:J F ZhangFull Text:PDF
GTID:1228330395996909Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
To meet the increasing business demands, scalability of data center attracts moreattentions recently. For a data center, interconnection network is its “skeleton”, and switchequipments are “joints” of its skeleton. The structures of existed data centers have beendetermined and are hardly changed. They may be not good for scalablility because of theoriginal design. Those types of data center are usually enlarged by replaces the switchequipments. Therefore a data center is able to include much more computing units withoutchanging its topology structure. In contrast, for a data center which is going to be built, a welldesigned topology structure is an important factor with no doubt. Both these two aspectsmotive us to work on scalable switches and topology structure of data center interconnectionnetworks.For switches: contention-tolarant crossbar switch (CTC) is a novel and agile switcharchitecture which was proposed in previous works of this topic. Via contention-tolarantmechanism, schedulers can be fully distributed over all input ports in CTC(N), whichdecouples the controls. This amazing feature largely reduces the scheduling and wirecomplexities. So that larger switch equiments can be built easily. However, the previousworks have proved that the asymmetry of swithing fabric leads to bottle neck of throuphput.In this dissertation, we firstly propose an improved contention-tolerant crossbar switcharchitecture named Diagonalized Contention-Tolerant Crossbar Switch, DiaCTC(N) for short.DiaCTC(N) entirely inherits the advantage of fully distributed control from CTC(N), however,it overcomes the throughput bottleneck by modifying the link pattern in switch fabric.Simulation results show that the performance of DiaCTC(N) is significantly enhanced withoutany additional cost. We would like to mention that, all of the scheduling algorithms can beplaced on DiaCTC(N) without any changes.Secondly, we find that, interceptions in CTC(N) which are caused by contention-tolerantmechanism lead to out-of-sequence issue, i.e. cells would be out of their original sequencewhen they arrive at output ports. And this problem presents in both CTC(N) and DiaCTC(N).We proposed a new fully distributed scheduling algorithm called Self-Adjust schedulingalgorithm which can effectively avoid the out-of-sequence. Simulation results indicate that DiaCTC(N) can achieve high throughput with running Self-Adjust scheduling algorithm.Thirdly, associated with the charactiristic of loosely coupled control inContention-Tolerant Crossbar(CTC), in this dissertation, we investigate two large scaleswitching architectures which are constructed by small scale CTCs, i.e. ParallelizedContention-Tolerant Crossbar Switch, denoted by PCTC(N), and Two-stageContention-Tolerant Crossbar, denoted by TCTC(N, k). In PCTC(N), the entire fabric ispartitioned into several non-overlapping regions. Each region is a small CTC(N) switchmodule. All regions operate independently and in parallel, and the switching performance ispromoted in this way. Both of theoretical analysis and simulations under various trafficmodels prove that PCTC(N) can achieve high throughput. In TCTC(N, k), the switchingprocess is splited into two stages, i.e. input stage and output stage. Each stage consists ofmultiple CTC modules. The parameter of k is the number of input/output ports within eachCTC module of input/output stage. There is one path between input/output stages. When acell arrives at input module via input port, it will be switched to corresponding output modelaccording to its output destination number. Then the cell will be sent to its destination outputport via output model. By developing queueing model, we find out that, TCTC(N, k) canachieve100%throughput when k is far less than N. Meanwhile, it also helps on decreasing itsaverage numbers of interceptions and reducing the latency leaded by out-of-sequence issue.TCTC(N, k) is suitable for constructing large scale switching architecture.For topology structure: with a well-designed network topology, a data center shallconnect as many computing units as possible and be able to expanded fast to meet therequirement of increasing business. Meanwhile, it is also required an appropriate shortdiameter to guarantee the performance of communication.Our studies are based on hypergraph theories and combinational block design theories.Firstly, we design an innovate DC (Data Center) topology structure named Multi-layer SteinerTriple System, abbr.MSTS. We theoretically prove that MSTS can connect much morecomputation units and is able to be expanded in double exponential speed. At the same time,MSTS keeps a short diameter to guarantee the communication performance. MSTS can beimplemented with numbers of small scale commercial connection equipments withoutsacrificing communication performance. It largely reduces the cost.Secondly, we present an efficient construction method and a fault tolerant routingalgorithm for MSTS structure associated with the characteristic of MSTS. We test the faultpath problem caused by computing unit failures in simulation environment. The resultsindicate that the computing unit failure does lead to fault path, however, the increment of fault path is less then computing unit failure. It means that the communication in MSTS is uniform,and no bottleneck exists.Thirdly, we develop a hypergraph model of DC topology structure, and convert theproblem of construcing data center to the problem of constructing hypergraph. The3-stepmethod can construct topology structure with diameter2. In this dissertation, we give a proofon the number of disjoint path between any pair of nodes in a hypergraph which is producedby3-step method. Further, we propose a general3-step method and theoretically prove that, itcan construct larger topology structure with dameter2.Last but not the least, we construct indirect hypergraph by using merging methods. Nomatter using the3-step method or using the general3-step method, the outcomes are directnetworks. Every node in a directy network not only processes data, but also serves for itscommunications. When a node is failed, it will affect the communication links as well. Forthis consideration, we propose a merging method. Multiple direct networks which areproduced by3-step method or general3-step method are merged and an indirect network withdiameter5is constructed with this method. For different optimizing purposes, we designdifferent merging methods, and these merging methods form a family of topology structures.It considers trade-offs between the inter cluster communication capacity, reliability, numberof switching equipments and number of computing uints.
Keywords/Search Tags:Data Center, Switches, Network Topology, Scalability
PDF Full Text Request
Related items