Font Size: a A A

Robustness in data center networks

Posted on:2014-07-05Degree:Ph.DType:Dissertation
University:Polytechnic Institute of New York UniversityCandidate:Tam, Adrian Sai-wahFull Text:PDF
GTID:1458390008958328Subject:Engineering
Abstract/Summary:PDF Full Text Request
Data center network is a new subject. Its use can be vastly different from the networks in the past decade. Because of the single authority of ownership, adopting new networking technology is easy. It also facilitates some new use of the data network. Most notably is the use of a centralized control for various network function becomes practical here.;In this dissertation, we investigate several issues on the improvement of robustness of data center network. We begin from the network layer, to find the way to provide resilience to network failures. The latest technique to mitigate network failures is fast reroute. Make use of the large amount of redundant connectivity in a network, it is to preconfigure detours for different routes so that, whenever a failure occurred, the detour replaces the malfunctioning route. This technique is well developed for unicast routing. However, we see that multicast is also important in data center networks as it is the only efficient way to disseminate data to many destinations at once. We developed the fast reroute scheme for multicast. Our scheme handles the cases of single-link failure first, and then we extended it to cover the cases of shared-risk link groups (SRLGs).;Besides resilience to failures, the redundant connectivity in data center network can also be used to leverage throughput. While existing techniques like equal-cost multipath forwarding is taking this idea to provide more bandwidth, we can go further to cope with the dynamic behavior of traffic and provide better performance. Taking the capability of newly developed Convergence Enhanced Ethernet, we investigate the method of dynamic rerouting based on the congestion feedback provided by the network. The dynamic rerouting is in addition to the prioritized flow control that is advocated in the Convergence Enhanced Ethernet standard. We see the prioritized flow control, which dynamically adjusts the bandwidth provided to traffic, is merely a spectral solution to congestion. We propose to put dynamic rerouting, which is to relocate traffic, as a spatial solution. We argue that combining the spectral and spatial solution, we shall have a significant improvement to the network throughput. We evaluate this combination over a wide range of traffic patterns.;An emerging use of the data center network is to provide computation power. The map-reduce computation pattern is the most favored approach of today. However, this also shows that the old transport layer protocol cannot serve the new purpose. We recently found that the incast traffic pattern, which map-reduce computation is an example, highlighted the robustness problem of TCP. Its time-out mechanism sabotaged the performance of cluster computation. We carefully examined the design of TCP and found the three independent causes of the performance problem of incast traffic pattern. We also proposed three solutions to them, which is surprisingly simple, effective, and compatible to non-incast traffic.;Finally, we move on to the highest layer of the networking stack. While many proposal about data center networking are using controller, the controller becomes the single point of failure. We accept that, many welcomed features of data center network would not be possible without a controller. But it is interesting that we are the first to address the robustness of the control plane in the data center network. We propose the concept of ``devolved controllers'', which is a set of non-identical controllers to provide resilience and load balancing to each other. We use the example of managing dynamic reroute to see how devolved controllers can replace a single omniscient controller in a data center network. We provided the operation mechanism and configuration algorithm of devolved controllers. The effect of different design parameters are also investigated.
Keywords/Search Tags:Data center network, Different, Robustness, Provide, Controller, New
PDF Full Text Request
Related items