Font Size: a A A

Systems for Improving Internet Availability and Performance

Posted on:2013-04-23Degree:Ph.D.DrType:Dissertation
University:University of WashingtonCandidate:Katz-Bassett, Ethan BFull Text:PDF
GTID:1458390008985870Subject:Computer Science
Abstract/Summary:
The Internet's role in our lives continues to grow, but it often fails to provide the availability and performance demanded by our increasing reliance on it. The problems largely stem from the fact that network operators at an Internet service provider (ISP) have little visibility into and even less control over the routing of other ISPs on which they depend to deliver global connectivity.;In this dissertation, I design, build, and evaluate practical distributed systems that ISPs can use today to understand availability and performance problems. I develop reverse traceroute, a system to measure reverse paths back to the local host from arbitrary destinations. While tools have long existed to measure the forward direction, the reverse path has been largely opaque, hindering troubleshooting efforts. I show how content providers such as Google could use reverse traceroute to troubleshoot their clients performance problems. The rest of the dissertation focuses on long-lasting routing outages. My measurements show that they occur frequently and contribute significantly to overall unavailability. To address these long-term problems, I develop a system, LIFEGUARD, for automatic failure localization and remediation. First, the system builds on reverse traceroute to locate faults, even in the presence of asymmetric paths and failures. Second, I develop a technique that enables edge ISPs to steer traffic to them around failures, without requiring the involvement of the network causing the failure. Deploying LIFEGUARD on the live Internet, I find that it can effectively route traffic around particular ASes without causing widespread disruption.
Keywords/Search Tags:Internet, Availability, Performance, System
Related items