Font Size: a A A

Probabilistic methods for Web caching and performance prediction of IP networks and Web farms

Posted on:2004-09-11Degree:Ph.DType:Thesis
University:Stanford UniversityCandidate:Psounis, KonstantinosFull Text:PDF
GTID:2468390011958447Subject:Computer Science
Abstract/Summary:
Over the past ten years, the Internet has grown from a small-scale research network to an immense world-wide infrastructure. But its great success is also the source of overwhelming problems; it is very hard to design scalable algorithms for, or analyze and predict the performance of such a large-scale, heterogeneous network.; Traditional methods often yield complex, non-scalable solutions to these problems. To reduce the implementation complexity, one may employ probabilistic algorithms that require only a small random sample of the “state” and input. When this random sample has enough information for achieving good performance, the outcome is a large reduction in implementation complexity for a small sacrifice in performance. This thesis uses this approach in two important areas of Internet research: web caching and performance prediction of networks and web farms.; The first part the thesis investigates how to improve the performance and reduce the complexity of web caching. First, a method to randomize any web cache replacement policy is designed, leading to reduced-complexity implementations at a small performance cost. The novelty of the method is a general procedure to improve the quality of a sample without increasing its size. Second, a parsimonious model for generating synthetic web traces is introduced, which provides insights for optimally designing replacement and prefetching algorithms. Third, a scalable scheme for exploiting the redundancy in web documents that change frequently over time is designed.; The second part of the thesis reduces the complexity of performance prediction of large systems by introducing a scalable, widely applicable method called SHRiNK (Small-scale High-fidelity Reproduction of Network Kinetics). The method uses a sample of the actual traffic to drive a small-scale replica of the original system. Then, it exploits a scaling law to extrapolate from the performance of the replica to that of the original system. Network simulations and web farm experiments show that SHRiNK accurately predicts the distribution of a large number of performance measures of the original system by the small-scale replica. A theoretical argument reveals that the method is widely applicable for any network topology, flow transfer protocol, and queue management scheme.
Keywords/Search Tags:Network, Performance, Method, Web, Small-scale
Related items