Font Size: a A A

Scalable distribution of data across autonomous systems

Posted on:2002-01-25Degree:Ph.DType:Thesis
University:University of Illinois at Urbana-ChampaignCandidate:Gupta, Vijay ShivshankerFull Text:PDF
GTID:2468390011998026Subject:Computer Science
Abstract/Summary:
In the past few years, the Internet has emerged as one of the primary means for distribution of information. The Internet is composed of several thousand autonomous systems, each of which is administered independently. Today, most distributed applications use uni-cast pull to distribute data across autonomous system (AS) boundaries. This does not favor applications that need to quickly distribute, or multicast, data and event information from a machine in one AS to several other machines in other ASes. The primitive for multicasting on the current Internet, namely IP Multicast, is not suitable because of various reasons. First, IP Multicast is unreliable. Second, Internet Service Providers do not provide IP Multicast service because IP Multicast allows any sender to flood any multicast session. Third, IP Multicast requires flooding of session information and large routing tables. To gain a better understanding of the issues involved in quick large-scale data distribution, this thesis examines three Internet-scale problems: (i) freshness of search engines, (ii) web cache consistency, and (iii) distribution of video.; The first two problems can benefit from quicker propagation of update information from the millions of web servers to search engines and web proxies respectively. The third problem can profit from the creation of inter-AS multicast distribution trees. The unifying theme for all three problems is that all of them deal with large-scale distribution with coordination of (potentially) thousands or millions of principals. Based on the common requirements for solving the three example problems, we propose Global Rendezvous Architecture (GRA)—an architecture for creating the control plane for facilitating quick data distribution across AS boundaries. GRA makes use of application-layer centralized entities called middlemen to facilitate Internet-scale multicast of control messages. The middlemen facilitate the creation of distribution trees for control messages, and can also help with the creation of data plane.; We use GRA to develop architectures for the problems of search engine freshness, web cache consistency, and video distribution. For search engine freshness and web cache consistency, we first present the design and implementation of algorithms for web servers. Second, we present the design and implementation of architectures at middlemen to demonstrate the feasibility of using a single middleman for the entire Internet.; For the video distribution problem, we first propose a solution approach to mitigate the flooding concerns that ISPs have with IP Multicast. Second, we show how a few (potentially competing) middlemen can be used for creating inter-AS distribution trees for large-scale video distribution.
Keywords/Search Tags:Distribution, IP multicast, Data, Web cache consistency, Internet, Across, Autonomous, Middlemen
Related items