A scalable content-addressable network

Posted on:2003-03-20

Degree:Ph.D

Type:Thesis

University:University of California, Berkeley

Candidate:Ratnasamy, Sylvia Paul

Full Text:PDF

GTID:2468390011482002

Subject:Computer Science

Abstract/Summary:

PDF Full Text Request

In May 1999, Shawn Fanning, then a freshman at Northeastern University, launched the first “peer-to-peer” or P2P file-sharing application—Napster. Napster allowed individual end-users (called peers) to share the MP3-encoded music stored on their local computers directly with one another over the Internet. Within a year, Napster had grown to a user population of over 50 million users making it the fastest growing Internet application to date. Three years later, despite the closure of Napster, the phenomenon of file-sharing continues its dramatic growth and appears set to remain an important feature of the Internet for the foreseeable future. The sheer scale of these file-sharing applications make them important in their own right. And yet, as this thesis will argue, P2P is much more than just a way to trade MP3s over the Internet. The P2P architecture with its use of low-cost, grass-roots resources and its decentralized nature that does not rely on any form of centrally managed infrastructure, represents a significant departure from the client-server architecture of the Web. These unique characteristics, we believe, allow P2P systems to support the rapid and low-cost deployment of powerful large-scale applications in a manner that would not be possible with the current architecture of the Web.; There are two key pieces to a P2P system: the lookup mechanism used to locate a desired file and the actual file download. The decentralized storage in P2P systems makes the file transfer process inherently scalable; the hard part is finding the peer(s) from which to retrieve the desired file. This thesis addresses this problem of scalable indexing in P2P systems, i.e., given a file identifier, how can we find the IP address of the peer(s) holding the file? Ideally, a solution to this indexing problem must be scalable to millions of users, must find files quickly, and must be resilient to the frequent arrival and departure of participant peers. As a solution, we introduce the concept of a Content-Addressable Network (CAN) as a distributed system that provides hash table functionality—mapping “keys” onto “values”—on Internet-like scales. Our CAN design is completely distributed (requiring no form of centralized control, coordination or configuration), scalable (nodes maintain only a small amount of control state that is independent of the number of nodes in the system), and fault-tolerant (nodes can route around failures).; The Distributed Hash Table (DHT) functionality supported by CAN serves as a useful substrate for a range of large distributed systems; for example, Internet-scale facilities such as global file systems, application-layer multicast, event notification, and chat services can all be layered over a DHT system such as CAN.

Keywords/Search Tags:

P2P, CAN, File, Scalable, Napster, Over

PDF Full Text Request

Related items

1	Scalable I / O Research And Reference Implementation
2	A study for scalable directory in parallel file systems
3	Research On Key Issues Of Scalable Distributed File System Metadata Service In Large-scale Networked Storage Systems
4	Scalable And Unified Benchmarking Framework For Distributed File Systems
5	Media Store Of Scalable Video
6	Scalable file systems and operating systems support for big data applications
7	Scalable capability-based authorization for high-performance parallel file systems
8	The Research On Scalable Metadata Service Of Parallel File System For High Performance Computing Systems
9	Research Of Internet File Storage Service System
10	Live Streaming System Based On P2p Architecture Design And Realization