Font Size: a A A

Managing large scale distributed data with peer-to-peer search trees

Posted on:2008-07-19Degree:Ph.DType:Thesis
University:Princeton UniversityCandidate:Zhang, ChiFull Text:PDF
GTID:2448390005978328Subject:Computer Science
Abstract/Summary:
Massive data sets have been increasingly shared by distributed users following the success of broad-band Internet services. Indexing and complex queries are essential for such data collections in applications like multimedia retrieval, data mining, and spatial databases. Traditional service infrastructures rely on centralized management and query processing. This approach does not scale with the rapidly increasing amount of application data available on massively distributed systems like the Internet. In recent years, peer-to-peer computing has emerged as a promising model for building large, scalable systems. A number of peer-to-peer systems have been proposed by the research community, typically taking the form of Distributed Hash Tables (DHT). Such systems provide excellent scalability and resilience for storage applications. However, they do not provide complex query capabilities beyond exact matching. This thesis proposes Brushwood, a peer-to-peer infrastructure for building distributed search trees to support complex queries. Users define how their tree nodes handle index queries and insertion requests. The execution of query/insertion operations and maintenace of the search tree are implemented by Brushwood infrastructure, which guarantees logarithmic cost with decentralized tree navigation and maintenance, even in the face of skewed data sets and unbalanced tree structures. The platform is flexible to support diversified applications. I present two examples in this thesis: an index service for high dimensional data, and a content-based publish/subscribe system. The high-dimensional index supports range queries and nearest neighbor search based on Brushwood tree navigation. The publish/subscribe system supports complex sementics for matching dynamic events to user subscriptions. They use significantly different index structures built upon the same Brushwood platform. Experimental results demonstrated the versatility and scalability of the systems.
Keywords/Search Tags:Data, Distributed, Index, Tree, Peer-to-peer, Search, Systems, Complex
Related items