Font Size: a A A

Data Management In Peer-to-Peer Systems

Posted on:2005-04-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:W N QianFull Text:PDF
GTID:1118360125967272Subject:Software and theory
Abstract/Summary:PDF Full Text Request
With the popularity of Internet in real-life applications and the development of data management technologies, more and more computing nodes tend to share their data or service with other nodes. On the other hand, sensor networks are playing important roles in more and more monitor applications, due to the mature of data collecting techniques and the continuous decreasing of hardware cost. Both large-scale data or service sharing and sensors network data transmission or processing applications are characterized by the following two aspects, Each node can play the role of a client (i.e. service consumer) or a server (i.e. service provider), or both; A node may establish connection or communicate with any other node on application level.Systems with these two features are generalized as systems with peer-to-peer (P2P) model.This thesis is devoted to the issues of data management in P2P environment. It studies data and query routing, locating and search, query answering, indexing, and data placement techniques for supporting complex query processing functions under a unified query processing system framework CON-QuerP, in dynamic peer-to-peer environments with large-scale distributed, autonomic yet symmetric peers. Major contributions of this thesis include,1. A query processing system framework, called CON-QuerP for unstructured peer-tc-peer environments, is proposed. The framework enables query processing over peers based on unstructured peer-to-peer platform and back-end relational database engines of each peer. CON-QuerP differs itself from existing systems in that, its collaborative view mechanism and distributed hashtable (DHT) based resource locating and search mechanism, named CON, provide more efficient query processing with the help of multi-granularity views in a collaborative environment.2. A search technique in the favor of small-world phenomena, called SHINOV, is investigated. SHINOV combines similarity-based heuristic search and the node-to-visit (NOV) control technique, which determines the degree of query broadcast based on the similarity between the query and shared data objects on a peer, instead of the widely used time-to-live (TTL) technique. Under the assumption of "The peer-to-peer networks fit small-world phenomena.", SHINOV outperforms traditional breadth-first search (BFS). Simulation shows that SHINOV is advantageous over BFS in a dynamic peer-to-peer environment with autonomy peers, especially when the number of searchable topics are increasing and the searching topics are changing.3. A new concept of clustering-based query answering (CBQA) is introduced, which generalizes the problem of nearest-neighbor search and data clustering. A general approach to evaluate CBQA in peer-to-peer systems is proposed. It is proved that under certain conditions, this approach can generate exact the same result as in centralized situations. Three concrete algorithms to instantiate the proposed CBQA evaluation approach are designed for three contexts: k-nearest-neighbor search, distance-based clustering and density-based clustering respectively. The consistency of results obtained by these three algorithms and that in centralized fashion is proved, while the respective cost of communication and computation is analyzed. These algorithms can be combined with SHINOV search technique in implementation of CON-QuerP, as basic communication and routing techniques for supporting data objects searching tasks delivered by high-level query processing module.4. The techniques for materialized view selection, based on negotiation, is studied for supporting SQL-alike query processing. Under the framework of CON-QuerP with collaborative view mechanism, a cost model for evaluation of the network transmission cost of query plans is given. The cost of the query plans are estimated cheaply by using empirical method. The cost model is combined with CON in the negotiation of requesters, i.e., the querying peers, and coordinators, which are automatically selected peers in CON. The negotiation pr...
Keywords/Search Tags:peer-to-peer computing, query processing, search, clustering, view selection
PDF Full Text Request
Related items