Font Size: a A A

Research On Similarity Search In Hybrid Peer-to-Peer Systems

Posted on:2011-12-06Degree:MasterType:Thesis
Country:ChinaCandidate:T L TangFull Text:PDF
GTID:2178360302974670Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of multi-media, web and other technology, multi-media databases have become very important in many applications, one of the important research is to model these data objects with high-dimensional vectors, and address similarity queries on them, including content-based video, audio retrieval, data stream matching, digital image processing, text processing, etc. Due to the complexity of similarity queries, centralized processing will lead high overloads and performance bottlenecks on a single computer, therefore, with the rise of P2P systems, distributed processing has become the focus research point.This dissertation mainly focuses on the two most important similarity query types: window-based range query and A-Nearest Neighbor query. Firstly, we select the super peer P2P systems as the underlying network overlay and design a distributed framework for query.Then, we propose two algorithms to address the efficient processing of similarity queries, which data is horizontally distributed across the P2P systems. In each peer, data is mapped into one-dimensional key according to a dimension reduction algorithm, and in super-peers, network query tree is built to make the query can access the whole network by rule, statistic information of data is constructed in order to prune the accessed peers in each query. In the ANN query process, a radius estimation algorithm is proposed for the estimation of the distance from query point to the kth neighbor, then the complex ANN query can be converted to a simple range query.In the experiments, we employ synthetic data including uniform distributed data and gaussian distributed data to test the efficiency and correctness, results reveal the high query efficiency of the query algorithms and effectiveness of the radius estimation algorithm for kNN query.
Keywords/Search Tags:dimension reduction, super-peer P2P systems, similarity query, high-dimensional data, radius estimation
PDF Full Text Request
Related items