Font Size: a A A

P2P Based Research On The Hyperrectangle-Range Query Of High-Dimensional Data

Posted on:2010-11-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q ZhangFull Text:PDF
GTID:2178360302460893Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In recent years, along with the development of the Internet, there appear a large number of P2P systems and the P2P technique has become the hotspot to study. P2P network research was originally conceived to share multimedia files, which has brought the needs of multimedia file retrieval. Usually, we need to fetch some multimedia files, which are within the scope of some attributes, i.e. hyperrectangule-range qurey of high-dimensional data.This paper has intensively studied the information retrieval technology in structured P2P environment and the indexing algorithm of high dimensional data. We find the following issues: The current structured P2P network has an ineffective support for complex search queries such as range query; many false hits were caused when use dimensionality reduction methods or approximate vector technology in high-dimensional data retrieval; super-sphere retrieval method can not be refined to each dimension. A hyperrectangle-range retrieval method based on clustering-pyramid is proposed in Chord system in this paper. Firstly, high-dimensional data are converted into 1-dimensional index values using clustering-pyramid-technique. Secondly, every index value is identified with a unique mark using locality-preserving hashing function. Finally, the P-Chord system is generated by storing the marks and iMinMax index values into chord nodes. In addition, a data filtering strategy and range query algorithm are given on that basis. Experimental results show that the P-Chord has advantages over reducing the false hits, as well as improving precision.In unstructured P2P networks, in order to tackle the problem that there are lots of query-messages in range query using Gnutella protocol and super-sphere retrieval method can't set the retrieval scope into every dimension, simplified cluster-pyramid index is introduced. With the high-dimensional data points are mapped to the pyramids and the routing table is added to the index, each node not only can execute range query, but also can transmit the requests of query to the most stable nodes. On this basis, the range query algorithm and network self-configuration algorithm are given, and the updating strategy of the index is proposed. Experimental results show that the SCPI approach has advantages over improving query performance, reducing system consumption of constructing and maintaining index.
Keywords/Search Tags:hyperrectangle-range query, Chord, Cluster-pyramid, Network self-configuration
PDF Full Text Request
Related items