A Two-dimensional Index Structure Based P2P Query Of Multi-dimensional Data

Posted on:2010-01-16

Degree:Master

Type:Thesis

Country:China

Candidate:H B Lu

Full Text:PDF

GTID:2178360302960682

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the development of the network technology in recent years, many P2P systems have emerged and the P2P technologies get more and more attention. P2P technologies are mainly used in information retrieval, file sharing systems, distributed computing and electronic commerce and so on. The information retrieval as the primary means of searching information in the web is currently the most common applications in P2P technologies.High-dimensional data has always been a hot research in the database fields. There are many applications in practice, such as data mining, multimedia information retrieval and so forth. Similarity retrieval is a very critical issue, which is to find a more similar data with the given object in the large data set. In high-dimensional data retrieval process, the distance calculation is an important factor that affects retrieval efficiency. In order to reduce the distance calculation, some solutions have been proposed in recent years, which mainly based on approximate vector representation or create a one-dimensional index for data. The former one is usually to find an approximate vector representation for the high-dimensional data in order to simplify the search space, such as the VA-file. To establish a one-dimensional index for data means to transform the high-dimensional data into one-dimensional data in some way so as to reduce the effects of dimensionality. A typical representative of this approach is idistance.Different from the low-dimensional space what we are familiar with, the high-dimensional space has its own unique characteristics with the data distribution, that is the high-dimensional data space is virtually hollow, which makes the majority of multi-density estimation methods can not reach accurate result. That is because the region with low density accounted for a significant portion of the distribution volume, and high density regions are lack of sufficient observations. Based on the analysis of these distribution characters of the high-dimensional space, this paper split it into several sub-spaces according to the amount of data, so that these data in the sub-spaces can be distributed evenly. Division of the sub-space is a vertical division of data space. To create district partitions for further division on the basis of the sub-space, this is a horizontal division of the data space. After the space is divided, create the two-dimensional index value for the data set based on the approximate vector representation and the creation of one-dimensional data index, making the mapping between data indexing and the identifier of the peers of structured P2P network Chord. To implement a two-layer filter with the query during retrieval, this has reduced the distance calculation and gained a high performance of query. The experimental results show that the two-dimensional indexing structure has a good performance in precision rate and efficiency of search.

Keywords/Search Tags:

Range query, Chord, Sub-space, Zone bit code

PDF Full Text Request

Related items

1	P2P Based Research On The Hyperrectangle-Range Query Of High-Dimensional Data
2	Research Of High-Dimensional Space Query Algorithm Based On Space-Filling Curves
3	Data Integrity Verification Technology Research And Implementation Of Range Query In Location-based Service
4	Research On Orthogonal Range Query Based On BC-iDistance
5	Chord-based P2p Query Methods
6	The Research And Implementation Of Indexing And Query Techniques Based On Range Query
7	DR-Chord: An Efficient Double-Ring Chord Protocol
8	A Declarative Code Query Technology For Heterogeneous Code Repositories
9	I3 Network Information Query And Storage Mechanism
10	Research On Range Query Authentication Technology In LBS