Research On High-dimensional Index Structures Of Large Data

Posted on:2010-05-17

Degree:Master

Type:Thesis

Country:China

Candidate:J Y Yu

Full Text:PDF

GTID:2178360275956556

Subject:Applied Mathematics

Abstract/Summary:

PDF Full Text Request

Image as a rich,intuitive performance of the media,in many areas has been widely applied,such as digital libraries,geographic information systems,the DNA database of bioinformatics,medical diagnosis,etc.How to deal with quickly content-based similarity search is becoming increasingly important in a large image database. High-dimensional indexing technology is not only a fundamental issue of the field of the content-based similar retrieval,but also a hot issue.So studying on high-dimensional index structures of large data has important theoretical and practical significance.However,due to the impact of "curse of dimensionality",with the growth of the data dimension,the performance of the traditional index structure has dropped drastically.In response to these problems above,to high-dimensional data of massive images database as the background,around the high-dimensional characteristics of the image characteristics,this paper studied and analyzed the distance distribution characteristic of high-dimensional image data by experimental.On this basis,the research and design a new structure of high-dimensional data indexing,that is KVP-tree.The main contents is as follows:Firstly,by experiments the different type and the different dimension feature vector of classification images and mixed image library database are extracted and uniformed. Then the distance between arbitrary two images of image database are calculated.At last the distance distribution characteristic of high dimensionality data is given:the distance distribution of high dimensionality space has the feature of larger means and smaller variance.That is the distance distribution is central.So it is given that the high dimensionality index structure using a "balanced tree" is not necessarily the best optionSecondly,A novel high dimensionality index structure is proposed by improved VP-tree,that is KVP-tree,which combines the cluster Algorithm of K-means and node structure of M-tree.The design thinking,node structure,the procession of building tree and the query method of KVP-tree are introduced.Then by experiment the performance of KVP-tree is deeply analyzed.The experiments show that KVP-tree improves the output capacity of node,decreases the number of distance calculation,and improves the efficiency of query by comparing the query performance of KVP-tree and VP-tree in detail.

Keywords/Search Tags:

High-dimensional index structures, feature extraction, K-means, VP-tree, M-tree

PDF Full Text Request

Related items

1	Research On High - Dimensional Data Tree Index Based On Soft Subspace
2	NAQ-tree: Effective index structure for similarity search in high dimensional space
3	R-Tree Index Construction Of Dynamic K-Means Algorithm
4	The Three-Dimensional Index Structure Of R~*-tree Based On The Minimum Bounding Box And The Adaptive Clustering
5	Image Retrieval Method Based On Depth Learning Feature Extraction And Tree-hash Mixed Index
6	K-means Based On Binary And Svm Decision Tree Algorithm Of Data Mining Research
7	Research On Feature Selection Algorithms Based On Decision Tree For High-dimensional Data
8	Research On Multidimensional Cloud Data Index Structure Based On KD Tree And R Tree
9	Design And Implementation Of The Auto-adapted And Dynamic-balanced Spatial Index-QERï½ž+-tree
10	Research On Algorthms Of High-dimensional Multimedia Data Indexing