Font Size: a A A

Research On High-dimensional Index Structures Of Large Data

Posted on:2010-05-17Degree:MasterType:Thesis
Country:ChinaCandidate:J Y YuFull Text:PDF
GTID:2178360275956556Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
Image as a rich,intuitive performance of the media,in many areas has been widely applied,such as digital libraries,geographic information systems,the DNA database of bioinformatics,medical diagnosis,etc.How to deal with quickly content-based similarity search is becoming increasingly important in a large image database. High-dimensional indexing technology is not only a fundamental issue of the field of the content-based similar retrieval,but also a hot issue.So studying on high-dimensional index structures of large data has important theoretical and practical significance.However,due to the impact of "curse of dimensionality",with the growth of the data dimension,the performance of the traditional index structure has dropped drastically.In response to these problems above,to high-dimensional data of massive images database as the background,around the high-dimensional characteristics of the image characteristics,this paper studied and analyzed the distance distribution characteristic of high-dimensional image data by experimental.On this basis,the research and design a new structure of high-dimensional data indexing,that is KVP-tree.The main contents is as follows:Firstly,by experiments the different type and the different dimension feature vector of classification images and mixed image library database are extracted and uniformed. Then the distance between arbitrary two images of image database are calculated.At last the distance distribution characteristic of high dimensionality data is given:the distance distribution of high dimensionality space has the feature of larger means and smaller variance.That is the distance distribution is central.So it is given that the high dimensionality index structure using a "balanced tree" is not necessarily the best optionSecondly,A novel high dimensionality index structure is proposed by improved VP-tree,that is KVP-tree,which combines the cluster Algorithm of K-means and node structure of M-tree.The design thinking,node structure,the procession of building tree and the query method of KVP-tree are introduced.Then by experiment the performance of KVP-tree is deeply analyzed.The experiments show that KVP-tree improves the output capacity of node,decreases the number of distance calculation,and improves the efficiency of query by comparing the query performance of KVP-tree and VP-tree in detail.
Keywords/Search Tags:High-dimensional index structures, feature extraction, K-means, VP-tree, M-tree
PDF Full Text Request
Related items