Font Size: a A A

High-dimensional Data Indexing Structure

Posted on:2006-03-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:D G DongFull Text:PDF
GTID:1118360155460691Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Today, more and more multimedia information sources are available in digital form due to the development of computer and Internet. The issue to speed up the similiarity query of multimedia object becomes much important. Mostly, similarity can not be measured on the multimedia objects directly, but rather, on abstractions of objects termed feature vectors, the distances of which are used to represent the similarity. The brute force approach for similarity query is to sequentially scan all the feature vectors. In typical multimedia applications, databases usually contain a large number of feature vectors, and sequential scan will incur extremely high disk I/O cost.Many index structures have been proposed to solve this difficult problem, such as R-Tree and its variants, VA-File, Pyramid-Technique etc.. From the published results, it can be concluded that most of these methods could achieve good query performance when the dimensionality is less than 20. However, the query performance will suffer greatly as the dimensionality increases, which is so-called "the curse of dimensionality".In this dissertation, we first introduced four novel high-diemsional index structures to overcome the curse of dimensonality: 1) Angle-Tree, which can support the kind of similarity search whose similarity is measured by the cosine of the angle between two vectors; 2) VAR-Tree, which integrates VA-File and R-Tree, and employs R-Tree to manage approximation vectors; 3)VA-Trie, the key idea behind which is adopting the idea of quantization to compress the vectors and then employing the Trie structure to organize and manage the approximations; 4) OVA-File, which inserts the approximations of VA-File into a suitable position so that the approximations close to each other in data space are placed in neighboring positions of approximation file.Based on VA-File and OVA-File, we proposed two fast video clip query algorithms, which enable VA-File and OVA-File to support both video shot similarity query and video clip similarity query.Finally, we presented the application of our proposed index structure OVA-File in a multimedia information retrieval system, which has been put into practical use. In this system, OVA-File was employed to manage and index large scale high diemsional feature vectors extracted from video database.
Keywords/Search Tags:information retrieval, index structure, similarity query, multimedia database
PDF Full Text Request
Related items