Font Size: a A A

Quantization techniques for similarity search in high-dimensional data spaces

Posted on:2003-02-25Degree:M.ScType:Thesis
University:University of Toronto (Canada)Candidate:Garcia-Arellano, Christian MarceloFull Text:PDF
GTID:2468390011487975Subject:Computer Science
Abstract/Summary:
In recent years, several indexing techniques have been developed for efficient similarity search in high-dimensional data spaces. Some of the techniques, based on the idea of vector quantization, have shown to be the most successful in terms of efficiency. The concept of vector approximations was first introduced with the VA-file, and the IQ-tree and A-tree were more recently developed and are known to outperform their common ancestor.; In this thesis we present an extensive experimental evaluation and analysis of the query performance of state-of-the-art quantization techniques. We compare their behavior for real data sets, by performing K-nearest neighbor queries for different distance metrics.; We also propose a new static similarity indexing strategy called Quantized Clustering Tree or QC-tree. The QC-tree strategy integrates the best characteristics observed in the IQ-tree and the A-tree. It achieves better query performance than the former and more stable results than the latter.
Keywords/Search Tags:Techniques, Similarity, Data, Quantization
Related items