Font Size: a A A

Iterative Quantization-based Hashing For Approximate Nearest Neighbor Retrieval

Posted on:2016-06-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q HuangFull Text:PDF
GTID:2308330479493946Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In a content-based image retrieval(CBIR) system, images are usually represented by very high dimensional feature vectors. A large-scale CBIR system needs to perform efficient nearest neighbor retrieval in a high dimensional space. Traditional tree-based indexing methods are inefficient in a high dimensional space. Hashing-based approximate nearest neighbor(ANN) retrieval methods provide approximate solutions to search nearest neighbors in a large scale and high dimensional dataset. In a hashing-based ANN algorithm, data items are encoded into hash codes. Then, the similarity between two hash codes is used to estimate the similarity between the two corresponding data items. A hashing algorithm yielding high quality hash codes for data is very important to a hashing-based approach. The iterative quantization is a simple but effective hashing algorithm which generates good hash codes by minimizing the quantization error between data items and their hash codes. In this thesis, an improvement of the iterative quantization using a hyperspherical embedding is proposed. Firstly, the target values of geodesic distances between data items are computed according to their pairwise similarities. Then, the data items are embedded onto a hypersphere according to the target values of geodesic distances using an iterative algorithm. Finally, we apply the iterative quantization to the embedded points and compute their hash codes. For a large scale dataset, a landmark-based approximate hyperspherical embedding algorithm is proposed. Firstly, the embedding solution of a small set of landmark points is computed using the aforementioned algorithm. Then, the embedding positions of data items are estimated according to the similarities between data items and landmark points using the least square method. Experimental results show that the proposed algorithm generates better hash codes for the CIFAR10 dataset in comparison to the original iterative quantization method.
Keywords/Search Tags:approximate nearest neighbor retrieval, hashing, iterative quantization, spherical embedding
PDF Full Text Request
Related items