| Image data is used to store and disseminate information as a main role in the Internet technology.And with the increasing of Internet users with vast information interaction,the scale of image data is growing at an exponential rate.Due to that,it is difficult to realize efficient image retrieval.And the higher dimension of image feature vectors make the retrieval task more challenging.To solve these problems,the Approximate Nearest Neighbor(ANN)search based on Vector Quantization(VQ)method is used for image retrieval.However,in the face of large scale and high dimensional image features,the accuracy and efficiency of vector quantization method still need to be further improved.Therefore,this thesis focuses on the vector quantization method of approximate nearest neighbor search for large-scale image features to carry out the following research:(1)In order to reduce the time cost of quantization while keeping the accuracy unchanged,a Projection-based Enhanced Residual Vector Quantization is proposed(PERVQ).Based on previous research on enhanced residual vector quantization(ERVQ)method,the feature vectors of training and quantizing are projected in low-dimensional vector space to improve the efficiency.The overall errors generated by projection and quantization are joint optimized to increase the codebooks discrimination.(2)A method called Codewords-expanded Enhanced Residual Vector Quantization(CERVQ)is proposed to further improve the code book accuracy and reduce quantization loss.The CERVQ method combines ERVQ with the method of calculating mean-equisection vectors to reduce training error and improves the quantization accuracy.The codebook of each layer is expanded with the mean-equisection vectors to generate new codewords in quantization stage,which are used to quantize input feature vectors to improve quantization accuracy.(3)In the this residual quantization structures,the error generated by training of the previous layers which is used to the train codebook of next layer.Due to that way,the accuracy of the codebooks from previous layers will affect the next layer codebook.Thus,a method called Aggregate Vector Quantization(AVQ)was proposed.In this method,features are divided into partial vectors with the same dimension and quantizers are constructed.The parallel quantizers are constructed to reduce the error transmission between codebooks.Then,an iterative optimization strategy combined with codeword extension in CERVQ is used to improve the accuracy of the codebook.Finally,according to these three methods,a fast method for calculating approximate Euclidean distance between feature vectors is designed,and the ANN search performance of the three proposed methods is evaluated on public data sets.Experimental results proved the feasibility of these methods. |