Font Size: a A A

Approximate Nearest Neighbor Search Based On Optimized Bit Allocation And Residual Rotation

Posted on:2017-04-24Degree:MasterType:Thesis
Country:ChinaCandidate:F XuFull Text:PDF
GTID:2308330485993919Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The human society has entered the information age in 21 st century, information will constitute an important technology material basis of human life and production, and to replace the traditional means of production into the first factor of production. Here, the information includes not only the narrow sense of "audio message", which essentially refers to the information resources of the whole society can be shared. New challenges are raised for access, storage, transmission and query to information resources, with the further development of Internet technology and the popularity of the computer.Similarity search is often referred approximate nearest neighbor search, it has been widely used in various fields of informatization life. For related methods to approximate nearest neighbor search, have also been a hot issue which is researched by domestic and foreign scholars. Which in addition to the method that is streamlining information retrieval structure as a means to improve, the other is data approximate method with the represent of hash and vector quantization. Vector quantization by quantization similar data to the same word and maping each data to the system codebook, thereby reducing the complexity of the neighborhood search.This article is based on the method of vector quantization. We start by analysising the quantization distortion of the vector quantization method, to derive the necessary that resolve the contradiction between codebook size and quantization complexity if we want to reduce quantization distortion. In order to obtain the optimal size of the codebook number of clusters relative to the size of the codebook, we must find the clustering number which meet the conditions as a parameter. As a starting point, we have analyzed the existing excellent method for determining the clustering number and raised a indicator which applies to our issue for determining the clustering number. The indicator is for the unbalanced energies distribution which generate from the data processing. The indicator combines difference degree of data distribution and quantization distortion. We propose the optimized bit allocation method by applying this indicator, in order to build a relatively optimal codebook.In addition to the quantification process improvements, the paper also presents residual rotation method, optimizing quantization distortion after the end of the quantized. By analyzing the iterative Quantization method, based on previous research we have come to the fact that the quantization error can be reduced by orthogonal rotation. Accordingly, we obtain residual vector by differencing the data and their quantization results, using the residual matrix instead of random orthogonal matrix, optimize the quantization error by an iterative method of rotating.Both of the methods presented in this paper were carried out experimental analysis by three different dimensions database, by this way we verify the effectiveness of the indicator. By contrasting with the retrieve performance of existing excellent methods, we discuss the validity and applicability of our approach.
Keywords/Search Tags:Approximate nearest neighbor search, vector quantization, quantization error, optimized bit allocation method, residual rotation
PDF Full Text Request
Related items