Optimized K-Means Hashing Quantization For Approximate Nearest Neighbor Search

Posted on:2020-06-19

Degree:Master

Type:Thesis

Country:China

Candidate:Y L Chen

Full Text:PDF

GTID:2428330602958015

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the advent of the era of big data,the Internet is generating massive high-dimensional data,how to retrieve the target data quickly and accurately is an important issue in the computer field.This requires people to store and query the data index efficiently.ANN has become an effective technology of solving this problem in various fields.The ANN search techniques for high-dimensional is mainly divided into two categories,one is based on prototypes,such as product quantization(PQ),additive quantization(AQ),etc.The other is based on projection hashing,such as local sensitive hash(LSH),Iterative quantization(ITQ),etc.Both two types of methods are developing rapidly,and each of them has its own advantages in different application scenarios.At the same code length,the prototypes methods is usually more accurate than the projection hashing methods,but the distance calculation is slower.In recent years,prototypes based binary hashing methods have become research focus,such as k-means hash(KMH),adaptive binary quantization(ABQ),etc.These methods have high precision due to the idea of prototypes,and their neighbor search is very efficient as Hamming distance calculation and sorting.Besides,they are more advantageous when processing high-dimensional data.However,this type of method is limited by the binary representation of the hypercube characteristics.And the quantization result can not match with the real data distribution with a large quantization error.This paper delved into the typical method of binary hashing:K-means hash(KMH),and proposed Optimized K-means Hashing(OKMH)model aiming at the shortcomings of KMH with large quantization error.Furthermore,the adaptive coding technique is added to the above optimization model to obtain a more refined binary code subset and codebook,this paper proposed Refined Optimized K-means Hashing(ROKMH),which further reduces the quantization error of OKMH.Finally,the proposed OKMH and ROKMH were evaluated on four public datasets(SIFT,CIFAR-320,CIFAR-512 and CALTECH256),and compared with several state-of-the-art binary algorithms such as LSH,KMH,ABQ,etc.In the experiment,the evaluating indicators(quantization error,recall rate,accuracy rate and MAP value)have shown that the proposed methods outperform many binary hashing methods by about 10%.

Keywords/Search Tags:

ANN, Massive high-dimensional, KMH Quantization, Optimized Model

PDF Full Text Request

Related items

1	Channel Optimized Scalar Quantization over Orthogonal Multiple Access Channels with Memory
2	Research On High-order Modulations Signal Transmitting Performance For One-Bit Massive MIMO Systems
3	Finding optimized bounding boxes of polytopes in d-dimensional space and their properties in k-dimensional projections
4	Research And Application Of High Dimensional Query Technology
5	Research On Key Technologies Of Channel Estimation In Massive MIMO Systems
6	Research On The Algorithm Of All Zero Quantization Block Detection In H.265/HEVC
7	Research On All-zero Block Detection For Efficient Video Coding With RDO Quantization Support
8	The Research Of Retrieval Optimized On Optimized Retrieval Cache Method Which Base On Multi-level Of Massive Sensor Data
9	Toward High-Dimensional Data Analysis In Massive MIMO Systems: A Random Matrix Theory Inspired Perspective
10	R-Simp: Model simplification in reverse, a vector quantization approach