Hashing For Large Scale Similarity Search

Posted on:2016-09-02

Degree:Master

Type:Thesis

Country:China

Candidate:K Zhao

Full Text:PDF

GTID:2308330476453322

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

The explosive growth of the multimedia data (including texts, images and videos) on the internet has posed a great challenge to many applications in terms of fast simi-larity search. Content Based Image Retrieval is one representative example. To handle this problem, Hashing based approximate nearest neighbor search techniques have re-cently become more and more popular because of their improvements in computational speed and storage reduction.This paper focuses on the Hashing algorithms for large scale similarity search. It begins with optimizing and improving the traditional Spectral Hashing. Then it propos-es new Hashing model to deal with the problems that traditional Hashing framework may encounter. Finally, it shows a new method to model the semi-supervised Hashing problem to improve the retrieval accuracy. The contributions and innovation of this paper include:(1) Local Linear Spectral Hashing. It improves the Spectral Hashing from the fol-lowing perspectives:(1) Spectral Hashing only takes neighbor relations into con-sideration. The proposed method considers not only neighbor relations, but also non-neighbor relations; (2) Spectral Hashing needs to compute an x n matrix, which is very time-consuming when n is very large. The proposed method uses a. m×m (m<<n) local matrix instead; (3) Spectral Hashing assumes that data distributes uniformly, and it uses a little complicated solution. In most cases, the assumption of uniform data distribution is usually not true. The proposed method ignores this assumption, and adopts a simple linear model to solve the proposed problem. Experiments demonstrate the proposed methods are simple and efficient.(2) Locality Preserving Hashing. In general, Hashing methods will take two step- s:projection+quantization. In projection stage, several projected dimensions are generated in certain ways. Then in quantization stage, the projected values will be quantized into binary codes. These methods may destroy the locality structure of dataset that is preserved in projection stage, since them usually di-rectly thresholding the projected values to get the binary codes. By contrast, the proposed method combines projection and quantization stage, by using a joint optimization framework to complete projection and quantization simultaneous-ly. Consequently, the locality structure of the data in the feature space can be well preserved. The experimental results show the proposed locality preserved strategy is more reasonable.(3) Locality Preserving Discriminative Hashing. Hashing algorithms can be divided into three categories:supervised, semi-supervised and unsupervised. The semi-supervised methods have a promising performance since it integrates labeled and unlabeled data. One representative approach is Semi-supervised Hashing. How-ever, it only considers the pair-wise relations of labeled data, ignoring the global labeled information. What’s worse, it fails to preserve the neighbor relationship. Differently, the proposed method combines Linear Discriminant Analysis and Locality Preserving Projection to preserve the local and global structures of data simultaneously. The experiments on three standard datasets show the effective-ness and efficiency of the proposed method.

Keywords/Search Tags:

PDF Full Text Request

Related items

1	Image Hashing Algorithms Based On Locally Linear Embedding And Locality Preserving Projection
2	Research On Similarity Image Retrieval Based On Locality Sensitive Hashing And Structured P2P Network
3	Research And Implementation Of LSH-based Large Scale Similar Image Retrieval Technology
4	Research On Hashing Methods Towards Large-scale Similarity Computation And Search
5	Clustering And Locality Sensitive Hashing Algorithms On Text Stream Data Under Classification-oriented Measure
6	Dynamic Similarity Search Over Encrypted Data
7	Research On Locality Sensitive Hashing-Based Similarity Search
8	Research On Audio Similarity And Their Applications In Music Retrieval
9	Research About Image Retrieval Based On Hashing Technology
10	Research Of Approximate Nearest Neighbor Search Based On Locality Sensitive Hashing