Font Size: a A A

Min-hash Sketch Construction Via Nonparametric Clustering

Posted on:2016-05-13Degree:MasterType:Thesis
Country:ChinaCandidate:K H LiFull Text:PDF
GTID:2308330479482174Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the area of computer vision, image retrieval is being widely studied. In recent years, with the idea of big data, image retrieval techniques are developing even faster.Image retrieval system is used to search relevant images from an large image database providing a query image or some other related information.In partial duplicate image retrieval systems, min-Hash algorithms are widely used because of its high efficiency and robustness. In most of min-Hash algorithms, min-Hash functions are considered independent and grouped into tuples called sketches, the discriminative power of sketches are limited. By modeling correlations of min-Hash functions, we propose a novel sketch construction method called Nonparametric Clustering min-Hash(NCm H). In NCm H, the randomly generated min-Hash functions are clustered before grouping them into sketches, while spatial information is fully used in this process. The constructed sketches preserve abundant spatial information between visual words, thus NCm H achieves higher retrieval accuracy compared to the standard min-Hash. Furthermore, our method can be combined with other min-Hash algorithms such as GVP m H, Pm H and Tm H to further improve accuracy. The contribution of this paper can be concluded as follows:(1) using spatial coupling phrases which preserve both co-occurrence and spatial information between visual words to evaluate correlations between min-Hash functions.(2) A nonparametric Bayesian method for clustering min-Hash functions which can be used for sketch construction to improve retrievalaccuracy of the standard and advanced min- Hash algorithms.In experiments, we show that our method outperforms the standard min-Hash and improves the state-of-the-art min-Hash algorithm on Oxford 5K dataset and University of Kentucky dataset.
Keywords/Search Tags:Image Retrieval, Min-Hash, Non-parametric Bayesian, Chinese Restaurant Process, Topic Model
PDF Full Text Request
Related items