Font Size: a A A

Feature Fusion-based Visual Instance Retrieval Research

Posted on:2019-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:W Z XueFull Text:PDF
GTID:2428330572959002Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
The continuous emergence and development of massive web and mobile image retrieval have made large-scale visual feature based image retrieval a vital research subject.Highquality feature representation is of essential importance in determining the precision and the efficiency of image retrieval.Existing visual retrieval methods can be generally divided into two categories: transforming local features into global representation based on visual vocabulary techniques;learning global features with high-quality representing ability based on deep learning models.Both methods are very effective in image recognition and classification.However,in visual retrieval applications,especially in visual instance retrieval tasks,both of them are limited by many factors and the accuracy is far from statisfactory.The reason drives from the limitations of these kinds of features.Traditional low-level features have better anti-geometric distortion effect,however,there are some problems such as lack of spatial geometric information and insufficient expression of highlevel semantic content.The prevailing deep features in recent years can carry rich high level semantic information by learning prelabeled image dataset,but often lack low-level content,certain geometric invariance,generality,etc.Moreover,the retrieval efficiency can also go down greatly due to the slowness of visual vocabulary building,the difficulty of training deep model and the high dimension of the features.To address the above issues,this paper proposes a feature fusion-based visual instance retrieval scheme.The main contributions are as follows: 1)designing a novel fused feature that could cover both low-level information and high-level semantic information,so as to enchance the discrimination ability and robustness of the feature;2)encodeing the novel features into product quantization codes and constructing a multi-inverted index structure for them,to reduce the computational cost in visual retrieval;3)exploring to map fused features into Locality Sensitive Hashing to further improve the index structure.To enrich the information of visual features,this paper adopts an unsupervised mode to extract and fuse traditional low-level features and deep semantic features,which totally cover four levels of image content: the color layer,i.e.,color features in HSV space;the point layer,i.e.,local aggregated descriptors based on Root SIFT;the scene layer,i.e.,global pooling layer features extracted from the Goog Le Net model;the target layer,i.e.,improved regional maximum activation features from VGG network model.By precomputing the similarity between all single features in the image library,we can assign different weights for each corresponding single feature,and then proposea concatenation of the weighted fusion features,namely CCRC(i.e.,Complementary CNN,Root SIFT and CN),to further improve the retrieval precision.To speed up the visual instance retrieval,this paper designs multi-inverted index structure to organize the CCRC features efficiently.Firstly,we divide the original fusion feature space into four subspaces and perform coarse clustering respectively to construct multi-indexes.Then we divide the original space again into the cartesian product of low-dimensional subspace to encode the original data into product quantization(i.e.,PQ)codes and fill the PQ codes into the pre-built index lists,We also prebuilt the codebook and lookup table respectively to transform the complex Euclidean distance calculation between the two original high-dimensional vectors into simply and fast operations on lookup table between two compact codes.During the instance retrieval,we first carry out a preliminary matching on the multi-inverted index to obtain a coarse candidate set of CCRC features.By reordering the candidate CCRC features together,we get final retrieval results.We further optimaize multi-inverted index structure by replacing PQ codes with hash codes mapped by LSH(i.e.,Locality Sensitive Hashing)function,and directly compute Jaccard distances between these hashing codes to get rid of complex preprocess.We validate our method on four real-world image datasets.Plenty of experimental analysis and comparison show that,the retrieval precision on fusion features is higher than single features,and our indexing scheme can significantly reduce the retrieval time and meanwhile preserve the retrieval precision.We finally compared with some state-of-the-art methods,the results also demonstrate the effectiveness of the fusion feature and the efficiency of the retrieval scheme proposed in this paper.
Keywords/Search Tags:Visual Instance Retrieval, Feature Fusion, Multi-Inverted Index, Product Quantization, Locality Sensitive Hashing
PDF Full Text Request
Related items