Towards Deep Compact Visual Descriptor Via Fisher Network With Binary Embedding

Posted on:2020-06-23

Degree:Master

Type:Thesis

Country:China

Candidate:J Q Qian

Full Text:PDF

GTID:2428330575963612

Subject:Pattern Recognition and Intelligent Systems

Abstract/Summary:

PDF Full Text Request

In large-scale mobile visual search,the compactness of visual descriptor is of fundamental importance for retrieval efficiency.Fisher Vector(FV)is a type of very discriminative global descriptor and has achieved excellent performance for large scale visual search.But for resources limited devices such as mobile phone and embedded devices,the compactness of global descriptor is crucial.The high-dimension FV is not suitable for these devices.Hashing has been widely used to embed high-dimension global descriptor to low-dimension binary codes,but the low-dimension binary codes are not as discriminative as the original high-dimension global descriptor.To get compact visual descriptor,FV is first extracted followed by hashing encoding.The learning of hash codes based on the high-dimensional FV is a two-stage learning process:the learning of FV codebook and the learning of hashing encoding process,which makes the final binary codes sub-optimal.In recent years,more and more researchers focus on the end-to-end deep neural network which directly maps the image to binary codes.But this kind of binary codes is not very discriminative and is not optimal for large-scale visual search task..To solve these problems mentioned above,we propose a novel compact image description scheme based on an end-to-end deep neural network to solve large-scale image retrieval problem.The proposed neural network consists o:f two blocks:the Fisher network and the binary embedding neural network.The Fisher network is a learnable network that mimics the traditional FV encoding scheme,and can be trained jointly with other neural networks.The binary embedding neural network encodes the high dimensional FV produced by Fisher network into a middle-length binary codes.These two modules can be trained end-to-end,which makes the overall learning process optimal.The proposed network inputs the local feature descriptors of an image and outputs an image-level binary signature.The model is trained with the image label in a supervised manner.The output binary signature can preserve the semantic similarity between images and its length is as short as possible.Experiments performed on MPEG-7 CDVS and ILSVR2010 prove that the proposed compact image description scheme performs better than the traditional two-stage encoding method.

Keywords/Search Tags:

Large-scale Mobile Visual Search, Compact Visual Descriptor, Aggregated Descriptor, Binary Coding

PDF Full Text Request

Related items

1	Compact Aggregated Descriptors For Mobile Visual Search
2	Research On Dominant Descriptor Selection Based Mobile Visual Search Algorithm
3	Mobile Visual Search Based On CRBM And NetFisher
4	Local Visual Information Based Large-Scale Image Retrieval
5	Progressive Transmission Used In Mobile Visual Search
6	Mobile Visual Search Based On Saliency
7	Research And Hardware Design Of MPEG-7 Compact Colour Descriptor
8	Research On Some Issues For Decentralized Connective Stabilization Of Large-scale Descriptor Systems With Expanding Construction
9	Research On Key Techniques Of Content-Based Large-Scale Image Retrieval
10	The decentralized control of large-scale descriptor systems