Zero-Shot Retrieval Based On Cross-Modal Semantic Guided

Posted on:2019-08-23

Degree:Master

Type:Thesis

Country:China

Candidate:Y X Sun

Full Text:PDF

GTID:2428330623462500

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

With the explosive growth of multimedia data,the fast and accurate image retrieval become an urgent issue.However,the training procedure of typical image retrieval method is based on massive labeled training data,and the acquirement of label is time-consuming and labor-intensive.To solve this problem,Zero-Shot retrieval is proposed to deal with the image retrieval in�unseen� categories.In this work,We devise an effective strategy to resolve zero-shot hashing using semantic Softmax loss(SSL)and attribute-guided network(AgNet).Different from typical zero-shot learning(ZSL)methods which integrate the visual features and the class semantic descriptors into a multimodal framework with a linear or bilinear model.A nonlinear approach is proposed to impose ZSL as a multi-class classification problem via a Semantic Softmax Loss by embedding the class semantic descriptors into the Softmax layer of multi-class classification network.To narrow the structural differences between the visual features and semantic descriptors,we further use an L2 normalization constraint to the differences between the visual features and visual prototypes reconstructed with the semantic descriptors.The results on three benchmark datasets,i.e.,AwA,CUB and SUN demonstrate the proposed approach can boost the performances steadily and achieve the state-of-the-art performance for both zero-shot classification and zero-shot retrieval.The combination of zero-shot retrieval and hashing is another topic of this work.Existing efforts in zero-shot hashing mainly focus on single-modal retrieval task,especially Image-Based Image Retrieval(IBIR).However,as a highlighted research topic in the field of hashing,cross-modal retrieval is more common in real world applications.To address the Cross-Modal Zero-Shot Hashing(CMZSH)retrieval task,we propose a novel Attribute-Guided Network(AgNet).It aligns different modal data into a more high-level semantic space,i.e.,attribute space.Besides,category similarity is utilized to construct the relationships between different modalities while attribute similarity is introduced to regularize the distance of similar categories in single modality.Experimental results on both cross-modal and single-modal retrieval tasks have demonstrated the superiority of the proposed approach.To further evaluate the performance of AgNet in each category,confusion matrix and t-SNE are utilized to visualize the neighbor relationship between textual hash codes and visual hash codes of AgNet.

Keywords/Search Tags:

Zero-Shot Learning, Cross-Modal Hashing, Image Retrieval, Attribute-Guided, Semantic Embedding

PDF Full Text Request

Related items

1	Coupled-hashing For Cross-modal Retrieval
2	Cross-modal Retrieval Research Based On Correlation Analysis And Structure Preserving
3	Heterogeneous Graph Hashing For Cross-Modal Audio-Image Retrieval
4	Research On Visual-Semantic Cross-Modal Retrieval Based Hashing Learning
5	Cross-modal Retrieval And Annotation Based On Hashing Learning Method
6	Zero-Shot Learning For Deep Cross-Modal Hashing
7	Research On Cross-modal Retrieval Method Based On Deep Semantic Hashing
8	Research On Single-modal And Cross-modal Retrieval By Hashing Technology
9	Research On Cross-Modal Hashing Method Based On Class Semantic Embedding
10	Research On Cross-modal Hashing Retrieval Algorithms Based On Latent Semantic Learning