Font Size: a A A

Cross-modal Retrieval Based On Cyclic Generative Adversarial Networks

Posted on:2020-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:L H NiFull Text:PDF
GTID:2518305897470744Subject:Information security
Abstract/Summary:PDF Full Text Request
With the popular use of multimedia-enabled devices and mobile networking,there is a massive amount of multimedia data generated,communicated,and processed every day.How to get useful information from massive multimedia data is the most concerned issue for Internet users,and cross-modal retrieval is one of the important solutions.Given a query of one modality,cross-modal retrieval techniques can search for the semantically relevant data of the other modalities.In addition,cross-modal retrieval has a wide range of applications.The most common scenario is mutual retrieval between images and text.Nevertheless,traditional methods do not work well for cross-modal retrieval.For example,the accuracy of the unsupervised approaches without the help of labeling information is low.Additionally,some supervised approaches only consider the feature correlation in common subspace,and some methods directly adopt feature vectors to make retrieval calculations,which is inefficient.In view of the shortcomings of the existing works,combined with dual learning and generative adversarial networks(GANs),we exploit the idea of cyclic generative adversarial networks to learn the common subspace of different modalities for the first time,and construct a novel two-way circular closed neural network.In the setting of supervised learning,we exploit triplet constraints to enhance the difference of different classes and similarity of same classes for different modalities.In the unsupervised setting,we introduce a manifold structure to capture meaningful nearest neighbor information of the instance in each modality,so that similar instances have smaller distances in the common space.Finally,we recode the feature vectors in the common subspace into compact hash codes,and obtain the Hamming distance between the hash codes by XOR operation for efficient cross-modal retrieval.The proposed scheme figures out the problems of existing supervised and unsupervised methods.In the experiment,our quantitative comparisons over three widely-used datasets against six state-of-the-art methods demonstrate the superiority of our approach.
Keywords/Search Tags:cross-modal retrieval, generative adversarial networks, dual learning, hash code
PDF Full Text Request
Related items