Research On Cross-modal Retrieval Method Based On Adversarial Network

Posted on:2021-04-25

Degree:Master

Type:Thesis

Country:China

Candidate:F Shang

Full Text:PDF

GTID:2428330602464582

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With rapid advances in communication and Internet technology,there has an explosive increase of multi-modal data.Massive amounts of multi-modal data not only facilitates users,but also poses new challenges to information retrieval technology.In order to better satisfy users'requirements for modal data retrieval,in the meantime,make the computer has the ability to simulate the cognition,learning and decision-making processes of multi-modal data,cross-modal retrieval is applied as the times require.Deep neural network（DNN）is similar to multi-layer nonlinear projection,which has stronger mapping ability than shallow model.It can fully extract multi-level abstract representations of different modalities.In particular,generative adversarial network（GAN）can effectively fit the distribution of multi-modal data and better learn the shared representations of different modal data.This paper integrates the ideas of dictionary learning,metric learning and dual subspaces on the basis of adversarial network,effectively captures the structural information and semantic information of multi-modal data,and well eliminates the heterogeneous gap and semantic gap.The main works and contributions are as follows:1.This paper proposes a Semantic Consistency cross-modal Dictionary learning algorithm with rank Constraint（SCDC）method,which integratesl₂₁-norm and rank constraint into dictionary learning.Then,we introduce the generative adversarial mechanism and propose a Adversarial Cross-Modal Retrieval Based on Dictionary Learning（DLA-CMR）method,which utilizes dictionary learning to reconstruct discriminative features,and takes advantages of adversarial learning to mine the complex statistical characteristics of multi-modal data.Specifically,this method constructs two antagonists,called feature preserving and modality classification.The former ensures that the transformed features（features projected into the common space）have the maximum correlation while maintaining their own modality inherent statistical characteristics,effectively eliminating the heterogeneous gap.The latter is essentially a binary classifier that can predict the original modality of the transformed features.The purpose of feature preserving and modality classification is opposite.They constantly fight and improve,and finally learn a common space to effectively cross the heterogeneous gap and semantic gap.2.This paper proposes a cross-modal Dual Subspace learning with Adversarial Network（DSAN）method,which considers dual subspaces,metric learning and adversarial learning simultaneously.In particular,dual subspaces can effectively mine the structural information of different modal data and make full use of modality-specific information.We propose an improved quadruplet loss,which considers both relative distance and absolute distance,pushes the boundaries of positive and negative samples to some extent.Meanwhile we introduce the hard sample mining,which effectively reduces the complexity and improve the performance of the model.We propose an intra-modal constrained loss,which maximizes the distance of the closest cross-modal negative instances and the corresponding cross-modal positive instances.In addition,this method can make the different modal data learn better feature representations in dual subspaces through adversarial learning,so as to effectively improve the accuracy of cross-modal retrieval.

Keywords/Search Tags:

Cross-modal retrieval, Adversarial network, Dictionary learning, Metric learning, Deep learning

PDF Full Text Request

Related items

1	Deep Metric Learning For Cross-Modal Retrieval
2	Research On Cross-Modal Deep Adversarial Metric Learning
3	Research On Cross-Modal Retrieval Algorithm For Similarity Preservation In Deep Adversarial Learning
4	Research On Cross-Modal Retrieval Based On Deep Semantic Analysis
5	Research On Deep Compatibility Learning For Cross Audio-Visual Media Matching
6	Research On Cross-modal Retrieval Method Based On Deep Learning And Hashing Learning
7	Research On Deep Hashing Method And Security For Cross-Modal Retrieval
8	Research On Cross-modal Retrieval Method Based On Supervised Deep Neural Network
9	Research On Supervised Learning For Cross-modal Retrieval
10	Design And Implementation Of A Cross-modal Retrieval System Based On Deep Adversarial Hashing Technology