Cross-Modal Retrieval Of Image-Text Based On Deep Learning

Posted on:2021-02-11

Degree:Master

Type:Thesis

Country:China

Candidate:G Q Tian

Full Text:PDF

GTID:2428330614958453

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of next-generation information technologies such as the Internet,big data,artificial intelligence,etc.,the core theories such as big data analysis,cross-media computing,swarm intelligence,collaboration and optimization,machine learning,and brain-like intelligence continue to be deepened.As an important content and application in the field of cross-media computing,cross-modal retrieval has also received more and more attention.To a certain extent,most of the existing cross-modal retrieval methods have the problems that the modal feature representation ability is not strong enough and the feature correlation model needs to be further improved.Aiming at these problems,this thesis proposes a feature correlation method based on adversarial networks,called FCMAN.This method first enhances the ability of image modal feature representation,and enhances the feature representation ability of image modality by fusing different features of image modality.Secondly,two new adversarial networks are introduced on the basis of feature correlation modeling using a single adversarial network,the role of the two adversarial networks introduced is to model the real labels and predictive labels of projected features of image and text modalities respectively.Therefore,the feature correlation of the image and text modalities is further learned through the combination of image and text modalities feature correlation models of multiple adversarial networks.At the same time,in order to test the performance of FCMAN and visually display the retrieval effect,an image-text cross-modal retrieval system is designed and implemented.Using this system,users can input data in either modality of image or text for retrieval.On the basis of the initial retrieval,the accuracy of the retrieval is further improved through the fusion of relevance feedback technology.The experimental analysis and application results show that FCMAN proposed in this thesis can more effectively learn the feature correlation between image and text modalities and improve the accuracy of image-text cross-modal retrieval.On this basis,the image-text cross-modal retrieval system combined with relevance feedback technology further demonstrates the effectiveness of FCMAN.The research in this thesis provides new ideas and references for the application mode of cross-modal retrieval technology,and has strong theoretical value and application prospects.

Keywords/Search Tags:

cross-modal retrieval, image and text, feature correlation, adversarial network, relevance feedback

PDF Full Text Request

Related items

1	Design And Implementation Of DCGAN-based Image-text Cross-modal Retrieval System
2	Research On Hierarchical Supervised Cross-modal Image And Text Retrieval Based On Deep Hashing
3	Research On Cross Modal Image And Text Retrieval Methods Based On Pretraining Model
4	Research On Multi-modal Web Image Retrieval
5	Research On Relevance Computation Of Cross-modal Retrieval
6	Cross-modal Feature Correlation And Its Application With Generative Adversarial Mechanism
7	Research On Cross-Modal Image-Text Retrieval Techniques Based On Semantics And Common Sense
8	Based Relevance Feedback Image Retrieval Techniques And Realization
9	Deep Network For Image-Text Cross-Modal Retrieval
10	Research On Relevance Feedback In Content-Based Image Retrieval