Font Size: a A A

Cross-Modal Retrieval Of Image-Text Based On Deep Learning

Posted on:2021-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:G Q TianFull Text:PDF
GTID:2428330614958453Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of next-generation information technologies such as the Internet,big data,artificial intelligence,etc.,the core theories such as big data analysis,cross-media computing,swarm intelligence,collaboration and optimization,machine learning,and brain-like intelligence continue to be deepened.As an important content and application in the field of cross-media computing,cross-modal retrieval has also received more and more attention.To a certain extent,most of the existing cross-modal retrieval methods have the problems that the modal feature representation ability is not strong enough and the feature correlation model needs to be further improved.Aiming at these problems,this thesis proposes a feature correlation method based on adversarial networks,called FCMAN.This method first enhances the ability of image modal feature representation,and enhances the feature representation ability of image modality by fusing different features of image modality.Secondly,two new adversarial networks are introduced on the basis of feature correlation modeling using a single adversarial network,the role of the two adversarial networks introduced is to model the real labels and predictive labels of projected features of image and text modalities respectively.Therefore,the feature correlation of the image and text modalities is further learned through the combination of image and text modalities feature correlation models of multiple adversarial networks.At the same time,in order to test the performance of FCMAN and visually display the retrieval effect,an image-text cross-modal retrieval system is designed and implemented.Using this system,users can input data in either modality of image or text for retrieval.On the basis of the initial retrieval,the accuracy of the retrieval is further improved through the fusion of relevance feedback technology.The experimental analysis and application results show that FCMAN proposed in this thesis can more effectively learn the feature correlation between image and text modalities and improve the accuracy of image-text cross-modal retrieval.On this basis,the image-text cross-modal retrieval system combined with relevance feedback technology further demonstrates the effectiveness of FCMAN.The research in this thesis provides new ideas and references for the application mode of cross-modal retrieval technology,and has strong theoretical value and application prospects.
Keywords/Search Tags:cross-modal retrieval, image and text, feature correlation, adversarial network, relevance feedback
PDF Full Text Request
Related items