Font Size: a A A

Research On Cross-modal Retrieval Of Images And Texts Based On Deep Hashing Learning

Posted on:2021-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:X TianFull Text:PDF
GTID:2428330614458438Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,multi-modal data including images,texts,audio,and video has exploded and enriched people's lives.People have long been dissatisfied with single-modal data search,such as images retrieval images,texts retrieval texts.With the increasing of multi-modal data,people hope to realize mutual retrieval between multi-modal,such as images retrieval texts,texts retrieval images.Therefore,in recent years,cross-modal retrieval has become a research hotspot.Different modal data have different feature spaces,how to measure the similarity between them has become a difficult point for research.In real life,unlabeled data is easier to obtain than labeled data.If you manually label these unlabeled data,it takes a lot of effort.On the basis of some labeled data,how to mine the label information contained in unlabeled data becomes difficulties in research.In order to make better use of these unlabeled data,this thesis mainly studies semi-supervised cross-modal retrieval methods.In view of the above problems,in order to better preserve the similarity between multi-modal data and separate the irrelevant data at the same time,this thesis proposes a semi-supervised deep hashing model based on modal similarity preservation.Based on this research,in order to better retain the feature information and reduce the negative interference caused by redundant noise,this thesis proposes a semi-supervised deep hashing model based on denoising autoencoder,which further improves the accuracy of cross-modal retrieval.The specific research work is as follows:1.Aiming at the problems that some existing models cannot well preserve the similarity between modalities and cannot separate the irrelevant data,this thesis proposes a semi-supervised deep hashing based on modal similarity preservation(SS-LPDP)and learning algorithm.The model is divided into three parts: label prediction,hash code learning and distance preservation.First,use deep neural networks to extract the features of images and texts,learn the corresponding hash function to project features of different dimensions into a common space,and predict the label information of unlabeled data based on the feature distribution of some labeled data.Then use the label information and the extracted features as input to perform hash code learning and distance preservation.Finally,according to the parameter changes in each iteration of training,the label information of the unlabeled data is dynamically updated.Experimental results show that compared with some recent models,the SS-LPDP model has achieved a certain improvement in the accuracy of cross-modal retrieval.2.Aiming at the negative interference caused by redundant noise during training of SS-LPDP model,based on the existing work,combined with the idea of denoising autoencoder,a semi-supervised deep hashing model based on denoising autoencoder(SS-DAE)is proposed.First,the features of the images and texts are extracted using a deep neural network,and the extracted features are as input into a denoising autoencoder.The denoising autoencoder consists of a random noise adding part,an encoding part,and a decoding part.Then predict the label information of unlabeled data based on the feature distribution of some labeled data,take the label information and the features extracted from the encoding part as input,perform hash code learning and distance preservation,meanwhile,according to the features extracted from the decoding part and the features extracted by the neural network define the reconstruction loss function of the denoising autoencoder.Finally,according to the parameter changes in each iteration of training,the label information of the unlabeled data is dynamically updated.Experimental results show that compared with some recent models,the SS-DAE model has achieved a certain improvement in the accuracy of cross-modal retrieval.
Keywords/Search Tags:cross-modal retrieval, deep hashing, semi-supervised, similarity preservation, denoising autoencoder
PDF Full Text Request
Related items