Font Size: a A A

Research On Deep Cross-modal Hashing Retrieval Method Based On Features Fusion

Posted on:2022-11-04Degree:MasterType:Thesis
Country:ChinaCandidate:H F WangFull Text:PDF
GTID:2518306743474154Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the constant updating of Internet technology and artificial intelligence,cross-modal retrieval has been becoming a hot research topic in order to adapt to the pluralistic trend of information development.Because of small storage space and fast retrieval speed,hashing method has attracted extensive attention in cross-modal retrieval.In the paper,cross-modal retrieval tasks are implemented from both supervised and unsupervised hashing perspectives,and the research works are as follows:A supervised cross-modal hashing method,called Shared Semantics based on Triple-Fusion Hashing for deep cross-modal retrieval(SSTFH),is proposed.Most of the existing cross-modal hashing methods keep the similarity by only imposing constraints on the learned hashing codes,but rarely consider maintaining the similarity among intermediate features.In addition,much semantic information will be lost when going through lots of fully-connected layers and pooling layers.To solve the above problems,a triple fusion strategy is employed firstly.With the first and second fusion,the global abstract features and local detailed features from different modalities are fused to reduce the loss of semantic information after passing through several fully-connected layers and pooling layers.Then the third fusion is used to fuse the local features from different modalities to construct a shared semantic space,in which the association of different modal features can be increased and narrow the heterogeneous gap between modalities.This is beneficial to learn strong hashing codes.Extensive experiments conducted on MS COCO and IAPR TC-12 datasets verify the effectiveness of the proposed method.An unsupervised cross-modal hashing approach,named Multi-scale Fused Dual GAN Cross-modal Hashing retrieval(MFDCH),is put forward.A two-stream network structure are used to learn hashing codes for different modalities in most unsupervised works,but there is no correlation in the learning process.Similarly,most unsupervised methods also take the problem of semantic information loss during feature extraction.Based on the above considerations,firstly,this paper proposes to extract semantic features of different scales from different network layers,and then fuses these features to obtain more semantic-rich features to reduce the loss of semantic information.Secondly,an inner-outer double cycle generative adversarial network is designed.The outer cycle network is used to increase the correlation between modalities and learn more powerful modal representation.The inner cycle network can further enhance the relevance among modality-specific features to learn high quality hashing codes.Lots of experiments are carried out on two widely used MIRFlickr-25 K and IAPR TC-12 datasets,and the experimental results show that the proposed model can improve the retrieval performance.
Keywords/Search Tags:Cross-modal retrieval, Deep hashing, Features fusion, Shared semantics, Generative adversarial networks(GAN)
PDF Full Text Request
Related items