Font Size: a A A

Zero-Shot Learning For Deep Cross-Modal Hashing

Posted on:2022-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y GaoFull Text:PDF
GTID:2518306605467884Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
With the vigorous development of information technology,the era of big data has come.Countless massive multi-modal data with abundant sources and diverse modalities are greatly affecting all aspects of our society.How to effectively use these massive data has already become an important issue.As a mainstream form of data utilization,information retrieval is playing an increasingly important role in today's society.In the field of multimedia information retrieval,cross-modal retrieval has gradually become one of the research hotspots.Because it can combine the images,texts and other different modalities.It aims to explore and characterize the semantic relationship between multi-modal data,and through the sample data of a certain modal to query the samples with similar semantics in other modalities.The cross-modal retrieval method based on hash algorithm has become one of the mainstream methods in the current multimedia data retrieval field due to its low storage cost and high retrieval efficiency.In recent years,the academic community has successively proposed many powerful supervised cross-modal hash retrieval algorithms.However,with the generation of massive new concepts and new categories of data,the labor cost of labeling data is also increasing.In addition,after manually labeling the newly added data,the traditional cross-modal hash retrieval model needs to be retrained,which further increases the time cost.In order to improve the transferability of the model to new data,this paper introduces the concept of zero-shot learning.Zero-shot learning conducts model training on samples of known classes,and uses additional prior knowledge to enable the model to complete corresponding tasks on samples of unknown classes.This work combines the advantages of zero-shot learning for the transferability towards unknown class sample learning,and fully considers the challenges in zero-shot learning and cross-modal hash retrieval.This work proposes two cross-modalities based on zero-shot learning.The contents of this paper are listed below:First,we propose a zero-shot cross-modal hash retrieval method based on attribute and tag semantic constraints.Aiming at the problem of difficulty in association between known and unknown samples in zero-shot learning,we design a common space mapping network based on attribute features,and construct the semantic association between the known and unknown samples through the attribute space.In addition,we construct a feature encoding network based on tag semantics,fully mining the rich semantic information in tags,and aligning cross-modal features with it.It can reduce the heterogeneity gap between modalities and the loss of semantic information in the process of mapping the high-dimensional data to the form of low dimensions.Experiments on three popular datasets show that our method can achieve competitive results.Second,we propose a zero-shot cross-modal hash retrieval method based on semantic feature reconstruction.Aiming at reducing the semantic gap between the low-dimensional discrete hash code and the high-dimensional real-valued feature of the original data,we reconstruct the learned hash code back to the feature semantic space,and design semanticbased constrain loss functions for the corresponding intra-modal and inter-modal to ensure the consistency of the reconstructed semantic feature space and the feature semantics of the original data.As a result,the learned hash code can better maintain the semantic similarity relationship in the original data,and further reduce the semantic gap.Experiments on three popular datasets show that our method can achieve competitive results.
Keywords/Search Tags:Deep neural network, Hashing, Zero-shot learning, Cross-modal retrieval
PDF Full Text Request
Related items