Font Size: a A A

Multimedia Retrieval Based On Deep Hashing With Associated Feature

Posted on:2022-10-26Degree:MasterType:Thesis
Country:ChinaCandidate:J H WangFull Text:PDF
GTID:2558307070952469Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the popularity of handheld intelligent terminals and social multimedia platforms,multimedia has become the main media form for people to communicate.How to quickly query the content needed from the massive multimedia data has become a hot issue.With the rapid rise of deep learning and the unique advantages of hashing method in computing and storage,the deep hashing method has won the favor of researchers.Although the existing deep hashing methods can well serve multimedia retrieval,there are still some shortcomings.Therefore,this paper focuses on how to efficiently use multi-modal data associated information.The main work and contributions are summarized as follows:(1)A cross-modal hashing method based on multi-modal joint information is proposed.Most of the existing methods don’t make full use of the consistency information between multimodes,so that the semantic information of the generated hash codes is insufficient.Therefore,this paper proposes a cross-modal hashing method focusing on multi-modal joint information,which jointly learns modality-specific hashing and multi-modal joint hashing in an end-to-end network to preserve the inter-modal and intra-modal similarity and bridge the semantic gap between different modalities.By matching the conditional probability distribution of modalityspecific hash codes and multi-modal joint hash codes in feature space,the image and text modality hash codes are rich in multi-modal joint information.On this basis,the discrimination of hash code is further improved by aligning the conditional probability distribution across the hash code scale.This method achieves excellent retrieval performance on public datasets.(2)A cross-modal hashing method based on associated knowledge distillation is proposed.To ensure retrieval efficiency,the mainstream hashing methods mostly use the model with a simple structure.The high-level semantic knowledge is difficult to be fully mined.Given this,this paper proposes a cross-modal hashing method based on associated knowledge distillation,which divides the framework into teacher and student hash networks.The teacher network uses the Transformer based self-attention and co-attention mechanism to capture the inter-modal and intra-modal modalities context information and then obtains the joint representation containing rich visual-semantic associated information.Then,the modality-specific latent semantic information and multi-modal associated complementary knowledge learned by the teacher network are transferred to the lightweight student network,so that the student network can generate hash codes that retain multi-source knowledge while ensuring the retrieval speed.Experiments on several public datasets show the effectiveness of the method.
Keywords/Search Tags:Deep hashing, Cross-modal retrieval, Conditional probability distribution, Knowledge distillation, Transformer
PDF Full Text Request
Related items