Font Size: a A A

Attention-based Fusion Triplet Hashing For Cross-modal Retrieval

Posted on:2022-09-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y B HuFull Text:PDF
GTID:2518306605972039Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the advancement of science and technology and the popularization of the Internet,big data on the Internet is widely spread in the form of multimedia data such as images,texts,and videos.How to quickly and efficiently store and analyze massive multimedia data is a problem of widespread concern for researchers from all walks of life.Hash-based retrieval methods have the advantages of low storage consumption and fast retrieval speed and are important technical means to solve the storage and analysis of massive multimedia data.Among them,hash-based cross-modal retrieval is the focus and difficulty in technical research.Due to deep learning technology has powerful feature representation capabilities,many scholars have proposed cross-modal retrieval methods combined with deep learning to improve retrieval performance.However,the existing deep cross-modal hashing method still has two problems to be solved.The deep networks are difficult to extract the correlation information of the features between modalities,making it impossible to effectively express the consistency of the features between modalities.The similarity matrix is difficult to measure the degree of similarity of samples between modalities,making it impossible to effectively express the ambiguity of samples between modalities.To effectively solve the above problems,this thesis proposes two deep cross-modal hashing methods:(1)A cross-modal hashing method based on attention fusion mechanism is proposed.It is difficult for the deep networks to extract the correlation information of the features between modalities.To enhance the semantic relevance of the features between modalities,this thesis studies the existing deep network models and designs an attention fusion module,which effectively reduces the redundant information in different modalities.In addition,this method uses two modality classifiers for adversarial learning,which enhances the consistency of features and hash code distribution,and significantly improves the performance of cross-modal retrieval.(2)A cross-modal hashing method based on improved triplet loss is proposed.It is difficult for the similarity matrix to measure the similarity of samples.This method proposes a new triplet loss function,which includes two parts: the inter-modal loss function with adaptive weights and the intra-modal center distance loss function,which are used to model the similarity relationship of different modality samples.The inter-modal loss function of the adaptive weight is used to capture the similarity information between different modalities samples,and the intra-modal center distance loss function is used to retain the similarity information between different modalities samples.This method effectively alleviates the semantic gap of different modalities samples,and significantly improves the accuracy of cross-modal retrieval.This thesis conducts experiments on three public datasets and compares them with some commonly cross-modal retrieval algorithms.The experimental results verify the effectiveness of the proposed two deep cross-modal hashing methods.
Keywords/Search Tags:Attention-based fusion mechanism, Triplet loss, Hashing, Cross-modal retrieval, Deep learning
PDF Full Text Request
Related items