Font Size: a A A

Research On Cross-modal Retrieval Method Based On Deep Semantic Hashing

Posted on:2022-10-05Degree:MasterType:Thesis
Country:ChinaCandidate:W W WengFull Text:PDF
GTID:2518306557968469Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Cross-modal retrieval refers to the mutual retrieval of different modalities,which is the process of retrieving data from one modality as a query to another.Due to the advantages of low storage cost and fast query speed,hashing-based methods have been paid more and more attention and widely used in cross-modal retrieval.However,many cross-modal hashing methods extract features by traditional hand-crafted methods and the quality of the extracted features by the way is not high,which will greatly reduce the accuracy of retrieval.In recent years,with the rapid development of deep learning,the high-quality feature extraction based on deep model improves the retrieval accuracy significantly,which is paid more attention by many scholars.However,there is still a difficulty in retrieval,that is,the heterogeneity of modalities will lead to semantic gap,which will also affect the improvement of retrieval performance.Therefore,how to handle the semantic gap of heterogeneous modalities becomes a big challenge for cross-modal retrieval.Aiming at the current problems,cross-modal retrieval method based on deep semantic hashing is studied in this thesis.Firstly,the related knowledge of deep learning is studied and analyzed in this thesis,which lays the foundation for the construction of deep cross modal model.Next,the algorithm of hash learning is studied to provide theoretical guidance for cross modal hashing retrieval.In addition,the attention mechanism related technologies are briefly studied.Finally,a deep hashing model framework for cross-modal retrieval is proposed based on the above theoretical knowledge.On this basis,a label-based deep semantic hashing for cross modal retrieval(LDSH)is proposed in this thesis.In this method,feature learning and hash code learning of each modality are integrated into the same framework through deep neural network.Morever,multi-label is used to construct a similarity-preserving matrix which can describe the similarity degree between modalities,which can preserve the rich semantic information of each modality data to the greatest extent.Furthermore,block structure(B-Structure)is introduced into the model to solve the redundancy problem between hash bits.Experiments show that the method is effective in improving the accuracy of cross-modal retrieval.Furthermore,based on LDSH,a deep semantic hashing with dual attention for cross modal retrieval(DSHDA)is proposed.In this method,multi-label data is used to train a semantic label network(Se Lab Net),which is utilized to extract the consistent semantic information to guide the training of each modality network and maximize the semantic correlation between modalities.In addition,Lo-attention is used to extract the local key information of each modality to improve the quality of the extracted features.Co-attention is used to solve the semantic gap caused by heterogeneity of modalities.The experimental results show that this method can improve the accuracy of cross-modal retrieval.The research results of this thesis can provide new ideas for the research of cross-modal retrieval across the semantic gap,and can be used in practical applications as well,and thus has good theoretical value and wide application prospects.
Keywords/Search Tags:Cross-Modal Retrieval, Deep Learning, Hashing Learning, Multi-Label Data, Dual Attention
PDF Full Text Request
Related items