Font Size: a A A

Design And Implementation Of Retrieval System Oriented To Cross-modal Hashing

Posted on:2022-07-15Degree:MasterType:Thesis
Country:ChinaCandidate:F LiFull Text:PDF
GTID:2518306764494274Subject:Internet Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of the artificial intelligence,the types of multimedia data have been booming.In this context,cross-modal retrieval has attracted more and more attention.The goal of cross-modal retrieval is to retrieve instances of another modality given a query instance of one modal,and provide the capability of mutual retrieval for different types of multimedia data.However,the amount of current multimedia data is growing rapidly,and fast and efficient cross modal retrieval has become a huge challenge.To solve this problem,many based on hashing methods have been proposed to achieve fast and efficient cross-modal retrieval.Therefore,this paper conducts research on the key technologies of cross-modal hashing.The main work of the paper includes:(1)An unsupervised cross-modal hashing with auxiliary knowledge is proposed.In order to solve the problem of large apparent differences in cross-modal data,this paper introduce the auxiliary knowledge from object detection task into the unsupervised cross-modal hash learning.The auxiliary knowledge from object detection task can help to construct the correlation among the media data and promote the unsupervised hash learning.This paper proposes a framework to combine the correlations constructed by auxiliary knowledge and media data,which considers the complementary information from both the auxiliary knowledge and media data.The experiments are conducted on two widely-used datasets,namely MIRFlickr and NUS-WIDE,which verify the effectiveness of the proposed method.(2)This paper proposes an object-level visual-text correlation graph hashing approach to mine the fine-grained object-level similarity in cross-modal data while suppressing noise interference.Specifically,a novel intra-modality correlation graph module is designed to learn graph-level representations of different modalities in an unsupervised manner,and the constructed graph structure contains the global information of its original semantic structure.Then,this paper designs a visual-text dependency building module that can capture correlation semantic information between different modalities by modeling the dependency relationship between image object region and text tag.Extensive experiments on two widely used datasets verify the effectiveness of this paper proposed approach.(3)Develop an intelligent image and text retrieval system based on deep learning.Based on the above two key research results,this paper designs an intelligent image and text retrieval system and realizes the three functions of feature generation,search text by image and search image by text.The system not only meets the users' needs for multimodal data retrieval,but also does not need additional manual data annotation for network training.Therefore,it meets the needs of practical application and has a good application prospect.
Keywords/Search Tags:Unsupervised Learning, Cross-modal Retrieval, Hash Learning, Graph Convolutional Neural Network, Intelligent System
PDF Full Text Request
Related items