Font Size: a A A

Coupled-hashing For Cross-modal Retrieval

Posted on:2018-03-10Degree:MasterType:Thesis
Country:ChinaCandidate:Y H LiuFull Text:PDF
GTID:2348330518499374Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of mobile Internet,Internet of things,and cloud storage,multimodal data(including image,text,video and audio)have become the main information carrier on the Internet,and tremendous amounts of multimedia data have been accumulated.The enormous multimedia data contain rich economic value and social value.It has brought new challenges and opportunities for the national economic and social development.For these reasons,how to retrieval among different data from different modalities has become a hot issue in information retrieval.Cross-modal hashing methods use hash functions to encode high-dimensional features of multimodal data into low-dimensional binary hash codes,and preserve the similarity of high-dimensional features.Due to its low storage cost and fast query speed,cross-modal hashing has attracted intensive attention in retrieval filed.The semantic gap between low-level feature and high-level semantic is one of the difficulties in cross-modal retrieval.While most of the methods are limited to the use of hand-crafted features with limited characterization.In this paper,two kinds of cross-modal hashing methods based on coupled relationship are proposed.The main works of the paper are listed as follows:(1)A cross-modal hashing method based on joint coupled relationship is proposed.Considering that the multimodal data have heterogeneous structures,we reject the strategy of projecting data from different modals directly into the common Hamming space.Instead,the multimodal data are projected into the opposing Hamming space,and then the different modals are coupled.At the same time,according to the characteristic that matrix decomposition can mine the latent semantic space,we use the hash codes to reconstruct the original data based on matrix decomposition.By means of mining the latent semantic space,the method not only improves the representation ability of features,but overcomes the semantic gap.The multimodal data can be tightly coupled.The experiment results show that the proposed method can not only achieve high precision,but have an improvement in term of retrieval efficiency compared with other methods.(2)We propose a cross-modal hashing method based on deep coupled relationship.In order to solve the problem of limited characterization of hand-crafted features,we firstly use two kinds of deep learning networks(CNN-F and MLP)to extract the features of multimodal data respectively,and then output the hash codes at the end of the networks.We integrate feature extraction and hash learning into a unified framework.We couple the multimodal data from two aspects.On the one hand,we use the deep convolutional networks to mine features that have more powerful representation ability.On the other hand,we use the similarity matrix generated by the category label to restrict the hash codes of different modals from two angles.Experiments show that the proposed approach outperforms several state-of-the-art methods in term of retrieval effect.
Keywords/Search Tags:Cross-Modal Hashing, Matrix Decomposition, Reconstruction Embedding, Latent Semantic Space, Deep Learning
PDF Full Text Request
Related items