Triplet-Based Deep Hashing Network For Cross-Modal Retrieval

Posted on:2019-11-05

Degree:Master

Type:Thesis

Country:China

Candidate:Z J Chen

Full Text:PDF

GTID:2428330572952225

Subject:Circuits and Systems

Abstract/Summary:

PDF Full Text Request

Over the past decade,with the rapid development of Internet technology and social network,millions of multimedia data have been generated every day.Multimedia data on the Internet exists in different forms from heterogeneous data sources.For example,a web page may contain multiple modal data such as texts,pictures,videos,etc.Although these data come from different modalities,they have strong semantic correlation.Cross-modal retrieval is designed for scenarios where the queries and retrieval results are from different modalities.Cross-modal retrieval mainly faces two technical problems.One is how to extract the sample features of different modalities to contain richer semantic features,and the second is how to bridge the semantic gap between different modalities.In order to solve the above problems,many cross-modal retrieval methods have been proposed.Among them,the hashing methods have attracted extensive attention from industry and academia due to their efficient retrieval speed and low memory cost.The cross-modal hashing methods map the high-dimensional original data into compact hash codes,and then compute the Hamming distance among cross-modal data via fast bit-wise XOR operation to measure the similarity between the cross-modal data.For two problems in cross-modal retrieval,we propose two cross-modal hashing retrieval methods,the specific content is as follows:(1)A cross-modal retrieval method based on a triplet deep hashing network is proposed.In order to extract effective cross-modal sample features,we integrate the feature learning and the hash code learning into a unified end-to-end deep neural network.At the same time,the proposed method uses triplet label as supervised information,and the triplet label can more flexibly capture multiple high-level similarities and generate different constraints.Furthermore,triplet organization can enlarge the number of training data to alleviate the over-fitting problem.This method effectively improves the retrieval accuracy of cross-modal retrieval.(2)A cross-modal retrieval method based on graph regularized triplet deep hashing network is proposed.Based on the above method,we use the triplet label to establish different triplet loss functions,inter-modal triplet loss function,intra-modal triplet loss function and graph regularization loss function.The inter-modal triplet loss function is used to construct the semantic relationship between different modalities.The intra-modal triplet loss function is used to enhance the discriminability of the hash code.The graph regularization loss function is used to establish the semantic similarity between the original space and the Hamming space.This method alleviates the semantic gap between the cross-modal data and effectively improves the retrieval accuracy.

Keywords/Search Tags:

Deep neural network, Hashing, Triplet labels, Cross-modal retrieval, Graph regularization

PDF Full Text Request

Related items

1	Research On Single-modal And Cross-modal Retrieval By Hashing Technology
2	Research On Key Technologies Of Deep Cross-Modal Hashing
3	Heterogeneous Graph Hashing For Cross-Modal Audio-Image Retrieval
4	Cross-modal Retrieval Research Based On Correlation Analysis And Structure Preserving
5	Research On Multi-Kernel Learning And Graph Regularization Based Cross-modal Hashing Retrieval
6	Attention-based Fusion Triplet Hashing For Cross-modal Retrieval
7	Graph Convolutional Network Hashing For Cross-Modal Retrieval
8	Research On Cross-modal Hashing Algorithms For Large-scale Multimedia Retrieval
9	Research On Cross-modal Retrieval Of Images And Texts Based On Deep Hashing Learning
10	Deep Label-based Hashing For Cross-modal Retrieval