Research On Cross-modal Hashing Algorithms For Large-scale Multimedia Retrieval

Posted on:2021-01-07

Degree:Master

Type:Thesis

Country:China

Candidate:Z J Shen

Full Text:PDF

GTID:2428330614450002

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Cross-Modal retrieval aims to provide flexible retrieval across different types of multimedia data(such as image,text or video).Compared to traditional uni-modality retrieval tasks,such as image-to-image retrieval,cross-modal retrieval enables more adaptable retrieval experience,like using a video to retrieve its detailed textual explanation.Cross-modal retrieval is a challenging problem,since data from different modalities typically have different statistical properties and are improper to compare directly,which is usually referred to as heterogeneity gap.To solve this problem,most of methods try to project data from different modalities into a common space.With the emerging of big data,existing cross-modal retrieval methods usually suffer from serious high computation and storage cost problem.Cross-modal Hashing(CMH)is then proposed to confront with scalability issue,which integrates hashing technique to learn compact hash codes for different modalities.In this work,we intent to propose more efficient cross-modal hashing methods.First,we propose a new method called �Semi-supervised Graph Convolutional Hashing Network�(SGCH).As we know,most of traditional cross-modal hashing methods are supervised,which performs better but also requires tedious human effort to label the training data.On the contrary,semi-supervised CMH methods which leverage both labeled and unlabeled data become more practical in real applications.In this work,we first model different modalities as graph structures,and use graph convolution to preserve the high-order intra-modal similarity,as well as propagating semantic information from labeled samples to unlabeled data.Then,we use a siamese network to project the learnt graph representations into compact hash codes.To further bridge the inter-modality gap,adversarial loss which aims to learn modalityindependent features by confusing a modality classifier is incorporated into the overall loss function.Extensive results on large-scale multimedia datasets NUS-WIDE-10 K and Wiki demonstrate the effectiveness of SGCH.Then,considering that graph structure is of vital importance for the final retrieval performance,we further propose a new methods called �Adaptive Semi-supervised Graph Convolutional Hashing�(ASGCH).ASGCH utilize Graph Sage algorithm to learn graph representation for different modalities,which is more flexible for largescale graph convolution.Meanwhile,ASGCH construct a semantic classifier to predict labels for unlabeled data,and then add the most confident predictions to the labeled dataset.At the same time,ASGCH utilizes the predicted labels to reconstruct the graph structures for different modalities recursively.Experiments on three real-world datasets including MIRFLICKR-25 K,NUS-WIDE-10 K and Wiki proves the outstanding performance of ASGCH than the state-of-the-art methods.

Keywords/Search Tags:

Cross-Modal Retrieval, Hashing Learning, Graph Convolutional Network, Semi-supervised Learning

PDF Full Text Request

Related items

1	Researches On Cross-Modal Learning Algorithms For Image-Text Retrieval
2	Cross-modal Retrieval Research Based On Correlation Analysis And Structure Preserving
3	Cross-modal Retrieval And Annotation Based On Hashing Learning Method
4	Heterogeneous Graph Hashing For Cross-Modal Audio-Image Retrieval
5	Research On Cross-modal Retrieval Of Images And Texts Based On Deep Hashing Learning
6	Design And Implementation Of Retrieval System Oriented To Cross-modal Hashing
7	Study Of Cross-modal Hashing Algorithms With Applications
8	Research On Semi-supervised Cross-modal Hashing Retrieval Algorithm
9	Research Of Weakly-supervised Cross-modal Hashing Learning
10	Research On Supervised Learning For Cross-modal Retrieval