| Due to the storage and efficiency,hashing has been widely applied to large-scale multimedia retrieval.With the explosion growth of multimodal data,multimodal hashing has received increasing attention recently.Multimodal data has different modalities to characterize the same object,and thus can provide richer information than unimodal data.However,learning powerful feature representations from multimodal data with heterogeneity and perform hash retrieval still deserves further research.In this thesis,we investigate multimodal hashing for multi-source and cross-modal retrieval,and implement a cross-modal retrieval prototype system that provides training,retrieval and performance analysis functions.The main contributions of this thesis are as follows:1.As an important branch of hashing methods,multi-source hashing incorporates features of multiple modalities for hash learning.However,most of the existing multi-source hashing methods are based on shallow models,which cannot fully capture the internal association between heterogeneous data.Meanwhile,the similarity of semantic structures between data points which is very useful in generating semantic-preserving hash codes is often overlooked.In this thesis,the proposed MGCH exploits graph convolutional networks(GCN)to explore the inherent structural similarity between data points and uses an asymmetric training strategy to improve the training efficiency.2.Deep network based cross-modal retrieval has recently made significant progress.However,bridging modality gap to further enhance the retrieval accuracy still remains a crucial bottleneck.Moreover,the existing cross-modal hashing method generally do not fully exploit the label dependencies in multi-label scenarios.To make full use of this information,in this thesis,we propose a multi-label graph convolutional cross-modal hashing(GCCMH)approach.The method focuses on the cross-modal retrieval task of multimodal hashing and mines the label relevance for multi-label scenarios.The globally relevant label embeddings are first learned from the original label features through the mapping function of GCN.After that,the generated label embeddings are input to a semantic encoder to generate semantic codes to guide the feature encoding process,and semantic similarity is preserved in feature learning to finally generate hash codes with smaller modal heterogeneity gaps and more discriminative.3.A cross-modal retrieval prototype system is designed and implemented.The system encapsulates the hashing method proposed in this thesis and provides complete functions of parameter setting,result viewing and performance analysis.We also develop an i OS client application for users to interact with this system,which reduces the threshold for users to use this system and improves the user experience. |