Font Size: a A A

Research On Cross-modal Retrieval Based On Deep Asymmetric Hashing

Posted on:2023-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:X Y WangFull Text:PDF
GTID:2558307118990859Subject:Mathematics
Abstract/Summary:PDF Full Text Request
With the explosive growth of multi-modal data such as images,texts,and videos,how to mine the semantic associations of multi-modal data and realize the mutual matching of inter-modal data has become a hot topic in cross-modal retrieval research.Due to the superiority of hashing in storing and searching large-scale data,and the powerful feature extraction ability of deep learning,the method based on deep hashing has gradually become the mainstream method of cross-modal retrieval.Semantic information is the key to learning discriminative hash codes for cross-modal retrieval methods,and traditional semantic similarity matrix is not enough to mine the semantic information of multi-label data.In addition,most deep cross-modal hashing methods adopt a symmetric strategy to learn hash codes,which makes the supervised information underutilized.And the use of relaxation strategies to solve the discrete constraint problem of hash codes may generate large quantization errors,resulting in suboptimal hash codes.In this thesis,we proposed two different cross-modal hashing algorithms.1.We have been proposed an algorithm in chapter 3,called Deep Semantically Consistent for Cross-modal Hashing Retrieval(DSCH).It introduced cosine distance to reconstruct the semantic similarity measure,which can mine the semantic information of multi-label data more deeply and provide more fine-grained association information.The semantic consistency module preserved the inter-modal similarity and intra-modal similarity simultaneously,and aligned the features of image data and text data to reduce the semantic differences inter-modal by using the relevant alignment;The label consistency module maped the features of different modalities to a common semantic representation space,so that the learned hash codes were consistent with the label information of the training data.2.We have been proposed another algorithm in chapter 4,named Deep Discrete Asymmetric Hashing for Cross-modal Retrieval(DDAH).It utilized asymmetric learning framework to learn the hash codes of query instances and database instances,which can more effectively mine the supervision information of the data and reduce the training time of the model.The discrete optimization algorithm was used to optimize the hash code matrix column by column to reduce the quantization error of the hash codes binarization.In order to fully mine the semantic information of the data,a label layer was added to the neural network for label prediction,and the semantic information embedding was used to embed different discrimination information into the hash codes through linear mapping to make the hash codes more discriminative.
Keywords/Search Tags:Cross-modal retrieval, Deep neural network, Semantic consistency, Asymmetric hashing, Discrete optimization
PDF Full Text Request
Related items