Font Size: a A A

Research On Cross-Modal Retrieval Algorithm For Similarity Preservation In Deep Adversarial Learning

Posted on:2024-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:G K LiFull Text:PDF
GTID:2568307136975729Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of the Internet,various forms of data also have explosive growth,and the proposal of multi-type data retrieval is more in line with the new requirements of the development of the times.Cross-modal retrieval aims to explore the semantic association information between different modes,establish semantic connection for data in different modes but with the same semantic information,so as to realize mutual retrieval between the two modal data.In order to solve the above problems,this paper takes image-text cross-modal retrieval as an example to carry out in-depth research.The main research contents are as follows:(1)Triplet Cross-Modal Retrieval Algorithm based on Deep Adversarial Learning(ATCMR).Most of the existing cross-modal retrieval algorithms only focus on the similarity relationship between different modal samples.When the nearest neighbor retrieval data is affected by noise,it is easy to retrieve wrong samples that are too close in the same modal,reducing the cross-modal The accuracy of state retrieval;when modeling data distribution,the feature distribution of different modal data is not aligned.This paper proposes a triplet cross-modal retrieval algorithm based on deep adversarial learning,and designs a triple similarity preservation function,while maintaining the inter-modal and intra-modal similarity relations in the common space and each modal space,so that The algorithm is robust to noise;in addition,a cross-modal generative adversarial network is designed to learn different modality features during training,ensuring that the distribution between different modality sample features is aligned.Cross-modal retrieval comparative experiments were carried out on two benchmark datasets,Pascal Sentence and Wikipedia.The results show that the cross-modal retrieval performance of triplet cross-modal retrieval algorithm based on deep adversarial learning is better.(2)Adversarial triplet cross-modal retrieval algorithm based on multi-layer self-attention mechanism(MACMR).Most of the existing supervised cross-modal retrieval algorithms rely too much on the label information of the original data in maintaining semantic consistency,and are not robust to data with wrong labels;the top-level information of the neural network is selected as the feature representation of the sample,and no attention has been paid to it.Details of the intermediate layers of the neural network.This paper proposes an adversarial triple cross-modal retrieval algorithm based on multi-layer self-attention mechanism.Constructive features and original features are trained against each other to minimize the difference between the semantic information of the generated features and the original semantic information,reduce the algorithm’s dependence on the original label information of the sample,and improve the robustness of the wrongly labeled data.Cross-modal retrieval comparative experiments were carried out on two benchmark datasets,Pascal Sentence and Wikipedia,and the results show that the cross-modal retrieval performance of the adversarial triples cross-modal retrieval algorithm based on the multi-layer self-attention mechanism is better.(3)In order to comprehensively analyze the two algorithms proposed in this paper,experiments are designed in terms of algorithm retrieval accuracy,training time,noise resistance and training set size.The retrieval accuracy of the algorithm can visually show the completion of the algorithm in cross-modal retrieval tasks;the training time reflects the consumption of training resources by the algorithm;the noise resistance reflects the robustness of the algorithm;the size of the training set reflects the performance of the algorithm when facing data sets of different sizes.,can maintain a good retrieval effect.Summarize the above experimental results and analyze the advantages and disadvantages of the two algorithms in different application scenarios.
Keywords/Search Tags:Cross-modal retrieval, Deep learning, Generate adversarial network, Self-attention mechanism
PDF Full Text Request
Related items