Font Size: a A A

Research On Zero-shot Image Retrieval Optimization Based On Deep Metric Learning

Posted on:2022-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:W H LiFull Text:PDF
GTID:2518306323462374Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Zero-shot image retrieval is an important research problem in the field of computer vision.In this type of task,the network model is trained on a very limited data set and is required to compare features in categories that have never been seen before.The mainstream zero-shot image retrieval models are mostly based on deep metric learning(DML)methods.DML aims to learn an embedding space in which semantically similar samples are close to each other,while semantically distant samples are far away from each other.This paper analyzes the shortcomings of the existing zero-shot image retrieval methods based on deep metric learning from two aspects:On the one hand,the training of the zero-shot image retrieval model can be regarded as a dictionary lookup process.Traditional methods can hardly guarantee the large capacity of the dictionary and the consistency of the dictionary at the same time.On the other hand,a large number of agents in the traditional agent-based deep metric learning method lack similar samples to form positive data pairs in training,which leads to the update of the agent in an unreliable direction and makes the network falls into a suboptimal solution.Aiming at the two shortcomings of traditional methods,the paper proposes two sets of optimization schemes:1.By introducing the MoCo(Momentum Contrast)technology in unsupervised learning,the paper can decouple the query encoder and the keyword encoder,expand the dictionary capacity,and use the network that drives the update as the keyword encoder to ensure the dictionary's consistency.Experiments show that the MoCo-based model has achieved excellent performance on multiple public zero-sample retrieval data sets.Further comparison experiments show that our model greatly surpasses the same method using contrastive loss,increasing Recall@1 by an average of 15%.2.This article proposes a proxy gradient consistency model to alleviate this problem.It removes the harmful part of the proxy gradient and ensures the effectiveness of the gradient during update.The agent gradient consistency mode can significantly improve the performance of deep metric learning methods.The Proxy-Consistency loss function proposed in this article is compared with multiple baselines on multiple zero-sample image retrieval data sets,and the accuracy of Top1 is improved by 1%-1.5%on the CUB and SOP data sets,and the Cars data set has achieved very competitive results.
Keywords/Search Tags:computer vision, zero-shot image retrieval, deep metric learning, representation learning
PDF Full Text Request
Related items