Research On Zero-shot Image Retrieval Optimization Based On Deep Metric Learning

Posted on:2022-04-14

Degree:Master

Type:Thesis

Country:China

Candidate:W H Li

Full Text:PDF

GTID:2518306323462374

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

Zero-shot image retrieval is an important research problem in the field of computer vision.In this type of task,the network model is trained on a very limited data set and is required to compare features in categories that have never been seen before.The mainstream zero-shot image retrieval models are mostly based on deep metric learning(DML)methods.DML aims to learn an embedding space in which semantically similar samples are close to each other,while semantically distant samples are far away from each other.This paper analyzes the shortcomings of the existing zero-shot image retrieval methods based on deep metric learning from two aspects:On the one hand,the training of the zero-shot image retrieval model can be regarded as a dictionary lookup process.Traditional methods can hardly guarantee the large capacity of the dictionary and the consistency of the dictionary at the same time.On the other hand,a large number of agents in the traditional agent-based deep metric learning method lack similar samples to form positive data pairs in training,which leads to the update of the agent in an unreliable direction and makes the network falls into a suboptimal solution.Aiming at the two shortcomings of traditional methods,the paper proposes two sets of optimization schemes:1.By introducing the MoCo(Momentum Contrast)technology in unsupervised learning,the paper can decouple the query encoder and the keyword encoder,expand the dictionary capacity,and use the network that drives the update as the keyword encoder to ensure the dictionary's consistency.Experiments show that the MoCo-based model has achieved excellent performance on multiple public zero-sample retrieval data sets.Further comparison experiments show that our model greatly surpasses the same method using contrastive loss,increasing Recall@1 by an average of 15%.2.This article proposes a proxy gradient consistency model to alleviate this problem.It removes the harmful part of the proxy gradient and ensures the effectiveness of the gradient during update.The agent gradient consistency mode can significantly improve the performance of deep metric learning methods.The Proxy-Consistency loss function proposed in this article is compared with multiple baselines on multiple zero-sample image retrieval data sets,and the accuracy of Top1 is improved by 1%-1.5%on the CUB and SOP data sets,and the Cars data set has achieved very competitive results.

Keywords/Search Tags:

computer vision, zero-shot image retrieval, deep metric learning, representation learning

PDF Full Text Request

Related items

1	Research Of Few-shot Learning Algorithms Based On Metric Learning
2	Few-shot Image Classification Based On Deep Metric Learning
3	Metric Learning And Indexing For Large-Scale Image Retrieval
4	Research And Application Of Few-shot Image Classification Based On Metric Learning
5	Image Retrieval And Zero-shot Object Detection Based On Deep Metric Learning
6	Research On Blurry Image Matching For Imaging Guidance
7	Research On Few-shot Learning Method Based On Deep Feature Metric
8	Deep Metric Learning For Zero-Shot Image Classification
9	Research Of Few-Shot Image Classification Based On Deep Representation Learning
10	Research And Application Of Image Representation Based On Large-scale Datasets