Font Size: a A A

Deep Metric Learning For Zero-Shot Image Classification

Posted on:2020-04-20Degree:MasterType:Thesis
Country:ChinaCandidate:H WangFull Text:PDF
GTID:2518306518967179Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the success of deep learning in computer vision,it is widely used in image classification algorithms.However,image classification algorithms based on deep learning rely on a large number of labeled training data and some data samples are difficult to obtain.Moreover,the labeling of the sample also requires a lot of resources.Therefore,collecting enough samples for each class to be identified and given the sample sufficient annotation information becomes a problem based on the deep learning classification algorithm.To solve this problem,the researchers proposed Zero-Shot Learning(ZSL),which uses the semantic information of the sample to complete training,such as word2 vec or attribute information of the label name.By sharing semantic information for the seen classes and unseen ones,zero-shot learning constructs a model for aligning the distribution of semantic information and visual information to achieve knowledge transfer from the seen classes to unseen classes.Therefore,in the process of distribution alignment,the metric of visual features and semantic feature similarity is an important research problem in zero-shot learning.This dissertation deeply explores the zero-shot learning based on depth metric learning method.And two specific methods of deep metric learning to tackle the zero-shot learning problem are proposed.This dissertation first summarizes the historical development and research results of the existing zero-shot learning algorithm.The characteristics and difficulties of zeroshot learning are discussed.The classical and advanced zero-shot learning methods are summarized.The different modal features(visual features,text features,attribute features)and corresponding feature acquisition methods are described in detail.In particular,this dissertation discusses the existing metric learning for zero-shot learning methods in detail and analyzes the advantages and disadvantages of the existing metricbased learning method.Then this dissertation re-thinking the zero-shot learning problem from the perspective of deep metric learning,and explores a unified learning model,which first embeds visual features and semantic features into a common space(visual space or public space).The deep metric network reconstructs a more discriminative metric space,and by focusing on the negative samples of different modalities,the distribution alignment of the two modalities is performed,and the final classification is implemented in the space.Specifically,two kinds of deep metric learning for zero-shot learning methods are proposed for the characteristics of the deep metric network and its loss function.The first method exploits a dual triplet network to mine data distribution characteristics.The dual triplet metric learning network uses attribute-oriented triplet network training negative attribute samples,using a visually-oriented triplet network to train negative visual samples.And proposes an efficient hard triples mining strategy to improve training efficiency and performance.The second method utilizes a cross-modal N-pairs network for efficient information interaction on two modalities.The cross-modal N-pairs metric learning network mainly consists of two parts: N-pairs network training of attribute modality contains N-pairs of several negative attribute samples;N-pairs network training of visual modality contains several negative visual samples.On this basis,a multi-class N-pairs strategy with hard positive sample mining is introduced to improve the classification accuracy of cross-modal N-pairs metric learning network and save training overhead.Finally,the above two algorithms have carried out sufficient experiments on the three benchmark datasets of Aw A,CUB and a PY to verify the impact of each part on the model,and compared other algorithms to prove its effectiveness and advancement.
Keywords/Search Tags:Zero-Shot Learning, Deep Metric Learning, Image Recognition, Triplet Loss, N-pairs Loss
PDF Full Text Request
Related items