Font Size: a A A

Study Of Zero-shot Image Classification Based On Robust Metric Learning

Posted on:2020-12-11Degree:MasterType:Thesis
Country:ChinaCandidate:S J WangFull Text:PDF
GTID:2428330602952090Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the development of Internet technology,a large amount of data is generated every day,and in order to make good use of the data for classification,labeling work is indispensable.However,on the one hand,it is impossible for us to manually label all the data,otherwise it will be time-consuming and laborious;on the other hand,we hope that the model we have learned can not only have generalization ability on new samples,but also new categories.It can also be identified,which requires that the model can classify unknown categories without having to retrain when new categories are encountered.Therefore,how to train a model that can still identify unknown categories,that is,zero-shot learning,has attiracted the attention of a large number of experts and scholars.Semantic embedding models and common space embedding models are two classic research directions in the field of zero-shot learning.The former aims to learn the mapping between visual feature space and semantic feature space.The latter aims to learn mappings from visual feature space to common space and from semantic feature space to common space.However,these two methods generally ignore the potential low-rank structure in the data,and the measurement method is not robust enough.This paper has done an in-depth study on this issue,as follows:1.Semantic Autoencoder(SAE)is an effective zero-shot learning method and belongs to semantic embedding models.The model introduces semantic information based on linear autoencoder and solves the problem of zero-shot image classification by minimizing recon-struction error.However,the model ignores the potential low-rank structure in the data,so that the learned projection cannot capture discriminative information of samples.In addi-tion,SAE measures the reconstruction error with the square F-norm.On the one hand,the norm is too sensitive to noise or outliers,and the robustness is poor.On the other hand,this measure ignores the sparse characteristics of the reconstruction error.Based on the above considerations,we add the low-rank constraint based on nuclear-norm to SAE,which can learn a low-rank mapping,thus discover the low-rank structure of the data,capture dis-criminative features and remove the redundant information.At the same time,we use the L1-norm instead of the square F-norm to measure the reconstruction error,which can bet-ter reconstruct the original data and make the model more robust.In this case,a Robust Semantic Autoencoder(RSAE)is proposed.Experimental results on four commonly used data sets demonstrate that RS AE has better classification performance than other mainstream zero-shot learning methods.2.The common space embedding model mostly treats labels as the unique identifier of d-ifferent categories,while ignoring its powerful discriminating ability.In response to this problem,we have designated the common space as the label space and proposed the Label Activation Framework(LAF).During the training phrase,the space consists of discrete la-bels of known categories.During the testing phrase,the space is defined as continuous,and the projection of the sample features in the space is the label,so the label is activated and has a discriminative meaning.Unlike Indirect Attribute Prediction(IAP),the known and unknown categories of LAF are defined in the same space,which is very beneficial for gen-eralized zero-shot learning.Based on this framework,we propose a specific robust model,which is to learn the mapping from the visual feature space to the label space through ridge regression,and learn the mapping from the semantic feature space to the label space through Robust Label Autoencoder(RLAE).A large number of experiments have shown that the LAF framework has better classification performance than many state-of-the-art methods,especially for generalized zero-shot learning tasks.
Keywords/Search Tags:Zero-Shot Learning, Low-rank, Autoencoder, F-norm, Nuclear-norm
PDF Full Text Request
Related items