Font Size: a A A

Research And Application Of Zero-shot Learning In Image Classification

Posted on:2021-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:C Y WangFull Text:PDF
GTID:2428330620465761Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of machine learning,computer vision and voice recognition have achieved remarkable results.However,with the improvement of human's living standards,people are no longer content to deal supervised learning with amount of manually labeled training data in traditional machine learning.It is hoped that machine can be more intelligent and able to deal with more realistic and various problems.A more realistic situation is that a large number of new categories appear every day.We need large numbers of labeled new categories samples to retrain our model to make sure high classification performance when new categories appear.It is well knowing that it's difficult to collect a large number of labeled samples.As for unseen classes we may can't obtain too many labeled samples,or it is expensive to obtain unseen samples and annotate all the unseen samples.So it is difficult to classify new categories using traditional machine learning approach.It is important to develop recognition model that can classify classes with few or no labeled samples.Zero-shot learning have been widely used in recent years.It is important for zero-shot learning to use the priori information of seen classes to help classify the unseen classes,and it has important research value and meaning.Zero-shot learning can be seen as a part of transfer leaning,which is a kind of transfer learning.The source domain contains a large number of labeled data,while the data in the target domain is unlabeled.The labeled seen categories in source domain as training data,unlabeled unseen categories in target domain as test data which are not seen during the training phase.And the label set of seen classes and the unseen classes are disjoint.How to transfer the knowledge learned from seen classes to unseen classes is the most important things in zero-shot learning.Zero-shot learning uses semantic space as an intermediate bridge between seen and unseen classes.Both the seen and unseen classes share a common semantic space.Although zero-shot learning has developed quickly in recent years,there are still have some problems.We study the existing problems of zero-shot learning in image classification.In order to solve the problems of zero-shot learning,two different zero-shot learning approachare proposed.Firstly,zero-shot learning aims to transfer knowledge learned from seen classes to unseen classes.But the seen classes and the unseen classes are different,which will lead to the domain shift problem.Moreover,directly learning the mapping function from visual space to semantic embedding space will lead to the information loss.In order to solve the problems of the information loss and the domain shift in zero-shot learning,we propose a zero-shot learning approach based on subspace learning and reconstruction for image classification(Zero-Shot Learning based on Subspace learning and Reconstruction,ZSLSR).Firstly,in order to make full use of the unseen classes information to mitigate the domain shift problem.It transferred the relationship between the seen classes and the unseen classes from the semantic embedding space into the visual space,and obtained the visual prototype of the unseen classes.Then,according to the visual prototypes and semantic prototypes of all categories including the seen and the unseen classes,this model learned a latent subspace space,which aligned the visual and the semantic spaces.The latent space not only contains the discriminative information in the visual space,but also contains the relationship information of the categories in the semantic embedding space.Meanwhile,the reconstruction constraint reduces the information loss in the subspace learning.Finally,in the zero-shot recognition,the test samples of unseen classes could be classified by nearest neighbor search in different spaces.Secondly,the common used semantic information is manually annotated attributes in zero-shot learning.Previous works mostly assume attributes are equal for zero-shot classification.However,in fact,different attributes have different properties,different attributes contain different information amount,which may have considerable impact on zero-shot learning accuracy.In zero-shot learning,semantic space made up of manually labeled attributes or word embedding extracted by NLP approach.Visual space made up of features extracted by CNN.So semantic features and visual features have different distribution,and dimension of features are different.So,directly learning the mapping function from visual space to semantic space can't get better classification model,it may cause semantic gap problem.In order to solve the problem of different attribute contains different information amount and semantic gap,we propose a zero-shot learning approachbased on Attribute Selection and Nonnegative Matrix Factorization.Firstly,attributes are selected through the weight mechanism,good attributes are selected with higher weight which contains more information lead to better classification accuracies.Bad attributes are selected with lower weight which contains less information.At the same time,this approach using Nonnegative Matrix Factorization find a set of latent coefficient vectors.Visual feature and semantic feature are represented by the coefficient vectors.The visual feature vector and semantic feature vector which from the same class can be presented by the same coefficient vector or similar coefficient vector.It helps compare visual feature and semantic feature more easily.It also can remit the semantic gap problem.
Keywords/Search Tags:zero-shot learning, image classification, transfer learning, semantic representation
PDF Full Text Request
Related items