Font Size: a A A

Zero-shot Learning Methods For Image Recognition

Posted on:2020-10-30Degree:MasterType:Thesis
Country:ChinaCandidate:Y ChenFull Text:PDF
GTID:2428330620460022Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Recent significant achievements in classification,detection,segmentation and other directions in computer vision are mostly based on supervised learning,which requires each class to contain a large number of labeled samples.However,with the expansion of the research scope,it takes people a lot of time and energy to collect a large number of labeled data for each untrained class.Therefore,given few labeled classes,the way to make use of vast unlabeled classes is undoubtedly an important and challenging subject.Zero-shot learning was proposed by referring to the human's ability to recognize unseen objects only with high-level descriptions.When given a set of labeled data with semantic descriptions,zero-shot learning aims to transfer information from seen classes to unseen classes and recognize objects from unseen classes.In this paper,we target on the problem of zero-shot learning.Existing zero-shot learning methods usually embed images into semantic embedding space,which is composed of attributes,and transfer information between seen and unseen classes by learning a project function across different sources.Although such an approach has achieved some progress,there remain some problems.Firstly,project function-based methods usually ignore the association information between seen classes in the embedding space,and usually lead to domain shift when the distribution of training data and test data is inconsistent.In addition,existing zero-shot learning algorithms are all proposed for traditional zero-shot learning,the latter requires that all test samples come from unseen classes.However,under the generalized zero-shot learning,the source of test data is unconstrained,which means test data may come from seen classes or unseen classes.Under the setting of generalized zero-shot learning,the ability to recognize unseen classes of existing algorithms is significantly degraded.This paper conducts corresponding research on the above problems,specifically:(1)Observing that existing methods ignore the class correlation in the embedding space and are prone to cause domain shift,this paper proposes a new zero-shot method,by exploring and utilizing more structural relations in the semantic embedding space.The proposed method unifies the structure of embedded attribute space with constrained classification reasoning.The proposed method assumes that semantic representations from similar classes will be projected into neighbor locations in the embedded space,and this assumption helps to predict classifiers for unseen classes.Therefore,we construct a semantic embedding space by extracting the attributes from the input image,then structurally constrained correlation in the constructed semantic space are mined and transferred to constrain the classifiers trained with seen classes,and finally we can synthesize classifiers for unseen classes.The proposed method can retain the global structure of semantic space while mining local relations,enhance the influence of neighborhood embedding and obtain more effective semantic representation.Experimental results on the benchmark datasets demonstrate the effectiveness of the proposed model and show that the proposed model can outperform the state-of-the-art methods.(2)Aiming at the problem that existing zero-shot methods perform poorly in the more practical generalized zero-shot scenario,this paper proposes to combine the posterior probability estimation and decision threshold into the existing traditional zero-shot learning methods,thus to solve the problem of generalized zero-shot recognition by estimating the source of test data.By modeling the data distribution of the seen classes,the source of test samples can be distinguished as seen or unseen.In this paper,we assume that the classifier outputs of the training data are all lower-bounded,which helps to estimate the probability of class inclusion of the test data that come from seen classes.We propose to define the generalized zero-shot problem as the problem to model the decision boundary for the training data output,so as to estimate the unnormalized posterior probability for each test data.Then the source of the test data can be distinguished,which is from seen classes or unseen classes,and we can apply the existing zero-shot learning classifiers to classify test data into some certain class.The proposed method helps to distinguish the positive classes from the known negative classes,and can adjust the decision boundary so that the test data from unknown classes is not often misclassified as the known class.Experimental results demonstrate the effectiveness of the proposed method and prove that the proposed method can improve the performance of existing zero-shot learning methods in generalized zero-shot learning scenarios.
Keywords/Search Tags:Object recognition, zero-shot learning, manifold learning, extreme value theory
PDF Full Text Request
Related items