Font Size: a A A

Multimodal Cycle-consistent Zero-Shot Learning Based On Unbiased Embedding

Posted on:2021-05-18Degree:MasterType:Thesis
Country:ChinaCandidate:R HanFull Text:PDF
GTID:2428330620968758Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In traditional target classification tasks,all types of samples are available when training the model.However,in real life,objects often have long-tailed distribution characteristics,which results in some types of samples that cannot be obtained,so they are used.Traditional target classification methods are not feasible to solve problems.The zero-shot learning is different from the traditional target classification method,and its goal is to identify target instances in new categories that have never been seen before.In the zero-shot task,the categories seen in the training set and the categories not seen in the test set are disjoint.To solve the problem of invisible object recognition,zero-shot learning has been widely studied in recent years.There have been many studies on zero-shot learning,but there are still some unsolved problems.First of all,the two problems of the zero-shot learning method are analyzed: problem one,that is,the instance of the invisible target class is often classified as one of the source classes that have been seen;The problem of low probability of correct semantic features of invisible classes.In view of this,this paper adopts a simple and effective generalized zero-shot learning method to solve these two problems.In this paper,we assume that both the marked source image and the unmarked target image can be used for testing.The introduced strong bias loss makes the marked source image mapped to several fixed points specified by the source category in the semantic embedding space.Unlabeled target images are forcibly mapped to other points specified by the target category.At the same time,a zero-shot learning method with consistent multi-modal training is added in this paper.Current methods solve the problem of zero-shot learning by learning the conversion from visual space to semantic space.This method often converts the visual representation of the invisible test into the semantic features of the visible class,rather than the correct semantic features of the invisible class,resulting in low classification accuracy of zero-shot learning.Existing methods use generative adversarial networks to synthesize visual representations of unseen classes from semantic features,and then use a comprehensive representation of unseen classes and visible classes to train a zero-shot classifier.This method has been proved to improve the classification accuracy of zero samples,but there is an important constraint that is missing: there is no guarantee that the synthesized visual representation can reproduce its corresponding semantic features in a multi-modal cycle consistent manner.This causes the synthesized visual representations to not represent their semantic featureswell,and also means that using this constraint can improve methods based on generative adversarial networks.This paper introduces the consistent loss of multi-modal cycles on this basis,forcing the generated visual features to reconstruct their original semantic features.In this paper,combining the above two problems,the proposed multimodal cyclic consistent zero-shot learning model architecture based on unbiased embedding was tested on three public data sets and compared with existing methods to obtain Better results.
Keywords/Search Tags:Target Classification, Zero-Shot learning(ZSL), Strong Bias Loss, Generalized Zero-Shot Learning, Generative Adversarial Networks, Regularization Constraint, Multimodal Cycle-consistent
PDF Full Text Request
Related items