Font Size: a A A

Research On Zero-Shot Image Learning

Posted on:2020-05-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y L YuFull Text:PDF
GTID:1488306131967179Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the renaissance of deep convolutional neural networks(CNNs),significant breakthroughs have been achieved on the supervised tasks that require sufficient training data for each target category.However,the assumption of the availability of sufficient data for all categories is unrealistic such that the classification of some categories is imposed on the learning system with few or none training data.In this thesis,we focus on zero-shot learning(ZSL)that enables the learning system to recognize unseen categories that are not observed in the training stage.Inspired by the human being's inferential ability that can recognize the unseen categories with the class semantic descriptions,ZSL is achieved by resort to a class semantic space that is derived from external semantic information.Despite significant process has been achieved in the last decade,ZSL still remains a challenging task due to the potential issues,such as heterogeneity gap,hubness,and domain shift.This thesis dedicates to address these issues for ZSL.Specifically,this thesis makes the following contributions.First,we propose to address ZSL by synthesizing visual features from the class semantic prototypes(the feature representations in the class semantic space)with a dictionary framework.To alleviate the domain shift issue in ZSL,we formulate the prediction into a refine process,in which a self-training strategy is developed under the transductive setting to select reliable instances from the unseen data as annotated instances to refine the previously trained model.Through this bootstrapping-based mechanism,the classification model is progressively reinforced for the unseen categories.Second,we propose an encoder-decoder framework for ZSL via finding a compact latent space to encode the principle semantic characteristics across various modalities such that the knowledge could be transferred.By enforcing all modalities from the same concept to share the same latent space,the common principle semantic characteristics across different modalities are effectively explored.Taking the common latent space as a bridge,the information from one modality can be easily transferred to the other one.Third,we present a semantics-guided attention model for addressing fine-grained ZSL.We use the part features to represent the visual features that are extracted from the key regions of the target objects,and develop an effective semantics-guided attention model to obtain a more semantics-related feature representation by distributing different weights for different regions based on their relatedness with class semantics.Inspired by the success of the generative approaches for ZSL,we propose a biadversarial network to formulate the feature generation and the class semantics inference into a cyclic model such that they can improve each other.To encode the discriminative information into the visual space,we further design a classification network to constrain both the real visual features and the generated visual features to be correctly classified.Such an approach can generate semantics-related and discriminative visual features for unseen categories for the subsequent label prediction.Last,a category classifier generation network is proposed to generate classifiers for unseen categories from their corresponding class semantic prototypes.To ensure the generated classifiers to be discriminative,we use the real category classifiers trained with the visual data to supervise the classifier generation network to generate classifiers from the class semantic prototypes.The whole training process is designed under a meta-learning umbrella.This approach provides a new perspective to address ZSL.Extensive experimental results on ZSL benchmark datasets demonstrate that the proposed approaches in this thesis perform competitively against the state-of-the-art approaches in terms of efficiency,effectiveness,and stability.
Keywords/Search Tags:Zero-Shot Learning, Feature Representations, Knowledge Transfer, Domain Shift, Hubness, Generative Adversarial Network, Meta-Learning
PDF Full Text Request
Related items