Font Size: a A A

Unseen Attribute-Object Pair Recognition Via Disentanglement

Posted on:2021-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:H ChenFull Text:PDF
GTID:2518306050473334Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
The rapid development of computer vision has brought huge progress for image recognition,enabling face recognition and object recognition to be extensively applied in daily life.The development of image recognition is inseparable from the deep neural networks which re-quire a large volume of data for learning the mapping relationship between visual features and categories,and such learning method is difficult to meet people's demands with the ex-pansion of the application range.In recent years,zero-shot learning has become a hot topic,and unseen attribute-object pair recognition is of great significance because of the particular-ity of its dataset.This task mainly faces two main difficulties:(1)the domain gap between unseen attribute-object pairs and seen attribute-object pairs makes the model learned in the training set unable to be effectively applied to the testing set;(2)the visual appearance of the attribute is closely related to the object which it is combined with,leading to the dif-ficulty in learning discriminative features of the attribute.In order to overcome the above problems,with reference to the thinking mode of humans to learn the visual concept,this paper proposes a model based on the idea of disentanglement.The main research contents are as follows:1.We propose the unseen attribute-object pair recognition model based on disentangle-ment.Researches show that the traditional methods either consider attributes and labels respectively or only consider the concept of "attribute-object" pair,thus fail to achieve the learning of complex concept by composing simple concepts.In this paper,the distribution of visual features is changed from different views to construct two independent disentangled subspaces,so as to decouple visual features.Then,semantic features are projected into two subspaces to align with the distribution of features projected from the image.The feature dis-tribution in two subspaces is considered synthetically to identify the unseen attribute-object pairs during testing,therefore,the discrepancy among any two compositions can be consid-ered.It ensures that the discrepancy among features will be kept in one subspace while the relevance among them is explored in another,so as to recognize complex concepts.2.We propose the unseen attribute-object pair recognition model based on discrimination and construction.In order to further improve the accuracy of recognition,new modules are introduced to optimize the disentangled subspace.The recognition model based disentan-glement is aimed to learn the attribute concept and the object concept separately,but cannot guarantee that the two subspaces can represent the specific concept respectively only by the constraint of relative distance.Therefore,this paper introduces a discriminative model to classify the attributes and objects of visual features in two subspaces separately,so as to learn more discriminative features in terms of specific concepts.Besides,a semantic recon-struction module is introduced to reconstruct the word vectors of attribute and object from visual features in the two subspaces,so that the semantic consistency between semantic features and visual features in attribute subspace can be improved.At the same time,the decoupled visual feature are combined and cross combined to reconstruct the input visual features to ensure the discriminability of input visual features,so as to promote the learning of compositional concept.Extensive experiments on MIT-States dataset and UT-Zappos dataset demonstrate that our method is effective and significantly improves performance compared against the previous state-of-the-art.
Keywords/Search Tags:Recognition of Unseen Attribute-Object pairs, Disentanglement, Zero-Shot Learning, Transfer Learning, Deep Learning
PDF Full Text Request
Related items