Font Size: a A A

Research On Zero-Shot Learning Algorithm Based On Deep Model

Posted on:2021-02-07Degree:MasterType:Thesis
Country:ChinaCandidate:P R MaFull Text:PDF
GTID:2428330611954122Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
In recent years,zero-shot learning(ZSL)technology has become a hot research area in computer vision and machine learning due to its ability to recognize new categories without providing any training samples.It uses the shared semantic space(such as the attribute space)to transfer knowledge from the seen classes to the unseen classes,thereby achieving unseen classes recognition.Zero-shot learning technology can be used to solve various problems that lack training samples,such as object recognition,video understanding,and natural language processing.It has great research significance and value.Based on deep neural network technology and generation model theory,this paper designs a deep embedding model(De SAE)and a deep generation model(DE-VAE)to improve the performance of zero-shot learning.Aiming at the problems that traditional linear embedding models can only learn linear mapping between multi-modal data,limited ability to represent complex targets,and poor generalization ability,this paper proposes a De SAE model.De SAE is a simple deep modification of the classic linear semantic autoencoder(Li SAE),which introduces non-linear factors through artificial neural network technology.Compared with Li SAE,the mapping function learned by the De SAE model can be better generalized to unseen classes.Experimental results on four benchmark datasets show that De SAE significantly improves the performance of zero-shot learning.The detailed performance differences between De SAE and Li SAE are as follows,when the task is ZSL: 42.1% vs 33.3%(CUB),43.3% vs 40.3%(SUN),53.8% vs 53.0%(AWA1),54.7% vs 54.1%(AWA2);When the task is GZSL: 20.1% vs 13.6%(CUB),15.3% vs 11.8%(SUN),10.6% vs 3.5%(AWA1),11.6% vs 2.2%(AWA2).In view of the fact that the methods based on the embedding models cannot essentially alleviate the problems of domain shift and hubness in zero-shot learning,this paper combines the advantages of the deep embedding models and the generation models to propose a DE-VAE model.DE-VAE learns a latent space shared by image features and class embeddings to help classification.First,a deep embedding network is used to learn the mapping from semantic space to visual feature space.Then,both the features obtained by the mapping from class embeddings,and the original image features are input into the modified Variational Autoencoder(VAE)to carry out cross-modal alignment.Finally,through the trained deep embedding network and the encoder of VAE,the image features and class embeddings from the seen and unseen classes are transformed into the latent features for training and testing the final Softmax classifier.By generating the latent features of the unseen class,DE-VAE transforms zero-shot learning into a traditional classification task,which essentially alleviates the domain shift and hubness problem.It achieves state-of-the-art performance on four zero-shot learning benchmark datasets.The performance difference between DE-VAE and CADA-VAE,which represents the most advanced technology at present,is as follows,when the task is ZSL: 63.1% vs 60.6%(CUB),64.0% vs 62.8%(SUN),69.4% vs 65.0%(AWA1),69.3% vs 64.3%(AWA2);when the mission is GZSL: 54.3% vs 52.5%(CUB),40.9% vs 39.9%(SUN),66.9% vs 63.6%(AWA1),67.4% vs 63.9%(AWA2).This paper proposes a deep embedding model De SAE and a deep generation model DEVAE based on deep learning technology and generation model theory,which effectively alleviates the problems of domain shift and hubness in zero-shot learning,and significantly improves the accuracy of zero-shot recognition.
Keywords/Search Tags:Zero-Shot Learning, Deep Neural Network, Embedding Model, Generation Model, Class Embedding
PDF Full Text Request
Related items