Font Size: a A A

Research On Autoencoder And Generative Adversarial Network Based Image Recognition

Posted on:2020-07-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:C HuFull Text:PDF
GTID:1368330602453785Subject:Light Industry Information Technology
Abstract/Summary:PDF Full Text Request
The core of image recognition is how to learn discriminative and robust image rep-resentation.Good features contribute to image recognition and analysis.However,the imago content is very complicated,susceptible to occlusion.illumination,size.deforma-tion and other factors,and it is very difficult to obtain features with strong discriminating ability.How to extract image features effectively is still one of the key topics in the fields of artificial intelligence,pattern recognition and computer vision.Based on autoencoder and generative adversarial network.this paper studies image feature extraction and its application in image recognition.Deep learning has the capability of hierarchical representation,which attracts many researchers and funding.Although deep learning has achieved good results in many fields such as image recognition.speech recognition and natural language processing.there are still some problems such as how to effectively embed the identification information into feature learning.how to fuse huge unlabeled data and labeled data for unified learning,how to extract the interpretable representation of data and so on.This paper mainly focuses on the above problems and proposes several effective deep learning based methods to improve the generalization ability of the model.The main work of this paper includes the following aspects:(1)A sparse autoencoder with label consistency constraints is proposed.Autoencoder is a kind of neural network with the structure of fast inference.A traditional autoencoder is not easy to learn some important information from samples,so there are many improved methods proposed,such as sparse autoencoders.non-negative constrained autoencoders and so on.But these autoencoders neglect the relationship between data,it is hard to learn the features with strong discrimination.To this end,this paper proposes a sparse autoencoder with label consistency constraints.In the process of feature learning,this autoencoder penalizes the distances between the features and their corresponding class centers and adds this loss as a central loss to the loss function.After reconstruction,the autoencodor can learn to identify the structural information of the data.Besides,shallow autoencoder can be stacked to a deep model,and the hierarchical representation cainbe further improved by pre-training and fine-tuning the deep model.Experiments on different datasets verify that the label consistency constraints help to improve to extract discriminative features,and it is also an effective initialization strategy of a deep model.(2)A ladder network with graph laplacian constraints is proposed.The ladder net-work is a deep network based on a deep autoencoder.This network can integrate super-vised learning and unsupervised learning into a unified framework.This semi-supervised learning strategy helps to improve the efficiency of label use in supervised learning and the discriminability of features in unsupervised learning.However,this kind of network ignores the structural information between data.In this paper.a laplacian matrix is in-troduced into this model,and a ladder network based on graph laplacian constraints is proposed.This network fuses all samples,whether they have a label or not.into a unified graph for learning.It plays the role of local constraints in the process of data recon-struction and feature learning and further improves the semi-supervised learning ability of the ladder network.In this paper,the laplacian ladder networks with fully connected network and convolutional network are established respectively.Experimental results on handwritten digital datasets and object recognition datasets demonstrate the effectiveness of the proposed method in the task of image recognition.(3)A generative adversarial network with mean and variance feature matching is proposed.Generative adversarial network is a kind of deep generation model that as-sumes that all samples,whether or not they are labelled or generated,are generated by a potential model.Therefore,a generative adversarial network can be used for semi-supervised learning.Improved generative adversarial network is an advanced method of generative adversarial network.It proposes a training method of feature matching,which effectively improves the stability of the model.However,this method only uses the first-order moment mean of the feature as the statistical variable for feature matching.This method does not describe the feature distribution well and does not match the original data feature with the generated data feature distribution very well.To this end,the second-order moment variance of features is added to the variable of feature matching,and a generative adversarial network based on the feature matching of mean and variance is proposed.This network makes a better matching of generated samples and original samples,and capture the manifold of data more effectively.The experiments verify that the feature matching method with variance helps to further improve the semi-supervised classification performance,especially in the case of less labelled samples,and it also can generate realistic images.(4)A dual encoder-decoder structured generative adversarial network is proposed.To obtain an interpretative representation,disentangled representation learning is often used to analyze deep learning networks,This paper experiments on face images to demon-strate the ability of disentangled representation of the generative adversarial network,and analyzes whether the disentangled representation is helpful to improve the robustness of the model.The disentangled representation generative adversarial network(DR-GAN)can disentangle the face identity information from the pose attribute.and then use the disentangled face representation to perform face recognition,which improves the ability of pose-invariant face recognition.However,this method has shortcomings.First,DR?GAN uses the traditional adversarial loss as its loss function,which is not conducive to the training stability and convergence speed of the model.Second.this method uses a one-hot vector to represent pose.This representat.ion loses a lot of attitude information,and this discrete representation loses the potential attributes of the continuity pose change.To this end,this paper proposes a dual encoder-decoder structured generative adversarial network.This network uses an autoencoder as one part of its discriminator.and intro-duces a pixel-wise loss function which helps stable the training of GAN.Moreover,the pose variable is considered as a continuous variable,which is added to the model training as a priori.It is evaluated by pose regression instead of pose classification,which helps to learn the disentangled facial representation.The experimental results prove that our method has a good performance on pose-invariant face recognition and generating faces with different poses.
Keywords/Search Tags:image recognition, deep learning, autoencoder, generative adversarial network, disentangled representation learning
PDF Full Text Request
Related items