Font Size: a A A

Research On Unsupervised Disentanglement Learning Based Regularization Method

Posted on:2024-07-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:M F HuFull Text:PDF
GTID:1528307307954019Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Representation learning is one of the core problems in artificial intelligence.Good representation can automatically transform the original data into a form that can be effectively used by machine models,avoiding the complexity of manual feature extraction and allowing the model to learn how to learn good feature representations.Interpretability is an important criterion to evaluate the representation.Good interpretability can effectively improve the robustness and generalization ability of the model.However,the data in the real world is usually complex and redundant,the useful features hidden in the samples are often entangled highly.Therefore,disentangling the entanglement between features and learning interpretable representations can help computers understand the world in the way of human thinking and logic,learn the laws of data moreaccurately,and helpdownstream tasks and applications.In unsupervised learning,it is a challenging problem to automatically discover interpretable factors and disentangle these factors.Current disentanglement algorithms are mainly based on autoencoder models,and various regularization methods are used to constrain the representation for improving disentanglement performance.This paper focuses on the regularization method of unsupervised disentanglement representation learning,which aims to improving disentanglement performance and improving the visualization effect of interpretable factors.The research is carried out inthe following three aspects:(1)Caps Net based on information bottleneck algorithm.Caps Net is a special autoencoder whose basic unit is a capsule vector.It has stronger ability to capture spatial features than traditional deep neural network,so that the representation contains rich interpretable factors.However,these factors always appear in the form of entanglement with each other,and the common disentanglement methods are difficult to be directly applied to the Caps Net,which is not conducive to the application of the model and downstream tasks.To solve this problem,we propose to use theinformation bottleneck theory to extract the minimum sufficient statistics of the output from the input data,force the capsule vector to learn more generalized and interpretable factors in the limited representation space,and disentangle the entanglement between factors.Considering the computational difficulty of the information bottleneck constraint,the computational complexity of the objective function was simplified from the perspective of information theory,and further improved to the constraint of the mean value of the capsule,which effectively improved the applicability of the algorithm.In addition,we propose class-independent mask vectors to help Caps Net understand the interpretable factors in samples from different categories using the same parameter space.The structure of the unsupervised capsule network and the corresponding dynamic routing algorithm are designed tohelp Caps Net learn good disentangled capsules without auxiliary information.(2)Variational Autoencoder with mean constraint.Variational Autoencoder is the most widely used autoencoder and the most commonly used model structure in the field of disentangled representation learning.However,the improvement of disentanglement performance is always accompanied by the decline of generation ability.The disentanglement experiment of Caps Net finds that the mean and variance of capsule vector have different effects on the model.Inspired by this phenomenon,this paper proposes a new method to decompose the KL divergence into different independent sub-terms,and verifies the influence of the decomposed constraints on the performance of the model,so as to analyze and understand the reason why the disentanglement algorithm reduces the quality of reconstruction.Furthermore,we find that the mean constraint method can improve the disentanglement performance of the VAE,and design the mc VAE algorithm,which uses the weighted mean constraint term to limit the posterior distribution of the representation,so that the model can significantly improve the disentanglement performance while ensuring the generation ability.In order to measure the disentanglement performance quantitatively,a new disentanglement metric is proposed.The sensitivity to the true factor in the perturbed data is expressed by judging the true factor,which avoids the numerical instability and other shortcomings of the original metric method.In addition,the common regularization methods are comprehensively analyzed.According to the mechanism of various disentanglement algorithms,the nature of disentangled representation and the learning process of disentangling factor areanalyzed.(3)A hybrid model-based method for generating high-quality disentangled samples.The above two regularization algorithms not only effectively improve the disentanglement performance of the autoencoder class model,but also improve the visualization effect of the interpretable factors to a certain extent.However,limited by the model structure and training method,the autoencoder model is difficult to generate high-quality samples from complex datasets,which seriously limits the application field of the algorithm.In this paper,we propose a simple and general hybrid model framework that can easily combine state-of-the-art disentangled and generative models,and then generate interpretable factors in the disentangled representation into high-quality high-resolution image samples in a high-fidelity form.The twostage training algorithm designed in this paper can train the two models separately,and the disentangled representation is used as the connection vector between them to help both models obtain the local optimal solution at the same time.When using a variational autoencoder to learn disentangled representations,to ensure that the samples generated by the two generative models can align with the same interpretable factors,we design a constraint method to improve the fidelity of the factors by maximizing the mutual information between the disentangled representations and the samples generated by the generator.Finally,the performance of the hybrid model is verified on complex datasets,and the change process of interpretable factors in high-resolution syntheticimages is shown.To sum up,this paper designs several disentanglement algorithms starting from Caps Net,then generalizes the mean constraint method to the VAE to tradeoff disentanglement performance and reconstruction quality.Finally,the proposed hybrid model can learn the interpretablefactors of complex datasets andgenerate cleardisentanglement samples.
Keywords/Search Tags:Disentangled Representation, Unsupervised Learning, CapsNet, Variational Autoencoder, Generative Adversarial Network
PDF Full Text Request
Related items