Font Size: a A A

Research On Learning And Inference Methods In Deep Generative Models

Posted on:2024-08-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q Z AiFull Text:PDF
GTID:1528307079451474Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of machine learning technology,deep learning has become one of the most important methods in the field of artificial intelligence due to its powerful representation learning and generalization capabilities,and is widely used in computer vision,natural language processing,speech recognition,recommendation systems and other fields.As an important branch of unsupervised learning,deep generative models provide an efficient solution for analyzing and learning the structural distribution of unlabeled data.By modeling the joint distribution of latent variables and observable data,deep generative models capture the internal data distribution from the perspective of probability models,master the generation rules of data,and can create data samples similar to the original data.In deep generative models,the process of computing the posterior distribution of latent variables given observed data is called inference,which plays a critical role in understanding the low-dimensional manifold structure of data and optimizing probability models.Starting from the relevant algorithms of deep learning,this dissertation takes the Variational Auto-encoder(VAE)model as a starting point and aims to explore and research new high-performance deep generative models and corresponding efficient inference algorithms.The dissertation addresses scientific issues such as the difficulty in selecting the model prior distribution,the high demand for high-quality training data,and poor scalability on complex data in existing deep generative models.The main research content and contributions can be summarized as follows:Firstly,for the problem of the difficulty in selecting model prior distribution in deep generative models,a variational autoencoder model based on a Bayesian pseudocoreset example prior is proposed to solve the polarization phenomenon of the prior being too simple or too complex in existing models.This model introduces a prior design based on a small-scale Bayesian pseudocoreset,which effectively avoids model overfitting while improving model optimization efficiency.Additionally,to obtain the optimal Bayesian pseudocoreset and provide explicit prior interpretability,the model adopts a stochastic optimization algorithm to minimize the Kullback-Leibler(KL)distance between the Bayesian pseudocoreset prior and the prior based on the entire dataset,while simultaneously optimizing the network parameters and the Bayesian pseudocoreset.Results on several benchmark datasets show that compared to previous variational autoencoder variant models,the variational autoencoder based on the Bayesian pseudocoreset example prior achieves competitive results in tasks such as density estimation,representation learning,and generative data augmentation,while significantly improving efficiency.Secondly,for the issue of high-quality training data requirement for deep generative models,this study focuses on the problem of generating low-quality minority class data in the context of class imbalance.A novel variational autoencoder model and its corresponding resampling model based on majority class guidance are proposed.The model generates new minority class samples under the guidance of the majority class prior,effectively achieving generative data augmentation for the minority class and balancing the training data for downstream machine learning tasks.The newly generated minority class samples inherit the diversity and richness of the majority class samples,thereby alleviating the overfitting problem in downstream tasks.Additionally,to prevent model collapse under limited training data,the model innovatively adopts a two-stage training method of pre-training and fine-tuning,as well as Elastic Weight Consolidation(EWC)regularization technique.Experimental results on benchmark image datasets and real-world tabular data demonstrate that the majority class-guided variational autoencoder achieves significant improvements in downstream classification tasks,proving the effectiveness of the proposed model.Thirdly,in response to the poor scalability of deep generative models on complex data,the research focuses on efficient multimodal deep generative models,taking multimodal data as the entry point.A multimodal variational autoencoder model based on contrastive normalization flow model is proposed.The model first uses reversible normalization flow to introduce a high-level semantic space,called the “meta” latent space,which separates modal-specific latent spaces.Then,contrastive learning is used to align modalities at the instance level in the introduced “meta” latent space,which maximizes the consistency among multimodal data and minimizes the negative impact on generation quality.Finally,experimental results on several benchmark multimodal datasets demonstrate the superiority of the proposed multimodal variational autoencoder model based on contrastive normalization flow.
Keywords/Search Tags:Deep learning, Deep generative models, Variational Auto-encoder(VAE), Efficient inference, Multimodal learning
PDF Full Text Request
Related items