Font Size: a A A

Study Of Feature Learning Methods And Applications Based On Probabilistic Model

Posted on:2020-08-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:1360330602950277Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the continuous improvement of artificial intelligence and machine learning,feature extraction is significant in various intelligence systems.With a large labeled data,supervised feature extraction via deep neural networks(DNNs)has been widely studied over the past few decades,because it can figure out which type of representation is relevant with the task at hand.Although they have recently enjoyed a great success in quiet a few applications,such as natural image processing and natural language processing,all of them are formulated as single point estimate problems and their ”black-box” properties make them hard to interpret what do they learn at each layer.On the contrast,beyond optimization-based methods,probabilistic model more powerful in describing data,incorporating domain knowledge and interpreting the learned features.Focusing on feature learning of images and texts in probabilistic framework,in this dissertation,we mainly presents relevant researches on designing interpretable Bayesian models from both supervised and unsupervised views.In addition,we develop the scalable inference for the models on big data applications and extend the models in joint image-text representation learning.The main contents of the dissertation are summarized as follows:1.In supervised feature extraction,considering the limitation of deep neural networks(DNNs)only able to learn point estimation of parameters,which is not robust to noise and lack probabilistic interpretability,we introduce a new max-margin discriminant projection method,which takes advantage of the latent variable representation for support vector machine(SVM)as the classification criterion.Specifically,the proposed model jointly learns the discriminative subspace and classifier in a Bayesian framework by conditioning on augmented variables.Moreover,an extended nonlinear model is developed based on the kernel trick.To explore the sparsity in the kernel expansion,we use the spike-and-slab prior to seek basis vectors(BVs)from the corresponding candidates.Unlike existing methods,which employ BVs to approximate the original feature space,in our method BVs are sought to associate the final classification task.Thanks to the conditionally conjugate property,the parameters in our models can be inferred via the simple and efficient Gibbs sampler.The linear model is named as max-margin linear discriminant projection(MMLDP),while the kernel one is named as kernel max-margin discriminant projection(KMMDP),which are tested on synthesized and real-world data to demonstrate their efficiency and effectiveness.2.For some types of data,such as natural images,both MMLDP and KMMDP are not able to model them since they are high and multi-dimensional.By the aid of the thoughts of deep learning,a unified Bayesian max-margin discriminant projection framework(MMDP)is introduced,which is able to jointly learn the discriminant feature space and the max-margin classifier with different linear or nonlinear relationships between the latent representations and observations.We assume that the latent representation follows a normal distribution whose sufficient statistics are functions of the observations.The function can be flexibly realized through either shallow or deep structures.The shallow structure includes linear,nonlinear kernel-based functions and even the convolutional projection,which can be further trained layerwisely to build a multilayered convolutional feature learning model.To take the advantage of the deep neural networks,especially their highly expressive ability and efficient parameter learning,we integrate Bayesian modelling and the popular neural networks,e.g.,MLP and CNN,to build an end-to-end Bayesian deep discriminant projection under the proposed framework,which degenerated into the existing shallow linear or convolutional projection with the single layer structure.Aiming at big data era,an efficient scalable inferences algorithm for MMDP are also developed by Stochastic Gradient Markov Chain Monte Carlo(SG-MCMC).Finally,we demonstrate the effectiveness and efficiency of the proposed models by the experiments on real-world data with the detailed analysis about the parameters and computational complexity.3.Considering the fact that there are more unlabeled data than labeled ones in practice,probabilistic generative models(PGMs)are more flexible in exploring the data structure from the unlabeled ones.In social life,faced with a large amount of texts in various corpora,most existing models are shallow ones that are constrained in learning features,and often characterized by a pure top-down generative structure which is not convenient for real-time processing.To build a flexible and interpretable multilayered PGM for document analysis,we develop deep autoencoding topic model(DATM)that uses a hierarchy of gamma distributions to construct its multi-stochastic-layer generative network.In order to provide scalable posterior inference for the parameters of the generative network,we also develop topiclayer-adaptive stochastic gradient Riemannian MCMC that jointly learns simplex constrained global parameters across all layers and topics,with topic and layer specific learning rates.Given a posterior sample of the global parameters,in order to efficiently infer the local latent representations of a document under DATM across all stochastic layers,we propose a Weibull upward-downward variational encoder that deterministically propagates information upward via a deep neural network,followed by a Weibull distribution based stochastic downward generative model.To jointly model documents and their associated labels,we further propose supervised DATM(s DATM)that enhances the discriminative power of its latent representations.The efficacy and scalability of our models are demonstrated on both unsupervised and supervised learning tasks on big corpora.4.The models discussed above are focused on one single modality.However,in real world,people often use multi modalities to understand one object or target,where image and text joint learning is one of the important research.Although joint image-text learning models have achieved success,few is able to exploit interpretable visual and semantic relationships hierarchically and build bidirectional end-to-end transformation.For this,we first propose aggregated posterior randomized GAN(APGAN),where the VAE aggregated posterior in lieu of noise is fed as the source of randomness into the GAN generator.APGAN maximally preserve their respective generators,which is able to model one or two modalities.With analyses on single modality,we develop it to AP-Stack GAN++ to perform joint image-text learning by combining a deep topic-model-based variational heteroencoder(VHE)with Stack GAN++.For better understand the connection between two modalities,we further develop “raster-scan” GANs that improve Stack GAN++ to generate photo-realistic images in not only a multi-scale low-to-high-resolution manner,but also a hierarchical-semantic coarse-to-fine fashion,referred as AP-raster-scan-GAN.With joint training,the state-of-theart performance is achieved in a rich set of image-text multimodal learning and generation tasks.
Keywords/Search Tags:feature extraction, max-margin, SG-MCMC, hierarchical probabilistic model, deep topic model, multi-modal learning
PDF Full Text Request
Related items