Font Size: a A A

Hierarchical Feature Learning Based On Deep Bayesian Generative Networks

Posted on:2018-05-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y L CongFull Text:PDF
GTID:1368330542992878Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the continuous improvement of informatization and intelligent level,vast informative data are generated all the time from every aspect of our society.If we can extract(learn)the contained information(features),we may make great contributes to both national defense and social livelihood.However,in practice,the data faced usually have diversified forms,and their most important character is“big”,that is high dimensionality and/or large scale.Such character makes it greatly challenging to efficiently extract information from big data.On the other hand,the high complexity of big data puts forward higher request on model descriptive capability.Bayesian models,based on solid statistical theories,have several outstanding advantages in big data processing:its high flexibility enable us to construct rich hierarchical models with powerful description abilities;Bayesian nonparametrics enables a model to adaptively adjust its dimensionality according to different data;Bayesian inference provides uncertainty of the estimated parameters;Bayesian models are capable of alleviating the overfitting problem to present a robust estimation;the sequential learning procedure reflected in Bayes'theorem naturally fits practical big data applications;and so on.Focusing mainly on feature learning for big data,this dissertation presents relevant researches on designing hierarchical Bayesian models,their inference algorithms,and generalization on big data applications.The main contents of the dissertation are summarized as follows:1.Radar target recognition technology plays an irreplaceable role in national defense,and one of the important research directions is the target feature extraction based on SAR images.If key features of a target can be extracted from its SAR images,they will be very helpful for the identification of important goals,or even have a crucial impact on the situation of wars.Aiming at that problem,we present a novel attributed scattering center(ASC)feature extraction algorithm for SAR targets based on L?evy random fields in a nonparametric Bayesian framework.The proposed method is capable of adaptively extracting concise and physically relevant ASC features,namely the number of ASCs and the ASC associated parameters,from SAR images.Such extracted features could provide highly valuable information for radar target recognition.2.In social life,a large amount of information appears in the form of text in various corpora.A reasonable assumption is that documents in a corpus are built up with some shared topics.For example,a document on“sport”may be composed of several related topics on“basketball”,“football”,or“volleyball”.Such topic structure also popularly exists in other data forms,such as image and language.Therefore,to study how to learn these topics from datasets is valuable.Based on this basic problem,we further consider hierarchical topic modeling,and propose the novel gamma belief networks(GBN).Trained with an upward-downward Gibbs sampler that jointly learns its multiple layers,GBN is capable of inferring multilayer deep representations of high-dimensional discrete and nonnegative real vectors.Given a fixed budget on the width of the first layer,the Gibbs sampler combined with the gamma-negative binomial process allows inferring the width of each layer in a layer-wise manner.Experimental results illustrate interesting relationships between the width of the first layer and the inferred network structure,and demonstrate that the GBN can add more layers to improve its performance in both unsupervisedly extracting features for classification and predicting heldout data.For exploratory data analysis,we extract trees and subnetworks from the learned deep networks to visualize how the very specific factors discovered at the first hidden layer and the increasingly more general factors discovered at deeper hidden layers are related to each other,and we also generate synthetic data by propagating random variables through a learned deep network from the top hidden layer back to the bottom data layer.3.In big data era,most practical datasets have enormous feature dimensions and samples.However,traditional batch learning methods need to pass through the entire dataset in each iteration,which requires a large amount of memory and computational cost,and thus not suitable for practical applications.At the same time,big data,containing more information,require deep models that have strong descriptive ability and modeling capacity;and the increased model complexity further enlarges the challenge in extracting information from big data.We start with the Poisson gamma belief network(PGBN)and study its efficient multilayer joint learning method with high scalability.Using data augmentation and marginal-ization techniques,the proposed method turns the PGBN into an alternative representation named deep latent Dirichlet allocation(DLDA),based on which we derive its block-diagonal Fisher information matrix for the first time.Combining the Fisher information matrix and a stochastic gradient MCMC framework,we present topic-layer-adaptive stochastic gradient Riemannian(TLASGR)MCMC that has topic and layer specific learning rates and enables multilayer joint learning of deep PGBN(DLDA)on big data.4.During the research of the aforementioned TLASGR MCMC,we are faced with the simulation problem of a MVN distribution truncated on the intersection of a set of hyperplanes,which widely exists in different research fields.The naive solution,based on Cholesky decomposition with O(V~3)computational complexity,is computationally expensive so that it may not suitable for practical problems.To solve this basic problem,we introduce a fast and easy-to-implement simulation algorithm for hyperplane-truncated MVN distributions.The proposed method takes advantage of the structure information hidden in the problem,and is capable of reducing the complexity to O(V),in some common cases,to extremely accelerate simulation.We also further generalize it to efficiently simulate random variables from a multivariate normal distribution with structured covariance/precision matrix,and experimentally verify its efficiency.
Keywords/Search Tags:Bayesian nonparametrics, Attributed scattering center(ASC), Topic modeling, Multilayer representation, Fisher information matrix(FIM), Stochastic gradient Markov chain Monte Carlo(SG-MCMC), Equality constraints, Structured covariance/precision matrix
PDF Full Text Request
Related items