Font Size: a A A

Structure Learning Of Graphical Models With Latent Variables Via Sparse And Low-Rank Decomposition

Posted on:2024-09-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q Z ZhengFull Text:PDF
GTID:1520307295987719Subject:Statistics
Abstract/Summary:
As a combination of graph theory and probability theory,graphical models provide a natural tool for analyzing and modelling the high-dimensional and complex data,and they are used in a wide range of applications such as genomics,computer vision and natural language processing.A major task involving graphical models is structure learning,which intends to estimate a graph that describes the conditional independence relationships among random variables from a given data set.The graphs are generally assumed to be sparse.However,in many real scenarios in fields such as psychology and economics,the data may come from highly interconnected systems where most interdependencies among random variables can be explained by a few unobserved latent variables,and thus cannot be fit well with sparse graphical models.By incorporating the essential latent variables,graphical models can provide a clear representation of the underlying background knowledge of the data,capture the data’s generation mechanism and reduce the model complexity.Nevertheless,as the number of latent variables and the structural relationships between latent and observed variables are generally unknown,graphical model structure learning with latent variables is challenging.A breakthrough approach is the penalized likelihood method based on sparse and low-rank decomposition[1].The penalties are usually convex including the l1 norm for the sparse matrix and the nuclear norm for the low-rank matrix,where the sparse matrix represents the conditional independence among observed variables given latent ones and the low-rank matrix encodes the effect of latent variables on observed ones.This approach is able to simultaneously estimate the graphical structure among observed variables and the number of latent variables.In addition to the structure learning of graphical models with latent variables,sparse and low-rank decomposition is also widely used in video processing,image processing and other fields.For the common problem of sparse and lowrank decomposition,a generalized nonconvex nonsmooth model is considered in this thesis and a unified framework is proposed to solve it.On the basis of sparse and low-rank decomposition,further studies are conducted on structure learning of undirected graphical models and directed acyclic graphical(DAG)models in the presence of latent variables.The main contributions are summarized as follows.1.The optimization problem of generalized nonconvex nonsmooth sparse and low-rank decomposition model is considered.Based on the majorization-minimization(MM)algorithm,we propose a unified MM-ADMM algorithm framework and prove its convergence.In the sparse and low-rank decomposition problem,the convex model employing the l1 norm and the nuclear norm is commonly used and computationally tractable,but there are certain limitations.Compared with convex penalties,nonconvex penalties approximate the l0 norm and rank function better.However,different nonconvex penalties may lead to different solution forms,making it less practical to apply.In our generalized nonconvex nonsmooth model,the penalties on sparse and low-rank matrices cover a series of nonconvex surrogate functions of the l0 norm and rank function,which are required to be continuous,concave and monotonically increasing on[0,+∞).such as lp(0<p<1),SCAD,Logarithm and MCP.The MM-ADMM algorithm is applicable to all eligible surrogates equipped with supergradients.The experiments on simulation data and computer low level vision tasks(including image denoising and video foreground and background separation)demonstrate the effectiveness of the MM-ADMM algorithm.2.Considering the structure learning problem of undirected graphical models with latent variables,we improve the penalized likelihood method in Chandrasekaran et al.[1]and propose an adaptive penalized likelihood method,which efficiently reduces the parameter estimation bias and promotes the estimation accuracy of graphical structure.Chandrasekaran et al.[1]uses non-adaptive convex penalties including the l1 norm and the nuclear norm.Although convex optimization problem is computationally tractable,previous studies have shown that these two non-adaptive penalties introduce biases in the estimation of sparse and low-rank matrices,respectively.To address this issue,we impose the adaptive lasso penalty on the sparse matrix and the adaptive nuclear norm penalty on the low-rank matrix.The minimization of the negative adaptive penalized likelihood is nonconvex and nonsmooth,nevertheless the estimates of sparse and low-rank matrices with closed form can be obtained by using the alternating direction method of multipliers(ADMM).Simulation studies show that the adaptive penalties perform better than the non-adaptive penalties in terms of both structure learning and parameter estimation.Considering the gene networks with latent variables,we apply the adaptive penalized likelihood method to the data set about riboflavin production with Bacillus subtilis.3.Considering the structure learning problem of DAG models with latent variables,we propose the SLRO-BCD method based on sparse and low-rank decomposition,and introduce a modified EBIC(mEBIC)criterion based on the EBIC in order to select the optimal penalty parameters in SLRO-BCD.Existing methods are either inapplicable to the random variables with unknown orders or unable to simultaneously estimate the graphical structure and the number of latent variables,but our SLROBCD method can simultaneously deal with these two limitations.The penalized likelihood function in SLRO-BCD includes the l1 norm imposed on the sparse matrix and the nuclear norm on the low-rank matrix,whose minimization problem is solved by the block coordinate descent(BCD)algorithm.Compared with EBIC,mEBIC considers the degrees of freedom for both sparse and low-rank matrices,and greatly improves the recovery performance of SLRO-BCD in terms of graphical structure and number of latent variables.Simulation studies show that the method of SLRO-BCD combined with mEBIC is able to estimate almost all true edges in the skeleton of graph,and compared to LRp S+GES,it can accurately estimate the number of latent variables.We apply the SLRO-BCD to analyze the gene expression data set related to Arabidopsis thaliana.
Keywords/Search Tags:Sparse and low-rank decomposition, Latent variables, Undirected graphical models, Directed acyclic graphical models, Structure learning
Related items