Font Size: a A A

Bayesian Methods For Variable Selection Problems

Posted on:2014-09-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:J YuanFull Text:PDF
GTID:1260330425962088Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Because of its flexibility of statistical inferences, the Bayesian approach has been attracting a growing body of research. Especially in recent years, various new sampling technologies have been developed and speed of high-performance computers is continuously increased. This greatly facilitates relative Bayesian computations in a variety of applications.Our paper focus on dealing with several popular and practical topics, such as variable selection method Lasso and variable selection problems in Linear Mixed Models, via Bayesian Methods.The Lasso is a popular technique of simultaneous estimation and variable selection in many research areas. The marginal posterior mode of the regression coefficients is equivalent to estimates given by the non-Bayesian Lasso when the regression coefficients have independent Laplace priors. Current approaches are primarily to either do a fully Bayesian analysis using Markov chain Monte Carlo (MCMC) algorithm or use Monte Carlo EM (MCEM) methods with a MCMC algorithm in each E-step. However, MCMC-based Bayesian method has much computational burden and slow convergence. In order to deal with these prob- lems, we propose two new algorithms, based on a non-iterative sampling technique via Inverse Bayes Formulae, which can effectively solve Bayesian Lasso problem.Mixed effects models are generally used to characterize the features of re-peated measurement data and/or longitudinal data, which is so common in biomedieal applications and Econometrics. In practice, longitudinal data are often unbalanced or incomplete; that is. all individuals are not observed at the same number of time points or with the same design matrix X. the number of observations on individuals may vary. Models that can accommodate the unbal-anced nature of longitudinal data and have more parsimonious covariance strue-tures need to be considered. Therefore, to address these problems we consider models for such longitudinal data that contain both individual random effects components and within-individual errors that follow an (autoregressive) AR(1) time series process.This dissertation consists of four chapters. Its main innovations are organized as follows:Chapter1mainly describes the background of our research and discusses on why we would like to choose these topics, followed by simple introductions of several important technologies applied in our methods.Tan (2007)[70] proposed a non-iterative sampling approach, the inverse Bayes formulae (IBF) sampler, for computing posteriors of a hierarchical model in the structure of Monte Carlo EM (MCEM). Motivated by this paper, In Chapter2, we develop this IBF sampler in the structure of Monte Carlo EM (MCEM) to give these marginal posterior mode of the regression coefficients for the Bayesian Lasso, by adjusting the weights of importance sampling when the full conditional distribution is not explicit. Simulation experiments show that the computational time is much reduced with our method based on EM algorithm and our algorithms our methods behave comparably with other Bayesian Lasso methods not only in prediction accuracy but also in variable selection accuracy and even better especially when the sample size is relatively large.In Chapter3, we also discuss Bayesian Lasso problem. Unlike in Chapter2, the approach we propose here is actually a full Bayesian analysis method based on a non-iterative sampling algorithm. As we all know, the Bayesian Lasso has two major advantages. Firstly, as a Bayesian method, the distributional results on the estimates are straightforward, making the statistical inference easier. Secondly, it solves the penalty parameter selection problem simultaneously. We first give an expectation-maximization (EM) algorithm to estimate the posterior mode of each regression coefficients. Tan (2003)[68] proposed a non-iterative sampling approach. the inverse Bayes formulae (IBF) sampler, for computing posteriors in the structure of EM algorithm. Tan (2006)[69] developed the IBF sampler in the structure of Monte Carlo EM (MCEM) for the hierarchical model with repeated binary data. Inspired by these two papers, we develop a non-iterative sampling method to obtain an i.i.d. sample approximately from posteriors by combining the inverse Bayes formula, sampling/importance resampling and posterior mode estimates from the results of EM algorithm to solve the problem of the Bayesian Lasso. This solution to the Bayesian Lasso can not only provide interval estimates (Bayesian credible intervals) that can guide variable selection as MCMC-based methods such as Gibbs sampler but also eliminate the convergence problems of Markov chain Monte Carlo methods. Performance of this algorithm, evaluated through simulation study, show that our method performs competitively with (often better than) those existing Bayesian Lasso methods.In Chapter4, we address the important problem of how to select the ran-dom effects component in a linear mixed model with AR(1) errors. A hierarchical Bayesian model is used to identify any random effect with zero variance. With the help of modified Cholesky decomposition for covariance matrix, we reparame-terized the mixed model so that the functions of the covariance parameters of the random effects distribution are incorporated as regression coefficients on standard normal latent variables. This facilitates the use of normal conjugate priors. We implement the posterior computation via a Markov Chain Monte Carlo algorithm.
Keywords/Search Tags:Bayesian Analysis, Monte Carlo, Inverse Bayes Formulae, Lin-ear Mixed Model
PDF Full Text Request
Related items