Font Size: a A A

Bayesian Analysis Of Semi-parametric Latent Variable Model And The Model With Missing Data

Posted on:2020-07-25Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z H MaFull Text:PDF
GTID:1480306455467504Subject:Statistics
Abstract/Summary:PDF Full Text Request
Multiple responses of mixed data types is a common problem in scientific research and practice.For mixed responses data,if separate analysis is applied to analyze data of different types separately,estimates with lower efficiency may be obtained.To improve estimation efficiency,jointly analysis is preferred to analyze mixed responses data so that dependence among responses of mixed types can be considered.In this paper,latent variable model is applied to build the joint distribution of mixed responses.Latent variable model uses a shared latent variable to obtain the joint distribution of the mixed responses.Given the shared latent variable,the mixed responses are assumed to be conditionally independent,through which a joint model for the mixed responses can be obtained.Usually,the shared latent variable in the latent variable model is assumed to be normally distributed,which may not be suitable in some circumstances.The first main contribution of this paper is breaking the normality assumption of the latent variable and assigning a Dirichlet Process(DP)for the latent variable,and a semi-parameteric latent variable model is built for higher flexibility.The effectiveness of the semi-parameteric latent variable model based on DP is verified through extensive simulation studies.Based on the semi-parametric latent variable model based on DP,we consider missing data additionally.Missing data exists in many circumstances,including non-response in survey sampling or dropout in clinical trials.We consider mixed responses with missing data,and build a semi-parametric latent variable model based on DP with missing data under Selection Model framework.Under the Selection Model framework,in addition to the semi-parametric latent variable model as the response model,a missing covariate distribution and a missing data mechanism model are built.Similarly,we use simulation studies to verify the precision and efficiency of the proposed model.Next,for the mixed responses with missing data problem,we modified the DP distribution to DP mixtures distribution as the distribution of the latent variable.The DP distribution will be inappropriate due to the discreteness of DP in some situations,like when the distribution of the latent variable is known to be continuous,the DP will be unsuitable.To overcome this limitation,we use DP mixtures distribution for the latent variable so that it can be suitable for a large range of continuous distirbutions,and a semi-parametric latent variable model based on DP mixtures with missing data is proposed.For dealing with missing data,the Selection Model framework is applied,and simulation studies are used to verify efficiency of the proposed models and methods.Parameter estimation and model analysis are achieved under Bayesian framework.The reason for using Bayesian approach are: i)under Bayesian framework,complex models can shown in a form of hierarchical model,and the semi-parametric latent variable in this paper are built by changing the prior distribution of the latent variable,that is,modifying the prior of the latent variable from normal distribution to DP prior or DP mixtures prior;ii)Bayesian approach can deal with missing data naturally,and the Selection Model framework for dealing with missing data in our paper can easily accomplished by Bayesian approach;iii)Markov chain Monte Carlo(MCMC)algorithms are used for parameter estimation,and the development of software and computation technique make the posterior estimation more easily.In our paper,the models are built under a Bayesian framework,and DIC,LPML and other model comparison criteria are used for model selection.We also introduce sensitivity analysis under Bayesian framework to ensure the completeness and reliableness of the analysis structure of our paper.Finally,survey data from Chinese General Society Survey(CGSS)and Chinese Health and Nutrition Survey(CHNS)are used for illustration to verify the practicability of the proposed models and methods and provide as references.In CGSS data,a semi-parametric latent variable model based on DP prior is built for jointly analyzing mixed continuous and ordinal responses.DIC and LPML are used for model comparison,and Bayesian approaches are used for parameter estimation.In CHNS data,missing data are considered as well,and a semi-parametric latent variable based on DP prior or DP mixtures prior with missing data are fitted.LPML is used for selecting the prior of the latent variable,while the DIC based on the missing data mechanism model is used for selecting the missing data mechanism model.Parameter estimation and sensitivity analysis are conducted through Bayesian approach.These application examples show that the proposed models and methods can be used and perform well in real datasets.
Keywords/Search Tags:Semi-parametric latent variable model, Missing data, Bayesian analysis, Mixed responses data
PDF Full Text Request
Related items