Font Size: a A A

A Study Of Instrument Approach For Nonignorable Missing Data

Posted on:2021-04-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:J ChenFull Text:PDF
GTID:1360330629980824Subject:Statistics
Abstract/Summary:PDF Full Text Request
Nonresponse or missing data is a common phenomenon in many areas,Nonignorable nonresponse,a response mechanism that depends on the values of the variable having nonresponse,is the most difficult type of nonresponse to handle.When the response mechanism is nonignorable,Robins and Ritov(1997)showed that,in order to identify all unknown parameters,either the propensity function or the original data distribution must have a parametric component.But even when a parametric component exists,the parametric model is identifiable only under some assumptions and the identifiability issue is still difficult to handle.In recent years,a statistical tool called "nonresponse instrument" or "shadow variable"(Wang et al.,2014;Zhao and Shao,2015;Miao and Tchetgen Tchetgen,2016)has been developed to address the model identification issue.Specifically,the nonresponse instrument is a useful covariate vector that can be excluded from the nonresponse propensity but are still useful cova riates even when other covariates are conditioned,and it can help us to identify unknown parameters under some conditions(Wang et al.,2014;Zhao and Shao,2015).The purpose of th is PhD thesis is to develop some statistical methods based on instrument to handle nonignorbale missing data under some specific settings,and study their theoretical properties.The main works are as follows:The existing work in this instrumental approach assumes such an instrument is given,which is frequently not the case in applications.Therefore,we first investigate how to search for an instrument from a given set of covariates.The method for estimation we apply is the pseudo likelihood proposed by Tang et al.(2003)and Zhao and Shao(2015),which assumed that an instrument is given and the distribution of response given covariates is parametric and the propensity is nonparametric.Thus,in addition to the challenge of searching an instrument,we also need to do variable and model selection simultaneously.We propose a method for instrument,variable,and model selection and show that our method produces con sistent instrument and model selection as the sample size tends to infinity,under some regularity conditions.Next,we consider the nonignorable nonresponse problem in the presence of high-dimensional covariates.When the original data distribution has a parametric component but the propensity function is nonparametric,Tang et al,(2003)and Zhao and Shao(2015)proposed a pseudo likelihood approach based on nonresponse instrument to estimate un-known parameters in a parametric density.The pseudo likelihood involves the estimation of the joint density of covariates.To avoid model misspecification or instability of kernel estimation,we estimate the density nonparametrically,and apply sufficient dimension reduction to reduce the dimension of covariates for efficient estimation.Consistency and asymptotic normality of the proposed estimators are established.Then,we first consider statistical inference of unknown parameters in estimating equations when some covariates have n onignorably missing values.When an instrument is available,the conditional distribution of the missing covariates given other covariates can be estimated by the pseudo likelihood method of Zhao and Shao(2015)and be used to construct unbiased estimating equations.These modified estimating equations then constitute a basis for valid inference by empirical likelihood.Our method is applicable to a widerange of estimating equations used in practice.It is semiparametric since no parametric model for the p ropensity of missing covariate data is assumed.Asymptotic properties of the proposed estimator and the empirical likelihood ratio statistic are derived.Finally,we still consider statistical inference of unknown parameters in estimating equations,where the difference is the response variable has nonignorably missing values.By utilizing the cutting edge techniques of nonresponse instrument,a parametric response propensity function can be identified and estimated.Then asemiparametric likelihood is constructed with the propensity function,estimating equations and auxiliary informa-tion being incorporated into the constraints to nake the inference valid and improve the estimation efficiency.Asymptotic distributions for the resulting p arameter estimates are derived.Some simulation results and real examples are present to show that the proposed methods give promising results.
Keywords/Search Tags:Missing data, Nonignorable nonresponse, Nonresponse instrument, Variable selection, Model slection, Pseudo likelihood, Estimating equation
PDF Full Text Request
Related items