Font Size: a A A

Semiparametric maximum likelihood estimation in parametric regression with missing covariates

Posted on:2004-11-15Degree:Ph.DType:Dissertation
University:University of PittsburghCandidate:Zhang, ZhiweiFull Text:PDF
GTID:1460390011477354Subject:Biology
Abstract/Summary:
Parametric regression models are widely used in public health sciences. This dissertation is concerned with statistical inference under such models with some covariates missing at random. Under natural conditions, parameters remain identifiable from the observed (reduced) data. If the always observed covariates are discrete or can be discretized, we propose a semiparametric maximum likelihood method which requires no parametric specification of the selection mechanism or the covariate distribution. Simple conditions are given under which the semiparametric maximum likelihood estimator (MLE) exists. For ease of computation, we also consider a restricted MLE which maximizes the likelihood over covariate distributions supported by the observed values. The two MLEs are asymptotically equivalent and strongly consistent for a class of topologies on the parameter set. Upon normalization; they converge weakly to a zero-mean Gaussian process in a suitable space. The MLE of the regression parameter, in particular, achieves the semiparametric information bound, which can be consistently estimated by perturbing the profile log-likelihood. Furthermore, the profile likelihood ratio statistic is asymptotically chi-squared. An EM algorithm is proposed for computing the restricted MLE and for variance estimation. Simulation results suggest that the proposed method performs reasonably well in moderate-sized samples. In contrast, the analogous parametric maximum likelihood method is subject to severe bias under model misspecification, even in large samples. The proposed method can be applied to related statistical problems.
Keywords/Search Tags:Semiparametric maximum likelihood, Regression, Method, MLE
Related items