Font Size: a A A

Statistical Inference For Two Classes Of Statistical Models With Missing Data

Posted on:2011-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:T QiuFull Text:PDF
GTID:2120360305477935Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Item non-response occurs frequently in daily life. It happens in opinion polls, market researchsurveys, medical studies and other scientific experiments. In such circumstances, the usual infer-ential procedures for complete data sets cannot be applied directly. It needs to do some treatmentson data before we can use usual statistical approaches. A common method is to impute values foreach missing response in order to obtain a'complete data'set and then apply standard statisticalmethods. Statistical inference for missing data is an important research field (e.g. Little & Robin,Statistical analysis with missing data[M], New York: John Wiley & Sons 2002).In the study ofthe nonparametric regression models with missing data, commonly used imputation approachesinclude linear regression imputation, semiparametric regression imputation and nonparametric re-gression imputation.Cheng (Nonparametric estimation of mean functionals with data missing atrandom, J. Amer. Statist. Assoc., 89, 81-87.) applied kernel regression imputation to obtain"complete data"and discussed asymptotic normality for the mean of response variable in a non-parametric regression model with random designs, Wang and Rao (Empirical likelihood for linearregression models under imputation for missing responses[J]. The Canadian Journal of Statistics,2001, 29(4): 597-608.) applied deterministic imputation to obtain"complete data"and the em-pirical likelihood confidence interval for the regression coefficient in a nonparametric regressionmodel with random designs is constructed, Wang and Rao (Empirical likelihood-based inferenceunder imputation for missing response data, Ann. Statist., 30, 896-924.)applied kernel regressionimputation to obtain"complete data"and established the empirical likelihood confidence inter-val for the mean of response variable in a nonparametric regression model with random. Theyuse regression imputation method to fill in missing data, construct an EL statistic based on'com-plete data'after imputation, and show that the EL statistic has a limiting distribution of a weightedsum of chi-squared variables with unknown weights. They need to use an adjusted EL to ob-tain a confidence region on regression coefficient, in which the adjustment coefficient needs tobe estimated. This would lead to a loss of the accuracy of the confidence region.In Chapter 2of this paper, the inference in a nonparametric regression model with random design points andmissing data is studied. We develop nonparametric regression imputation and inverse probabilityweighted approaches to estimate the nonparametric regression function m(x0) for fixed x0∈Rp in a nonparametric regression model with random design points. Asymptotic normality of the esti-mators is established, which is used to construct normal approximation based confidence intervalson the the parametric and nonparametric parts. In Chapter 3 of this paper, empirical likelihood(EL) ratio statistics on the the nonparametric regression function m(x0) for fixed x0∈Rp in anonparametric regression model with random design points are constructed based on the inverseprobability weighted imputation approach, which asymptotically have chi-squared distributions.These results are used to obtain EL based confidence intervals(regions) on the the parametric andnonparametric parts without adjustment, which can improve the accuracy of the confidence inter-vals(regions). Comparison of difference of populations is an important research topic in medicalstudies, economical and educational fields. Qin, Y.S. & Zhao, L.C.(Semi-parametric likelihoodconfidence intervals for various differences of two populations[J], Statistics and Probability Let-ters, 1997, 33(2): 135-143;Empirical likelihood confidence intervals for quantile differences oftwo populations [J],Chinese Ann Math(Ser A), 1997, 18(6): 687-694;Semi-empirical likeli-hood confidence intervals for quantile differences of two samples [J], Acta Mathematicae Appli-catae Sinica, 1998, 21(1): 103-112;Empirical likelihood ratio confidence intervals for variousdifferences of two populations[J],System Science and Mathematical Sciences, 2000, 13: 23-30)develop a systematical theory to construct EL confidence intervals for various differences of twopopulations under complete data. Qin, Y.S. & Zhang, S. C.(Empirical likelihood confidenceintervals for differences between two datasets with missing data[J], Pattern Recognition Letters,2008, 29(6):803-812) construct an adjusted EL confidence intervals for differences of two non-parametric populations under MCAR missing mechanism. They use (single) random imputationmethod to fill in missing data. In Chapter 4 of this paper,we assume that X,Y are missing atrandom (MAR) and propose to construct EL confidence intervals on various differences of twopopulations using'complete'data after inverse probability weighted imputation and the EL ratiostatistic based on this imputation has a limiting distribution ofχ12. Thus no adjustment is needed inconstructing confidence intervals. This would improve the accuracy of the EL confidence intervals.Here we summary some new findings in our work.1. The inference in a nonparametric regression model with random design points and miss-ing data is studied for the first time. We develop nonparametric regression imputation and inverseprobability weighted approaches to estimate the nonparametric regression function m(x0) for fixedx0∈Rp in a nonparametric regression model. Asymptotic normality of the estimators is estab-lished, which is used to construct normal approximation based confidence intervals(regions) onm(x0) for fixed x0∈Rp.2. In studying the construction of confidence intervals(regions) for the the nonparametricregression function m(x0) for fixed x0∈Rp in a nonparametric regression model with randomdesign points and missing data, and the differences of two nonparametric populations in two lin-ear regression models with random design points and missing data, we use the inverse probability weighted imputation approach. Based on this imputation approach, EL ratio statistics on the m(x0)in a nonparametric regression model and various differences of two populations in two linear re-gression models with random design points and missing data are constructed, which asymptoti-cally have chi-squared distributions. These results are used to obtain EL based confidence inter-vals(regions) on the parametric and nonparametric parts without adjustment, which can improvethe accuracy of the confidence intervals(regions).
Keywords/Search Tags:nonparametric regression model, missing data, randon design point, MARmissing mechanism, confidence interval
PDF Full Text Request
Related items