Font Size: a A A

Statistical Inference For Probability Density Function With Missing Data

Posted on:2011-11-17Degree:MasterType:Thesis
Country:ChinaCandidate:Q M LiFull Text:PDF
GTID:2120360305478006Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
In practice, some data may be missing for various reasons such as unwillingness of somesampled units to supply the desired information, loss of information caused by uncontrollablefactors, failure on the part of the investigator to gather correct information, and so forth. It happensin opinion polls, market research surveys ,medical studies and other scientific experiments.In suchcircumstances, the usual inferential procedures for complete data sets cannot be applied directly.It needs to do some treatments on data before we can use usual statistical approaches. in order tohandle missing data problem, we use"Complete-Case"method ,which deals with the problem bydeleting all the missing data and then apply standard statistical methods. imputation is a commonmethod to handle missing data problem, which is to impute values for each missing response inorder to obtain a'complete sample'set and then apply standard statistical methods. Statisticalinference for missing data is an important research field (e.g. Little and Rubin, Statistical Analysiswith Missing Data[M], New York: John Wiley and Sons 2002).Due to the complex of objective reality, the assumption that a population belongs to a specialparametric family could not be true . To get a good estimation for population parametric , we needto use sample data to directly estimate the probability density.In the field of probability density estimation , there exist a great number of results. In the caseof complete data Rosenblatt, Parzen, Loftsgarden & Quesenberry, Wahba, Silverman, Devroye,Devroye & Gyo¨rfi,among others have studied the issue of probability density estimate exteasively.for probability density estimation with missing covariate data, Robins et al(Semiparametric effi-cient estimation of a density with missing or mismeasured covariates [J]. Roy. Statist Soc SerB,1995, 57: 409-424) proposed estimators in a parametric model for the density of a responsevariable conditional on a vector of covariates ; for probability density estimation with missing re-sponse data, Wang(Probability density estimation with data missing at random when covariablesare present [J]. Journal of Statistical Planning and Inference,2008,138: 568-587) developedsemiparametric regression imputation and inverse probability weighted approaches to estimateprobability density .In chapter 2 of this paper, under incomplete data and MAR missing mechanism, we studiedthe estimation and its asymptotic properties for the probability density. We obtain two results: (1). The conditions in Wang(2008) are weaked (mainly re?ect to expand the scope of kernelestimation), we developed semiparametric regression imputation approaches to estimate the prob-ability densitys. Asymptotic normality of the estimators is established, which is used to constructnormal approximation based confidence intervals on the the parametric and nonparametric parts.(2). We develop inverse probability weighted approaches to estimate the probability density.Asymptotic normality of the estimators is established, which is used to construct normal approxi-mation based confidence intervals on the probability densityIn chapter 3 of this paper, under incomplete data and MAR missing mechanism, the construc-tion of Empirical likelihood (EL) confidence intervals for the probability density is studied for thefirst time. We obtain two results:(1). We use semiparametric regression imputation approach to fill in incomplete data . Basedon this imputation approach, EL ratio statistics on the probability density is constructed, whichasymptotically has a weighted sum of chi-squared variables. The results is used to obtain ELbased confidence intervals on the probability density .in which the adjustment coefficient needs tobe estimated. This would lead to a loss of the accuracy of the confidence intervals.(2). We use the inverse probability weighted imputation approach to fill in incomplete data ..Based on this imputation approach, EL ratio statistics on the probability densitys are constructed,which asymptotically have chi-squared distributions. These results are used to obtain EL basedconfidence intervals on the parametric and nonparametric parts without adjustment, which canimprove the accuracy of the confidence intervals.Here we summary some innovations in this paper.1. The conditions of Wang(Probability density estimation with data missing at random whencovariables are present [J]. Journal of Statistical Planning and Inference,2008,138: 568-587)are weaked (mainly re?ect on expanding the scope of kernel estimation).2. We develop inverse probability weighted approaches to estimate the probability densitys,Asymptotic normality of the estimators is established, which is used to construct normal approxi-mation based confidence intervals on the probability density3. Under incomplete data and MAR missing mechanism, the construction of confidence inter-vals for the probability density is studied for the first time. we use the inverse probability weightedimputation approach. Based on this imputation approach, EL ratio statistics on the probabilitydensitys are constructed, which asymptotically have chi-squared distributions. These results areused to obtain EL based confidence intervals on the parametric and nonparametric parts withoutadjustment, which can improve the accuracy of the confidence intervals.
Keywords/Search Tags:probability density function, MAR missing mechanism, asymptotic normality, Empirical likelihood
PDF Full Text Request
Related items