Font Size: a A A

Statistical Diagnostics Of Partially Linear Models With Missing Data

Posted on:2018-01-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:A X FanFull Text:PDF
GTID:1360330518454899Subject:Statistics
Abstract/Summary:PDF Full Text Request
Sensitivity analysis is a general statistical tool to investigate the stability of estimation outputs with respect to the data and model assumptions and to propose appropriate countermeasures and it is also an important step of data analysis.Sensitivity analysis is widely used in many statistical models and applied in many fields.It is well known that missing data are often encountered in various settings including economics,sociologic and biomedicine.So,there are many statisticians foucs on sensitivity analysis for missing data.Unfortunately,we find that most of the past studies have been done on the basis of the observed data log-likelihood or EM algorithm.But,in fact,since it is rather difficult to obtain the closed expression of the likelihood function of the complete data when the unknown smoothing function is involved,those existing methods can not be directly used to make sensitivity analysis for our considered partially linear models(PLMS).To address the issue,this dissertation is to develop a general sensitivity analysis method,including residual,generalized leverage,case-deletion influence analysis and local influence analysis,in a PLM with response missing based on semiparametric estimating equations(SEEs)and the penalized inverse probability weighted least square,which are constructed by using the inverse probability weighted approach,rather than likelihood function.At the same time,we develop Bayesian local influence analysis method to the generalized partial linear model with proportional data missing not at random and considering the difference of divergence parameters.Our methods not only solve the problem of general statistical diagnosis of the partially linear models with missing data,but also have a good reference value for the statistical diagnosis of other semi-parametric model with missing data.The contents of this thesis are as follows:(1)We investigates case deletion diagnostics and local influence analysis for partially linear models with response missing at random and unknown distributional assumption of measurement errors based on semiparametric estimating equations rather than likelihood function.The hat matrix,residuals and Cook’ s distance for linear regression models are extended to the considered partially linear models with response missing at random by con-structing a new linear regression model via the imputation and inverse probability weighted methods.The generalized leverage defined by Wei et al.(1998)for a complicated parametric model without missing data is extended to our considered partially linear models with re-sponse missing at random.Our idea for deriving generalized leverage for missing data can be extended to more complicated semiparametric models,including semiparametric estimating equations,with response missing not at random.Simulation studies and a real example are conducted to investigate the performance of the preceding proposed methods in identifying the potential influential observations.(2)We investigates local influence analysis for partially linear models with response missing not at random.We establish the penalized inverse probability weighted least square objective function for PLMs with response missing not at random by combining the inverse probability weighted estimation method and spline smoothing technique based on the generalized moment equations(GMMs)of instrumental variables and the kernel density estimation method.And,according to this,the local influence analysis methods for penalized Gaussian likelihood estimators in PLMs without missing data are extended to the PLMs with response missing not at random.Simulation studies and the real data analysis illustrate the effectiveness of our proposed local influence method for the PLM with response missing not random.Besids,we find that the diagonal element diagnostic statistics bjj are more sensitive than the largest eigenvector diagnostic statistics hmaxjj for outliers or influential point.(3)We investigates the Bayesian estimation and Bayesian local influence analysis to the generalized partial linear model with proportional data missing not at random and considering the difference of divergence parameters.We construct different perturbation models by disturbing the data,parameter prior distribution and missing mechanism model parameters,and construct the first-order and second-order Bayesian local influence measure statistic under different objective functions to identify the outliers,the wrong priori information and the missing mechanism influence.Four simulation experiments illustrate the effectiveness and rationality of the proposed method.
Keywords/Search Tags:Partially linear models, Missing data, Sensitivity analysis, Local influence analysis, Outlier
PDF Full Text Request
Related items