Font Size: a A A

Non- and semi-parametric techniques for handling missing data

Posted on:2006-09-26Degree:Ph.DType:Thesis
University:Limburgs Universitair Centrum (Belgium)Candidate:Hens, NielFull Text:PDF
GTID:2450390008975970Subject:Mathematics
Abstract/Summary:
In this work, a variety of non- and semi-parametric techniques are used to handle missing data problems. The material presented clearly shows the benefits of relaxing assumptions.; While starting off with a basic introduction into the field of missing data and non- and semi-parametric techniques, the successive parts of this work focus on different topics. A first part describes a kernel based imputation procedure which makes use of a non-parametric regression relationship between a partially observed response and fully observed covariate. The approach is related to the approximate Bayesian bootstrap method and can be seen as an extension of the local single imputation of Cheng (1994) to a proper local multiple imputation approach. An essential ingredient of the algorithm is the local generation of responses.; In a regression analysis, selecting an appropriate model from a candidate set of models is based on, e.g., the Akaike Information Criterion (AIC, Akaike, 1973). If however observations are incomplete, the use of complete cases can lead to wrong model choices. In a second part, two modifications of the AIC-criterion are proposed. The use of these AIC versions is illustrated on a case study and contrasted with tree-based methods who deal with both missing values and design.; From the existing material to deal with dropout in longitudinal studies, it is clear that a sensitivity analysis should be part of any statistical analysis. Next to providing an overview of existing sensitivity tools, the third part of the thesis describes a non-parametric sensitivity tool called 'kernel weighted influence'. It uses a 'kernel based neighbourhood' concept to explore the global and local influence towards non-random missingness for types of observations instead of observations itself in a selection model framework (Diggle and Kenward, 1994).; In a last part, generalized estimating equations are used to determine the force of infection for binary clustered data. The impact of missing data on the analysis is illustrated and inverse probability weighted estimating equations are proposed. The weights are estimated non-parametrically by a generalized additive model with penalized regression splines. Several other complications in the dataset are dealt with, including the constraint for the age-specific seroprevalence to be monotone increasing. (Abstract shortened by UMI.)...
Keywords/Search Tags:Data, Non- and semi-parametric techniques
Related items