Font Size: a A A

Using principal component analysis (PCA) to obtain auxiliary variables for missing data in large data sets

Posted on:2013-02-04Degree:Ph.DType:Dissertation
University:University of KansasCandidate:Howard, Waylon JFull Text:PDF
GTID:1458390008467769Subject:Psychology
Abstract/Summary:
The purpose of this dissertation is to address an important issue in the imputation of missing data in large data sets. The issue can arise in any analysis in which auxiliary variables are used to inform a modern missing data handling procedure (e.g., FIML, MI) to support the missing at random assumption, reduce bias and decrease standard errors. The problem is that researchers suggest an "inclusive strategy" where as many auxiliary variables are included as possible. However, the model becomes more complex with the addition of each additional auxiliary variable, so there is a practical limit to the number of auxiliary variables that can be successfully included. Beyond this limit, the model will fail to converge. Large data projects can present a challenge because it is possible to have hundreds of potential auxiliary variables to inform the missing data handling procedure, especially when non-linear information is included. The dissertation is divided into the following sections: 1) a brief discussion of the issue of missing data; 2) a review of the history of missing data including theory and existing solutions regarding handling missingness; 3) an assessment of the use of auxiliary variables in missing data handling; 4) a discussion of convergence failure with modern missing data methods; 5) a basic introduction to principal component analysis; 6) the introduction of an alternative strategy to address the large number of auxiliary variables issue; and finally, 7) a demonstration of the potential of the principal component scores as auxiliary variables approach by applying it to the analysis of simulated and empirical data.
Keywords/Search Tags:Data, Auxiliary variables, Principal component, Issue
Related items