Font Size: a A A

Comparing six missing data methods within the discriminant analysis context: A Monte Carlo study

Posted on:2001-10-14Degree:Ph.DType:Dissertation
University:The Ohio State UniversityCandidate:Viragoontavan, SunantaFull Text:PDF
GTID:1460390014454995Subject:Curriculum development
Abstract/Summary:
The purpose of this study was to compare the relative effectiveness of six missing data methods for discriminant analysis. These six missing data treatments were listwise deletion, group mean substitution, regression-based imputation, hot-deck imputation, multiple imputation using SOLAS(TM) (commercial computer software for missing data), and multiple imputation, NORM, developed by Schafer.;Missing data methods were compared under three simulated conditions: correlation structures (low/moderate and high), sample sizes (100, 200, and 500), and proportions of missing data (.05, .10, and .20). Values were randomly deleted from a complete data matrix at the three levels of proportion of missing data: .05, .10, and .20. These incomplete data matrices were then treated by the six missing data methods. The treated data as well as complete data were subjected to linear discriminant analysis. The relative effectiveness of the six missing data techniques was assessed by deviations of the hit rate and the discriminating power of the first discriminant function. The results revealed that the two multiple imputation procedures were uniformly the most effective. The two most effective methods were multiple imputation employing SOLAS(TM) and the multiple imputation approach developed by Schafer. In general, the third most effective method was the hot-deck procedure. The group mean and regression-based procedures performed reasonably well in estimating the discriminating powers of the principal discriminant function, but these two methods did not seem to function as effectively as previously mentioned methods in estimating the hit rates. Listwise deletion was found to be the least effective approach.;Finally, all methods provided more accurate estimates with data slightly/moderately correlated than they did with data highly correlated. The accuracy in estimating the hit rate and discriminating power increased directly with the sample size and inversely with the proportion of missing data.
Keywords/Search Tags:Missing data, Discriminant analysis, Multiple imputation, Effective
Related items