Font Size: a A A

Simulation Of The Missing Data Imputation Methods For The Regression Model

Posted on:2015-05-29Degree:MasterType:Thesis
Country:ChinaCandidate:W ZhaoFull Text:PDF
GTID:2180330461499178Subject:Statistics
Abstract/Summary:PDF Full Text Request
In experiment research and survey research, missing data is widespread. It not only impacts the process of data analysis, but also makes large deviation of the statistical analysis results, reduces the credibility of statistical conclusion. There are many reasons for missing data, domestic and foreign scholars have made a lot of achievements on it. Up to now, it is also a hot topic in statistics.A lot of statistical analysis methods base on the complete data set. For missing data, we can use the statistical methods after imputation, and imputation can reduce the influence of missing data. This dissertation details the imputation methods used commonly and compares their superiority. Then for missing data of explained variable in the multiple regression modes, we use mean imputation method, regression imputation method, EM algorithm imputation method and MCMC-DA imputation method to discuss. Under different missing rates, we do the simulation analysis from two aspects of the imputation value and regression coefficient estimates. The results show:under the same missing rate, ME, MSE and SD(the square error of regression coefficients), y (the angle between regression estimator of complete data set and regression estimator after imputation)of Regression imputation and EM algorithm imputation are small and stable, show that Regression imputation and EM imputation can get a better estimate to this model; Based on the above evaluation index, the regression imputation method is best when missing rate is bigger, and EM imputation method is best when missing rate is small. For MCMC-DA imputation, under the same missing rate, the MSE, SD, y of it are slightly better, although it did not achieve the ideal effect, but can get better results. They are relatively larger in Mean imputation, it is worse than any other three methods. In terms of overall trend, the influence of the sample size for imputation value and regression coefficient is obvious, but it reduces as sample size increase.Based on the discussion above, we construct two new imputations:MRED imputation, MRE imputation. And make a simulation comparison. For overall trend, MRED imputation is better than MCMC-DA imputation, inferior to Regression imputation and EM imputation; MRE imputation is better than MRED imputation and MCMC-DA imputation, between Regression imputation and EM imputation. But when Regression imputation, EM imputation and MCMC-DA imputation are close to each other, MRED imputation is best and can be used as a choice. When the Regression imputation and EM imputation is relatively close, differ with MCMC-DA imputation, MRE imputation is best and can be used as a choice.
Keywords/Search Tags:regression model, mean imputation, regression imputation, EM algorithm imputation, MCMC-DA imputation
PDF Full Text Request
Related items