Simulated Comparison Of Different Filling Methods In Missing Values

Posted on:2013-07-25

Degree:Master

Type:Thesis

Country:China

Candidate:L L Hua

Full Text:PDF

GTID:2234330371475841

Subject:Epidemiology and Health Statistics

Abstract/Summary:

PDF Full Text Request

ObjectiveMissing value is a common problem in traditional Chinese medicine of HIV/AIDS. It will increase the complexity of the analysis, and cause bias of the results and so on. It is urgent to resolve the missing value before statistic analysis. Compare the effect of different methods in simulated data of missing values, and conform the most apprppriate times of multiple imputations (MI). To explore the most exact, effective and convenient methods in different missing mechanism and different missing pattern.MethodsSAS9.1was used to simulate data, and to input missing value by different methods. Expectation maximization method(EM), regression method, imputating in mean method, deleting in groups method and multiple imputation method(MI) were used to dealing with continuous value with missing values, and the results were compared from accuracy, precision and mean. About binary variable data, deleting in groups method and logistic regression method in MI were employed and compared, and the results were compared by regression coefficient and standard error.Results1. Continuous variable data:The missing pattern of continuous value was arbitrary missing pattern. The more times was fulled in, the more powerful was the imputation effect. When the times of imputation were10, the effect was up to0.95, and the precision was best. When missing rate was not more than20%, the accuracy was better in imputating3or5times, while missing rate was between30%and40%, we need to imputating10times to get better accuracy. If the missing rate was above50%, the accuracy was poor. 2. Missing completely at Random:When missing rate was not more than10%. the effect of these five methods was similar. But MI had better precision and accuracy. When the missing rate was above20%, deleting in groups method and MI method were better than others. MI method had best precision, while deleting in groups method had best accuracy.3. Missing at Random:When the missing rate was between10%and20%, MI method had best accuracy and precision. When the missing rate was30%, deleting in groups method had best accuracy. If the missing rate was above40%, the effect of all methods was poor.4. Binary variable data:When missing rate was not more than40%, deleting in groups method was more similar to whole data in regression coefficient and standard error. When the missing rate was between40%and50%, logistic regression method in MI was better, and the most apprppriate times of imputation was2in this study dataset. If the missing rate was above60%, the effect of these two methods was poor.ConclusionsIt can be considered to be normal distribution for a large sample of continuous variables material, and allow missing range is below30%. Some traditional methods, such as imputating in mean method and deleting in groups method, have some advantage in treating missing values, which is more easier and convenient. Comparing with traditional methods, MI is able to solve most of problems in missing data sets, and it is more convenience and powerful than other methods.

Keywords/Search Tags:

Missing values, Simulation Imputation methods, Missing Comple-tely at Random, Missing at Random

PDF Full Text Request

Related items

1	Research On Multiple Imputation In Propensity Score With Partially Observed Covariates And Its Application In Real-World Studies Of Adverse Drug Reactions
2	A Simulated Comparitive Study And Application Of Statistical Methods In Datasets With Missing Values
3	Cardiovascular Disease Epidemiological Survey Data To Fill In The Missing Comparison And Simulation Research Methods
4	Comparative Simulation Study On Missing Data Handling Using Pattern Mixture Models
5	A Statistical Simulation Study Of Bias Correction When The Different Missing Mechanism Coexist
6	The Simulation Studies Of Imputation Methods Of Missing Data In The Scale
7	A Pattern Mixture Model Base On MNAR Missing Mechanism And Its Application In Medicine
8	Research On Application Of Missing Data Imputation In Medical Field
9	Computer Simulation Of Multiple Imputation For Analyzing Parallel Design And Crossover Design With Missing Data In Clinical Trial
10	Multiple Imputation For The Non-monotone Missing Data And The Application Of Cardiac Rehabilition Comprehensive Intervention Effect Evaluation