Font Size: a A A

Comparison And Application Of Semi-parametric Method With Missing Data And Other Methods

Posted on:2021-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:F H ZhaiFull Text:PDF
GTID:2428330611996385Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
The problem of missing data is a general problem in experimental research.Multiple covariates often have missing data.If we just ignore the samples with incomplete information,it may cause a lot of information loss and even bias.Data with non-monotonic missing pattern random missing mechanism is the most common type of missing data.This paper focuses on the non-monotonic missing pattern of the random missing mechanism data.The semi-parametric method is mainly used to deal with the missing data.The semi-parametric method and the complete case analysis method,mean interpolation method,multiple interpolation method,EM algorithm,and BP neural Compare the common missing data processing methods including network method,compare their performance including bias,standard error and coverage,and perform back-judgment on the basis of missing data processing to get different methods with different missing rates.Accuracy.The main work and results of this article are as follows:First,select two sets of complete data sets,iris data and breast cancer clinical medical data,and set the number of covariates containing missing data to 2 and 3,respectively.The deletion rate under the random deletion mechanism of simulated non-monotonic deletion pattern is 3%,5 %,10%,20%,30%,and 40% cases,using the full case analysis method,mean interpolation method,multiple interpolation method,EM algorithm,BP neural network method,and semi-parametric method at different missing rates,respectively Missing data.Then,select a set of incomplete fatty liver clinical medical data sets.Simulate data similar to this incomplete data set,and use the full case analysis method,mean interpolation method,multiple interpolation method,EM algorithm,BP neural network method Process with semi-parametric methods,compare the performance and return accuracy values obtained by various methods,select semi-parametric methods with better performance and higher accuracy,BP neural network method,and complete case analysis most commonly used in medical research The method was applied to the clinical data of fatty liver for comparison.Simulation experiments have proved that when the missing rate is low,it is better to use the full case analysis method to deal with missing data.As the missing rate increases,the accuracy of the complete case analysis method,mean interpolation method,and multiple interpolation method decreases rapidly.;The stability of the EM algorithm is slightly better than the first three methods;the BP neural network method is better than the other methods when the missing rate is 20%-30%;the semi-parametric method is least affected by the increase of the missing rate.On the complete clinical data of fatty liver,the accuracy of the complete case analysis method is at least 69.7905%,and the accuracy of the semi-parametric method and BP neural network method for missing data is higher,which are 75.9283% and 79.9436%,respectively.
Keywords/Search Tags:Missing data, Non-Monotonic Missing Pattern, Missing at Random, Semi-Parametric Method
PDF Full Text Request
Related items