Font Size: a A A

Research On Interpolation Method Of Missing Data

Posted on:2017-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:X Q YangFull Text:PDF
GTID:2310330488990438Subject:statistics
Abstract/Summary:PDF Full Text Request
Statistics is a practical science which is research the law and essence.And research about the law and essence is dependent on the basic object--the data.Missing problem is an unavoidable objective existence in the process of data collection.Deal with missing data rightly can improve the quality of data and estimators precision in data analysis.Sample data with high quality can not only show the features of population,but also can express the individual information of the sample data fully.Under the condition of the research process correctly,it is very important for data analysis when the missing close to the true by interpolate.This article summarizes the basis of previous theory and results,and do some research of the selection about "Interpolation Method".Four aspects need to be focus attention when a method would be chose,such as the type of data,missing data mode,missing data mechanism and data characteristics.In nature,the four aspects of the research are to do the deeply mining of the sample data.Sample data not only represent the index of population,but also contain lots of information about its own.Make full use of existing information to choice the interpolation method will be more conducive to the data restore.In the second part of this article,using ways of classify and compare to deal with the interpolation methods.On the basis of Interpolation Criterion,the interpolation methods should be selected according to the Data Sets.Deeply analysis about the missing data has been made in the third part.From four aspects--the type of data,missing data mode,missing data mechanism and data characteristics to discuss the essence of data,make sure the selection of interpolation methods should be according to the analysis of data at the same time.In the fourth and fifth part of this article,using simulated data and empirical data test the effect of interpolation in different characteristics data respectively.Results show that the influences of interpolation methods are obviously different.First,compare with more value interpolation,single value interpolation is more likely to affect the result of interpolation,distort data distribution and spend less time and work.Second,when the auxiliary information exist,the secondary variables with larger correlation coefficient will interpolate better.And the interpolation effect under high loss rate with auxiliary information is better than low loss rate with the single value interpolation.Finally,interpolation effect will gradually increase with the missing rate increase,but under the mechanism of missing at random,interpolation effect without a significant turning point because of the loss rate increase,the loss of information is uniform.
Keywords/Search Tags:Missing data, Data Interpolation, The interpolation method, Method selection, Basis of selection, Data Characteristics
PDF Full Text Request
Related items