Font Size: a A A

Research On Statistical Inference Of Non-probability Samples Based On Model Calibration Estimation

Posted on:2021-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:H F DuanFull Text:PDF
GTID:2370330623472756Subject:Statistics
Abstract/Summary:PDF Full Text Request
In the previous sampling survey research,probability-based sampling surveys are dominant.When the probability of sample introduction is known,randomization theory is used to generate a representative sample of the target population to eliminate selection bias,but as the response rate decreases And with the rapidly increasing cost of probability-based sampling,researchers are turning to cheaper and more convenient non-probabilistic sampling methods to achieve the required sample size.With the development of the Internet,online surveys are popular due to the short data collection cycle and low survey costs.At present,they have dominated the market research survey data collection.In this context,the problem of statistical inference of non-probability samples has become an urgent problem for network investigation.Solving the problem of statistical inference of non-probability samples is conducive to promoting the widespread application of non-probability sampling and the development of network surveys,it has certain practical significance.The model calibration estimation method adjusts the weights of sample units by using auxiliary information to reduce the difference between the sample structure and the overall structure,and is used to solve the problem of low sampling accuracy due to the randomness of the samples.In this paper,the decision tree,neural network,support vector machine,random forest,and Lasso in machine learning methods are introduced into model calibration estimation and used for statistical inference of non-probability samples.By constructing non-probability samples and combining auxiliary information to achieve the estimation of the target population,An empirical study was conducted on two different data sets.First,the group that can access the Internet was selected through the variable "Can the Internet be used",and it was assumed to be an online volunteer group,and then samples were randomly selected from the online volunteer group.To construct non-probability samples for statistical inference.There is a large difference between the mean of the target variable of the sample unit and the mean of the overall target variable in the two data sets used in this paper.The research results show that the statistical inference results and targets obtained by the model calibration estimation method under different sample sizes The true results of the population are very close,that is,even if the non-probability sample mean is significantly different from the actual population mean,using the model calibration estimation method can well solve this problem and achieve a more accurate estimation of the target population.The comparative analysis of the two methods found that the bias of the model-based calibration estimates based on random forests performed better,while the model-based calibration estimates based on Lasso performed better in terms of variance and root mean square error.
Keywords/Search Tags:Non-probability sample, Model calibration estimation, Machine learning, Statistical inference
PDF Full Text Request
Related items