Font Size: a A A

Research On Design Effect Based On Superpopulation Models

Posted on:2020-03-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y F YangFull Text:PDF
GTID:1360330596981233Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Design effect is an important index in survey sampling and the key to determine sample size in complex sampling design.There are two modes of research on design effect: design-based and model-based.In the design-based mode,the population is fixed,the randomness of the sample is derived from the random selection from the population,and the design effect is obtained by the design-based variance estimation.However,in the complex sampling design,the simple expression of the variance is often not availab le.Therefore,the calculation formula of the design effect cannot be obtained,and the calculation result of the design effect can only be obtained by the resampling method(Jackknife method,Bootstrap method)or the Taylor linearization method.In the model-based mode,the population is considered as a sample of the superpopulation.The design effect can be estimated by the model-based variance estimation.For various sampling designs including complex sampling design s,as long as the corresponding superpopulation model is obtained,the simple expression of variance can be obtained.The calculation formula of the design effect can also be obtained,and the correctness of the formula can be verified by simulation.The research on the design effect based on the superpopulation model has been paid more and more attention by foreign researchers,but there are few domestic studies.In this paper,some basic carding and discrimination work is first done,mainly including: Firstly,we systematically sort out the factors influencing the design effect.Through mathematical derivation and simulation,we f ind that survey variables,estimators,sampling methods and sample size can all have a significant impact on the design effect.Secondly,we study the design-based measurement of design effect.On the one hand,some design-based measurement methods of design effect are summarized.On the other hand,the problem of variance estimation under simple random sampling with complex sampling sample is studied.It is found that the unbiased estimators given in the literature do not have obvious advantages over simple estimators,but increase the difficulty and computation of the estimators.Thirdly,we propose the designbased measurement framework of design effect.To be specific,when the simple estimator of the viarance under simple random sampling does not exist,the original Jack Knife method can be used to estimate the variance.The designbased measurement framework of design effect is presented,and the simulation is carried out.The core work of this paper is: on the basis of previous studies,we enriched and developed the Superpopulation models corresponding to sampling methods.Based on Random Effect Model,we first present the superpopulation models corresponding to SRS,Two-stage Sampling,Unequal Probability Sampling and Stratified Sampling.Based on these Superpopulation Models,we discuss the formulas of the design effects of stratification,clustering and weighting respectively,and derive the formulas for calculating the combined design effects of two factors and three factors.The formulas show that the combined design effect of multiple factors is equal to the product of the design effect of each factor.The validity of the formulas is verified by Monte Carlo simulation.The derived formulas are consistent with the traditional formulas given by Kish et al.And the contents of these formulas are more abundant than the traditional.Through these formulas,this paper realizes the decomposition of design effect,and makes the understanding of design effect more in-depth.Based on the Superpopulation model,this paper also studies the calculation of design effects in some specific cases.Specifically,for the problem that the design effect formula underestimates the real design effect in the case of extreme stratification and weighting,the expression of correction factor is given,and a concise approximate expression of the correction factor is obtained through a large number of simulations.For the problem that the design effect formula fails when the weight is related to the variable,the Taylor Expansion of non-linear function is used.Based on Superpopulation model,the corresponding formula is deduced and verified by simulation.For the problem that the formulas for calculating the clustering effect can not be used under three-stage sampling,the corresponding formula is deduced based on the Superpopulation model,which is consistent with the traditional formulas but more general.Also,based on Superpopulation model,the relationship between the design effects of different domains and the design effect of the whole population is studied,and the corresponding formula is deduced and verified by simulation.Finally,this paper discusses the application of these formulas for calculating design effects.On the one hand,it discusses the estimation of parameters in the formulas when only sampling design and sample are available,combs out and gives the estimation methods of each formula,and verifies the estimat ors through simulation.On the other hand,based on the database of women of childbearing age in the Wuhan "1+8" urban circle of Hubei province,and taking women of childbearing age in Qianjiang city as the population,various sampling designs are carried out and the corresponding design effects are calculated,so as to have an intuitive understanding of the design effects of the estimator s under the actual population.Finally,some Suggestions on reducing design effect in actual sampling are presented.The innovations of this paper are as follows:Firstly,the superpopulation models corresponding to the sampling methods are developed.Based on the superpopulation model of simple random sampling and two-stage sampling given in the literature,the design idea of superpopulation model is proposed.According to this idea,a series of superpopulation models corresponding to stratified,multi-stage and unequal probability sampling are designed.Secondly,a calculation system of design effect based on superpopulatio n models is preliminarily established.Based on superpopulation models,not only the formulas for calculating and estimating the design effect when stratification,clustering and weighting adjustments exist separately or simultaneously are derived,but also the formula for calculating and estimating the weighting effect when the weights are correlated with variables,and the clustering effect under three-stage sampling.Formulas for the relationship between the design effects of each strata or domain and the total design effect in multiple sampling design are also given.These formulas constitute a preliminary calculation system of design effect based on superpopulation models.Thirdly,some problems related to design effects are discovered and preliminarily solved for the first time.For example,the problem o f underestimation of design effect when extreme stratification and weighting coexist is found,and the problem is initially solved by introducing correction factors.For example,the calculation formula of clustering effect under three-stage sampling is given based on the superpopulation model for the first time.For example,formulas for the relationship between the design effects of each strata and the total design effect in stratified sampling is given for the first time.
Keywords/Search Tags:Design Effects, Superpopulation model, Random Effect Model, Variance estimation
PDF Full Text Request
Related items