Font Size: a A A

Estimations Of Disease Prevalence And The Diagnostic Accuracy In The Absence Of A Gold Standard Under The Double-sampling Design

Posted on:2022-12-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q QuFull Text:PDF
GTID:2504306755499604Subject:Statistics
Abstract/Summary:PDF Full Text Request
In medical research,it is important to estimate the disease prevalence and the accuracy of its diagnostic test parameters.In the process of diagnosis of subjects,if the diagnostic test with misjudgment is used,subjects may be misclassified,which will lead to biased estimates.while the gold standard can prevent misclassification.Whereas gold standard,while free of misclassification,are often expensive and time-consuming,diagnostic tests with misclassification are cheap and convenient.Therefore,using double-sampling to obtain data is one of the effective methods of overcoming the disadvantages of the two test methods.The double-sampling,which is a random sample of individuals from the population of interest all subjected to the misjudgment test,and then a random sample of individuals subjected to the gold standard.Because individuals who have received the gold standard can accurately know the condition of the disease,such data are also called partially validated data.Considering that gold standard is difficult to exist in reality and classifiers with misjudgment are often used,the data obtained through double-sampling is called partially validated data with imperfect gold standard.Part of this article is based on the partially validated data with imperfect gold standard,assumed both classifiers are false negative and false positive,to estimate the disease prevalence and diagnosis accuracy evaluation research,and from the perspective of confidence intervals,under the given confidence level,the control of interval width within the specified range of approximate formula for sample size or effective algorithm for research.Firstly,the conditional independent model(model I)and dependent model(model II),are considered.Under the two models,two point estimation methods,maximum likelihood estimation and Bayesian estimation,were proposed for disease prevalence and sensitivity and specificity of tests.The confidence interval(CI)based on Wald method,Log transformation,Logit transformation,the inverse hyperbolic tangent transformation and the bootstrap resampling,and under the conditional independent model,the Bayesian credible interval is also developed.According to the simulation results:(1)For model I,except that Bayesian confidence interval under Jeffery’s non-information prior is a bit anti-conservative,other confidence intervals show good properties,so these confidence interval construction methods above can be recommended for application and practice;(2)For model II,all confidence intervals have good properties,so these confidence interval construction methods above can be recommended for application and practice.Secondly,this paper investigates sample sizes for determining disease prevalence from the perspective of CI.The sample size determination methods are developed,and the width of CI is controlled by the predetermined confidence level.We provide sample size determination methods based on Wald CI and the inverse hyperbolic tangent transformation CI of the disease prevalence under the conditional independent model and dependent model,and under the conditional independent model,the sample size estimator based on a Bayesian credible interval is also developed.According to the simulation results:(1)In the same parameter setting,no matter which method,the sample size obtained based on model I is larger than that obtained based on Model II;(2)For model I,except for Bayesian method under Jeffery’s non-information prior,the empirical coverage probability of other methods is close to the prior given confidence level.For model II,all sample size determination methods have empirical coverage probability close to the prior given confidence level,so these sample size determination methods mentioned above are recommended to be used in application and practice.Finally,malaria data are taken as an example to illustrate the feasibility and effectiveness of the proposed method.
Keywords/Search Tags:Disease prevalence, Sensitivity, Specificity, Partially validated data with imperfect gold standard, Confidence interval, Sample size
PDF Full Text Request
Related items