Semiparametric Regression Analysis Of Current State Data With Misclassificatio

Posted on:2024-07-31

Degree:Master

Type:Thesis

Country:China

Candidate:L J Fang

Full Text:PDF

GTID:2557307067478134

Subject:Statistics

Abstract/Summary:

Current status data are commonly encountered in various scientific fields including epidemiological studies and clinical trials.Such data arise when each subject under study is examined only once at the examination time,and one only knows the failure status of the event of interest at the examination time rather than the exact failure time.When the disease is of low prevalence,group testing is a commonly used strategy to reduce screening cost and time.This testing strategy works by first amalgamating specimens(e.g.,blood,urine)from several individuals into a pool and then testing pooled specimen to examine the infection statuses of individuals.When the endpoint of interest is a time-to-event outcome(e.g.,the time to infection)and the infection status of the pooled individuals is examined once through a test,we obtain group-tested current status data.If the diagnostic procedure used to examine the disease is not perfect and has testing error,one will further obtain the misclassified current status data or the group-tested current status data with outcome misclassification.The two type of data are frequently encountered in survival analysis,have complex structures and contain limited information.Therefore,it is challenging to conduct regression analysis to investigate the covariate effects on the failure time of interest.This thesis first conducts regression analysis of the misclassified current status data with the semiparametric probit model,which is one of the commonly used semiparametric models in survival analysis and has recently received considerable attention.We consider the nonparametric maximum likelihood estimation and develop an expectation-maximization(EM)algorithm by incorporating the generalized pooladjacent-violators(PAV)algorithm to maximize the intractable likelihood function.The resulting estimators of regression parameters are shown to be consistent,asymptotically normal,and semiparametrically efficient.Furthermore,the numerical results in simulation studies indicate that the proposed method performs satisfactorily in finite samples and outperforms the naive method that ignores misclassification.The proposed method is applied to a real dataset on chlamydia infection collected by the Nebraska Public Health Laboratory.The second part of this thesis discusses the regression analysis of the misclassified group-tested current status data with semiparametric probit model.For the estimation,a sieve maximum likelihood estimation approach is developed that approximates the nonparametric nuisance function with logarithmic monotone splines.To facilitate the sieve estimation,we develop an efficient EM algorithm by using three-stage data augmentation.The asymptotic properties of the resulting estimators are investigated via empirical process techniques and sieve estimation theory.Numerical results from extensive numerical studies suggest that the propose method performs well and has some desirable advantages over the estimation method based on the individual-based testing results.An application to a set of chlamydia data provided by the State Hygienic Laboratory at the University of Iowa also demonstrates the practical usefulness of the proposed method.

Keywords/Search Tags:

Current status data, EM algorithm, Group testing, Maximum likelihood estimation, Misclassification

Related items

1	Regression Analysis Of Misclassified Current Status Data
2	EM Algorithm Parameter Estimation Problem Based On Group Testing
3	Statistical Analyses And Applications For Missing Data Based On EM Algorithm
4	Order Restricted Weighted Estimation Method In Mixed Tests
5	Maximum Likelihood Estimation And Application Of Multi-parameter Odd Log-Logistic Generalized Gompertz Model For Complex Censored Data
6	Research On Parameter Regression Model Of Interval Censored Data In Proportional Hazard Model
7	Research On Semantic Matching Algorithm Of Table Tennis Question Answering System Based On Knowledge Graph
8	Statistical Inference Of Competitive Risk Models Under Complex Censored Data
9	A Chen-Lindley Distribution: Statistical Inference And Its Applications
10	Theory And Application Of The AR(p) Model Of Matrix Cross-Section Data Time Series