| In recent years,failure time data have been frequently encountered in various scientific fields such as reliability engineering,public health,biology,and demography.In the analysis of failure time data,one has to deal with various types of censorings with right censoring being the most widely studied.Other common types of censorings include interval censoring and partly interval censoring.In addition to censoring,left truncation is also frequently encountered in failure time studies,meaning that individuals are recruited into the study only after experiencing a certain initial event,resulting in the data collected that are not a random sample from the target population.Further,when the left-truncated random variable follows a uniform distribution,the collected data are usually referred to as length-biased data,a special case of left-truncated data.In regression analysis,some continuous covariates may suffer measurement errors.Clearly,different types of data require different statistical inference methods for handling.In practical problems,the data encountered may not be limited to a single type,potentially involving both censoring and left truncation,making the data structure more complex and increasing the difficulty in analyzing such data.Addressing such data not only poses significant theoretical challenges but also presents substantial computational difficulties.In this thesis,we will conduct an indepth analysis of complex interval-censored data and provide corresponding statistical modeling methods.The thesis will include four chapters.Chapter 1 serves as an introduction,providing an overview of the research background of complex interval-censored data and outlining the structure of the thesis.Chapter 2 investigates the variable selection problem for interval-censored data with length-biased sampling under the proportional hazards model.We consider different penalty functions and proposes a universal variable selection method to elimate irrelevant variables.Under some regular conditions,we prove the consistency and oracle properties of the proposed variable selection method.Simulation results demonstrate the superior performance of the proposed method in selecting important covariates and estimating the regression coefficients.The method is then applied to a Prostate cancer study,illustrating its practical relevance.Chapter 3 discusses the semi-parametric regression analysis of partly intervalcensored data under length-biased sampling.This chapter,under the semi-parametric proportional hazards model,utilizes non-parametric maximum likelihood estimation to handle the baseline cumulative hazard function,treating it as a step function.It employs an EM algorithm based on two-stage data augmentation to simplify computations and obtain parameter estimates.Under some regular conditions,we establish the consistency,asymptotic normality,and semi-parametric efficiency of the obtained estimators.Numerical simulation results demonstrate the superior performance of the proposed method,which is more efficient than the conditional likelihood method.This chapter also applies the proposed method to an AIDS cohort,showcasing its practical utility.Chapter 4 investigates regression analysis of interval-censored data with covariate measurement errors.The chapter assumes an additive measurement error model for the observed covariates and a semi-parametric transformation model for the underlying failure time.Under these assumptions,the chapter proposes an estimation method based on simulation-extrapolation approach to correct measurement errors.It proves that the resulting estimates are consistent and asymptotically normal.Simulation studies validate the proposed method’s good performance in practice.Specifically,compared to the Naive method that does not consider covariate measurement errors,the proposed method can effectively reduce estimation bias.Finally,the chapter applies the proposed method to a set of real data from a study of decompression sickness. |