Font Size: a A A

Statistical Modeling Analysis Of Censored Data Based On Model Averaging

Posted on:2024-10-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:C WangFull Text:PDF
GTID:1520307340977009Subject:Statistics
Abstract/Summary:PDF Full Text Request
Censored data arise extensively in various applied fields,such as medicine,biology,reliability engineering,actuarial science,and demography.By censoring,we mean that an individual on a survival time of interest is incomplete,that is,the survival time is observed only to fall into a certain range instead of being known exactly.Specifically,possible censoring schemes include right censoring,where it is only known that the subject is still alive at a given time,and interval censoring,where the only information is that the event occurs within an interval.One type of interval censoring is case I interval censoring,also known as current status data,which typically occur when each study subject is observed only once and the only observation information regarding the event of interest is whether it occurred prior to the observation time.In this paper,we propose model averaging methods to improve the accuracy of predictions with censored data.The researches presented in this paper encompass three main areas: model averaging in semi-parametric transformation models for right-censored data,model averaging in survival trees for right-censored data,and model averaging in additive hazards models for current status data.Firstly,we consider model averaging methods for right-censored data under semi-parametric transformation models.In these models,variable selection has received significant attention in academia.However,traditional variable selection methods do not consider the randomness of the model selection stage,which may lead to an underestimation of variance,posing risks to statistical inference and prediction.To improve the accuracy of survival probability predictions,we study model averaging techniques within this model framework.Specifically,we employ a K-fold cross-validation method and develop a new cross-validation criterion to determine model weights.Simulation studies and a case analysis of lung cancer data demonstrate the feasibility and effectiveness of our method.Secondly,extensive researches have been dedicated to the study of random survival forest,a statistical tool that employs survival trees as fundamental learners to analyze right-censored data.However,the predictive performance of these models is not always optimal,necessitating further improvement.To address this,we propose two model averaging methods based on martingale residuals.Specifically,we define an in-of-bag and out-of-bag data process and establish two weight selection criteria.Subsequently,we implement a greedy algorithm to optimize the calculation of weights.Numerical simulations demonstrate that our proposed methods offer higher predictive accuracy and robustness compared to model-based regularization methods and random forests with equal weights for all trees.Finally,we apply the methods to a study of mantle cell lymphoma.Lastly,we explore the model averaging methods for current status data.Despite extensive discussions on variable selection for failure time data,these methods often overlook the inherent model uncertainty in the selection process,which can lead to inaccuracies in statistical predictions.To improve the accuracy of risk quantity predictions,such as survival probability,we propose two optimal model averaging methods within the semi-parametric additive hazards model framework for current status data.Utilizing the martingale-residuals process,we define a leave-one-out cross-validation procedure and derive two cross-validation criteria for model weight selection.The effectiveness and superiority of our methods are validated through a series of simulation studies,with an application to Alzheimer’s disease data further illustrating their practical utility.
Keywords/Search Tags:Censored data, High-dimensional data, Model averaging, Cross validation, Martingale-residuals process, Greedy algorithm
PDF Full Text Request
Related items