Font Size: a A A

Flexible Statistical Learning Methods for Survival Data: Risk Prediction and Optimal Treatment Decision

Posted on:2014-09-23Degree:Ph.DType:Dissertation
University:North Carolina State UniversityCandidate:Geng, YuanFull Text:PDF
GTID:1454390005984797Subject:Statistics
Abstract/Summary:
In survival analysis, the major endpoint of interest is time-to-event data, which is usually subject to censoring. Among various problems in this area, we focus on two in this dissertation: survival risk prediction and optimal treatment decision.;In Chapter 2, we propose a new model-free machine learning method for risk classification and survival probability prediction, which plays an important role in patients' risk stratification, long-term diagnosis and treatment selection. The proposed method is based on weighted support vector machines (wSVMs) equipped with the inverse probability of censoring weighting (IPCW) technique. The new approach does not require any specific parametric or semiparametric model assumption, and is therefore robust. In addition, it is capable of capturing nonlinear covariate effects when a flexible kernel function is used. We demonstrate numerous simulation examples to show finite sample performance of the proposed method under different settings. Applications to a glioma tumor data and a breast cancer gene expression data are given to further illustrate the methodology in real data analysis.;In Chapter 3, we are interested in finding the best treatment rules that maximize patients' mean survival time. Due to patient's heterogeneity in response to treatments, great efforts have been devoted to developing optimal treatment regimes by integrating individuals' clinical and genetic information. A main challenge arises from the selection of important variables that can help to build reliable and interpretable optimal treatment regimes since the dimension of predictors may be high. We propose a robust loss-based estimation framework that can be easily coupled with shrinkage penalties for both estimation and variable selection. The asymptotic properties are studied for the proposed estimators for regression coefficients and the associated estimate of the restricted mean log survival time under the derived optimal treatment regimes. Simulations are conducted to assess the empirical performance of the proposed method for parameter estimation, variable selection and optimal treatment decision. An application to a survival data from an AIDS clinical trial is also given to illustrate the method.
Keywords/Search Tags:Survival, Data, Optimal treatment, Method, Risk, Prediction
Related items