| We consider two regularization approaches, the least absolute shrinkage and selection operator (LASSO) and the threshold gradient directed regularization (TGDR), for variable selection and estimation in the partially linear accelerated failure time (PL-AFT) model. The PL-AFT model has two regression components: a linear component for high-dimensional covariates, such as gene expressions, and a nonparametric component for other low-dimensional covariates.;Our study is motivated by studies that investigate the relationship between survival and genomic measurements and other variables such as clinical or environmental covariates. To obtain unbiased estimates of genomic effects, it is necessary to take into account these covariates, whose effects on the survival can be highly nonlinear and are often best modeled in a nonparametric way. We use the Stute's weighted least squares method to construct the loss function, which uses the Kaplan-Meier weights to account for censoring. A V-fold cross validation is used for tuning parameter selection.;Our simulations show that LASSO and TGDR perform well when there is no censored observation, even when the number of covariates is larger than the sample size. When the censoring rate increases, the percentage of non-zero regression coef ficients identified to be non-zero via LASSO and TGDR decreases and the bias for LASSO and TGDR estimate increases. When the censoring rate increases to 50%, the bias for LASSO estimates increases dramatically and is worse for TGDR estimates in that most of the estimators for non-zero coefficients are close to 0. Overall, LASSO performs better than TGDR when the outcome variable is subject to censoring. We propose a refit approach to correct the bias in LASSO estimates. Our simulation shows that this refit approach can correct most of the bias, especially for large coefficients. We also propose an approximation for the variance of the estimator from the refit approach. We apply the PL-AFT model to two real examples: mantle cell lymphoma (MCL) data and lung adenocarcinoma data. The results show that the gene identified are likely relevant to MCL and lung adenocarcinoma, respectively. |