Font Size: a A A

Variable Selection for General Transformation Models

Posted on:2012-10-25Degree:Ph.DType:Thesis
University:The Chinese University of Hong Kong (Hong Kong)Candidate:Li, JianboFull Text:PDF
GTID:2450390011956510Subject:Statistics
Abstract/Summary:
General transformation models are a class of semiparametric survival models. The models generalize simple transformation models with more flexibility in modeling data coming from statistical practice. The models include many popular survival models as their special cases, e.g., proportional hazard Cox regression models, proportional odds models, generalized probit models, frailty survival models and heteroscedastic hazard regression models etc. Although the maximum marginal likelihood estimate of parameters in general transformation models with interval censored data is very satisfactory, its large sample properties are open. In this thesis, we will consider the problem and use discretization technique to establish the large sample properties of maximum marginal likelihood estimates with interval censored data.;In general, to reduce possible model bias, many covariates will be collected into a model. Hence a high-dimensional regression model is built. But at the same time, some non-significant variables may be also included in. So one of tasks to build an efficient survival model is to select significant variables. In this thesis, we will focus on the variable selection for general transformation models with ranking data, right censored data and interval censored data. Ranking data are widely seen in epidemiological studies, population pharmacokinetics and economics. Right censored data are the most common data in clinical trials. Interval censored data are another type common data in medical studies, financial, epidemiological, demographical and sociological studies. For example, a patient visits a doctor with a prespecified schedule. In his last visit, the doctor did not find occurrence of an interested event but at the current visit, the doctor found the event has occurred. Then the exact occurrence time of this event was censored in an interval bracketed by the two consecutive visiting dates. Based on rank-based penalized log-marginal likelihood approach, we will propose an uniform variable selection procedure for all three types of data mentioned above. In the penalized marginal likelihood function, we will consider non-concave and Adaptive-LASSO (ALASSO) penalties. For the non-concave penalties, we will adopt HARD thresholding, SCAD and LASSO penalties. ALASSO is an extended version of LASSO. The key of ALASSO is that it can assign weights to effects adaptively according to the importance of corresponding covariates. Therefore it has received more attention recently. By incorporating Monte Carlo Markov Chain stochastic approximation (MCMC-SA) algorithm, we also propose an uniform algorithm to find the rank-based penalized maximum marginal likelihood estimates. Based on the numeric approximation for marginal likelihood function, we propose two evaluation criteria---approximated GCV and BIC---to select proper tuning parameters. Using the procedure, we not only can select important variables but also be able to estimate corresponding effects simultaneously. An advantage of the proposed procedure is that it is baseline-free and censoring-distribution-free. With some regular conditions and proper penalties, we can establish the n -consistency and oracle properties of penalized maximum marginal likelihood estimates. We illustrate our proposed procedure by some simulations studies and some real data examples. At last, we will extend the procedures to analyze stratified survival data.;Keywords: General transformation models; Marginal likelihood; Ranking data; Right censored data; Interval censored data; Variable selection; HARD; SCAD; LASSO; ALASSO; Consistency; Oracle.
Keywords/Search Tags:Models, Variable selection, Interval censored data, Marginal likelihood, ALASSO
Related items