| As a commonly used survival model with survival time as the response,accelerated failure time model has been widely used for survival time prediction for its intuitive interpretability;with the continuous development of genomics:combining clinical data and high-dimensional gene expression data to predict the survival prognosis of patients has become a hot issue in the field of survival analysis.However,when there are many redundant predictors in the model,and the relat.ionship between predictors and response is very complex,it is very challenging to construct accurate accelerated failure time predictive models in high-dimensional situations.For survival data,studying the accurate prediction of semi-parametric accelerated failure time models can enrich and develop semiparametric modeling theory and method,which has great practical value for accurate prediction of disease risk and survival period and the formulation of precise treatment plans for high-risk and high-incidence diseases.In this thesis,we apply the kernel machine to capture the complex relationship between predictors and the response,combined with the LASSO penalty to eliminate the redundant predictors,and the kernel-based accelerated failure time models are respectively constructed under the conditions that the random error obeys a specific distribution and its specific distribution is unknown.Then we correspondingly propose two new Regularized Garrotized Kernel Machine estimate methods and compile"one-group-at-a-time coordinate descent" iterative algorithms.The main advantages of the method proposed in this paper are that it can better describe the potential nonlinear relationship between predictors and the response,realize the automatic modeling of the interactive effects between predictors meanwhile automatically eliminating redundant predictors,thereby improving the predictive accuracy of the model.In this thesis,the finite sample performance of the proposed method is investigated through numerical simulations.The result,show that,compared with the existing representative methods,the Regularized Garrotized Kernel Machine estimate methods have higher prediction accuracy in both cases of known and unknown error distributions,especially for the situation where the relationship between predictors and the response is very complex.Based on the clinical and gene expression information of gastric cancer data and Mantle Cell Lymphoma data,the proposed methods are used to predict the survival and risk score of patients,the empirical results show that the proposed two methods can provide a useful reference for the risk classification of patients and the design of personalized treatment plans,which verifies the practicability of the proposed method. |