| Doubly interval censored data is common in medicine,epidemiology,economics and other fields,and is an important data type in survival analysis research.For example,in the AIDS cohort study,intermittent observations were made on hemophiliac patients who were transfused with contaminated blood.The purpose of the study was to determine the impact of the amount of contaminated blood patients received on the AIDS incubation period,and the incubation period of AIDS is determined by HIV-1 infection and AIDS.However,neither of these two events can be directly observed in the study,only the in-terval that their occurrence can be determined,which doubly interval censored data are generated.At present,most scholars’ research on doubly interval censored data mainly focuses on parameter estimation,and there are relatively few studies on variable selection for doubly interval censored data.Therefore,this thesis will study the variable selection problem of Cox proportional hazards model under doubly interval censored data.The research contents are as follows:First,for the Cox proportional hazards model,the Bernstein polynomial is used to approximate the benchmark hazard function and parameterize it.Based on this,the like-lihood function under doubly interval censored data is derived,and then using the BAR(Broken Adaptive Ridge)method to select variables.In order to simplify the compu-tational complexity,this thesis combines the Newton-Raphson method and the iterative least squares to transform the penalized likelihood of variable selection into the problem of least squares estimation problem,thereby realizing variable selection.Then,the effec-tiveness and feasibility of the method proposed in this thesis are verified by numerical simulation and example analysis.The simulation results show that compared with other variable selection methods in this thesis,the BAR method is more stable in selecting vari-ables,with the least number of wrongly selected variables and the smallest MMSE.Secondly,for the varying-coefficients Cox proportional hazards model,B-splines are used to approximate the time-varying coefficients,and Bernstein polynomials are used to approximate the benchmark hazard function.Based on this,the likelihood function of the varying-coefficients Cox proportional hazards model with doubly interval censored is derived.AGLasso(Adaptive Group Lasso)is used to construct the likelihood function,and then using the Iterative Group Shooting algorithm to optimize the penalty function,so as to realize the variable selection of the varying-coefficients Cox proportional hazards model and obtain important variables,and use numerical simulation and the analysis of example to verify the validity and feasibility of the variable selection method.The results show that the variable selection method proposed in this thesis can correctly select impor-tant variables with high probability under different sample sizes,number of parameters and different censoring rates.Compared with the BAR method,the AGLasso method is more accurate in selecting variables.Finally,the thesis gives a summary and future research contents. |