The main research topic of survival analysis is the correlation between survival time or results and various relevant impact conditions.The commonly used analysis tools are: survival function,life distribution function,proportional hazard model and so on.These methods still remain in the traditional statistical methods,and the prediction accuracy is not so good for some actual data.For the short-term prediction of survival time,we can use semi-parametric algorithms Cox model.However,Cox model must satisfy proportional risk assumption,which is a very strong assumption,and in the actual data with much noise,it’s not suitable to remain this assumption.The machine learning algorithm models have very few restrictions on the distribution of data and have better hypothesis space and fitting abilities.Some scholars at home and abroad have made some researches on the prediction of survival time using random forest,boosting tree algorithm and BP neural network.However,machine learning has a rapid development recently,these researches are now far away.The new optimization algorithm and new model XGBoost still haven’t used in the research of survival analysis.And XGBoost is a novel boosting tree tool,it has a very good prediction on a mass of applications compared to the other general machine learning frames.This thesis mainly attempts to introduce XGBoost model into the research and applications of survival analysis.In the meantime,we used Adam algorithm to train BP network and compare it with SGD algorithm.Then we perform numerical experiments to verify the performance of new model and algorithm.In the chapter four,we have improved the XGBoost algorithm.The-norm used to deal with overfitting problems in machine learning is introduced into the objective function of the original XGBoost model,we solve the new function and obtain a new boosting tree algorithm.This thesis also puts the hazard function of survival analysis to the training data into the XGBoost model for training.A composite model combining statistical method and machine learning algorithm is obtained,which has better accuracy and stability than single XGBoost.Finally,the results of various methods on different data sets are presented to reveal the effectiveness of the new method. |