Heart failure,which is also known as "myocardial failure",is the inability of the heart to pump a blood supply commensurate with the venous return and the metabolic needs of the body’s tissues.It is characterized by a series of signs and symptoms that are associated with a high mortality rate and frequent relapses.In the clinics,assessing the prognostic probability of readmission of heart failure patients and the survival status of patients after discharge,choosing to seek medical treatment in advance and formulating a more scientific treatment plan based on the prediction results,is an important means to prevent rapid deterioration of the disease and miss the best treatment time,thus reducing medical expenses and morbidity and mortality.In this paper,we model and analyze the data set of elderly heart failure inpatients from the perspective of machine learning and survival analysis,and the main work of this paper is as follows.1.Data pre-processing.This paper uses a heart failure database based on the electronic medical records of the Fourth People’s Hospital in Zigong,Sichuan Province,which includes information on demographic data,baseline clinical characteristics,comorbidities,laboratory results and outcomes of patients with all types of heart failure.The data set was processed for missing values,and variables were screened by ANOVA,recursive feature elimination and correlation analysis to extract the 36 independent variables with the highest correlation with the outcome variables,and finally the processed data were divided into test and training sets.2.To establish a prediction model based on machine learning for readmission of elderly patients with heart failure.Machine learning models were developed for readmission with 28 days,readmission within 3 months and readmission within 6months,and the model performance was evaluated by four evaluation indexes of AUC,accuracy,specificity and sensitivity.The results showed that the random forest stood out among many models,with AUCs of 0.963,0.909,and 0.809,respectively.However,the performance of the same model for predicting the three dependent variables was gradually deteriorating,which may be attributed to the existence of a certain timeliness of the clinical information of patients.3.To establish a survival analysis based on randomized survival forest for elderly heart failure patients and model interpretability analysis under the framework of SHAP theory.The Cox proportional risk regression model and random survival forest were established by integrating death within 28 days,death within 3 months and death within6 months as outcome variables and time to death as time variables for patients,respectively,and using the consistency index as the evaluation index for assessing the model.The random survival forest model with better performance(C-index: 0.721)was selected to predict the survival status of patients,and the SHAP theoretical framework was combined to interpret the results in a holistic and local manner,extracting the 20 most important variables affecting the survival rate of heart failure patients and analyzing the degree of influence and the positive and negative direction of each variable. |