ObjectiveThe study presented here aimed to establish a predictive model for heart failure and allcause mortality in PD patients with machine learning algorithm.MethodsThis study was a single center,retrospective cohort study.A total of 606 patients who underwent peritoneal dialysis catheterization and CAPD treatment in our hospital from January 2010 to December 2016 were included as the study population.The subjects were selected according to inclusion and exclusion criteria.All the patients were followed until February 2021,or until patients transferred to hemodialysis,underwent kidney transplantation or death.The baseline clinical data,laboratory tests and examination indicators of PD patients that met the research criteria were collected.Independent t-tests or Mann-Whitney U tests were used to compare the factors between the groups of patients who reached the study endpoints and those who did not.Recursive feature elimination methods combined with classifier algorithm were used for feature selection and evaluation of the performance of the above variables,and then the irrelevant variables were eliminated repeatedly in each iteration of classifier.3 machine learning models,including xgboost,Random Forest and Adaboost,were performed to construct the predictive models for the endpoints,and the method with the highest AUC was selected to modeling the above screened variables and rank their contributions.The risk factors affecting the study endpoints(hospitalization for heart failure,all cause mortality,and all cause mortality or hospitalization for heart failure composite end points)were obtained from the models respectively.The outcomes for 1-year and 5-year follow-up were also studied.In addition,the prediction performance of machine learning methods and cox regression was compared.ResultsA total of 606 PD patients were included in this study,63.7% of which were male.The mean age of the patients was 52.6±16.1 years.The median follow-up period was 49 months.298 patients developed heart failure required hospitalization during follow-up.According to Random Forest model(AUC=0.793),the risk factors were CCI2,congestive heart failure,systolic blood pressure,BMI,etc.79 patients developed heart failure during the first year follow-up.According to Random Forest(AUC=0.74)model,the risk factors were BMI,age,systolic blood pressure,24 h urine volume,etc.246 patients developed heart failure during 5-year follow-up.According to xgboost(AUC=0.852)model,the risk factors were CCI,BMI,systolic blood pressure and age.The machine learning model(AUC =0.885)showed a better predictive performance than cox regression(C-Index=0.77)for heart failure.The 1-year and 5-year survival rate decreased with the increase of the risk score.Regarding all-cause mortality,199 patients died during the follow-up.According to Random Forest model(AUC=0.83),the risk factors predicting death were age,CCI,creatinine,e GFR,etc.64 patients died during the first year follow-up.According to xgboost model(AUC=0.74),the predictive risk factors for 1-year all cause mortality were age,HDL-C,CHO,LDL-C,etc.A total of 161 patients died during 5-year follow-up.According to Random Forest model(AUC=0.852),the predictive risk factors were age,CCI,e GFR,creatinine,etc.The machine learning model(AUC=0.83)showed a better predictive performance than cox regression(C-Index=0.79)for all-cause mortality.Regarding the study of all-cause death or heart failure composite end point,409 patients reached the composite end point during follow-up.According to Random Forest(AUC=0.807)model,the risk factors were i PTH,CCI2 congestive heart failure,age,HDL-C and so on.114 patients reached 1-year composite end point study.According to xgboost model(AUC=0.818),the risk factors were HDL-C,age,corrected calcium,BMI,etc.A total of 331 patients reached the 5-year composite endpoint,According to Random Forest model(AUC=0.729),the risk factors were age,creatinine,e GFR,i PTH,etc.The machine learning model(AUC=0.834)showed a better predictive performance than cox regression(C-Index=0.71)for the composite endpoint.ConclusionIn this study,we used machine learning methods to predict 1-year and 5-year heart failure,all cause mortality and composite endpoint(heart failure or all cause mortality).Age,CCI,creatinine and e GFR are independent risk factors for all cause mortality in PD patients.History of heart failure,systolic blood pressure,BMI and age were independent risk factors for heart failure in PD patients.i PTH,age,HDL-C and corrected calcium are independent risk factors for the composite endpoint in PD patients.Therefore,further long term and large scale studies are needed to facilitate timely and individual treatment for PD patients,and improve clinical outcomes by using the machine learning predictive models. |