| Purpose:To explore the feasibility of machine learning technology in analyzing and predicting the risk of catheter-related venous thrombosis in elderly patients with PICC,so as to reduce the incidence of catheter-related thrombosis in elderly patients and provide theoretical reference for guiding clinical practice.Methods:Samples were collected in PICC center of a general hospital.Fifty-seven characteristics were collected by literature review,including patient-related factors,operation-related and treatment-related factors,catheter-related factors.Retrospective data were collected as a training set to build the model,and prospective data were used as a test set to test the efficacy of the model.Retrospective data were collected from elderly patients who underwent intravenous therapy at a PICC center after catheterization between October 1,2018 and January 30,2020.To explore the risk factors of PICC catheter-related venous thrombosis in elderly patients,and use 7machine learning algorithms to construct association rules between each factor and the outcome of PICC catheter-related venous thrombosis.Including Decision Tree,Random Forest,SVM,Bayes,GBDT,e Xtreme Gradient Boosting and DNN.The information of elderly patients with catheterization in PICC center from September 1,2020 to May 31,2021 was prospectively collected,and the data were put into 7 machine learning models.The control variable method was used for 70 times of model validation,and the AUC,accuracy and other performance indicators of 7 classification models were obtained.Finally,the optimal model was selected by comprehensive score.Excel was used for data entry and preliminary processing.Python V3.6.2 was used to build association rules between each factor and outcome,as well as validation and optimization models.Results:A total of 522 elderly patients were collected,and 382 patients were retrospectively collected,including 76 positive cases and 306 negative cases,with a positive and negative sample ratio of 1:4.There were 140 prospective data,including 30positive cases and 110 negative cases,with a sample ratio of 1:3.6.Cancer accounted for82.4%of all patients;There were 258 males and 264 females;The mean age of the patients was 69.62 years.Median indwelling time was 85.07(34-268)days.The data were divided into two groups using catheter-related venous thrombosis as the outcome index.SPSS univariate analysis showed that the following factors were statistically different between the two groups:Surgical history(χ~2=23.73,P<0.01),malignant tumor(χ~2=73.71,P<0.01),skin infection occurred after catheterization(χ~2=14.31,P<0.01),anti-tumor therapy of traditional Chinese medicine(χ~2=7.25,P=0.01),PT(Z=2.02,P=0.04),PT and INR(Z=2.13,P=0.03).Random forest(RF)model was used to rank the importance of features related to outcome indicators,and the top 30 features were obtained as follows:Catheter indwelling time,PT and INR,serum albumin,the length of the catheter placement and fasting plasma glucose levels,PT,high-density lipoprotein cholesterol(hdl-c),BMI,blood platelet and blood type,white blood cells,D-dimer,operation history,APTT,marital status,age,position of catheter,NRS nutrition risk assessment score,FIB,malignant tumor,arm circumference,anticoagulant/antiplatelet agents treatment,arm of catheterization side were dominant hand,low density lipoprotein,catheterization person,smoking history,gender,education level,catheterization vein,anti-tumor treatment of Chinese medicine.The model was trained with retrospective data,and the top 30 features were added into the model to obtain the association rules between each correlation factor and outcome indicators,and the AUC,specificity,sensitivity and accuracy of 7 classification models were obtained on the test set.The top three AUC scores were GDBT(0.85),RF(0.84)and XGBoost(0.83),but the accuracy of GDBT and RF was 0.66 and 0.74 respectively,both lower than XGBoost’s0.81.the specificity of GDBT was very low,only 0.56.Therefore,in the case of comprehensive scores,the prediction results of the model constructed by XGBoost are more stable and the comprehensive score was the best.Conclusions:For elderly patients,machine learning technology can fully explore the related factors of PICC catheter-related venous thrombosis risk and accurately predict its risk,which can provide technical support for the screening of high-risk patients in the future clinical work. |