| Background and Objective: Atrial fibrillation can reduce the quality of life of patients and increase the risk of stroke and death,so early intervention can help improve their prognosis.As there are many factors affecting the prognosis of patients with atrial fibrillation and it is not clear,there is no model that accurately predicts the risk of death in patients with atrial fibrillation.This study compared Lasso-Cox,random forest,support vector machine,and neural network machine learning algorithms to find out the best death risk prediction model of atrial fibrillation among the four algorithms,so as to explore the related factors of death risk of atrial fibrillation,and provide basis for the management of patients with clinical atrial fibrillation.Methods: 1.Patient data collection:Patients with atrial fibrillation inpatients in the First Affiliated Hospital of Shantou University Medical College from November 2018 to December2019 were the research objects.The patient’s general information,past history,treatment history;laboratory,electrocardiogram and ultrasound examination data were collected.Follow-ups were conducted at 1 month,6 months and 1 year.The follow-up review includes: the patient’s survival,symptoms,medication,etc.2.Statistical analysis:Continuous and categorical variables with data missing more than 30% were censored.For data missing less than 30%,continuous variables were preprocessed with multiple imputation,and categorical variables were processed with mode imputation.According to the ratio of 7:3,they were randomly divided into the modeling group and the verification group;the data of the modeling group was model-fitted,and the data of the verification group was verified internally.The Lasso algorithm was used to reduce the dimensionality of the data variables of the modeling group,then four machine learning algorithms: Lasso-cox,random forest,support vector machine and neural network were used to establish the prognosis prediction models of four patients with atrial fibrillation.Evaluated the pros and cons of each model through the ROC curve,and got the best model.Established a nomogram based on the prediction indicators of the best model.C-index,calibration and ROC curve were assessed the accuracy of the nomogram.The time-dependent ROC curve and decision curve were used to compare nomogram.CHA2DS2-VASc score and HAS-BLED were used to evaluate the pros and cons of the model.The X-tile was used to explore the nomogram risk score threshold.Then,draw the Kaplan-Meier survival curve.Finally,the nomogram model was designed into a web version.Results: 1.A total of 1,068 patients with atrial fibrillation were selected in this study.After data filling and deletion,a total of 978 cases were included in the analysis,including 684 cases in the modeling group and 294 cases in the verification group.There were 70 deaths in the modeling group and 39 deaths in the verification group.The median follow-up time was 170 days.2.The Lasso algorithm screened out 12 clinical factors.These factors were respectively incorporated into Lasso-cox,random forest,support vector machine and neural network for calculation to build its prediction model,and the ROC curve was used to evaluate the accuracy of each model.The AUC of lasso-Cox’s modeling and validation groups was 0.903(95%CI:0.856-0.950)and 0.851(95%CI: 0.792-0.909),the AUC of random forest was 0.827(95%CI:0.784-0.870)and 0.815(95%CI: 0.749-0.880),and the AUC of support vector machine was0.956(95%CI: 0.933-0.978)and 0.717(95%CI: 0.917),respectively.The values of the neural network were 0.519(95%CI: 0.490-0.548)and 0.497(95%CI: 0.445-0.549),respectively.Comprehensive judgment,Lasoo-cox regression model was the best model.3.The Lasso-cox algorithm obtained eight clinical factors including stroke history,tumor history,neutrophil ratio,red blood cell distribution width,monoamine oxidase,uric acid and age,and established a nomogram model based on these eight clinical factors.The C-index in the modeling group was 0.872(95%CI: 0.835-0.909),and the C-index in the verification group was0.811(95%CI: 0.742-0.880).The 30-day,180-day,and 365-day AUC in the modeling group were 0.885(95% CI: 0.841-0.930),0.894(95% CI: 0.845-0.943)and 0.856(95% CI:0.786-0.932),respectively.The 30-day,180-day,and 365-day AUC of the verification group were 0.852(95% CI: 0.789-0.914),0.903(95% CI: 0.856-0.950),and 0.817(95% CI:0.671-0.962),respectively.4.The time-dependent ROC curve and decision curve showed that the nomogram model was better than the CHA2DS2-VASc and HAS.BLED scoring models in predicting the death risk of inpatients with atrial fibrillation.X-tile software calculated the nomogram risk score less than10.3 to be divided into low-risk,and ≥ 10.3 was divided into high-risk group.Kaplan-Meier survival curve results showed that the 30-day,180-day,and 365-day survival rates of the low-risk group in the modeling group were 97.3%,94.9%,and 89.0%,respectively,and the high-risk group was 69.4%,44.6%,and 38.2%.In the validation group,the survival rates of the three items in the low-risk group were 95%,90.2%,and 84.0%;the high-risk group was 77.8%,61.2%,and 52.1%.Conclusion: Stroke,cancer,neutrophil granulocyte ratio,red cell volume distribution width,monoamine oxidase,uric acid and age are all risk factors for death in patients with atrial fibrillation. |