Font Size: a A A

Prediction Of 3-month Prognosis Of Acute Ischemic Stroke Based On Machine Learning

Posted on:2022-10-20Degree:MasterType:Thesis
Country:ChinaCandidate:W QinFull Text:PDF
GTID:2504306575956609Subject:Social Medicine and Health Management
Abstract/Summary:
[Objective] Under the two types of application scenarios at the time of admission and discharge,to explore the influencing factors of 3-month prognosis of acute ischemic stroke(AIS)by using logistic regression,and establish a predictive model.To establish the prediction model of 3-month prognosis of AIS using two machine-learning models,which were random forest and Extreme Gradient Boosting(XGBoost),and evaluate their predictive effect compared with logistic regression,and select suitable prediction models for two application scenarios,and provide references for effectively improving the prognosis of patients with AIS.[Methods] In this study,15087 AIS patients in the China National Stroke Registration Phase II(CNSR-2)were included as the research objects in the admission scenario.Due to the death of 107 patients in the hospital,14980 AIS patients were included as the research objects in the discharge scenario.Whether the patient’s 3-month modified Rankin Scale(m RS)was greater than two was the dependent variable.In the admission scenario,the predictive variables included 35 variables from four dimensions,including general demographic characteristics of patients,personal history/past-medical history/medicine history,admission inspection indicators and general situation.In the discharge scenario,20 variables from five dimensions of Inhospital treatment and examination measures,complications,acute recovery,discharge status and secondary prevention strategy were further considered,with a total of 55 variables.According to the ratio of 7:3,the data were randomly divided into a training set and test set.The training set was used to train the model,and for the machine learning models,the isotonic regression was used to calibrate the predicted probability of the model.On the test set,the area under the receiver operating characteristic curve(AUC)was used to evaluate the discriminability of the model.The Hosmer-Lemeshow(H-L)test,the calibration chart,and the Brier score were used to evaluate the calibration of the model.The AUC,Net Weight Classification Improvement(NRI),and Comprehensive Discrimination Improvement(IDI)were used to compare the prediction effects between models.[Results] In the admission scenario,the AUC of logistic regression was0.8028(95%CI: 0.7854~0.8202),the H-L test passed(P>0.05),the Brier score was0.110,and the calibration chart showed a 45-degree angular distribution.The AUC of random forest without probability calibration was 0.8017(95%CI: 0.7840~0.8194),the H-L test failed(P<0.001),the Brier score was 0.118,and the calibration chart was deviated from the angle of 45-degree.After probability calibration,the AUC of random forest was 0.8027(95%CI: 0.7851~0.8204),the H-L test passed(P>0.05),the Brier was0.109,and the calibration showed a 45-degree angular distribution.Before and after the probability calibration,the AUC of XGBoost was 0.8105(95%CI: 0.7934~0.8276),the H-L test passed(P>0.05),the Brier score was 0.107,and the calibration chart showed a 45-degree angular distribution.In comparison between models,random forest and logistic regression have no difference in AUC(P=0.984),NRI=0(P=0.458),IDI=0(P=0.585),the AUC of XGBoost was better than logistic regression(P=0.015),NRI>0(P<0.001),IDI>0(P<0.001).In the discharge scenario,the AUC of logistic regression was 0.8599(95%CI:0.8499~0.8749),the H-L test passed(P>0.05),the Brier score was 0.092,and the calibration chart showed a 45-degree angular distribution.The AUC of random forest without probability calibration was 0.8634(95%CI: 0.8485~0.8782),the H-L test failed(P<0.001),the Brier score was 0.100,and the calibration chart was deviated from the angle of 45-degree.After probability calibration,the AUC of random forest was0.8630(95%CI: 0.8482~0.8778),the H-L test passed(P>0.05),the Brier was 0.091,and the calibration showed a 45-degree angular distribution.Before and after the probability calibration,the AUC of XGBoost was 0.8668(95%CI: 0.8522~0.8815),the H-L test passed(P>0.05),the Brier score was 0.089,and the calibration chart showed a 45-degree angular distribution.In comparison between models,random forest and logistic regression have no difference in AUC(P=0.437),NRI=0(P=0.366),IDI=0(P=0.512),the AUC of XGBoost was better than logistic regression(P=0.026),NRI>0(P<0.001),IDI>0(P<0.001).[Conclusion](1)Regardless of in the scenario of admission or discharge,when using random forest to predict the prognosis of patients with AIS,it is necessary to use the IR method for probability calibration,so that the model has a good degree of calibration.XGBoost does not require the IR method for probability calibration,but also has a good degree of calibration.(2)Regardless of in the scenario of admission or discharge,logistic regression,calibrated random forest and XGBoost model have good discrimination and calibration,and effectively used to predict the prognosis of patients with AIS.(3)Regardless of in the scenario of admission or discharge,the prediction effect of XGBoost model is slightly better than logistic regression,and random forest is not statistically different from logistic regression.(4)In the admission scenario,we recommend to use logistic regression model to predict the prognosis of patients with AIS.In the discharge scenario,we recommend to use XGBoost and a calibrated random forest model to predict the prognosis of patients with AIS.
Keywords/Search Tags:Acute ischemic stroke, Influencing factor, Prognosis prediction, Machine learning, Probability calibration
Related items