Font Size: a A A

Early Risk Prediction Of Patients With ARDS Based On Machine Learning

Posted on:2024-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:H M ZhangFull Text:PDF
GTID:2544307103995559Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Identifying the severity of a patient’s condition as early as possible and effectively reducing the mortality risk through clinical electronic health records is a significant challenge currently facing intensive care units(ICU).Acute respiratory distress syndrome(ARDS)often accompanies complications.As a group of single-disease patients,the commonly used disease scoring systems and traditional machine learning models in ICU have problems such as low accuracy,low efficiency,and poor interpretability in identifying and predicting the mortality risk of patients,which has attracted widespread attention from medical staff at home and abroad.Based on medical information mart for intensive careⅢdata,this thesis focuses on the non-invasive physiological parameter problem of ARDS severity recognition,the small medical sample and interpretability problem of mortality risk prediction,and the critical risk factors affecting ARDS patients under the guidance of transpulmonary pressure.The specific research work is as follows.(1)Given the subjectivity and non-timeliness of the disease scoring system and invasive parameters in evaluating the development of ARDS,combined with noninvasive parameters,this thesis proposed a model(XGB-SRM)of ARDS severity recognition based on extreme gradient boosting(XGBoost).Firstly,the physiological parameters of patients were extracted from the database for statistical analysis,and the outliers and unbalanced samples were processed by the interquartile range and synthetic minority oversampling.Secondly,the Pearson correlation coefficient and random forest were used as hybrid feature selection to score the noninvasive parameters comprehensively.Finally,to realize the accurate classification of disease degree,XGBoost combined with grid search cross-validation to determine the best hyper-parameters of the model.The experimental results show that the area under the curve(AUC)of the model in distinguishing disease severity is as high as 0.980,with an accuracy of 0.901,thereby ensuring the accuracy and recognition efficiency of the model.(2)Given the characteristics of ARDS medical data with unbalance,small samples and large feature space,and the lack of interpretability of existing model,this thesis proposed an interpretable method(WM-IMRP)for mortality risk prediction based on weighted balanced distribution adaptation(W-BDA)and multilayer perceptron(MLP).Firstly,k-nearest neighbor interpolation for missing values,the extracted data were preprocessed for the divided source and target domains.Secondly,feature selection based on XGBoost was performed in two domains to eliminate redundant features and achieve dimension reduction.Thirdly,the reconstructed domains were mapped to the same reproducing kernel Hilbert space(RKHS)through W-BDA,and the balance factor was introduced to achieve the weighted equilibrium adaptation of conditional and marginal distributions.Finally,the MLP network model was trained by the new source domain,and the mortality risk prediction of ARDS was achieved on the new target domain.The experimental results show that the method proposed in this article has an AUC of up to0.905,an accuracy of 0.878,and an F1 score of 0.921 in predicting the mortality risk,ensuring good prediction accuracy and reliable explanatory power.(3)Given the problem of limited data and difficulty in analyzing the ARDS mortality risk factors,this thesis proposed a method to explore the key factors affecting the ARDS mortality risk under the guidance of transpulmonary pressure monitoring.Firstly,based on the group detecting transpulmonary pressure,the extracted ARDS patients were scored with a propensity score of 1:1,matching with the control group.Secondly,the optimal feature combination set was selected through total optimal subset regression using the rank sum test indicators,and the critical risk factors affecting ARDS were obtained through linear and nonlinear feature analysis.Finally,the machine learning model demonstrated the best predictive ability for mortality risk.The experimental results show that HCO3 and respiratory rate are the key risk factors affecting ARDS,and the accuracy of 28-day mortality risk prediction based on the logistic regression model is as high as0.921,which can provide medical staff with diagnosis and treatment suggestions.
Keywords/Search Tags:Acute respiratory distress syndrome, Hybrid feature selection, Weighted balanced distribution adaptation, Mortality risk prediction, Total optimal subset regression
PDF Full Text Request
Related items