Font Size: a A A

Establishment Of A Prediction Model For Acute Appendicitis Based On Machine Learning And Comparison With Nomogram Model

Posted on:2024-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y SunFull Text:PDF
GTID:2544307064487764Subject:Clinical Laboratory Science
Abstract/Summary:PDF Full Text Request
Objective:Acute appendicitis(AA)is an infection of the appendix that typically develops rapidly within a few hours.The accumulation of food residues and bacteria can accelerate the inflammatory process,making it one of the most common intestinal infections.It is also the main cause of acute abdominal pain requiring surgical removal of the appendix.Delayed diagnosis and treatment can lead to serious complications and life-threatening consequences,accurate diagnosis of AA is the most important preventive measure to avoid unnecessary surgery.Generally,AA has no specific direct cause,and often begins with discomfort and diffuse abdominal pain.The pain quickly shifts to the right lower abdomen and becomes more severe.Coughing and abdominal wall tightness can exacerbate the pain.In addition,nausea,constipation,or fever may also occur.Currently,due to its diverse and atypical early clinical manifestations,the diagnosis of AA is still challenging in some cases.It is important to find cost-effective methods that can achieve accurate diagnosis in a short time.In recent years,artificial intelligence(AI)has gradually become popular in medical research,rapidly and accurately detecting diseases,making health systems around the world more effective and secure.Machine learning(ML)is one of its fastest growing fields.The purpose of this study is to develop an early risk prediction model for AA based on ML by mining real-world data and using simple,rapid,and accurate evaluation methods.This model can identify risk factors for AA patients during initial evaluation,reduce medical consumption,more accurately utilize medical specific assistance,and provide decision support for their clinical diagnosis and treatment.Methods:This study is based on the medical data platform of the First Hospital of Jilin University.The basic information,medical history information and clinical diagnosis of the research object are collected in the medical record system,and the test data are collected in the laboratory information system.The subjects included three cohorts.Queue 1 includes AA patients and healthy control groups in 2019.Five algorithms,namely,Extreme Gradient Boosting(Xg Boost),Support Vector Machine(SVM),Random Forest(RF),Bernoulli Naive Bayes(Bernoulli NB),and Gaussian Naive Bayes(Gaussian NB),were selected and validated through a 5-fold cross validation,Divide the entire dataset into 5 folds,take turns using 4 of them as training sets to train the model,and use the remaining one as a verification set to score the model.Repeat the above process 5 times and take the average value.Select the best ML algorithm to filter the features of AA model.Queue 2 and queue 3 are divided into training sets and verification sets at a ratio of 7:3.The second cohort consisted of AA patients and patients with acute abdomen in 2020,in order to develop an early prediction model of AA,and to conduct age subgroup analysis with the median age of AA patients as the cut-off point.Queue 3 includes AA patients and patients with acute abdomen in 2021.A nomograph model is established based on traditional logistic regression analysis.Single factor regression analysis is used to screen predictors in the training set.In order to avoid multiple collinearity bias in multivariate analysis,Spearman correlation analysis is used to remove indicators that have significant correlation with most parameters and further screen modeling variables.A multivariate prediction model was constructed using multivariate logistic regression method,and a nomogram was constructed based on the prediction model parameters.The diagnostic efficacy was analyzed by comparing it with the ML model.Results:On the medical data platform of the First Hospital of Jilin University,a total of34 clinical indicators were collected from patients’ blood and biochemistry.Due to the close correlation between age and the occurrence of AA and the statistically significant difference,age factors were included in the selected characteristics.A total of 2269 AA patients and healthy control groups in cohort 1 were included in the study.The optimal algorithm Xg Boost was selected from five ML algorithms to screen the characteristics of the AA model.The model included C-reactive protein(CRP),neutrophil(NE),albumin(ALB),total protein(TP),mean corpuscular volume(MCV),hematocrit(HCT)Risk prediction model of 11 indicators including creatinine(Cr),lymphocyte(LY),calcium(Ca),age and mean corpuscular hemoglobin concentration(MCHC).The AUC value of this model in cohort 2 is 0.981,with good goodness of fit and certain clinical applicability.By analyzing age subgroups,its diagnostic accuracy for patients below the median age of AA was higher(AUC=0.992,accuracy=0.953,specificity=0.947,sensitivity=0.958).In comparison,this model has a slightly lower predictive ability for AA patients with a median or higher(AUC=0.980,accuracy=0.941,specificity=0.933,sensitivity=0.949).A total of 2730 subjects in cohort 3 were randomly divided into a training set and a validation set at a ratio of 7:3.The AA-Lab11 model was established using Xg Boost algorithm,which had a good discrimination in the validation set(accuracy: 0.978,accuracy: 0.976,sensitivity=0.973,specificity=0.981).Logistic univariate regression analysis was used to deeply study the impact of clinical indicators as independent variables on clinical outcomes.Except for HCT,EO,BA,MO,PCT,AST,Urea and GLU,which had no statistical significance with the predicted occurrence of AA(P>0.05),other indicators had statistical significance.Spearman correlation analysis was performed on the indicators screened by logistic single factor regression analysis,RBC,PLT,MCH,HB,NE,LY,PDW,ALT γ-GGT,ALP,DBIL,CRP,Ca,Cl,Na,and K have significant correlations with many analysis parameters(about half of the analysis parameters),which should be removed to avoid multicollinearity effects.WBC,MCV,MCHC,RDW,MPV,TP,ALB,GLO,TBIL,Cr and age are selected as the parameter groups for model construction.Multivariate logistic regression analysis showed that WBC,MCV,MCHC,RDW,MPV,TP,GLO,TBIL,Cr,and age combined can predict AA(accuracy=0.682,accuracy=0.631,sensitivity=0.676,specificity=0.686).The results show that the diagnostic efficiency of AA Lab-11 ML model in predicting AA is significantly better than that of nomograph model.Conclusions:This study is based on five ML algorithms,and the AA risk prediction model developed based on the optimal Xg Boost algorithm has good diagnostic efficiency.Age subgroup analysis shows that it has a high diagnostic accuracy for AA patients below the median age.Moreover,the diagnostic efficiency and clinical applicability of the ML model are significantly superior to the nomogram model,making it easier to identify high-risk patients during initial evaluation,which is helpful for the accurate diagnosis of AA in clinical practice.
Keywords/Search Tags:acute appendicitis, machine learning, C-reactive protein, risk prediction, nomogram
PDF Full Text Request
Related items