Font Size: a A A

Optimizing The Use Of Gastroscope Based On Machine Learning Model

Posted on:2019-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:Y J LiuFull Text:PDF
GTID:2394330569999156Subject:Internal Medicine
Abstract/Summary:PDF Full Text Request
ObjectiveWe aim to establish an objective and feasible pre-gastroscopic screening standard to solve the overuse of gastroscopy.MethodsThis study collected the demographic information,diet,lifestyle,medical history,symptoms,PGI,PGII,G-17 and Hp antibody of the patients.Gastroscopy and pathological examination were performed and both of them would be seem as diagnosis standard.The decision tree model,logistic regression model,random forest model and support vector machine model were trained by the collected information.The accuracy and validity of the machine learning models predicting positive gastroscopic results were evaluated by comparing the efficiencies of different pre-gastroscopic screening ways.Results433 gastroscopic positive cases of total 620 cases were enrolled in this stud,including 136 cases of gastric mucosal erosion,118 cases of gastric polyps,104 cases of gastric mucosal atrophy,85 cases of intestinal metaplasia,46 cases of gastric ulcer,46 cases of gastric intraepithelial neoplasia,36 cases of duodenal polypus,33 cases of duodenum ulcer,23 cases of reflux esophagitis,7 cases of Barrett's esophagus,3 cases of esophageal varices,2 cases of gastric carcinoma,2 cases of esophageal hiatal hernia and 2 cases of stromal tumors.Low education,higher physical labor,Cantonese,living in the country,drinking tap water,alcohol drinking,hypertension,hyperlipidemia,and history of upper gastrointestinal polyps,age beyond 52,smoking longer than 18 years,PG ? > 9.36,PGR < 10.89 are associated with positive gastroscopic results.Under the single factor analysis,the history of upper gastrointestinal polyps has the highest positive predictive value(100.00%),smoking longer than 18 years has the highest negative predictive value(80.00%),low education has the highest sensitivity(82.68%),history of upper gastrointestinal polyps has the highest specificity(100.00%),and history of upper gastrointestinal polyps has the highest Youden Index(0.32).The top ten important variables in decision tree model were history of upper gastrointestinal polyps(0.277),occupation(0.138),(0.107)of drinking water source,history of hyperlipidemia(0.079),G-17(0.070)and PGR(0.045),alcohol drinking(0.043),age(0.032),smoking(0.032),and discomfort after meal(0.031).The top nine important variables in Logistic regression model were history of upper gastrointestinal polyps(0.397),native language(0.192)and PGR(0.104),the history of hyperlipidemia(0.078),nausea and vomiting(0.061),place of residence(0.049)(0.057),high salt diet,smoking,length(0.030)and smoked Fried food(0.030).The top ten important variables in random forest model were G-17,age,low education,PG ?,PG ?,residence,history of upper gastrointestinal polyps,PGR,Hp antibody and smoking.The top ten important variables in support vector machine(SVM)model were history of upper gastrointestinal polyps(0.065),occupation(0.062),gender(0.049),level of education(0.047),smoking(0.046),diabetes(0.047),the feeding speed(0.040),gas(0.038),earlier satiation(0.032)and alcohol consumption(0.030).In the training set,support vector machine model fitted the highest degree(AUC=1.000),the random forest model(AUC=0.891),the decision tree model(AUC is 0.865),and the worst is the Logistic regression model(AUC=0.799).In the test set,four machine learning model has better prediction effect,AUC from high to low were random forest model(0.749),logistic regression model(0.742),the decision tree model(0.727)and support vector machine model(0.726).Assuming risk cut-off value was 0.66,the sensitivity of the model is 91.69%,as well as specificity is 12.30%,and only recommended gastroscopy in 88% of patients,the average 1.40 times gastroscopy can be found that the positive cases.Compared with direct gastroscopy,the efficiency of gastroscopy is increased by 2.48 times after using the screening model.ConclusionThis study compared the variables in the model with single factor analysis results,and proved that the history of upper gastrointestinal polyps,PG ?,PG ?,Hp antibody,smoking,drinking were important predicting variables for positive gastroscopic results,as well as the single alarm symptom is difficult to predict the results of gastroscope accurately.This study established a machine learning model screening before gastroscopy.The model can predict positive gastroscopic risk effectively and provide objective criteria for optimizing the use of gastroscope,which may be a new way to decrease the overuse of gastroscopy.However,before being applied in clinical practice,the models need externally validated.
Keywords/Search Tags:gastroscopy, machine learning, screening
PDF Full Text Request
Related items