Font Size: a A A

Prediction Of EGFR Gene Mutation Status And Subtypes In Lung Adenocarcinoma Based On PET/CT Machine Learning

Posted on:2022-11-09Degree:DoctorType:Dissertation
Country:ChinaCandidate:L L HuangFull Text:PDF
GTID:1484306782476734Subject:Special Medicine
Abstract/Summary:PDF Full Text Request
Objective:Based on PET/CT images,this study used different methods to construct models(clinical nomograms,radiomics model and deep learning model)to predict the mutation status of EGFR and the status of two sensitive mutation subtypes(19 exon deletion mutation and 21L858R point mutation),in order to provide important basis for guiding clinical treatment.Materials and methods:1.This retrospective study enrolled 124 pathologically confirmed patients with lung adenocarcinoma who underwent EGFR mutation test and whole body 18F-FDG PET/CTscan.Recorded clinical characters(age,gender,smoking history etc.),sixteen CT features(size,location,margins etc)and four metabolic PET parameters(SUVmax,SUVmean,MTV,TLG).To achieve four tasks,task 1 is to predict the status of EGFR mutation and wild-type;Task 2 was to identify19 deletion mutation from wild-type;Task 3 was to identify 21L858R point mutation from wild-type;Task 4 was to identify two sensitive mutant subtypes.Logistic regression analyses were performed to screen for significant predictors of EGFR mutation status and subtypes,and these predictors were presented as easy-to-use nomograms.The receiver operating characteristic curve(ROC),Area under the curve(AUC)value,decision curve and calibration curve were used to evaluate the clinical usefulness of the nomograms.2.The retrospectively study collected 117 patients with lung adenocarcinoma and the patients were divided into training set and test set according to random stratified sampling method in 9:1 ratio.It aims to achieve four tasks,the same as method 1.The tumor volume of interest(VOI)segmented by ITK-SNAP was imported into the pyradiomics database to extract the radiomics features.The random forest selection features were used as the radiomics labels,and the support vector machine(SVM)was used to construct the binary classification model.ROC,AUC,accuracy,precision,recall and F1 values were used to evaluate the effectiveness of the model.3.117 patients with lung adenocarcinoma were collected and the data set was divided as the previous part.The deep learning method is used to construct the model,aiming to achieve four tasks,which is the same as method 1.Three models were constructed in task 1.In model 1,only PET/CT images were used;in model 2,all clinical variables were stacked into the deep learning model(PET/CT+Clinical);in model 3,clinical variables and radiomics features were stacked into the deep learning model to construct a combined model(PET/CT+Radiomics+Clinical).According to the results of task 1,only a combined model was constructed of task2-4.There were total 6 models.ROC,AUC,accuracy,accuracy,recall and F1 values were used to evaluate the effectiveness of the model.Results:1.EGFR mutation was found more frequently to be women,and to have air bronchogram and pleural retraction than EGFR wild-type.19 deletion mutation were more likely to be female and to have air bronchogram and calcification than wild-type.There was no difference in sex between 21 L858R and wild type,but the probability of air bronchogram,pleural retraction and non-solid texture were higher in21L858R.Except that the probability of calcification of 19 deletion mutation was higher than that of 21L858R,there was no statistical difference in other variables between them.Compared with wild type,EGFR mutation had lower SUVmax,SUVmean,and MTV in our study,and lower SUVmean in 19 deletion mutation compared with wild type,and lower SUVmean,MTV,TLG in 21L858R mutation compared with wild type,while there were no statistical difference among different subtypes of four metabolic parameters.Based on above independent predictors,three nomograms for individualized prediction of EGFR mutation status and subtypes were constructed.The AUC values of three nomograms were 0.852(95%CI:0.783,0.920),0.857(95%CI:0.778,0.937)and 0.893(95%CI:0.819,0.968)of EGFR mutation vs wild-type,19 deletion mutation vs wild-type and 21 L858R vs wild-type,respectively.DCA showed that our nomograms had outstanding clinical utility.2.Task 1 retained 14 CT features and 17 PET features,and the model showed good prediction performance in the training set and test set,with AUC of 0.797 and0.786,accuracy of 0.685 and 0.667,and F1 value of 0.734 and 0.750,respectively.Task 2 retained 9 CT features and 10 PET features,and the model showed moderate predictive performance in the training set and test set,with AUC of 0.798and 0.600,accuracy of 0.831 and 0.667,and F1 value of 0.714 and 0.769,respectively.Task 3 retained 9 CT features and 10 PET features,and the model showed good prediction performance in the training set and test set,with AUC of 0.855 and0.786,accuracy of 0.754 and 0.667,and F1 value of 0.653 and 0.571,respectively.Task 4 retained 9 CT features and 8 PET features,and the model showed good prediction performance in the training set and test set,with AUC of 0.822 and0.800,accuracy of 0.800 and 0.833,and F1 value of 0.824 and 0.667,respectively.3.In task 1,the deep learning integrated model had the highest performance in training set and test set,with AUC of 0.928 and 0.832 and accuracy of 0.805 and0.675,respectively.The AUC of deep learning model using PET/CT images alone on the training set and test set was 0.916,0.676,and the accuracy was 0.802,0.566.The AUC of the model with clinical features on the training set and test set was 0.895,0.733,and the accuracy was 0.801,0.670.In task 2,the AUC on training set and test set was 0.906,0.828,and the accuracy was 0.868 and 0.767,indicating that the model had high prediction efficiency and robustness.In task 3,the AUC on training set and test set is 0.909,0.785,and the accuracy is 0.850 and 0.700,which also showed that the model had high prediction efficiency and robustness.In task 4,the AUC on training set and test set was 0.951 and 0.800,and the accuracy was 0.732 and 0.533.Conclusions:1.Clinical logistic regression model was constructed,including clinical pathological variables,radiological features of CT and PET metabolic parameters characteristics,the model were presented as nomograms,and has high diagnostic performance,more suitable for clinicians as the tool of daily use to personalized determine EGFR mutation status and subtypes.But prediction of mutant subtypes requires more sophisticated methods.2.The radiomics model has high diagnostic value for the two sensitive mutation subtypes,showing equal differential diagnostic efficacy in both training set and testing set,which can guide clinical selection of the suitable targeted therapy drugs.3.The integrated deep learning model showed the best performance in the prediction of EGFR mutation status and different mutation subtypes,and remained stable in the testing set.These three methods can complement each other to provide alternative non-invasive,safe and accurate imaging markers for precision treatment of lung cancer patients.
Keywords/Search Tags:Lung adenocarcinoma, EGFR, Gene subtypes, PET/CT, Machine learning
PDF Full Text Request
Related items