Font Size: a A A

A Deep Learning Model For Intracranial Aneurysm Detection And Segmentation In Computed Tomography Angiography Images: Development,Validation And Clinical Application

Posted on:2021-04-09Degree:MasterType:Thesis
Country:ChinaCandidate:Z ShiFull Text:PDF
GTID:2504306500475364Subject:Clinical Medicine
Abstract/Summary:PDF Full Text Request
PART Ⅰ Development of A Deep Learning Model for Detection and Segmentation of Intracranial Aneurysms in Computed Tomography Angiography ImagesPurpose: To develop a deep-learning-based model for detection and segmentation of intracranial aneurysms based on bone-removal Computed Tomography Angiography(CTA)images and validate the model in independent internal and external datasets.Methods: Patients were retrospectively collected who underwent CTA and digital subtraction angiography(DSA)within 30 days in Jinling Hospital,Nanjing,China between Jun.1,2009 and Mar.31,2017(Cohort 1).The data were then shuffled and separated into training/tunning/testing sets.The model was developed in the training set using a convolutional neural network.The tunning set was used to evaluate model performance during training and for hyper-parameter optimization,and the testing set was a held-out set of images used for evaluation of the trained model.Patient-based performance metrics included sensitivity,specificity,accuracy,positive predictive value(PPV)and negative predictive value(NPV);lesion-based performance metrics included recall rate,false positive per case(FP)and Dice index.For internal validation,we selected consecutive patients undergoing CTA examinations verified by DSA from Apr.1,2017 to Dec.31,2017 in Jinling Hospital(Cohort 2);for external validation,DSA-verified consecutive eligible CTA cases from Jan.1 2019 to Jul.31 2019 in Nanjing Brain Hospital(Cohort 3)were enrolled.The metrics by the model were evaluated respectively in each cohort and the 95% Wilson score confidence intervals(CI)were used to assess the variability.A 2-sided Pearson’s chi-squared test or Fisher exact test,if appropriate,was used to evaluate whether there were significant differences in specificity,sensitivity,accuracy,PPV,and NPV between the internal and external datasets.Results: There were 1177 cases in Cohort 1,which contained 869 patients with 1099 aneurysms and 308 non-aneurysm controls(257 patients without abnormal findings and 51 patients with intracranial artery stenosis).The training set included 927 cases(744 cases with aneurysms and 183 non-aneurysm controls);the tunning set consisted 100 cases(50 cases with aneurysms and 50 controls);the testing set had 150 cases with half aneurysms cases and half controls.The results demonstrated that the model had the highest sensitivity of 97.3%(95%CI: 90.8%-99.3%)and moderate specificity of 74.7%(95%CI: 63.8%-83.1%),accuracy of 86.0%(95%CI: 79.5%-90.7%),PPV of 79.4%(95%CI: 70.0%-86.4%)and NPV of 96.6%(95%CI: 88.3%-99.1%)on the testing set when FPs was set at 0.29/cases(95%CI: 0.23-0.37).For lesion-based analysis,the recall rate was 95.6%(95%CI: 89.1%-98.3%)with a Dice index of 0.752(95%CI: 0.708-0.796).Cohort 2 contained 245 cases(111 cases with aneurysms and 134 controls).The model reached accuracy,sensitivity,and specificity of 86.1%(95%CI: 81.2%-89.9%),88.3%(95%CI: 81.0%-93.0%),and 84.3%(95%CI: 77.2%-89.5%),with a recall rate of 79.7%(95%CI: 72.5%-85.4%)and FPs of 0.26/case(95%CI: 0.21-0.32).Cohort 3 contained 211 cases and 39 patients had aneurysms.The framework reached accuracy,sensitivity,and specificity of 80.1%(95%CI: 74.2%-84.9%),82.1%(95%CI: 67.3%-91.0%),and 79.7%(95%CI: 73.0%-85.0%),with a recall rate of 72.3%(95%CI: 58.2%-83.1%)and FPs of 0.27/case(95%CI: 0.22-0.34).Conclusion: Our study devised a deep learning model for automatical detection and segmentation of intracranial aneurysms in CTA images and presented excellent performance of sensitivity and segmentation in both internal and external datasets.Part Ⅱ The Comprehensive Analysis of A Deep Learning Model for Detectin and Segmentation of Intracranial Aneurysms in CTA ImagesPurpose: To evaluate the influence of occult cases(namely CTA-negative but DSA-positive aneurysms),image quality and manufacturers on the performance of the deep learning model for detection and segmentation of intracranial aneurysms in CTA images.Methods: Patients with occult cases were collected between Jun.1,2009 and Mar.31,2017 in Jinling Hospital who had underwent CTA examinations and digital subtraction angiography(DSA)(Cohort 4),which was applied to the model for validation.Consecutive patients undergoing CTA examinations and DSA during 2018(Cohort 5)were collected for validation of the effect of image quality.CTA image quality was rated on a four-point scale,which is based on the degree of noise,vessel sharpness,and overall quality.DSA-verified consecutive eligible CTA cases from Tianjin First Central Hospital were collected for validation of the effect of different manufacturers in 2013-2018(Cohort 6)to the model,which contained 3 different manufacturers including GE Revolution,Siemens SOMATOM Definition Flash and Toshiba Aquilion ONE.Patient-based performance metrics included sensitivity,specificity,accuracy,positive predictive value(PPV)and negative predictive value(NPV);lesion-based performance metrics included recall rate,false positive per case(FP)and Dice index.The metrics by the model were evaluated separately in each cohort and the 95% Wilson score confidence intervals(CI)were used to assess the variability.A 2-sided Pearson’s chi-squared test or Fisher exact test,if appropriate, was used to evaluate whether there were significant differences in specificity,sensitivity,accuracy,PPV,and NPV between the different groups.The threshold for statistical significance was <0.05 for two-sided tests and the Bonferroni-adjusted p value method was used to account for multiple comparisons.Results: A total of 31 occult cases were included that contained 43 aneurysms in Cohort 4.Our model detected 5 occult aneurysms from 5 patients.Cohort 5 contained 151 patients,among which 46 patients had 59 aneurysms.There were 10,43,65,and 33 cases for the image quality scores of 1-4,respectively.The results demonstrated that the sensitivities were 66.7%,100%,73.9% and 83.3% in the groups of score 1-4,and the corresponding specificities were 85.7%,89.7%,85.7%,and 92.6%,respectively.There was no significant difference among the four groups(all p>0.05).Cohort 6 enrolled 59 patients(containing 50 aneuryms from 39 patients),among which CTA was acquired by GE Revolution in 13 patents(10 patients with aneurysms),by Siemens SOMATOM Definition Flash in 21(18 patients with aneurysms),by Toshiba Aquilion ONE in 25(11 patients with aneurysms).The results demonstrated that the sensitivity and specificity were 70.0%,66.7%;72.2%,66.7%;45.5%,50% for GE,SIEMENS and Toshiba,respectively,while without significant differences(all p>0.05).Conclusion: Our model had a high tolerance to image quality and different manufacturers.Interestingly,our model was potential to detect some occult aneurysms that are CTA-negative but DSA-positive with the special properties of deep learning.Part Ⅲ Clinical Applications of A Deep Learning Model for Detectin and Segmentation of Intracranial Aneurysms in CTA ImagesPurpose: To further investigate the clinical applications of the deep learning model for detection and segmentation of intracranial aneurysms in CTA images in routine clinical setting and acute ischiemic stroke(AIS)setting,and compare the performances of the model against radiologists.Methods: Consecutive real-world cases undergoing head or head/neck CTA in one internal dataset from Jun.1,2019 to Jul.31,2019 in Jinling Hospital(Cohort 7)and one external dataset from Aug.1,2018 to Sep.30,2019 in Lianyungang First People’s Hospital(Cohort 8)were collected to validate the performance of the model.We compared the performances of the model against 6 board-certified radiologists(2 resident radiologists,2 attending radiologists and 2 assistant director radiologists)who were required to make the diagnosis independently.The average performance of radiologists and the model and the corresponding reading time in the whole group,the subarachnoid hemorrhage(SAH)group and the non-SAH group were recorded.Patient-based performance metrics included sensitivity,specificity,accuracy,positive predictive value(PPV)and negative predictive value(NPV);aneurysm-based performance metrics included recall rate,false positive per case(FP)and Dice index.For the AIS setting,patients who suspected of AIS from Jul.1 2018 to Jul.31 2019(Cohort 9)were collected for the function validation of whether the confident screen of aneurysm-negative cases can reduce radiologists’ workload.For normally distributed data,independent sample t test was used,otherwise,Mann–Whitney U test was applied to compare the difference of time to diagnosis.A 2-sided Pearson’s chi-squared test was used to evaluate whether there were significant differences in performances between the different groups.For comparisons with radiologists,the choice of superiority or non-inferiority was based on what seemed attainable from simulations conducted in Cohort 7-8.The confidence limits of the difference were based on Gart and Nam’s score method with skewness correction.For non-inferiority comparisons,a 5% absolute margin was pre-specified before the test set was inspected.The metrics by the model were evaluated separately in each cohort and the 95% Wilson score confidence intervals(CI)were used to assess the variability.Results: Cohort 7 enrolled 374 patients,which contained 71 aneurysms from 53 patients.The microaverage sensitivity and specificity were 58.5%(95% Cofidence Interval [CI]: 53.0%-63.8%),95.3%(95% CI: 94.2%-96.1%);66.7%(95% CI: 54.1%-77.3%),95.4%(95% CI: 89.6%-98.0%);56.6%(95% CI: 50.5%-62.5%),95.3%(95% CI: 94.2%-96.2%)for the radiologists in the whole group,SAH group and non-SAH group,respectively.The radiologists had higher PPV in the SAH group than the non-SAH group(88.9%(95% CI: 76.5%-95.2%)vs 56.9%(95% CI: 42.2%-70.4%),p=0.001),and vice versa for NPV(83.7%(95% CI: 76.2%-89.2%)vs 94.0%(95% CI: 90.7%-96.2%),p<0.001).The recall rates were 50.3%(95% CI: 45.5%-55.0%),54.8%(95% CI: 44.1%-65.0%)and 49.1%(95% CI: 43.9%-54.4%)for radiologists in the three groups.For the framework,it had likely higher sensitivity [69.8%(95% CI: 56.5%-80.5%,p=0.119),80.0%(95% CI: 49.0%-94.3%,p=0.636)and 67.4%(95% CI:52.5%-79.6%,p=0.182)] and NPV [94.6%(95% CI: 91.4%-96.7%,p=0.390),88.9%(95% CI: 67.2%-96.9%,p=0.830),and 95.0%(95% CI: 91.8%-97.0%,p=0.487)].And the framework had a comparative recall rate of 59.2%(95% CI: 47.5%-69.8%,p=0.164),64.3%(95% CI: 38.8%-83.7%,p=0.506)and 57.9%(95% CI: 45.0%-69.8%,p=0.220).The mean diagnosis time per examination microaveraged across clinicians was 30.1 seconds(95% CI: 29.2-31.0 seconds).While the framework took 18.2 seconds(95% CI: 17.9-18.4 seconds)per case and was significantly faster than the radiologists(p<0.001).Cohort 8 contained 316 patients(76 aneurysms from 60 patients)and had similar results.The microaveraged sensitivity and specificity were 70.8%(95% CI: 65.9%-75.3%),95.6%(95% CI: 94.4%-96.5%);81.3%(95% CI: 74.3%-86.8%),96.2%(95% CI: 91.4%-98.4%);63.3%(95% CI: 56.6%-69.6%),95.5%(95% CI: 94.3%-96.5%)for radiologists in the whole group,SAH subgroup and non-SAH subgroup,respectively.PPV was higher in the SAH group [96.1%(95% CI: 91.1%-98.3%)vs 67.7%(95% CI: 50.5%-81.1%),p<0.001] than non-SAH group,and vice versa for NPV(82.5%(95% CI: 63.9%-92.6%)vs 94.5%(95% CI: 90.8%-96.8%),p<0.001).The microaveraged recall rates were 61.6%(95% CI: 57.1%-66.0%),72.4%(95% CI: 65.7%-78.2%)and 53.8%(95% CI: 47.8%-59.7%),respectively.The framework had likely higher sensitivity [81.7%(95% CI: 70.1%-89.4%,p=0.082),92.0%,95% CI: 75.0%-97.8%,p=0.306)and 74.3%,95% CI: 57.9%-85.8%,p=0.209)],NPV [94.5%(95% CI: 90.5%-96.9%,p=0.516),88.9%(95% CI: 67.2%-96.9%,p=0.683),and 95.1%(95% CI: 90.9%-97.4%,p=0.772)] and recall rate(75.0%(95% CI: 64.2%-83.4%,p=0.025),84.8%(95% CI: 69.1%-93.3%,p=0.131)and 67.4%(95% CI: 52.5%-79.5%,p=0.095)).The microaveraged diagnosis time was 27.1seconds(95% CI: 26.3-28.0 secondes).And the framework took 19.6 seconds(95% CI: 19.3-20.0 seconds)per examination with a significant difference(p=0.001).Generally,the model had a non-inferiority performance against the radiologists.In the AIS setting,our model had a specificity of 88.7%(95% CI: 83.7%-92.4%)while the sensitivity was relatively low(40.0%,95% CI: 16.8%-68.7%),and the NPV was 96.8%(95% CI: 93.2%-98.5%).With the triage of the framework,87.4% of patients were predicted as negative,among which 96.8% predicted-negative cases are true-negatives,and the other 12.6% were predicted as high-risk group with the aneurysm.Conclusion: Our model had comparative or even higher performances in the internal and external cohorts when compared to the radiologists,and can act as a complementanry tool for diagnosis of aneurysms.And in the AIS setting,the model can exclude aneurysm-negative cases with high confidence and help prioritize the clinical workflow and improve accuracy.
Keywords/Search Tags:Computed tomograph angiography, Digital substraction angiography, Intracranial aneurysms, Deep learning, Medical image segmentation, Object detection, Image quality, Head Computed tomograph angiography, Real World, Human-model comparison
PDF Full Text Request
Related items