Font Size: a A A

The Study Of Predicting The Expression Level Of Cancer Stem Cell Marker Genes In Lower-grade Gliomas Based On MRI Radiomic Features

Posted on:2023-03-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z H WangFull Text:PDF
GTID:1524306791982709Subject:Clinical medicine
Abstract/Summary:
Gliomas are the common primary malignant tumor in the central nervous system(CNS).Among the various glioma subtypes,glioblastomas(grade Ⅳ)are highly proliferative and aggressive,with the worst prognosis of a 5-year average survival rate of being below 10%.However,the survival prognosis of lower-grade gliomas(grade Ⅱ and grade Ⅲ)is complex and diverse.In 2016,the World Health Organization(WHO)incorporated molecular diagnostic criteria(e.g.Isocitrate dehydrogenase(IDH)mutation status,1p19 q co-deletion status)into glioma classification,together with the histological subtyping to constitute the current diagnostic criteria of glioma,that improved the prediction accuracy of glioma treatment effect and prognosis.With the rapid development of molecular biology,an increasing number of genes have been shown to be significantly associated with glioma growth,proliferation,treatment resistance,and prognosis.Glioma stem cells(GSCs)have the characteristics of cancer stem cells(CSCs),which can affect tumor occurrence,development,metastasis,recurrence,chemoradio-therapy resistance,etc.Now detecting the GSCs surface marker through pathological examination is the main means to identify GSCs,in addition,GSCs surface markers are also thought to have great potential value in tumor immuno-therapy.Previous studies have shown that the protein molecules encoded by the CD44,CD133,Oligodendrocyte transcription factor 2(OLIG2),and cyclin D2(CCND2)genes can be used as specific surface markers for GSCs,so one of the aims of this study was to evaluate the association between the four GSCs surface markers and some important clinicopathological features and prognosis of LGG patients.Due to the invasive,time-consuming,temporal,and spatial lag of pathological examina-tion,and the local specimens may not reflect the overall nature of the tumor,it is necessary to find a noninvasive,simple,and efficient evaluation method.At present,magnetic resonance imaging(MRI)examination has been widely used in the preoperative noninvasive diagnosis of glioma,however,it could not adequately reflect the physiological and pathological characteristics of the tumor,furthermore,the diagnostic accuracy was unsatisfactory because the observers were greatly influenced by subjective and empirical factors.Radiomics is an emerging,noninvasive imaging diagnostic method that combines computer technology with medical image data to reflect the shape,size,and texture features of the lesion by a large amount of image feature data from the medical image.Additionally,radiogenomics can organically combine radiomics technology with molecular biology techniques,which pioneered a new field of molecular imaging research.Nowadays,artificial intelligence(AI)technology has become an important step in radiomics research,and machine learning(ML)algorithm is the core technology in AI.Common ML algorithms include logistic regression(LR),support vector machine(SVM),random forest(RF),etc.With the continuous development of AI technology,the diagnostic performance of radiomics models is also constantly improving,therefore,another aim of this study was to evaluate the predictive ability of the expression levels of GSCs surface markers in LGG patients based on MRI radiomics features,meanwhile,we try to compare the performance differences of the three machine learning classifiers(LR,SVM,and RF),to provide a reliable reference for the clinical decision-making and the researchers in the same field.Part 1.To explore the expression differences and prognostic value of the four marker genes of cancer stem cells in lower-grade glioma Background and objective:Previous studies have demonstrated that the proteins encoded by the four genes,CD44,CD133,OLIG2,and CCND2,can all serve as surface markers for GSCs,based on bioinformatics analysis methods,the study will explore the potential associations between gene expression of CD44,CD133,OLIG2,and CCND2 and clinicopathological characteristics and survival prognosis of LGG patients.Method:The gene expression data,patients’ clinicopathological information,and survival prognosis information were downloaded from the TCGA and c Bioportal datasets.First,the four genes were analyzed for expression differences between different ages,sex,tumor grade,IDH mutation status,and 1p19 q co-deletion status,respectively.Second,univariate logistic regression was employed for assessing the potential association between the genes and the tumor grade,IDH mutation status,and 1p19 q co-deletion status.Third,the median value of the gene expression was used as the cut-off,cases were divided into high and low expression level groups,the differences of the overall survival(overall survival,OS)rate and progression-free survival(PFS)rate between the high and low expression levels groups were assessed using Kaplan-Meier survival analysis and survival curve will be plotted,the above procession was performed in the four genes respectively.Finally,univariate and multivariate cox proportional hazards regression analyses were used to assess the potential prognosis value of the four genes in LGG patients Result:1.After screening,a total of 505 cases were included in the study.Except for age and gender,there was a significant difference in expression of CD44 in tumor grade,IDH mutation status,and 1p19 q co-deletion status(P-values were less than0.05);Except for age and gender,there was a significant difference in expression of CD133 in tumor grade,IDH mutation status,and 1p19 q co-deletion status(P-values were less than 0.05);Except for gender and tumor grade,there was a significant difference in expression of OLIG2 in age,IDH mutation status,and 1p19 q codeletion status(P-values were less than 0.05);Except for gender and 1p19 q co-deletion status,there was a significant difference in expression of CCND2 in age,tumor grade,and IDH mutation status(P-values were less than 0.05);2.Univariate logistic regression analysis revealed the significant associations of the CD44 expression with tumor grade,IDH mutation status,and 1p19 q co-deletion status,with the odds ratio(OR)(95% confidence interval(CI))values and P-values were 1.016(95%CI:1.008~1.025),P < 0.01;0.986(95%CI:0.978~0.993),P <0.01;0.952(95%CI:0.936~0.967),P < 0.01,respectively;The CD133 expression is also significantly associated with tumor grade,IDH mutation status,and 1p19 q co-deletion status,with OR(95%CI)values and P-values were: 1.477(95%CI:1.248~1.747),P < 0.01;0.597(95%CI:0.510~0.700),P < 0.01;0.560(95%CI:0.431~0.747),P < 0.01,respectively;There was no potential association between the OLIG2 expression and the tumor grade(P > 0.05),however,there are significant associations with IDH mutation status and 1p19 q co-deletion status,with OR(95%CI)values and P-values were: 1.024(95%CI:1.019~1.029),P < 0.01;1.004(95%CI:1.001~1.007),P < 0.01,respectively;There was significant associations between the CCND2 expression and the tumor grade,IDH mutation status,and 1p19 q co-deletion status,respectively,and with the OR(95%CI)values and P-values were: 1.012(95%CI:1.005~1.019),P < 0.01;0.993(95%CI:0.987~0.998),P < 0.05;0.991(95%CI:0.984~0.999),P < 0.05.3.As the Kaplan-Meier survival analysis and the survival curves shown: There were statistically significant differences in the OS and PFS rates between the high-expression level group and the low-expression level groups of CD44,CD133,OLIG2,and CCND2 in patients with LGG,all the P-values were less than 0.05 by Log-rank test4.In the univariate COX proportional hazards regression analysis,CD44,CD133,OLIG2,and CCND2 were all significantly associated with the overall survival prognosis in patients with LGG.Their hazard ratio(HR)values,95%CI and P-values were: 1.012,1.007~1.016,P < 0.01;1.137,1.089~1.187,P < 0.01;0.993,0.990~0.997,P < 0.01;1.005,1.002~1.009,P < 0.01,respectively.However,in the multivariate COX proportional hazard regression analysis,besides age,tumor grade,IDH mutation status,and 1p19 q co-deletion status,only CD44 was the independent risk factor for overall survival in patients with LGG.Its HR(95%CI)value and P-values are: 1.007(1.002~1.013)and P < 0.05.Conclusion:The expression levels of the four CSCs marker genes: CD44,CD133,OLIG2,and CCND2,can significantly affect the tumor nature of LGG and the survival prognosis of patients with LGG,among them,CD44 is an independent risk factor for the overall survival prognosis of LGG patients.Part 2.Applying multiple machine learning methods to evaluate the value of radiomics features based on T2 FLAIR images in predicting the expression level of CD44 in patients with lower-grade glioma Background and objective:Previous studies have revealed that by applying radiomics method,the molecular typing of gliomas could be predicted effectively,hence,the study will evaluate the value of radiomic features based on T2-weighted fluid attenuated inversion recovery(T2FLAIR)image to predict CD44 expression level in the patient with LGG,while comparing the performance of the three common machine learning classifiers: LR,SVM,and RF.Method:A total of 108 screening-eligible cases were included in the study,setting the median value of CD44 expression in the cases of the previous chapter as cut-off,these 108 cases were divided into high and low expression level groups for establish-ing radiomics predictive labels.After sketching interval of interest(ROI),extracting radiomic features and features reduction by the least absolute shrinkage and selection operator(LASSO)and multivariate logistic regression(MLR),a total of nine features remained.Based on these screened optimal features,the predictive models of CD44 expression level were established by LR,SVM,and RF,respectively,5-folds cross-validation was performed to evaluate the performance of the three classifiers.Finally,by applying logistic regression classifiers,we will further compare the predictive performance between the radiomics model and the clinical-radiomics combination model.The model was evaluated by plotting the ROC curve and calculating the area under the curve(AUC),sensitivity,specificity,accuracy,positive predictive values,and negative predictive values.Results:After feature reduction,a total of nine features were used for the construction of the CD44 prediction model,they are “original first order Minimum”,“wavelet-HLL first order 90Percentile”,“wavelet-LHL gldm Dependence Variance”,“wavelet-LHL first order 10Percentile”,“wavelet-HLH glrlm Long Run Low Gray Level Emphasis”,“wavelet-HLH glszm Large Area High Gray Level Emphasis”,“wavelet-HHH glszm Gray Level Non Uniformity Normalized”,“wavelet-HHL ngtdm Strength”,“waveletLLL glcm Imc1”.By 5-folds cross-validation,the average AUC,sensitivity,specifi-city,accuracy,positive predictive value,and negative predictive values of the three classifier models were: LR: 0.877,0.761,0.835,0.805,0.845,0.813;SVM: 0.852,0.773,0.774,0.768,0.751,0.788;RF: 0.811,0.674,0.752,0.722,0.700,0.724.In addition,comparing the performance of the radiomics model and clinical-radiomics combination model by using LR,the result indicates that there is no significant statistical difference between them.Conclusion:The radiomic features of T2 FLAIR images have good predictive value for the expression level of CD44 in lower-grade gliomas.
Keywords/Search Tags:radiomics, machine-learning, MRI, gliomas, cancer stem cell marker genes
Related items