Font Size: a A A

Study On Screening Of Characteristic Genes And Diagnostic Prediction Model Of Thyroid Cancer

Posted on:2024-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhangFull Text:PDF
GTID:2544306932490594Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Objective: To develop a reliable prediction model for the early diagnosis of thyroid cancer by using the Cancer Genome Atlas(TCGA)database to screen the characteristic genes of thyroid cancer and construct a prediction model.Methods: In this study,the gene expression profiles and clinical data of 541 patients with thyroid cancer were all obtained from the Cancer Genome Atlas(TCGA),the patients were divided into paracancerous group and cancer group,and the differential expression genes were screened using limma,edge R and DESeq2,and the differential gene sets(DEGs)between TC samples and normal samples were screened out by three packages,and then the core modules related to TC were determined by weighted gene co-expression network analysis(WGCNA).The top 15 key module genes with the highest degree of connectivity within the module were selected as core genes.In order to improve the accuracy of the model,the gene is screened by LASSO regression,the screened gene is used as the characteristic gene,the patient is randomly divided into training set and verification set according to the ratio of 8:2,and the prediction model is established using the traditional four machine learning classification methods(decision tree,K nearest neighbor,random forest,support vector machine),respectively,with accuracy,no information rate,Kappa,sensitivity,specificity,The receiver operating characteristic(ROC)curve and the calculated area under the curve(AUC)to compare the performance of the model.Results: There were a total of 4059 DEGs between TC cancer group patients and paracancerous group patients,and a co-expression network was established with DEGs,of which the blue module was the key module related to TC,and the gene-gene interaction analysis of the blue module was carried out by using the plug-in cyto Hubba in Cytoscape 3.9.0,and the top 15 genes were selected as candidate hub genes according to the MCC value of nodes in the network.After further LASSO regression screening,eleven central genes(MPPED2,MRO,SORBS2,SLC4A4,ACACB,LRP1 B,LRRC2,HGD,SAMD5,ASXL3,LRP2,ELMO1,ADIG,TRIM58,SPX)were preserved.The predictive model was constructed in the training set of 339 patients,and the predictive value of the model was reflected in the validation set,and the ROC curves obtained by the models constructed by four traditional machine learning methods,decision tree,K nearest neighbor,support vector machine and random forest,were0.911,0.889,0.905 and 0.911,respectively.Conclusion: In this study,eleven characteristic genes were identified as candidate biomarkers for TC.These characteristic genes may provide a theoretical basis for predictive diagnosis of TC.At the same time,multiple models constructed in this study have good resolving ability for thyroid cancer,suggesting that the prediction effect of common genes is good and each model has good robustness,which has potential value in clinical application.
Keywords/Search Tags:thyroid cancer, WGCNA, TCGA database, predictive model
PDF Full Text Request
Related items