Font Size: a A A

Study On Breast Cancer Prognosis And Stemness-related Gene Screening And Risk Model Establishment

Posted on:2022-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:J Y PeiFull Text:PDF
GTID:2504306491486784Subject:Clinical Medicine
Abstract/Summary:PDF Full Text Request
According to the 2020 global cancer report,the number of new cancer cases in2018 is 18.1 million,and it is estimated that this number will continue to increase in the next 20 years.There is an urgent need for early detection and treatment of cancer.Breast cancer,as the most common cancer in women,is killing many young women.According to statistics released by WHO in 2019,breast cancer ranks second in the incidence rate of cancer worldwide,with a fifth mortality rate,2088849(11.3%)and626679(6.6%)respectively.The number of new cases of breast cancer among Chinese women with cancer was 367900(19.2),ranking first.With the rapid development of second-generation sequencing technology and the maturity of gene chip technology,a large number of sequencing data and gene expression profile data emerge as the times require.At the same time,the rapid development of artificial intelligence also provides new algorithms and technical support for these big data,so as to screen out the relevant target genes that can be used to establish the prediction model for the prognosis of the disease.Through further analysis of the above target genes,the prediction model can be obtained.Then the samples were randomly divided into training group and test group,and the test group was used to verify the prediction model established in the training group.This study is based on bioinformatics,through mining gene expression profile data and related omics data of breast cancer in TCGA and geo database,using PERL language for data preprocessing,using R language for difference analysis and correlation analysis,constructing the interaction network of key genes,constructing clinical prognosis risk model,verifying model and so on Research.Meanwhile,the key genes were analyzed and verified by GEPIA,ONCOMINE,GEO,STRING,KM PLOTTER and other websites.The original data of all patients in this study were from TCGA and geo database and related clinical information.The R packages used in this study include SURVIVAL,AFFY,LIMMA,WGCNA,PHEATMAP,SURVIVALROC,BEESWARM and other related ancillary installation packages.At the same time,PPI(protein protein interaction)analysis of key genes was carried out to construct the network relationship of differentially expressed genes,and the data were visualized by Cytoscape.Furthermore,go analysis and KEGG analysis of key gene sets were carried out by R language program,and correlation analysis of genes used in modeling was carried out by combining with immune infiltration database.In the first part of this study,perl language script was used to sort out the big data of m RNA expression from TCGA database,and R language program was used to screen out the differential genes,then the immune related differential genes were extracted for prognosis correlation analysis,and the prognosis immune related genes were obtained.Next,the immune prognostic genes were modeled and screened to obtain the combined model of 15 immune prognostic genes.The model was validated by prognostic survival analysis,receiver characteristic curve,univariate and multivariate correlation analysis,and clinical stage correlation analysis.The second part is to combine the gene expression values from the Cancer Genome Atlas(TCGA)samples with the m RNA expression based stem cell identity index(m RNAsi),and use the tumor purity to correct the m RNAsi.The analysis of m RNAsi and corrected m RNAsi showed that it was closely related to the clinical features of BRCA,including tumor depth,pathological stage and survival status.The key gene modules and key genes were distinguished by weighted gene co expression network analysis(WGCNA).A series of functional analysis and expression verification of key genes were carried out by using several authoritative databases,such as oncomine,Gene Expression Omnibus(GEO)and gene expression profiling integrated analysis(GEPIA).
Keywords/Search Tags:breast cancer, immune -related genes, prognosis risk model, stemness scores
PDF Full Text Request
Related items