| BackgroundTriple-negative breast cancer(TNBC),which comprises approximately 10-20%of all breast cancers,is defined by lack of the expression of estrogen receptor(ER),progesterone receptors(PR)and absence of amplification or overexpression of HER2.The weighted gene co-expression network Analysis(WGCNA)is a system biology method that can identify the modules of highly correlated genes and clarify the association between modules and clinical traits in the transcriptome(mRNA)level.Therefore,our study aims to study prognosis expression of gene in triple-negative breast cancer based on WGCNA by utilizing the TNBC gene expression data in public databases,and to explore the molecular mechanisms involved in the prognosis of TNBC.Objective1.Establish the mRNA expression database and clinical information database based on the TNBC in the Gene Expression Omnibus(GEO)and TCGA;2.Construct the gene co-expression network using the weighted correlation network analysis,identify specific modules and hub genes,and explore the biological processes and pathways of the progression of TNBC;3.Clarify the critical role of hub genes which may be very beneficial to assess the malignant potential and prognosis of TNBC,and develop specific molecular targets for more effective treatment of TNBC.Materials and Methods1.In this study,mRNA expression data and clinical trait information for breast cancer were downloaded from the GEO database using the keywords "breast cancer" in NCBI and TCGA database.The search strategy of the study was designed as follows:the type of study was expression profiling by array,and the entry type was"datasets".The sample size of all selected datasets should be greater than or equal to 100.The organism was homo sapiens.Database searching was independently carried out by two researchers.2.The average mRNA expression value was taken as the gene expression value for multiple probes corresponding to one gene in the five datasets.The expression of mRNA genes for five datasets was normalized and merged by gene name,which was deleted when gene names were not found in all five datasets.Genes were filtered based on a standard for gene expression missing value in at least 10%of the samples and a difference in the first 50%variance.3.In R 3.5.1,the gene co-expression networks were constructed using the WGCNA R package.For survival analysis,MEs and gene expression were dichotomized to low and high expression groups that were carried out via the Cutoff Finder.The hazard ratio(HR)was determined via a Cox regression model,and survival curves were plotted from Kaplan-Meier estimates.Differences in gene expression between the groups were analyzed using the Mann-Whitney U test.A P-value<0.05 was considered as significant in this study.Gene Ontology(GO)and Kyoto Encyclopedia of Genes and Genomes(KEGG)pathway enrichment analyses for the identified modules were performed using Cytoscape software(version 3.5.1)with the ClueGO V2.5.0 plug-in.Results1.According to the inclusion criteria,a total of 5 microarray datasets(GSE16446,GSE25055,GSE25065,and GSE58812,TCGA)that contained both survival outcomes and clinical information were included.By preprocessing the five datasets,a total dataset containing 459 TNBC patients,and mRNA expression of 5,782 genes was compiled.2.In this study,11 distinct co-expression modules were identified(ranging in size from 38 to 2251 genes).These co-expression modules are shown in different colors.In the multivariate survival analysis,the red module was associated with the prognosis of TNBC(HR=0.38,95%Cl:0.18-0.79;P=0.010);the green-yellow module was associated with the prognosis of TNBC(HR=0.41,95%Cl:0.25-0.69;P=0.001);the tan module was associated with the TNBC(HR=3.41,95%Cl:1.46-7.96;P=0.005).It was a correlation between the red module and clinical stage in TNBC(r=-0.12,P=0.030);the green module was correlated with clinical stage(r=0.11,P=0.050).The results of Mann-Whitney U test showed that the mRNA expression values of MEs in the red module were significantly different between the two groups of relapse and non-relapse in TNBC patients(Z=-2.39,P=0.017).Based on the above analysis,the red module was identified as a key gene module in the prognosis of TNBC.3.In Cytoscape,276 genes in the red module were performed in enrichment analysis of using ClueGO.The top 10 significantly enriched GO terms were as follows:mRNA processing,regulation of mitotic nuclear division,cellular response to topologically incorrect protein,interaction with symbiont,Golgi vesicle transport,mitotic cytokinesis,regulation of TOR signaling,transcription elongation from RNA polymerase Ⅱ promoter,organelle localization by membrane tethering,histone lysine methylation.For KEGG analysis,3 KEGG pathways were significantly identified,including the Hedgehog signaling pathway(KEGG:04340),GnRH signaling pathway(KEGG:04912)and thyroid hormone signaling pathway(KEGG:04919)4.In Cytoscape,the gene co-expression network of the red module was visualized.The correlation between the 276 gene expression values and the MEs in the red module was calculated.Combined the connectivity between the genes,12 hub genes in the red module were as follows:APC,ATRX,CHD1,CHD9,COL4A3BP,DCP2,DMXL1,KIAA1033,RAPGEF6,TRIM23,TTC37,ZFYVE16.A combination of these hub genes for ROC curve analysis showed that it could distinguish recurrence(or metastasis)from non-recurrence(or metastasis)in TNBC patients(AUC=0.57;P=0.023).The results of Mann-Whitney U test showed that the mRNA expression of ATRX was significantly different between the recurrence(or metastasis)from non-recurrence(or metastasis)of TNBC patients(Z=-2.25,P=0.024).The multivariate survival analysis showed that ATRX was associated with prognosis of TNBC(HR=0.60,95%Cl:0.38-0.96;P=0.033);CHD9 was associated with prognosis of TNBC(HR=0.37,95%Cl:0.15-0.93;P=0.033);TRIM23 was associated with prognosis of TNBC(HR=0.29,95%CI:0.09-0.93;P=0.038).Conclusion1.The WGCNA was firstly used for TNBC that it could find biologically meaningful modules.In this study,the red module was identified as a key gene module in the prognosis of TNBC and the high-expression red module predicted a better prognosis outcome for TNBC patients.2.The genes in the red module were mainly involved in biological processes and pathways such as TOR signaling,histone lysine methylation,Hedgehog signaling pathway and GnRH signaling pathway,suggesting that these pathways may be the key pathways and important mechanisms affecting progression of TNBC,which can affect Cell proliferation,invasion and migration.3.In present study,APC,ATRX,CHD1,CHD9,COL4A3BP,DCP2,DMXL1,KIAA1033,RAPGEF6,TRIM23,TTC37 and ZFYVE16 were considered as hub genes that may affect the prognosis of TNBC in the mRNA level,which provided some clues in exploring the molecular mechanism of TNBC prognosis.However,additional research is still needed to support the results about critical pathways and genes due to limited research conditions. |