| Objective:This study constructs an aging related prognostic signature for breast cancer based on the TCGA database and combines clinicopathological factors to develop a clinically user-friendly nomogram.It provides new ideas for better clinical identification of highrisk groups among patients and assessment of patient prognosis.Methods:The TCGA-BRCA cohort m RNA data and corresponding clinical information were downloaded from The Cancer Genome Atlas(TCGA)official website,and the downloaded data were integrated into a gene expression matrix and a clinical information matrix using Perl language.The "limma" R package and the "survival" R package were used to perform differential gene expression analysis and univariate Cox regression analysis to identify differentially expressed aging genes that are closely related to survival.Lasso Cox regression analysis was then used to further screen the appeal genes and to construct a prognostic risk model associated with ageing.Patients were also classified into two risk category groups,high and low,based on the median value of the risk score,followed by Kaplan-Meier survival analysis to compare the survival difference between the two,and time-dependent subject characteristic curves(ROC curves)were used to assess their predictive ability for individual overall survival(OS)at 3,5 and 7 years.GSEA enrichment analysis was used to identify the functions and pathways that were significantly enriched for differentially expressed genes in the high and low risk groups.We then used univariate Cox regression and multivariate Cox regression to identify independent prognostic influences in breast cancer,and developed nomogram based on independent prognostic influences using the "rms" R package,and assessed the predictive power of the nomogram using calibration curves and C-index.In addition,stratified analyses were performed in different clinicopathological subgroups.Finally,the prognostic risk model was validated in an independent external dataset.Results:The m RNA data of 1100 tumour samples and 112 normal tissue samples were downloaded from the TCGA database,together with clinical information of 1085 female breast cancer patients.Firstly,162 differentially expressed genes(DEGs)were identified by differential gene expression analysis,of which 50 were up-regulated and112 were down-regulated in expression.Meanwhile,the collected and collated senescence-related genes were subjected to univariate Cox regression analysis,and the results showed that 78 aging-related genes(ARGs),including TP63,NRG1,TBP and APOC3,were closely associated with the prognosis of BC patients.Subsequently,the12 senescence genes that were closely associated with prognosis and differentially expressed after taking the intersection of DEGs and ARGs were subjected to Lasso Cox regression analysis.Based on the Lambda.min values,three genes were excluded and nine key genes(NRG1,S100 B,ALDH3A1,APOD,MMP7,CXCL14,IGFBP6,MAP2K6,MMP1)were finally screened for participation in the construction of the prognostic risk model.Risk scores were calculated for each sample using the risk score formula and the 1031 breast cancer patients were divided into a high-risk group(515patients)and a low-risk group(516 patients)based on the median risk score.k-M survival analysis showed that the overall survival rate was significantly lower in the high-risk patient group(p < 0.001).And stratified analysis showed that among the clinicopathological subgroups,OS was significantly prolonged in the low-risk group compared with patients in the high-risk group,with a statistically significant difference(p < 0.05).Next,univariate and multifactorial Cox regression analyses showed that age at diagnosis,pathological stage,and model risk score were independent prognostic influences for breast cancer.And based on the results of the multifactorial regression analysis,a nomogram integrating clinicopathological factors and model risk scores was developed using the R language.In the evaluation of the nomogram,the calibration curves at 3,5 and 7 years were close to the diagonal and the calculated C-index was0.769.Finally,the newly developed prognostic risk model was further validated in the GSE20685 dataset.Conclusion:1.K-M survival analysis and ROC curves demonstrate that risk scores calculated based on nine aging-related genes are effective in identifying high-risk groups with poor prognosis among breast cancer patients.2.The development of a nomogram combining the risk scores for aging-related genes and clinicopathological factors,with calibration curves and C-index assessment demonstrating that the nomogram was able to make accurate predictions of patient survival at 3,5 and 7 years and outperformed any of the independent prognostic influences involved in constructing the nomogram alone.3.Univariate Cox regression analysis and multi-factor Cox regression analysis showed that ageing-related genetic risk scores were independent influences on the prognosis of breast cancer patients.4.Stratified analysis showed that the aging-related polygenic prognostic model showed good prognostic predictive value in different clinicopathological subgroups such as pathological stage I-II and III-IV,age >50 years,tumour length >5cm and tumour length ≤5cm,presence and absence of lymph node metastasis,and absence of distant tumour metastasis.5.The NRG1,S100 B,APOD,MMP7,CXCL14,IGFBP6 and MMP1 genes involved in the construction of the multigene model were shown to be closely related to breast cancer tumour growth,invasion,metastasis and patient prognosis in previous studies,and may become future therapeutic targets. |