| Objective Non-small cell lung cancer(NSCLC)is the most common type of lung cancer,and its incidence increases with age.Age is one of the important risk factors for NSCLC,but the mechanism of NSCLC in different age groups is not clear.The aim of this study was to identify differential genes in early-onset and late-onset non-small cell lung cancer by the Scissor algorithm and construct a prognostic risk model to further investigate the relationship between the model and the prognosis and immunotherapy response of non-small cell lung cancer.Methods 1.Transcriptome expression data of lung adenocarcinoma(LUAD)and lung squamous cell carcinoma(LUSC)and corresponding clinical data were downloaded from the UCSC Xena database,and age-related phenotypes of LUAD and LUSC were constructed after excluding patients with no information on overall survival.In this phenotype,age ≤ 50 years was defined as early-onset tumors,and age > 50 years was defined as late-onset tumors.The single-cell RNA sequencing(sc RNA-seq)data of LUAD and LUSC were downloaded from the Gene Expression Omnibus(GEO)database.Subsequently,age-related phenotypes,bulk-RNA data,and sc RNA-seq data from LUAD and LUSC were used as input files for the Scissor algorithm to identify age-related single-cell subgroups in sc RNA-seq data from LUAD and LUSC and perform differential analysis of identified subpopulations.Subsequently,the differentially expressed genes were enriched to explore their biological functions.2.Based on the non-negative matrix factorization(NMF)algorithm,the identified differential genes were further clustered,and the differences in prognosis,immune microenvironment,and biological functions of the clustered subtypes were explored.3.Univariate Cox regression analysis and the least absolute shrinkage and selection operator(LASSO)analysis were used to further identify differentially expressed genes and to construct an age-related prognostic risk score model known as the age-related score(ARscore)using the Gene Set Variation Analysis(GSVA)algorithm.The TCGA cohort and the three GEO cohorts were divided into high-and low-risk groups based on the median value of the ARscore.The prognostic difference between the high-and low-ARscore groups was investigated using survival analysis.The association between ARscore and clinical characteristics was investigated using multivariate Cox regression analysis.4.ARscore was further constructed in the NSCLC sc RNA-seq dataset using the GSVA algorithm to investigate biological differences between the high-and low-ARscore groups.5.The Estimation of Stromal and Immune Cells in Malignant Tumors Using Expression Data(ESTIMATE)and single-sample gene-set enrichment analysis(ss GSEA)algorithms were used to further assess the differences between the high-and low-ARscore groups in the tumor immune microenvironment.6.The Wilcoxon test was used to evaluate the relationship between ARscore and immunotherapy.Results 1.Two age-related single-cell subgroups were screened out in the sc RNA-seq datasets of LUAD and LUSC by the Scissor algorithm.Subsequently,the difference analysis showed that a total of 85 genes were significantly different in two age-related single-cell subgroups.Enrichment analysis showed that the differential genes were mainly associated with extracellular processes,immunity,inflammation,apoptosis,and tumor progression.2.The NMF algorithm was used to cluster two molecular subgroups(C1 and C2).Cluster 1 had a higher survival advantage(P =0.042)and a greater abundance of immune infiltrates than cluster 2.Enrichment analysis revealed that cluster 1 was primarily associated with autoimmune diseases,metabolic,inflammatory,and cancer pathways,whereas cluster 2 was primarily associated with cell cycle,cellular metabolism,and carcinogenic targets.3.In addition,in the TCGA cohort and the three GEO cohorts,the high-ARscore group had a worse prognosis than the low-ARscore group(P < 0.05).In multivariate Cox regression analysis,ARscore was found to be an independent prognostic factor(P < 0.05).4.The differentially expressed genes in the high-and low-ARscore groups in the single-cell dataset were mostly enriched in the pathways of cell transcription,translation,and modification,extracellular transport,autoimmunity,neurodegeneration,and cell clearance.5.The high ARscore group had higher immune scores,stromal scores,and estimate scores in the tumor immune microenvironment than the low ARscore group,and a large number of immune cells were significantly enriched in the high ARscore group.Finally,the high ARscore group had higher expression of the six common immune checkpoint genes,and the high ARscore group had a better therapeutic effect when treated with PD-1,CTLA-4,or a combination of PD-1 and CTLA-4 than the low ARscore group.Conclusions Based on the Scissor algorithm,we constructed an age-related risk-prognosis model,which can be used as a novel method to predict prognosis and immunotherapy response in NSCLC. |