Objective: Esophageal squamous cell carcinoma(ESCC)is a major subtype of esophageal tumors in China.The incidence of lymph node metastasis is high in ESCC,and lymph node metastasis is one of the main mechanisms of poor prognosis in ESCC.Therefore,we wanted to explore the genetic markers associated with lymph node metastasis to assist in the diagnosis and prognosis prediction of lymph node metastasis in esophageal cancer.Methods: We first used the SEER database to define the grouping of lymph node metastasis risk in the study,and then,grouped the GSE53625 dataset obtained from the GEO database according to this grouping criterion to explore the genetic markers that cause lymph nodes to be more susceptible to metastasis,and studied the transcription factors of these genes and the regulatory relationships of lncRNAs through the reciprocal network.The mutation data of ESCC genes obtained from the TCGA database were also used to examine the mutation status of these genes.After that,a risk score model for lymph node metastasis risk was constructed using LASSO logistic regression for the aforementioned genes,and the correlation between lymph node metastasis risk score and immune cell infiltration and immune-stromal score was explored.Subsequently,survival analysis was performed to investigate the genes associated with survival among the genes included in the logistic regression model and to construct a risk score model for predicting patient survival using Cox regression.In addition to this,we used a GSEA enrichment analysis approach to explore which gene functions and pathways were significantly associated with the risk of lymph node metastasis in the dimension of the entire data set.In the second part of the study,the Gene Expression Profiling Enrichment Analysis(GEPIA),and Gene Expression Omnibus(GEO)databases were utilized to explore PPFIA1 mRNA expression in esophageal cancer.The associations of PPFIA1 expression with clinicopathological variables and prognosis were evaluated in the GSE53625 dataset and verified in quantitative real-time polymerase chain reaction(q RT-PCR)-based c DNA array and immunohistochemistry(IHC)-based tissue microarray(TMA)datasets.The interactions between PPFIA1 and other genes based on the proteinprotein interaction(PPI)network was analyzed via the STRING website.Result: Analysis of the GSE53625 dataset identified 2252 mRNAs differentially expressed in normal and tumor tissues,of which 69 mRNAs were associated with the risk of lymph node metastasis,and Lasso logistic regression analysis was performed on these genes and a lymph node metastasis risk score model containing 34 genes was constructed,and the model showed good predictive effect(AUC 0.969).In addition,the lymph node metastasis risk score was correlated with the immunestromal score,the level of eosinophil infiltration,and the level of naive CD4+ T-cell infiltration.And then,we constructed a validated survival prediction scoring system(1-year survival AUC 0.706)based on 4 genes(PPFIA1,RHOU,MAMDC2,and RPL3L),and multifactorial Cox regression analysis showed that the survival risk score independently affected patient prognosis.Finally,we also constructed a nomogram model to predict patient prognosis using the survival risk score.In addition,GSEA enrichment analysis revealed that focal adhesion and ECM-receptor interactions were the genetic pathways with the highest correlation with the risk of lymph node metastasis.In the study of PPFIA1 we found the expression of PPFIA1 was obviously upregulated in ESCC tissues versus adjacent normal tissues according to online database analyses(all P<0.05).High PPFIA1 expression was significantly associated with several clinicopathological features,including tumor size,histological grade,tumor invasion depth,lymph node metastasis,and tumor-node-metastasis(TNM)stage.High PPFIA1 expression was related to worse outcomes and was identified as an independent prognostic indicator of overall survival(OS)in ESCC patients GSE53625 dataset,P=0.004;c DNA array dataset,P<0.001;TMA dataset,P=0.039).PPI analysis demonstrated that PPFIA1 was highly correlated with multiple genes,including UNC13 B,RAB3A,PTPRD,and SYT1.Conclusion: We constructed a lymph node metastasis risk prediction model and a prognostic model,both of which showed good predictive effects.In addition,we found that the metastasis risk score correlated with the immune stroma score,eosinophil infiltration,and naive CD4+ T-cell infiltration of the tumor,indicating the correlation of lymph node metastasis risk with the tumor microenvironment.By GSEA enrichment we identified focal adhesions and ECM-receptor interactions as the gene pathways most closely associated with lymph node metastasis risk.In addition,PPFIA1 may be associated with ESCC progression and could be used as a biomarker for prognostic evaluation of ESCC patients. |