| Objective:Oral squamous cell carcinoma(OSCC)is one of the most common malignant tumors in the world,which is characterized by high incidence rate and mortality.Although considerable efforts have been made and great progress has been made in multimodal treatment,most patients are in a certain condition of advanced stage when diagnosed,and the prognosis is poor in most cases.The development of bioinformatics provides us with a method to screen and identify reliable prognosis related genes,so as to find out early,treat effectively and improve the prognosis of OSCC patients.In addition,it has gradually become a consensus that autophagy can regulate the occurrence and development of tumors.The purpose of this study is to identify autophagy related genes(ARGs)which influence prognosis,and build a risk prognosis model;The nomogram was constructed by combining clinicopathological factors and risk score to predict the survival rate of OSCC patients and provide reference for clinical treatment and prognosis evaluation.Methods:1.Data download(1)The transcriptome expression profiles of OSCC patients’ tumors and normal control samples were downloaded from the Cancer Genome Atlas(TCGA)database.(2)The GSE41613 dataset was downloaded from the Gene Expression Omnibus(GEO)database and the data were normalized.(3)232 autophagy related genes were downloaded From human autophagy database(HADB)(http://www.autophagy.lu/)and incorporated into the TCGA dataset for subsequent analysis.2.Bioinformatics methods(1)Differential expression analysis of ARGs in TCGA data set was performed.Then,volcano map,heat map and box chart of tumor tissue and normal tissue control were drawn.Go(gene ontology)and KEGG(Kyoto Encyclopedia of genes and genomes)enrichment analysis for differentially expressed ARGs were performed.The enriched molecular function(MF),biological process(BP),cellular component(CC)and signal pathway help us understand the impact of differential genes on tumor progression.(2)The survival status and survival time were combined with the expression of differential ARGs.Univariate Cox regression analysis was performed to screen out potential genes affecting the prognosis of patients.(3)The screened genes were included in the LASSO Cox regression model,and then constraint parameters by regularization to prevent the model from overfitting.the larger the value of λ,the greater the penalty on the multivariate model.Select the best "log"(λ)”to obtain the genes related to the prognosis of OSCC patients and assign corresponding coefficients.(4)The corresponding score of each sample is calculated by the formula,and the samples are divided into high-risk group and low-risk group according to the median score.(5)The samples were sorted according to the risk score.The risk score,survival status,survival time and the expression of included ARGs were plotted one-toone.(6)The survival information of patients was combined with the risk score file.Then,Group the files according to high risk and low risk.Kaplan Meier(K-M)survival analysis was performed to verify the correlation between the risk score and prognosis.In order to test the accuracy of the risk scoring model for the identification and diagnosis of OSCC,it is introduced into the receiver operating characteristic curve(ROC)analysis,which is an important means to identify and judge the value of disease prognostic factors,and calculate its area under the curve(AUC).(7)The patient’s risk score and clinicopathological factor files were combined,and univariate and multivariate Cox regression analysis was performed to identify whether the risk score and each clinicopathological factor had independent prognostic value.ROC curve analysis is introduced to test the effectiveness of various factors on OSCC prognosis and diagnosis.(8)The clinicopathological factors were adjusted to the data of dichotomous variables,the risk scores were combined,and the Wilcoxon rank sum test was performed to analyze the correlation between the risk scores and the clinicopathological factors.(9)Combined with age,sex,grade,tumor stage,T stage,N stage and risk score,nomogram was constructed in TCGA training set through the "RMS" package to predict the probability of 1-year,3-year and 5-year survival of OSCC patients.(10)Identify the immune cell infiltration characteristics of OSCC patients in TCGA data set through "CIBERSORT" algorithm,and then use Pearson correlation analysis to obtain the correlation of immune cells in the data set,so as to speculate the interaction relationship between key immune cells.Finally,the violin plot of immune cell infiltration difference between high and low risk groups was drawn.3.immunohistochemistry(1)Immunohistochemical staining was performed using a tissue microarray composed of 55 tumor tissues(46 OSCC tissues composite inclusion criteria)and 5 normal control tissues to verify the expression difference of the 6 ARGs included in this experiment between tumor samples and normal control samples.(2)The results of immunohistochemistry were interpreted to obtain the total score of each index of each sample;Mann-Whitney test was used to evaluate the difference of the total score between tumor tissues and control normal tissues,and chi-square test was used to evaluate the correlation between the expression of each index and each clinicopathological factor.Results:1.Identification of differential ARGs in OSCC patients and enrichment analysis(1)From human autophagy database(HADB)232 ARGs were obtained.37 differentially expressed ARGs in TCGA dataset were screened by differential expression analysis,including 11 down-regulated ARGs and 26 up-regulated ARGs.(2)Go enrichment analysis of 37 differentially expressed ARGs showed that their biological processes were mainly enriched in neuronal death,regulation of apoptotic signaling pathways,and cell responses to unfolded proteins.In the enrichment analysis of KEGG signaling pathway,ARGs are mainly enriched in apoptosis,platinum drug resistance,EGFR tyrosine kinase inhibitor resistance,ErbB signaling pathway,PD-L1 expression and PD-1 checkpoint pathway in cancer,suggesting that these ARGs may regulate tumor occurrence and progression through these biological processes.2.Construction and evaluation of ARGs risk scoring model(1)In order to build the ARGs risk scoring model,37 differentially expressed ARGs were included in the univariate Cox regression analysis.Then the LASSO Cox regression analysis was carried out to prevent model overfitting.Finally,the key modeling genes were screened out.The OSCC risk scoring model composed of 6 ARGs(BID,DDIT3,VEGFA,FADD,BIRC5 and NKX2-3)was constructed.The samples were divided into high-risk group and low-risk group according to the median risk score.(2)Survival analysis was conducted in the TCGA dataset and GEO dataset(GSE41613),and Kaplan Meier(K-M)curves were drawn.The survival rate of the high-risk group in both datasets was lower than that of the low-risk group.According to the risk score,the ROC curves of 1-year,3-year and 5-year survival rates in the two groups of data sets are drawn,and the AUC value is obtained.The risk score has good predictive power.3.Identification of independent prognostic factors and construction of nomogram(1)The results of univariate and multivariate Cox regression analysis showed that the risk score and age were significantly related to the prognosis of patients,and they had independent prognostic value.ROC curve verified the prediction accuracy of the ARGs risk score model,and the AUC value of the risk score was 0.627,which was higher than other clinical features.(2)According to the correlation analysis of clinicopathological factors,the ARGs included in the risk score were significantly related to the tumor stage,T stage and N stage of OSCC patients.(3)Integrating the ARGs risk scoring model and clinicopathological characteristics(age,sex,grade,tumor stage,T stage and N stage),a nomogram was constructed to predict the 1-year,3-year and 5-year survival probability.The calibration curve confirms the prediction accuracy.4.Exploration of potential mechanism of risk scoring model"Cibersort" algorithm was used to detect the immune cell infiltration characteristics of OSCC patients in high-risk and low-risk groups.We screen the immune cells with obvious differences in infiltration,especially the difference in the infiltration level of CD8+T cells,which would help us explore the potential mechanism of how the ARGs included in the risk scoring model affect the prognosis of patients.5.immunohistochemistry verificationFinally,the protein expression difference of risk model related ARGs between tumor tissues and normal control tissues was verified in vitro by immunohistochemical staining of tissue microarray.The results showed that the expression of BID,DDIT3,VEGFA,FADD and BIRC5 in tumor tissues was higher than that in normal tissues,and the expression of NKX2-3 in tumor tissues was lower than that in normal tissues.The staining results were interpreted to obtain the total score of each sample and included in the statistical analysis.It was found that the expression level of DDIT3 protein in the nucleus was significantly positively correlated with the patient’s age(P=0.015)and negatively correlated with the N stage(P=0.03).Conclusion:In this study,differentially expressed ARGs in OSCC tissues were identified by bioinformatics analysis.Risk scoring models and nomograms were constructed based on these genes,and verified by in vitro experiments.The risk score has independent prognostic value,and the higher the score,the worse the prognosis.The results of this experiment can provide a new reference for the diagnosis,treatment and prediction of individual prognosis of OSCC patients. |