| BACKGROUND: With the development of high-throughput sequencing and computer technology,the application of bioinformatics has become more and more widespread.Sequencing technology has become more and more mature,which has further reduced the cost of sequencing,so that it is now widely used in various life science fields.The application of bioinformatics in HNSCC(Head and neck squamous cell carcinoma)is flourishing,especially in multigene models.Therefore,further use of bioinformatics to predict the clinical outcome of HNSCC is necessary.METHODS: The 656 head and neck squamous cell carcinoma cases enrolled in this study were obtained from the TCGA(The Cancer Genome Atlas)database,GEO(Gene Expression Omnibus)database,and EBI(European Bioinformatics Institute)database.Bioinformatics Institute)database.Gene expression data,exon sequencing data,and clinical data were obtained from the three databases.With TCGA-HNSCC as the main research direction,we used AMES(APOBEC mutation enrichment score)to reflect the mutation intensity from TCW to TTW and TCW to TGW,and deciphered the somatic mutation characteristics of HNSCC by using Nonnegative Matrix Factorization.A prognostic risk model associated with HPV(human papilloma virus)was constructed based on machine learning.The stability of the model was tested by ROC(Receiver Operating Characteristic)curves and multiple validation sets.Applications of the prognostic models include survival analysis,univariate Cox analysis,multivariate Cox regression,clinical subgroup analysis,and immunotherapy prediction.RESULTS: In the TCGA-HNSCC cohort,differential analysis revealed differences in APOBEC3 B,APOBEC3C,APOBEC3 D,APOBEC3G,APOBEC3 H,APOBEC3F,APOBEC2,APOBEC4,and AICDA between tumor and paracancerous tissues.Correlation analysis revealed a positive correlation between AMES and not only APOBEC3 family members,but also with tumor mutational load(R = 0.31,P = 1.4e-12),and the proportion of APOBEC mutations(R = 0.96,P < 2.2e-16).However,no association was found between AMES and HNSCC prognosis(P = 0.94).Not only was there a high expression of most APOBEC family members in the HPV+ group compared to the HPV-group,but also a higher proportion of APOBEC mutations and AMES.In the HPV-group,44.578% of samples had APOBEC-associated mutations,while 55.556%of samples had APOBEC-associated mutations in the HPV+ group.Survival analysis revealed better overall survival in the HPV+ group compared to the HPV-group(P=0.0026).Differential genes for HPV+ and HPV-were included in univariate Cox analysis,which resulted in 29 prognostic genes(P < 0.05).Prognostic genes were included in machine learning and Cox Boost + Step Cox [both] was found to be the best combination,and stepwise regression identified 7 candidate genes(BEST2,CD19,CDKN2 A,DMRTA2,GAST,SLURP1 and TFPI2)and calculated their respective coefficients.The ROC curves showed AUC values of 0.650,0.678,and 0.691 for 1,2,and 3 years,respectively.The AUC values of external dataset GSE3292 were 0.792,0.929,and 1.000 for 1,2,and 3 years,respectively.The AUC values of external dataset E-MTAB-8588 were 0.596,0.654 and 0.679 for 1,2,and 3 years,respectively.The predictive power of the risk score performed even better relative to the clinical factors,with all AUC values for the risk score being higher than the other clinical factors.Although univariate Cox regression found that HPV status predicted overall survival in patients with HNSCC(P = 0.023;hazard ratio: 0.413,95% CI: 0.193-0.887),after adjusting for confounders,HPV status was not an independent prognostic factor(P =0.344;hazard ratio: 0.680,95% CI: 0.307-1.510).Multivariate Cox regression found risk score to be an independent prognostic factor(P < 0.001;hazard ratio: 2.390,95% CI:1.695-3.371).Then,the risk score-based prediction model also showed good predictive power in radiotherapy population,with a better prognosis in the low-risk group than in the high-risk population(P=0.004).In addition,further analysis revealed an enrichment of T-cell receptor pathway and B-cell receptor pathway in the low-risk group with a large number of CD8+ T cells,naive B cells infiltration.Higher expression of the immune checkpoints PD1 and CTLA4 was found in the low-risk group compared to the high-risk group(P<0.001;P<0.001).Finally,a better sensitivity to immunotherapy was further predicted in the low-risk group in CTLA4+ / PD1+(P = 0.0017),CTLA4-/ PD1+(P =0.043),and CTLA4+ / PD1-(P = 0.014).CONCLUSIONS: In head and neck squamous cell carcinoma,this study exposes the association of HPV and APOBEC signature mutations.Risk scores calculated from HPVpositive and HPV-negative differential genes better reflect the prognosis of HNSCC patients than HPV status and furthermore predict the sensitivity to radiotherapy and immunotherapy. |