| Objective: To screen genes related to nasopharyngeal carcinoma(NPC)by machine learning methods.To analyze the expression,immune infiltration,molecular functions and signaling pathways involved,and prognosis of the selected genes in NPC by using bioinformatics methods.The expression of related genes in NPC was further verified by experiments,and the correlation with clinicopathological factors was analyzed.To explore the role of related genes in the occurrence and development of NPC.To provide new methods and ideas for the daily pathological diagnosis of NPC,and to seek new therapeutic targets for NPC,so as to provide new molecular basis for the treatment of NPC.Methods:1.The NPC gene expression data was downloaded from GEO database,R language was used to merge the missing values,and SVA package was used for data processing and removal of batch effects.Limma package was used to screen differential genes,and the DAIVD database was used for GO and KEGG analysis of differential genes.2.Using R language and Python to construct LASSO,SVM-RFE,m RMR and XGBOOST machine learning models for screening NPC related genes and selecting intersection genes of the four methods.A neural network model was constructed to confirm the diagnostic efficacy of related genes in NPC,and an independent data set was selected for verification.3.R language was used to analyze the expression of related genes in NPC in GEO database..The molecular structure and physicochemical properties of related genes were analyzed by GEPIA database.The protein-protein interaction network of related was analyzed by STRING database.Visualization and scoring of the interaction network were carried out using Cytoscape software.Related genes were analyzed by GO and KEGG in DAIVD database.TIMER2.0 database was used to calculate the immune infiltration of NPC gene expression data and compare the expression of 22 kinds of immune cells in NPC and nasopharyngeal benign tissues.The relationship between related genes and NPC immune infiltration was analyzed,and the relationship between related genes and 6 kinds of key immune cells was analyzed.The relationship between the expression of related genes and the prognosis of NPC patients was analyzed in GEO dataset GSE120349.4.Immunohistochemistry was used to detect the expression of protein in paraffin tissues of 68 NPC patients and 13 patients with benign nasopharyngeal lesions that confirmed by pathology at the First Affiliated Hospital of Dali University from 2013 to 2021.The relationship between the expression of protein and clinicopathologic features was analyzed by clinical data and follow-up information of patients..Results:1.Gene expression data of 5 data sets were combined,batch effect were removed and the data were standardized.A total of 184 samples were obtained,181 differential genes were screened by Limma package,including 87 up-regulated genes and 94down-regulated genes.2.GO and KEGG analysis of NPC differential genes showed that DEGS were enriched in a variety of biological functions and signaling pathways.3.Four machine learning methods,namely LASSO,SVM-RFE,m RMR and XGBOOST,were used to screen BLK and OSBPL10 genes as NPC-related genes,and the AUC value of the neural network model in the test set reached 0.9.The expression dot plots of BLK and OSBPL10 genes were drawn in the independent data sets,and the P values were calculated,where P<0.01 for BLK gene and P<0.001 for OSBPL10 gene were statistically significant.The AUC values of ROC curve were0.95 and 0.967,respectively.4.Biogenic analysis showed that BLK and OSBPL10 were low expressed in NPC;GO and KEGG analysis showed that BLK and OSBPL10 were closely associated with a variety of biological functions and signaling pathways;immunoinfiltration analysis showed that BLK and OSBPL10 were closely associated with a variety of immune cells.The results showed that the expression of BLK was negatively correlated with tumor purity,positively correlated with B cells,positively correlated with CD8+T,positively correlated with CD4+T cells,positively correlated with macrophages,positively correlated with neutrophils,and positively correlated with dendritic cells.The expression of OSBPL10 was negatively correlated with tumor purity,but not with the expression of B cells,not with the expression of CD8+T,and positively correlated with CD4+T cells,positively correlated with macrophages,positively correlated with neutrophils,and positively correlated with dendritic cells.In GSE120349 prognostic analysis,low BLK expression was associated with poor prognosis of NPC,while OSBPL10 was not.5.Immunohistochemical results showed that BLK and OSBPL10 were lower expressed in NPC compared to benign nasopharyngeal lesions,with P values of less than 0.05.Analysis of the expression of BLK and OSBPL10 and clinicopathologic data showed that the expression of BLK was correlated with the pathological type and nationality of NPC patients.OSBPL10 was correlated with EBV infection,AJCC clinical stage and KI67 expression in NPC patients.6.In the analysis of prognostic factors,NPC survival status was negatively correlated with EBV infection status and N stage of patients,and was correlated with ethnic groups.Moreover,the expression of BLK and OSBPL10 is related to patient prognosis,with low expression indicating poor prognosis.Conclusion:In this study,NPC-related genes BLK and OSBPL10 were screened by machine learning methods,and bioinformatics analysis showed that BLK and OSBPL10 were closely related to immune cells,and genes were closely related to various genes and involved in rich biological processes and signaling pathways,as well as closely related to tumor immune microenvironment.Experiments have verified that BLK and OSBPL10 have rich functions and obvious differential expression in NPC,which can be used as relevant diagnostic markers for NPC. |