| Purpose:Gastric cancer is the fifth most common cancer in the world and the most common gastrointestinal tumour in China.The gene regulation mechanism of gastric cancer is complex and is still being explored.By searching for the core genes of gastric cancer,exploring the related molecular markers and potential molecular mechanisms will help further research on gastric cancer.The identification of prognosis-related genes and the construction of prognostic models in combination with clinical information will help the prognostic assessment of gastric cancer and further provide individualised treatment for gastric cancer patients.Methods:In the first part of the study,we downloaded the GSE54129,GSE79973,GSE118916 datasets from the GEO database and TCGA-STAD data,preprocessed them with R software to screen out differentially expressed genes in gastric cancer tissues and normal tissues,and used Cytoscape software and R software,respectively The differentially expressed genes were subjected to GO analysis,KEGG analysis and GSEA analysis using Cytoscape software and R software clusterProfiler package,respectively,to clarify the enrichment pathways of the differentially expressed genes.The PPI network was constructed in combination with the database String,and the network results were imported into Cytoscape software to further screen out the gastric cancer core genes,and KM survival analysis was performed on them.Significant genes were further analysed in combination with LinkedOmics,TIMER,HPA and other databases.In the second part of the study,we downloaded the GSE62254 and GSE84437 datasets from GEO database and TCGA-STAD genomic data from UCSC Xena database and clinical data,used univariate,random forest,LASSO and multi-factor analysis to screen out genes associated with survival prognosis of gastric cancer patients to establish risk score formulae,combined with clinical data to construct prognostic models,and used KM survival analysis,decision curves,calibration curves,and subject work curves to assess model efficacy,and performed intra-and extra-group validation.Results:In the first part of the results,we downloaded three GEO datasets and the gastric cancer dataset in TCGA,and obtained the respective differential genes by a series of bioinformatics methods screening,and took the summary intersection to obtain 294 differentially expressed genes,including 97 up-regulated genes and 197 down-regulated genes.The results of the enrichment analysis showed that the differentially expressed genes were mainly enriched in pathways such as extracellular matrix,collagen trimer,oxidoreductase activity,protein digestion and uptake,chemical carcinogenicity,channel activity,and neuroactive ligand-receptor interactions.Construction of protein interaction networks was performed to further select 10 core genes,which were validated by KM survival analysis,and COL1A1,COL1A2,COL2A1,COL4A1,COL5A1,COL11A1,COL18A1 and SERPINH1 were significantly associated with survival of gastric cancer patients.In the second part of the study,we downloaded the genomics data of TCGA-STAD from the UCSC Xena database and divided it into a training set and an in-group validation set in a 6:4 ratio.92 genes were obtained from the training set gene data by univariate analysis,and further screened by random forest,LASSO and multi-factor analysis to finally obtain 2 prognosis-related genes by multi-factor The results of the subject work curve showed that the AUC values of the model were 0.806,0.717 and 0.769 at 1 year,3 years and 5 years respectively.curves,and subject work characteristic curves to assess model efficacy.The prognostic model was validated by TCGA-STAD in-group validation set data and GSE62254 and GSE84437 as out-group validation set data.Conclusion:Compared with normal tissues,COL1A1,COL1A2,COL2A1,COL4A1,COL5A1,COL11A1,COL18A1 and SERPINH1 were up-regulated in gastric cancer tissues,and their expression levels were negatively correlated with patient survival,which may become gastric cancer biomarkers.The clinical prognostic model constructed by screening two prognostic genes,PRDM6 and SLITRK4,based on the TCGA database,combined with clinical data,can assess the prognosis of gastric cancer patients to a certain extent and help provide individualized treatment for gastric cancer patients. |