| Objective: By bioinformatic methods,we aim to screen the hub genes in prostate cancer of the tumorigenesis and progression and to establish a risk prediction model for prostate cancer based on NCBI Gene Expression Omnibus(GEO),the Cancer Genome Atlas(TCGA),the Human Protein Atlas(HPA)and UCSC Xena database.Through the evaluation of risk prediction model and the comprehensive analysis of the hub genes in prostate cancer,the biological function of the hub genes and the value of risk prediction model were discussed,so as to provide theoretical support for the clinical diagnosis of prostate cancer.Methods:(1)We downloaded the gene chip data from Gene Expression Omnibus(GEO).The datasets were GSE46602,GSE3325,and GSE104749.In the R language environment,differentially expressed genes(DEGs)were obtained by using the limma package to process the data.(2)The intersection of the DEGs was obtained using Venn diagram,which were further screened by Lasso regression.(3)TCGA database was used to obtain the expression profile and the clinical prognosis information of prostate cancer.The risk prediction model was constructed by screening the hub genes and evaluated in the training set and the test set.(4)By utilizing the DAVID database,we conducted the GO analysis and KEGG pathway enrichment analysis of DEGs.Meanwhile,the protein-protein interaction network(PPI)was drawn based on STRING.(5)Combined with TCGA database,the Human Protein Atlas,UCSC Xena database and TIMER database,the hub genes in the risk prediction model were analyzed in the terms of the expression levels between the prostate cancer and normal tissues,and the relationship between the clinical prognosis and the immune infiltration.Results: In this study,227 DEGs were read in GSE46602,444 DEGs were read in GSE3325,228 DEGs were read in GSE104749,and 16 DEGs were shared among the three datasets.The co-expressed DEGs were analyzed by LASSO regression,univariate and multivariate Cox regression.Six hub gene signatures(AOX1,PCA3,HOXC6,TOP2 A,ERG and ANGPT1)were used to establish a risk model.The risk model had good predictive value through the verification and evaluation of the training set and test set.PPI network of the DEGs revealed 281 nodes with 378 edges.Among the six gene signatures,PCA3,HOXC6,ERG and TOP2 A were highly expressed in prostate cancer,while AOX1 and ANGPT1 were low expressed in prostate cancer,which were basically consistent with the results of the risk prediction model.The expression of PCA3 and TOP2 A are associated with TNM stages,the progress free interval(PFI)time and the immune infiltration of macrophages,which are highly valuable in predicting the prognostic outcome in PCa.Conclusion: In this study,a risk prediction model with six gene signatures was constructed,which could effectively predict the prognostic risk of prostate cancer.In the risk prediction model,TOP2 A,HOXC6,PCA3 and ERG were risk factors for prostate cancer prognosis,and AOX1 and ANGPT1 were protective factors.PCA3 and TOP2A might help explore the prognosis and become the diagnostic markers of prostate cancer. |