Cancer is a major disease that seriously endangers human life and health,and research on cancer and its related fields remains a hot spot for bioinformatics research.Studying the genes that play a causal role in cancer evolution remains a challenge in cancer biology,and the identification of cancer genes is crucial for precision oncology and cancer therapy.It is widely believed that cancer progression is due to the accumulation of driver gene mutations that confer selective growth advantage to cells.However,it has been found that some cancer driver genes are unaltered at their DNA sequence level but regulated abnormally through various cellular mechanisms,suggesting that different histological data play different degrees of influence on cancer development,a complex mechanism for which effective integration of this information would be beneficial for cancer gene prediction.Genetic and non-genetic causes contribute to tumorigenesis,which requires the development of predictive models to effectively integrate different histological data,exploit the complementary information contained in multi-omics datasets,and fully utilize the utility of these comprehensive high-throughput data.At the same time,genes act together in signaling pathways and regulatory pathways as well as in protein complexes.Therefore,the information contained in protein interaction(PPI)networks is important when trying to predict cancer genes.Therefore,combining multi-omics data with protein networks to predict cancer genes is considered.In this paper,the MONET algorithm,a cancer driver gene identification algorithm,is proposed based on an integrated framework of graph neural network model to integrate multi-omics data.The algorithm is based on the integrated framework of graph convolutional neural network and graph attentional neural network,and predicts cancer driver genes by combining four types of pan-cancer omics data of gene mutation,DNA methylation,gene expression and copy number variation with protein interaction network.First,the graph structure(nodes in the graph structure represent genes)constructed by the protein interaction network is used to learn the feature vector representation of each gene using the graph convolutional neural network algorithm and the graph attentional neural network algorithm,respectively.Then,based on the idea of integration analysis,the gene feature vectors obtained from the two graph neural networks are stitched together to enhance the gene features.Finally,the integrated gene feature vectors are fed into a multilayer perceptron model to perform a semi-supervised cancer driver gene identification task,and the MONET algorithm outputs a predictive value for each gene indicating the probability that the gene is a cancer driver gene.Experimental results show that the MONET model outperforms the baseline model in terms of area under the subject operating characteristic curve(ROC curve)and area under the precision-recall rate curve(PR curve). |