Font Size: a A A

Research On The Identification Method Of Human Key Genes Based On Network Representation Learning

Posted on:2021-12-29Degree:MasterType:Thesis
Country:ChinaCandidate:Q ChangFull Text:PDF
GTID:2510306095990579Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Essential genes are a group of genes that are indispensable for cell survival and cell fertility.Studying human essential genes which guides the treatment of diseases can help scientists reveal the potential biological mechanisms of human cells.Recently,the publication of human essential gene data makes it possible for researchers to train a machine-learning classifier by using some features of the known human essential genes and to use the classifier to predict new human essential genes.Previous studies have found that the essentiality of genes closely relates to their properties in the protein–protein interaction(PPI)network.Therefore,we studied the human essential gene on PPI network.The main work of this paper is as follows:(1)In this work,we propose a novel supervised method to predict human essential genes by network embedding the PPI network.Our approach implements a bias random walk on the network to get the node network context.Then,the node pairs are input into an artificial neural network to learn representation vectors that maximally preserves network structure and the attributes of the nodes.Finally,the features are put into an SVM classifier to predict human essential genes.The prediction results on two human PPI networks show that our method achieves better performance than those that refer to either genes' sequence information or genes' centrality properties in the network as input features.Moreover,it also outperforms the methods that represent the PPI network by other previous approaches.(2)For the network representation of human essential genes,we implemented different types of cluster analysis.Through the cluster visualization and enrichment analysis of the essential gene's network representation of two human PPI networks,the modularity of human PPI networks are further expounded and it is proved that our method can better extract the network characterization of human essential genes.(3)We also incorporated the DNA sequence information of human genes into the process of network representation learning.Finally,the integration of multiple network representation learning models successfully improved the prediction accuracy of human essential genes.Experimental results based on two human PPI networks show that the ensembled model has more robust performance than using only network information as the input features of the classifier.
Keywords/Search Tags:human essential genes, protein-protein interaction network, network representation learning, ensemble
PDF Full Text Request
Related items