Font Size: a A A

Research On Disease Gene Prediction Method Based On Annotated Gene Set

Posted on:2023-05-09Degree:MasterType:Thesis
Country:ChinaCandidate:C DengFull Text:PDF
GTID:2544307070484154Subject:Engineering
Abstract/Summary:PDF Full Text Request
The study of disease-associate genes in complex diseases is the key to understanding the pathogenesis of diseases and developing targeted drugs.Most of the current disease gene prediction methods are based on gene network data,which may not accurately describe the complex association patterns of disease-associate genes because gene networks can only describe the association between genes but not the polygene association relationships corresponding to a set of genes.Therefore,this thesis addresses this problem by conducting research on disease gene prediction methods based on annotated gene sets.The main research work and innovation points of this thesis are as follows.(1)Molecular Signatures Data Base(MSig DB)collects many annotated gene set data in terms of location,function,metabolic pathways,target binding,etc.Each annotated gene set represents the polygene association relationships corresponding to a set of genes,and this polygene association information can help to more accurately describe the complex associations of genes in biological processes,but it is difficult to be used in biocomputational problems because of its special organization.This thesis proposes a disease gene prediction method based on the annotated gene sets integration(GSI).GSI represents and integrates different types of annotated gene set data from the MSig DB database in the form of the signal matrix,which is then used to train a disease gene prediction model.Experimental results show that GSI achieves the best performance on multiple disease data.(2)GSI can only use the inclusion relationship between genes and annotated gene sets,and the model has limited expression ability and cannot deeply mine multi-gene association information in annotated gene sets.To solve this problem,this thesis proposes a disease-specific hypergraph residual neural network(Di SHyper)for disease gene prediction.The hypergraph is an extension of the ordinary graph,which can represent the association of multiple nodes and thus is well suited for describing complex association relationships of multiple genes in annotated gene sets.Di SHyper represents and integrates different types of annotated gene set data from the MSig DB database in a hypergraph structure,and then trains disease gene prediction models using disease-specific hypergraph residual neural networks.Di SHyper can deeply explore the complex associations of genes in annotated gene sets and incorporate disease-specific association information into the models.Experimental results show that Di SHyper achieves the best performance on many different types of disease datasets.
Keywords/Search Tags:Disease gene prediction, Annotated gene set, Hypergraph, Hypergraph neural network
PDF Full Text Request
Related items