Prioritizing Disease Candidate Genes Based On PPI Network

Posted on:2015-06-21

Degree:Master

Type:Thesis

Country:China

Candidate:Q Li

Full Text:PDF

GTID:2180330434954083

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Identification of disease-causing genes is a key problem of human genetics research. The traditional gene positional cloning strategies can restrict the location of the disease gene to a region that may contain tens to hundreds of candidate genes. It is time-consuming and laborious to validate these candidate genes one by one through biological experiments. However, using bioinformatics methods can not only reduce costs, but also can quickly and effectively identify disease candidate genes. Among them, Prioritization method based on PPI network show a better performance. This paper mainly focuses on prioritizing disease candidate genes based on PPI network and its main research results list as below:Firstly, we developed a shortest path-based algorithm, named SPranker, to prioritize disease candidate genes in protein interaction network. Considering the fact that diseases with similar phenotypes are generally caused by functional related genes, we further proposed a new algorithm SPGOranker by integrating the semantic similarity of GO annotations. SPGOranker not only considers the shortest path between protein pairs in a protein interaction network but also takes their GO semantic similarity into account. The proposed algorithms SPranker and SPGOranker were applied to1598known orphan disease-causing genes (ODGs) from172orphan diseases (ODs). The proposed algorithms were compared with three state-of-art approaches, ICN, VS and RWR. The experimental results show that SPranker and SPGOranker outperform ICN, VS and RWR for the prioritization of orphan disease-causing genes. We further apply our methods to identify and rank potential novel candidate genes for several ODs.Secondly, we proposed an algorithm based on the search engine ranking method, named TrustRanker, to prioritize disease candidate genes. We constructed a bipartite graph consisting of two disjoint sets of nodes which named diseasome. Starting from the diseasome bipartite graph we generated two biologically relevant network projections, human disease network (HDN) and disease gene network (DGN). We also analyzed the topology characteristics of the two networks. We used two type of similarity between two diseases, topological similarity and disease phenotype similarity, to select genes associated with specific diseases as seeds. Using these seed genes initialize start probability matrix of TrustRanker. We test our method on gene-disease association data, evaluating the prioritization achieved. Using data on2666disease from the OMIM knowledgebase, we perform large-scale cross validation to rank the candidate genes and also evaluate and compare the performance of our approach. Our results show that our method outperforms Prince and PRP. Importantly, we apply our method to study three multi-factorial disease for which some causal genes have been found already.

Keywords/Search Tags:

disease candidate genes, protein interaction network, prioritization, TrustRanker

PDF Full Text Request

Related items

1	The Research Of Prioritizing Disease Candidate Genes Based On Heterogeneous Network
2	Disease Gene Prioritization Based On A Tissue-specific Network Model
3	Identifying Protein Complexes And Predicting Disease Genes Based On Protein Domain
4	The Study Of Predicting Disease Genes Based On Interaction Network
5	Network Based Analytical Method For Prioritization Of Candidate Symptom Genes
6	Development Of A Rice Genomic Variation Database And Candidate Gene Prioritization Platform For Genome-wide Association Studies
7	Predicting Protein-protein Interactions And Studying The Related Contents Based On Network Topologies
8	Statistical Modeling For Analysis Of Biological High-throughput Data And Its Application
9	Research On Biological Network Construction Algorithm Based On Multi-gene Interaction Information
10	Numerical Researches On Protein-Protein Interactions Network