Font Size: a A A

A module based approach for identifying driver genes and expanding pathways from integrated biological networks

Posted on:2015-05-28Degree:Ph.DType:Dissertation
University:Boston UniversityCandidate:Huang, Chia-LingFull Text:PDF
GTID:1478390020950996Subject:Biology
Abstract/Summary:
Each gene or protein has its own function which, when combined with others, allows the group to perform more complex behaviors, e.g. carry out a particular cellular task (functional module) or affect a particular disease phenotype (disease module). One of the major challenges in systems biology is to reveal the roles of genes or proteins in functional modules or disease modules.;In the first part of the dissertation, I present a data-driven method, Correlation Set Analysis (CSA), for comprehensively detecting active regulators in disease populations by integrating co-expression analysis and specific types of literature-derived causal relationships. Instead of investigating the co-expression level between regulators and their targets, I focus on coherence of regulatees of a regulator, e.g. downstream targets of a transcription factor. Using simulated datasets I show that my method can reach high true positive rate and true negative rate (>80%) even the regulatory relationships is weak (only 20% of regulatees are co-expressed). Using three separate real biological datasets I was able to recover well-known and as- yet undescribed, active regulators for each disease population.;In the second part of the dissertation, I develop and apply a new computational algorithm for detecting modules of functionally related genes that are likely to drive malignant transformation. The algorithm takes as input the identity and locations of a small number of known oncogenes (a seed set) on a human genome functional linkage network (FLN). It then searches for a boundary surrounding a gene set encompassing the seed, such that the magnitude of the difference in linkage weights between interior-interior gene pairs, and interior-exterior gene pairs is maximized. Starting with small seed sets for breast and ovarian cancer, I successfully identify known and novel drivers in both cancer types.;In the third part of the dissertation, I propose a module based approach for expanding manually curated functional modules. I use the KEGG pathway database as an example and the results show that my approach can successfully suggest both validated pathway members (genes that are assigned to a particular pathway by other manually curated pathway databases) and novel candidate pathway genes.
Keywords/Search Tags:Gene, Pathway, Module, Approach
Related items