Font Size: a A A

Recovering Bayesian networks with applications to gene regulatory networks

Posted on:2008-06-10Degree:Ph.DType:Thesis
University:University of Colorado at BoulderCandidate:Wang, JianFull Text:PDF
GTID:2448390005969043Subject:Statistics
Abstract/Summary:
Bayesian networks are convenient graphical expressions for high dimensional probability distributions representing complex relationships between a large number of random variables. A Bayesian network is a directed acyclic graph consisting of nodes which represent random variables and arrows which correspond to probabilistic dependencies between them. There has been a great deal of interest in recent years on the problem of learning the structure of Bayesian networks from data. Much of this interest has been driven by the study of genetic regulatory networks in molecular biology. Gene expression is quantified by amounts of messenger RNA, measured in cells. These measurements represent realizations of continuous data. The majority of work done to date on the recovery of Bayesian networks is focused on the case of discrete data or, of continuous data, but only from a very specific type of network called a Gaussian network. Many leading genetics researchers apply this established research to their problems by either discretizing continuous data or assuming a Gaussian structure. While no one would argue that discretization causes a loss of information, in this thesis we show that the conditional dependencies that we want to recover are lost. Gaussian assumptions are ubiquitous in statistics and are often applied even when they are not valid, even giving reasonable results. This is definitely not the case with many different continuous-valued Bayesian networks that we have tested. In this thesis, we explore the shortcomings of existing methods for Bayesian network recovery and give many alternatives based on direct tests of conditional correlation and independence. Also, we develop a collection of computational tools, such as cycle checking algorithms, for example, that are necessary to apply the recovery techniques of this thesis to very large Bayesian networks.
Keywords/Search Tags:Bayesian networks
Related items