Font Size: a A A

Using heterogeneous sources of biological knowledge to improve the identification of differentially expressed gene

Posted on:2011-08-25Degree:Ph.DType:Thesis
University:Stanford UniversityCandidate:Daigle, Bernie Joseph, JrFull Text:PDF
GTID:2444390002459322Subject:Genetics
Abstract/Summary:
Gene expression is a fundamental biological process whereby genetic information is converted to cellular function. Experimentally, microarray-based quantitation of messenger RNA (mRNA) levels provides a genome scale estimate of this process. A central task of microarrays is the accurate detection of differentially expressed (DE) genes between two or more experimental conditions, and many analytical methods have been developed for this purpose. Unfortunately, the noise level and experimental variability of microarrays can be limiting. While a number of existing methods partially overcome these limitations by incorporating biological knowledge, there is ample room for improvement. This thesis presents two novel computational methods designed to integrate biological knowledge with microarray data to better identify DE genes.;I first describe M-BISON (Microarray-Based Integration of data SOurces using Networks), a formal probabilistic model that accommodates diverse forms of qualitative biological knowledge. M-BISON improves DE gene prediction on a range of simulated data, particularly when using very noisy microarray data. I applied the method to a heat shock microarray dataset in S. cerevisiae, using conserved yeast DNA sequence motifs as knowledge. M-BISON improves the analysis quality and makes predictions that are easy to interpret in concert with incorporated knowledge. By analyzing M-BISON predictions in the context of the background knowledge, I identified YHR124W/NDT80 as a potentially novel player in the yeast heat shock response.;Next, I introduce SAGAT (SVD Augmented Gene expression Analysis Tool), a mathematically principled approach that utilizes pre-existing microarray datasets to better identify DE genes. I tested the method on three well-replicated human microarray datasets, and I demonstrate that use of SAGAT increased effective experimental sample sizes. I applied SAGAT to unpublished data from a microarray study investigating transcriptional responses to insulin resistance, resulting in a 50 percent increase in the number of significant genes detected. Slightly more than half of these genes were experimentally validated using qRT-PCR, confirming SAGAT's findings and furthering our molecular understanding of a risk factor for type 2 diabetes.;In addition to the results described here, I provide both M-BISON and SAGAT as freely available software packages that are applicable to any microarray study. Together, these methods constitute a solid foundation for the principled integration of imperfect biological knowledge with microarray data, and they present attractive opportunities for future work.
Keywords/Search Tags:Biological, Microarray, Using, M-BISON, SAGAT
Related items