Font Size: a A A

A new method for genetic network reconstruction in expression QTL data sets

Posted on:2010-03-30Degree:Ph.DType:Thesis
University:North Carolina State UniversityCandidate:Duarte, Christine WFull Text:PDF
GTID:2448390002477924Subject:Biology
Abstract/Summary:
Expression QTL (or eQTL) studies involve the collection of microarray gene expression data and genetic marker data from segregating individuals in a population to search for genetic determinants of differential gene expression. Previous studies have found large numbers of trans-regulated genes that link to a single locus or eQTL "hotspot". It would be of great interest to discover the mechanism of co-regulation for these groups of genes. However, many difficulties exist with current network reconstruction algorithms such as low power and high computational cost. A common observation for biological networks is that they have a scale-free or power-law architecture. In such an architecture, there exist highly influential nodes that have many connections to other nodes, but most nodes in the network have very few connections. If we assume that this type of architecture applies to genetic networks, then we can simplify the problem of genetic network reconstruction by focusing on discovery of the key regulatory genes at the top of the network. We introduce the concept of "shielding" in which a gene is conditionally independent of the QTL given the shielder gene, and we iteratively build networks from the QTL down using tests of conditional independence. We evaluate the confidence level of shielders using a two-part strategy of requiring a threshold number of genes to be shielded and requiring a high level of bootstrap support for shielders. We have performed a set of simulations to test the sensitivity and specificity of our method as a function of method parameters. We have found that our method has good performance using a significance level of 0.05 for testing the hypothesis that a gene is a shielder, with little gained by decreasing alpha further. The shielder boostrap confidence level depends on the desired balance between false positives and false negatives, but our recommendation is to use 80% bootstrap support for high confidence of discovered network features. With a small sample size (100) and a large number of network genes (as many as 622), our algorithm succeeds in finding a high percentage of the key network regulators (47% on average) with high confidence (95% specificity on average).;We have applied our network reconstruction algorithm to a yeast expression QTL data set in which microarray and marker data were collected from the progeny of a backcross of two species of Saccharomyces cerevisiae [8]. Networks have been reconstructed for 6 of the 11 largest eQTL hotspots in this data set. The regulation of shielder gene expression has been found to be primarily in trans. Bioinformatic analysis of three networks generated different hypotheses for mechanisms of regulation of the shielded genes by the primary shielders. One common theme was that the shielders modulated the effect of transcription factors of which they were themselves targets. Overall our method has created a list of potentially important regulatory genes in various yeast biological processes, and further bioinformatic analysis or laboratory experiments could lead to the generation and testing of many important hypthotheses.
Keywords/Search Tags:Gene, QTL, Data, Expression, Network reconstruction, Method
Related items