Font Size: a A A

Computational and statistical approaches to study gene regulation and gene function

Posted on:2006-06-23Degree:Ph.DType:Thesis
University:Harvard UniversityCandidate:Zhong, ShengFull Text:PDF
GTID:2454390008957718Subject:Biology
Abstract/Summary:
Clarifying gene regulation mechanism and inferring gene function are two important aspects of computational molecular biology. This thesis reports several statistical and computational developments in these directions.; I report a statistical methodology that largely improves the accuracy in computational predictions of transcription factor binding sites in eukaryote genomes. This method models the cross-species conservation of binding sites without relying on accurate sequence alignment. We applied this method on all published ChIP-chip data in S. cerevisiae and found that the accuracy was 20% higher than the best current methods.; I report the identification a functional intergenic element in vertebrate Hox clusters between paralogous groups 6 and 7. This intergenic element is more conserved than most coding regions and other regulatory regions of Hox genes. Using reporter assays in transgenic mice and chick embryos we have demonstrated that the copy of this element in the human HoxA cluster functions as an enhancer, strongly suggesting it plays a functional role in Hox regulation in vivo.; I report a statistical method to identify the Gene Ontology (GO) terms that are strongly associated with a group of genes. The practice of choosing GO terms by multiple association tests entails proper control for multiple hypothesis testing. We proposed a permutation strategy that could generate from the null the same number of test statistics with the same correlation structure as the observed test statistics. On top of this strategy we proposed a method to provide a moderately conserved estimator of q-value for every GO term. It then became legitimate to choose GO terms according to the estimated q-values.; I report a statistical treatment to analysis TAG microarray data. TAG microarray measures the relative abundance of a mutant yeast strain in a mixed population of many mutant strains. We devised a method that identifies mutants with both statistically different and scientifically different growth behaviors in two growth conditions. With this method, we identified 53 mutants being sensitive to a growth condition. 52 of these mutants have been confirmed by biochemical re-tests.
Keywords/Search Tags:Gene, Computational, Regulation, Statistical, Report
Related items