Font Size: a A A

Functional annotation of genomes by combination of sequence, expression and regulatory data

Posted on:2006-09-19Degree:Ph.DType:Dissertation
University:Boston UniversityCandidate:Leyfer, DmitriyFull Text:PDF
GTID:1450390008467398Subject:Biology
Abstract/Summary:
The "OMICS" era in biology that started with genome sequencing has now moved into transcriptomics (gene expression) and regulomics (expression regulation) studies. While gene expression or regulation data alone provide valuable materials for understanding biological systems, integration of diverse types of data holds promise for pinpointing the fine structures of biological interactions and unifying discrete processes into biological networks.; This dissertation presents work in the functional genomics area, including expression studies using EST databases, regulation studies using tools for identifying promoters and 3'-UTR regulatory elements, and integration of diverse biological data in the Transcriptional Module (TM) research.; Transcriptional modules are groups of coexpressed genes regulated by groups of jointly acting Transcription Factors (TFs). Assuming that the expression pattern of a gene is largely determined by TFs that bind to the gene's promoter, unsupervised clustering of genes based on transcription factor binding sites (cis-elements) in promoters can group genes in a similar way as grouping genes by their expression patterns. Thus, intersections between expression and cis-element-based gene clusters can reveal TMs. Statistical significance assigned to TMs of varying sizes allows identification of even the smallest two-gene regulatory units. The method correctly identifies the number and size of TMs on artificial datasets. Comparing yeast TMs with MIPS and GO categories demonstrated that experimental modules are biologically relevant. The modules are in statistically significant agreement with TMs identified by other research groups. This work suggests that there is no preferential division of biological processes into regulatory units; each degree of partitioning exhibits a slice of the biological network, revealing hierarchical modularity of transcriptional regulation.; Our results have several implications for drug discovery: (1) I can pinpoint the set of TFs controlling a biological process, therefore reducing the effort for target identification; (2) I can evaluate whether an organism is a good model organism by determining whether TFs controlling similar biological processes are the same in this organism and in human; and (3) I can apply our method to any kind of disparate data to obtain a more comprehensive understanding of expression regulation.
Keywords/Search Tags:Expression, Data, Regulatory, Regulation, Biological, Gene
Related items