Formal methods for genomic data integration

Posted on:2006-04-07

Degree:Ph.D

Type:Thesis

University:The Pennsylvania State University

Candidate:Shah, Nigam

Full Text:PDF

GTID:2458390008467350

Subject:Biology

Abstract/Summary:

The rapid growth of life sciences research and the associated literature over the past decade, the rapid expansion of biological databases, and invention of high throughput techniques that permit collection of data on many genes and proteins simultaneously have created an acute need for new computational tools to support the biologist in collecting, evaluating and integrating large amounts of information of many disparate kinds. This thesis presents methods for the representation, manipulation and conceptual integration of diverse biological data with prior biological knowledge to facilitate both, interpretation of data and evaluation of hypotheses.; We have developed a tool (called CLENCH) that assists in the interpretation of gene-lists resulting from microarray data analysis, by integrating and visualizing Gene Ontology (GO) annotations and transcription factor binding site information with gene expression data. During the development of CLENCH, it became evident that developing a unified framework for representing prior knowledge and information can increase our ability to integrate new data with existing knowledge.; In subsequent work, we developed the HyBrow (Hypothesis Browser) system as a prototype tool for designing hypotheses and evaluating them for consistency with existing knowledge. HyBrow consists of a conceptual framework with the ability to represent diverse biological information types, an ontology for describing biological processes at different levels of detail, a database to query information in the ontology, and programs to design, evaluate and revise hypotheses. We demonstrate the HyBrow prototype using the galactose gene network in Saccharomyces cerevisiae as a test system.; Along with the increase in available information, knowledgebases, which provide structured descriptions of biological processes, are proliferating rapidly. In order to support computer-aided information integration tools like HyBrow, a knowledgebase should be trustworthy and it should structure information in a sufficiently expressive manner to represent biological systems at multiple scales. We extend and adapt the conceptual framework underlying HyBrow and use it to verify the trustworthiness and usefulness of the Reactome knowledgebase.

Keywords/Search Tags:

Data, Biological, Hybrow, Information

Related items

1	Comparative analysis of biological sequences through information visualization
2	Mining for significant information from unstructured and structured biological data and its applications
3	Bio-data Organizing And Processing Based On XML
4	Detection of low rank signals in noise and fast correlation mining with applications to large biological data
5	Systems analysis of complex biological data for bioprocess enhancement
6	New techniques for improving biological data quality through information integration
7	Integrative Analyses of Diverse Biological Data Sources
8	A statistical approach for information extraction of biological relationships
9	Biological information management with application to human genome data
10	Based On The Django Framework For The Establishment Of Biological Information Website