Font Size: a A A

Biological information management with application to human genome data

Posted on:1999-01-09Degree:Ph.DType:Dissertation
University:Georgia Institute of TechnologyCandidate:Kogelnik, Andreas MatthiasFull Text:PDF
GTID:1468390014969370Subject:Computer Science
Abstract/Summary:
The goal of the Human Genome Project (HGP) is to obtain the complete DNA sequence of the estimated 3-4 billion nucleotide bases in the human genome. This project has generated enormous volumes of data, with a wide variety of types. The ability to fully utilize this and other associated information is paramount in the furthering of biomedical science. Given the complex nature of this data overload, new database management systems are required to store and manage information and allow users to improve their utilization of data. This work defines numerous characteristics of biological data which must be met in order to comprehensively manage complex biological data. In addition, we designed and implemented a generalized prototype system for managing HGP data.; The biological information represents a broad spectrum of highly complex, highly variable, rapidly changing, and often poorly-structured data. The most popular biological databases have been designed to handle a restricted type (or types) of data from a particular data set, thereby limited the ways in which this data could be used. Until recently, no portion of the human genome yet sequenced had been studied extensively enough to generate the variety and magnitude of data that would necessitate the development of an integrated genetic information system. The human mitochondrial DNA (mtDNA) offers an opportunity to attempt to model and integrate such varied data sets as clinical data, functional/organizational data, population data and gene-gene interaction data. There is now an extensive array of mtDNA functional and comparative genetic information, population variation data, and disease mutations. In developing MITOMAP, a human mitochondrial genome database, a need for a more general solution to the problems of biological data management was recognized. As a result we developed GENOME, the Georgia Tech-Emory Networked Object Management Environment (GENOME), a prototype database environment designed to manage complex biomedical data. GENOME is designed to allow the establishment of a network of searchable data sources for the Human Genome Project sites which have a complete DNA sequence available for a given genetic locus and have large amounts of information which they are trying to integrate.
Keywords/Search Tags:Human GENOME, Data, Information, DNA, Biological, Management
Related items