Font Size: a A A

Data mining in DNA: Using the SUBDUE knowledge discovery system to find potential gene regulatory sequences

Posted on:2000-01-29Degree:M.SType:Thesis
University:The University of Texas at ArlingtonCandidate:Maglothin, Ronald KeithFull Text:PDF
GTID:2468390014463086Subject:Computer Science
Abstract/Summary:
The international genome sequencing projects are generating large volumes of DNA, RNA, and protein sequence data. The sizes of these data sets are too large for effective analysis by humans; they require automated methods of sequence analysis and pattern discovery. Data mining algorithms that can find biologically important patterns in these large databases, and that can do so in polynomial running time, are in great demand. In this work, the SUBDUE knowledge discovery system has been applied to the DNA sequence of baker's yeast, Saccharomyces cerevisiae. Several enhancements and modifications have been made to the system, and the resulting software has found several sequence patterns known to participate in yeast gene regulation. It has also found patterns known to participate in the gene regulation of other organisms, but not yet known to do so in yeast.
Keywords/Search Tags:DNA, Gene, Data, Sequence, Discovery, System
Related items