Font Size: a A A

Graphical Representations Of Nucleic Acid Sequences And Its Application

Posted on:2008-02-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:C X YuanFull Text:PDF
GTID:1100360218955519Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
With the implement and completion of the genome project of some model organism,especially the completion of HGP (Human Genome Project), people pay more and moreattention to the study of biological molecular sequences. Many new genome projects areproposed and achieved rapid progress. Because of this, huge database of molecular sequenceshave been generated. The storage, management and analysis of these sequence data promotethe integration of molecular biology, computer science and mathematics. The consequentdevelopment of computational molecular biology and bioinformatics has become a hotresearch area of science. As a new and developing interdiscipline, involving life science,computer science, mathematics, physics, chemistry and so on, its research area is very wide,which includes sequence comparison, gene recognition, molecular evolution and comparativegenomics, RNA and protein structure prediction, computer-aided drug design, and so on. Thisthesis is focus on graphical representation of biological sequences and its application.The main contents of this thesis are listed as follows:1. In Chapter 2, a 3-D graphical representation of DNA sequence is proposed, whichavoids some limitation occured in some former graphical representation model ofbiological sequence. Similarity and dissimilarity analysis based on this 3-D graphicalrepresentation are given for the first exon genes ofβ-globin of eleven species.2. In Chapter 3, A study is conducted to the basic problem relating to similarity analysisbased on the graphical representation method of biological sequences. And a generalmethod to discuss the sensitivity of all kinds of graphical representation model ofbiological sequences is promoted.3. In Chapter 4, based on the 3-D graphical representation of DNA sequence proposedin chapter 1, recognition of protein coding genes in the yeast genome is carried out.Cross-validation tests demonstrate that the accuracy of the algorithm is over 96%.The total number of protein coding genes in the yeast S. cerevisiae genome isestimated to be about 5920, significantly coincident with the widely accepted range5800~6000.
Keywords/Search Tags:DNA sequence, graphical representation, gene recognition, invariant method
PDF Full Text Request
Related items