Font Size: a A A

The Research On The 3D Graphical Representation Of DNA Sequence And The Algorithm For Constructing Phylogenetic Tree

Posted on:2008-06-28Degree:MasterType:Thesis
Country:ChinaCandidate:X Z ZhangFull Text:PDF
GTID:2178360215479827Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
With the development of HGP (human genome project), the research on different species gene sequences, more and more molecular sequences data have been generated. The need to analyze, process these data accelerates the development of Bioinformatics. With the increasing of gene sequences, the graphical representation is becoming important for studing gene sequences. So how to give effective graphical representation of gene sequences, to classify genes, and to study the phylogenetic relationships are the important problems in Bioinformatics.This dissertation mainly studied the graphical representation of DNA sequence, the similarity analysis of biological sequences and the algorithm for constructing the phylogenetic tree.Based on the study of several 3D graphical representation of DNA sequence, we propose a new 3D graphical representation of DNA sequence--N curves firstly. It is easy to translate the DNA sequence codes to a three dimensional space curve, a genome sequence can be uniquely represented by an N curve. We prove that it hasn't circuit in N curve, and it is according with symmetry, also we give the biology characteristics which N curve is contained.When we study the sequence comparability based on graphical representation of DNA sequence, it is often use leading eigenvaluesλto measure the sequence's characteristic. But as the sequence becomes longer, the calculation ofλwill become more complicated. We propose a new matrix invariant Z_inv, the experiment of eleven different species first Exonβ-globin gene illustrates that it calculates easily, and it is approximate toλ, the minimum error for Z_inv andλis 0.0008.We study the traditional algorithm for constructing phylogenetic tree and the PHYLIP programs. Based on the N curve, we propose an algorithm comprised hierarchical clustering. The phylogenetic relationships for the twelve HA (H5N1) sequences of avian influenza virus illustrate the algorithm.
Keywords/Search Tags:Biology data mining, DNA sequence, 3D graphical representation, Matrix invariant, Algorithm for constructing phylogenetic tree
PDF Full Text Request
Related items