Font Size: a A A

The Research On Similarity Of DNA Sequences And Algorithm For Constructing Phylogenetic Tree Based On Graphical Repressentation

Posted on:2011-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:J C GuoFull Text:PDF
GTID:2178360308468841Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of HGP (Human Genome Project, HGP) and model organism genome-sequencing projects, more and more molecular sequences data have been generated. The scientific analysis, process and research of these data not only accelerates the development of Bioinformatics, but also has broad application background in the fields of human disease prevention, diagnosis, treatment and new drug development. How to give effective graphical representation of the gene sequences, analysis of genetic similarity and evolutionary relationship of bioinformatics have become a hot topic.This dissertation mainly studys the graphical representation of DNA sequence, the similarity analysis of biological sequences and the algorithm for constructing the phylogenetic tree. The main achievements are summarized as below:Firstly, the JZ-curve, a new graphical expression of the gene sequence, is introduced. By defining three mathematical mapping, a gene sequence can be transformed into three curves. It proves that the JZ-curve not only avoids the limitations associated with crossing and overlapping, but also retains the biological information of gene sequences.Secondly, we construct a new characteristic matrix, named J/J matrix.When we study the sequence comparability based on graphical representation of DNA sequence, The J/J characteristic matrix based on JZ-curve can describe the chemical characteristic and the biological significance of gene sequences. The examination of similarities/dissimilarities among the coding sequences of the first exon ofβ-globin gene of different species illustrates the utility of the approach.Thirdly, based on the JZ curve, a fuzzy clustering algorithm on the basis of spectral graph theory for constructing phylogenetic tree is proposed. With the cluster analysis method, we build phylogenetic trees and determine the evolutionary relationship between the sequences. Meanwhile, the algorithm not only considers the divergence between classes, but also consider the similarity between classes, increase the accuracy of the results. The phylogenetic relationships for the coding sequences of the first exon ofβ-globin gene of 11 different species and the NA(H1N1) sequences of avian influenza virus illustrate that algorithm is credible.
Keywords/Search Tags:DNA Sequence, Graphical Representation, Characteristic Matrix, Algorithm for Constructing Phylogenetic Tree
PDF Full Text Request
Related items