Font Size: a A A

Comparison Of Biological Sequences/structures And Construction Of Phylogenetic Trees

Posted on:2008-07-17Degree:DoctorType:Dissertation
Country:ChinaCandidate:N LiuFull Text:PDF
GTID:1100360218455521Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Bioinformatics, or Computational Molecular Biology, deals with biologicalmacromolecules by employing the theories,methodologies and techniques frommathematics,physics and information science, etc, with the aid of computers and internet. Ithas developed a large number of particular analytic algorithms and softwares, which hasprovided powerful tools for biologists. Computational Molecular Biology has become quitehot in life science, one of whose most important topics currently is comparative genomics andphylogenetic analysis.This dissertation is aimed at exploring simple and efficient methods for analyzingbiological data so as to provide tools for biologists. Our work focuses on comparingbiological sequences/structures and constructing phylogenetic trees. The main achievementsin this dissertation can be outlined as follows:For the comparative study on sequences, we have presented two similarity measures forsequences——relative similarity measure and weighted similarity measure. The property of thefirst is that the entries on the diagonal in the matrix derived from relative similarity measureare usually not zeros, which doesn't have the influence on similarity analysis. The property ofthe second is that it permits the analysis to be made from different aspects.For the comparative study on structures, we have built stochastic process models forRNA secondary structures and protein secondary structures, and have developed theapproaches for structural similarity analysis and structural classification, respectively, basedon the models; We have proposed two methods for analyzing protein secondary structures:triangular graphics analysis and Fourier spectrum analysis, hence not only the structuralproperties can be displayed visually, but also numerical characteristics can be extracted; Wehave constructed the RNA secondary configurations and RNA Catalan Skeletons, andachieved direct enumeration of RNA secondary structures. By RNA secondary configurationsand RNA Catalan Skeletons, structural properties can be displayed in simple ways andnumerical characteristics can also be available. They provide new media for studying RNAsecondary structures. Furthermore, motivated by RNA secondary structures, we have obtainedthe Catalan numbers with restrictions.For the study on constructing phylogenetic trees, we have developed the RNA secondarystructure—based approach and two protein-based approaches for constructing phylogenetic trees. These approaches don't involve evolutionary model hypotheses and their timecomplexity are not high. Especially, the RNA secondary structure—based approach can dealwith more complicated structures than RNAforester can do (RNAforester is a popularsoftware for comparing RNA secondary structures).
Keywords/Search Tags:Relative similarity measure, Weighted similarity measure, Characteristic sequences, RNA Catalan Skeletons, Lempel-Ziv complexity
PDF Full Text Request
Related items