In this paper we systematically discuss the proceed of constructing species phylogenies based on chloroplast genomes. The theorem of singular valur discom-position(SVD) for a real matrix is showed in the preliminaries, which is the basic theorem for our analysis. We have to face large information when analysising protein sequences. We select C, Matlab or Phylip by demand.We present the process of constructing a phylogenetic tree in chapter 3. At first we have to convert all protein sequences in the whole genomes of multiple species into a large sparse 4-gram frequncy real matrix. Generally , it is nonsense if you only to convert them into a 3-gram frequency real matrix.We draw tree by Phylip software.In chapter 4, we construct the phylogenetic tree based on Chloroplast genomes. It's neccessary to reduce 'noise' in order to minimize conflict in the data before constructing species phylogenies of Chloroplasts. There are some specialists who have found a few methods for reducing 'noise' successfully such as Zuguo Yu,Liqian Zhou, Gary W. Stuart and so on.
|