Font Size: a A A

Inferring, testing and summarizing a posterior distribution of phylogenies

Posted on:2008-04-27Degree:Ph.DType:Thesis
University:University of Alberta (Canada)Candidate:Cranston, Karen AnnFull Text:PDF
GTID:2444390005468608Subject:Biology
Abstract/Summary:
Phylogenetics is the study of the evolutionary histories, or phylogenies, for groups of species. Inferring phylogenies is a difficult estimation problem, and Bayesian methods are a relatively new approach. Rather than returning a point estimate of the optimal tree, a Bayesian analysis integrates over the distribution of branch lengths and model parameters, producing a posterior distribution of phylogenies. Given the large sample space and inability to calculate the required integrals analytically, Bayesian methods use Markov chain Monte Carlo (MCMC) to sample from the posterior distributions of the parameters. A good MCMC algorithm finds the regions of high probability quickly and explores these regions efficiently. Creating MCMC algorithms is challenging, and the task is further complicated by the difficulty of testing the performance of the methods. We must ensure that the sampled states have reached a stationary distribution and that we have run the method for a sufficient number of iterations for accurate inference from the sampled states.; In this thesis, I develop a new algorithm, BranchSlide, for exploration of the tree space, and then test the algorithm against existing methods. I assess the performance of the algorithm using a variety of convergence diagnostics, including a novel statistic based on the partition probabilities of the tree topology.; Results indicate that the BranchSlide proposal algorithm, given an appropriate tuning parameter, works very well over a wide range of inference problems. Very informative data sets are robust to changes in the proposal method, while harder inference problems are very sensitive to proposal methods. The analyses also indicate that a very flat posterior distribution of tree topologies still contains a large amount of information, leading to the development of a method to extract a stronger topology signal from the posterior distribution. I implement these methods in BayesTrees, a novel software package for Bayesian phylogenetic inference.
Keywords/Search Tags:Posterior distribution, Phylogenies, Methods, Bayesian, Tree, Inference
Related items