Font Size: a A A

Analysis of probabilistic models of evolution

Posted on:2007-05-02Degree:Ph.DType:Thesis
University:Harvard UniversityCandidate:Matsen, Frederick Albert, IVFull Text:PDF
GTID:2448390005462881Subject:Biology
Abstract/Summary:
This thesis develops mathematical techniques to analyze evolutionary models and data. The first part of the thesis develops improved techniques for the analysis of the shape of phylogenetic trees. A phylogenetic tree is a graphical representation of the evolution of a group of taxa, usually with the tips representing extant taxa and the internal nodes representing hypothetical ancestors. The work here focuses on improving the utility of tree shape statistics, which are numerical summaries of the overall shape of a tree. First we develop the "geometric" perspective on tree shape, which is a formalization of the intuitive notion that a good tree shape statistic should be similar for similar trees and different for different trees. This perspective has the distinct advantage of allowing the evaluation of multiple tree shape statistics describing different aspects of tree shape. Second, we analyze the possibility of using the spectra of various matrices associated with the tree to describe the tree. The main result is that for any of several common choices of matrix that the fraction of binary trees with a unique spectrum goes to zero as the number of leaves goes to infinity. Third, we explore a natural recursive framework which allows for the enumeration of and optimization over tree shape statistics. This framework is then applied to find more powerful statistics than were known before in an example application.; The second part consists of the analysis of two different evolutionary models. The first result concerns the coalescent, which is a stochastic process modeling the ancestry of a genetic sample from a population. We show that the coalescent process, for a sample of size two on any graph of a certain class converges to the same process on a complete graph. The second is a result about random walks of populations on graphs which is then applied to analyze a model of language co-development. Specifically, we identify a very simple strategy which leads to the population finding a common language with high probability.
Keywords/Search Tags:Models, Tree shape
Related items