Font Size: a A A

Markov models on trees: Reconstruction and applications

Posted on:2008-11-07Degree:Ph.DType:Dissertation
University:University of California, BerkeleyCandidate:Roch, Sebastien JocelynFull Text:PDF
GTID:1440390005969046Subject:Statistics
Abstract/Summary:
In this dissertation, we consider a number of inference problems related to Markov models on trees---mostly motivated by applications in evolutionary biology.; We begin with the "ancestral reconstruction" problem---also known simply as the reconstruction problem: Given the state at the leaves of the tree, how accurately can one guess the state at the root? Our main result in this area is a new condition for non-reconstructibility. More precisely, we extend a known bound, the Kesten-Stigum bound, to a more general class of Markov transition matrices than previously known, more specifically, roughly symmetric binary channels.; The second problem we consider is the "phylogenetic reconstruction" problem: Given samples of the leaf states, when can one infer efficiently the tree structure from which the data was generated? This is a central problem in biology where it corresponds to reconstructing the evolutionary history of a group of species from molecular sequences. Here we prove two main results. First, we show that a standard inference technique used in biology, maximum likelihood, is NP-hard, that is, computationally intractable on worst-case instances. Then, using a connection to the ancestral reconstruction problem, we show that, when the data is generated from a Markov model with sufficiently low mutation rates, the sample requirement for an accurate reconstruction drops significantly. Moreover, we show that this transition, conjectured by Mike Steel, is sharp.; The third problem we consider is the "full reconstruction" problem: Given samples of the leaf states, when can one reconstruct efficiently the full Markov model, that is, the tree structure as well as the edge transition matrices? Using a standard framework from computational learning theory, we give conditions on the transition matrices for the feasibility of an efficient reconstruction of the full model.; Finally, we consider a related model in networking, multicast delay inference. We give an efficient algorithm for reconstructing the routing tree and its delay parameters.
Keywords/Search Tags:Tree, Model, Markov, Reconstruction, Problem, Inference
Related items