Font Size: a A A

Corrected Log Det evolutionary distance estimation

Posted on:2010-01-03Degree:M.ScType:Thesis
University:Dalhousie University (Canada)Candidate:Gao, HeFull Text:PDF
GTID:2448390002477889Subject:Applied Mathematics
Abstract/Summary:
In this thesis, we will be interested in the use of distance methods to reconstruct evolutionary trees with a focus on LogDet distances. The LogDet estimator is a measure of divergence (evolutionary distance) between sequences of biological characters: DNA amino acids, or gene content data. This transformation is useful in comparison with many existing distances which tend to falsely group sequences on the basis of their similar nucleotide composition. However, a difficulty is that LogDet distance does not exist when the determinant is less than or equal to zero.;Examining the proportions of times the estimated topology was the same as the true topology we found that LogDet distance can be used to accurately reconstruct the true evolutionary trees in many situations. However, the corrected distance performed better since it dealt with the problem of non-existence.;We introduce a corrected LogDet distance with a correction factor alpha. With appropriate values of alpha, we can decrease the proportion of non-existent distances substantially. There is a tradeoff between choosing a to minimize the probability of non-positive distance and to minimize the MSE of the distance. We find optimal a values that minimize the MSE of the distance estimator and analyze its performance in decreasing the probability of non-existence. We also briefly introduce methods that can be used to estimate the edge lengths and the topology. We use the estimated edge lengths calculated with LogDet and corrected LogDet distances to estimate trees in four-taxon simulations.
Keywords/Search Tags:Distance, Corrected, Evolutionary, Logdet, Trees
Related items