Font Size: a A A

Statistical inference in population genetics

Posted on:1998-08-15Degree:Ph.DType:Dissertation
University:The University of ChicagoCandidate:Pluzhnikov, AnnaFull Text:PDF
GTID:1460390014474223Subject:Statistics
Abstract/Summary:
The properties of genealogical trees for samples of DNA sequences are studied using methods of the coalescent theory. The sample is assumed to be taken from a large population with no geographical structure. The evolution of the DNA region in question is neutral.; Theoretical results include expressions for covariances between tree lengths of nested subsamples of various structure. An important practical application concerns statistical properties of two estimators of the scaled mutation rate per site based respectively on the number of segregating DNA sites, and on the average number of pairwise differences between sequences in the sample. The main result relates to the asymptotic behavior of the variance of these estimators, as the sample size, n, is fixed, and the sequence length, l, increases. In this setting, in contrast to the traditional one in which l remains fixed and n varies, both estimators are shown to be consistent, the asymptotic rate of decrease of variance being at least (log l)/l for both of them.; Expressions are derived for the variance of the two estimators under various models of recombination. The variance of the average number of pairwise differences is obtained explicitly. For the variance of the total number of segregating sites, a numerical procedure is developed. Based on these results, optimal strategies to choose the sample size n and the sequence length l in order to minimize the sampling variance are derived, for a fixed total number of nucleotide base pairs to be sequenced. An optimal strategy typically involves sequencing fewer than 10 long copies of the region, for most values of the parameters observed in practice. The robustness of the procedure under various departures from the initial assumptions is discussed.
Keywords/Search Tags:DNA, Sample
Related items