Font Size: a A A

Topics in statistical genetics and genomics

Posted on:2003-06-24Degree:Ph.DType:Thesis
University:University of California, RiversideCandidate:Luo, LangFull Text:PDF
GTID:2460390011479070Subject:Biology
Abstract/Summary:
Several topics in statistical genetics and genomics are presented in this thesis, which includes (1) Expectation-maximization estimation of variance components for binary data without Gibbs sampler, (2) Estimation of genetic variances contributed by individual QTL, and (3) Testing viability selection using molecular markers.; In Chapter 1, a new EM algorithm for estimating the maximum likelihood estimates of variance components for binary data is developed. Maximum likelihood estimates (MLEs) of variance components for binary data are usually calculated using an EM algorithm. For complicated models the EM is often accomplished by means of Markov chain Monte Carlo (MCMC). However, Monte Carlo EM is computationally intensive; even for a small data set, it may take hours to achieve the desired precision of estimates. Under the latent variable model for binary data, I developed an EM algorithm that can handle arbitrarily complex models without resort to Gibbs sampler. This new EM algorithm involves an internal EM process embedded in each external EM step. The internal EM process is to calculate the conditional expectation and conditional covariance matrix of the random effects given the variance components and the data, and the external EM is to update values of the variance components. The new EM algorithm is illustrated through analysis of the well-known salamander data.; Estimation of genetic variances contributed by individual QTL is presented in Chapter 2. In addition to locating chromosomal positions of quantitative trait loci (QTL), estimating the sizes of identified QTL is also an important component in QTL mapping. The size of a QTL is usually measured by the variance. However, the variance may be overestimated in a small line crossing experiment. This paper describes a simple method to quantify and correct the bias. Let a be the additive effect of a QTL in an F2 family, s2a = a2/2 be the additive variance and s&d4;2a = â2/2 be the maximum likelihood estimate of s2a . Denote λa as the likelihood ratio test statistic under the null hypothesis that a = 0. An asymptotically unbiased estimate of s2a is s&d4;2* a = s&d4;2a (1− l-1a ).; Finally in Chapter 3, a QTL mapping method for viability loci is developed. In genetic mapping experiments, some molecular markers often show distorted segregation ratios. We hypothesize that these markers are linked to some viability loci (VL) that cause the observed segregation ratios to deviate from the Mendelian expectations. Although statistical methods for mapping VL have been developed for line-crossing experiments, methods for VL mapping in outbred populations have not been developed yet. In this study, we develop a method for mapping VL in outbred populations using a full-sib family as an example. We develop a maximum likelihood (ML) method that uses the observed marker genotypes as data and the proportions of the genotypes of the VL as parameters. The ML solutions are obtained via the expectation-maximization (EM) algorithm. Application and efficiencies of the method are demonstrated and tested using a set of simulated data and data from a four-way cross experiment in the mouse.
Keywords/Search Tags:Variance components, EM algorithm, Data, Genetic, Statistical, QTL, New EM, Maximum likelihood
Related items