Font Size: a A A

Probability models for design and analysis of genetic data

Posted on:2004-08-25Degree:Ph.DType:Thesis
University:Iowa State UniversityCandidate:Zhang, HongmeiFull Text:PDF
GTID:2468390011477304Subject:Statistics
Abstract/Summary:
This thesis concerns probability model development and analysis for genetic data. There are two studies involved. One study focuses on inference and study design for applications where objects of different types are observed. We apply a Bayesian hierarchical model to estimate the total number of categories in a region, and then use a Monte Carlo simulation approach to design future sampling. Specifically, the Monte Carlo simulation method is used to determine how large an extra sample is needed to guarantee that a certain proportion of all categories can be collected with a specified confidence. We apply the method to DNA sequence data. Some important properties of the proposed model are investigated through simulations.;The second study uses genetic marker information to identify ancestors of a given individual in the presence of genotyping errors. We extend an existing probability model to calculate the probability that a particular inbred line is an ancestor of the given hybrid, accounting for genotyping errors. A simulation study indicates that if misclassification is ignored, ancestry probabilities can be overestimated. We use the maximum likelihood estimate (MLE) as the estimate for the error rate. The developed methodology is then applied to simulated data and a genetic data set containing maize Simple Sequence Repeats (SSR) marker profiles.
Keywords/Search Tags:Genetic, Data, Probability, Model
Related items