Font Size: a A A

Statistical modeling of genomic data: Applications to genetic markers and gene expression

Posted on:2011-10-07Degree:Ph.DType:Dissertation
University:The University of Wisconsin - MadisonCandidate:Vazquez, Ana IFull Text:PDF
GTID:1448390002958520Subject:Biology
Abstract/Summary:
The field of genomics is becoming increasingly important across species and biological disciplines. It provides data and tools for investigating the complexity behind genetic regulation of complex traits, as well as for prediction of yet to be observed outcomes, such as disease susceptibility in medicine and genetic merit of animals and plants in agriculture. However, high-throughput technologies produce massive amounts of information on each individual assayed, giving rise to severe dimensionality issues in statistical learning. Deriving useful inferences and predictions from this "large-p small-n" type of information constitutes a major challenge. This dissertation focused on the development and evaluation of statistical tools for addressing "large-p, small-n" problems for quantitative traits; specifically, two problems were addressed: (a) In Chapters 2 and 3, the problem of prediction of genomic values using molecular marker genotypes was considered. Chapter 2 studied the effect of maker density and marker selection on accuracy of predictions of breeding values in US Holstein cattle. Chapter 3 offered an application of whole-genome prediction methods in cancer prediction for humans, a field where such techniques have not been yet extensively used. (b) Chapter 4 addresses the problem of inferences from gene expression data and described the development of a strategy for incorporating prior biological knowledge into statistical models. All the studies indicated above dealt with the "curse of dimensionality problem", using statistical techniques that either "reduce" the effective number of model parameters by shrinkage (all chapters), filter parameters (such as evaluation of SNP selection strategies in Chapter 2), or "increase" the amount of information from each replicate by borrowing information across gene effects and incorporation of prior knowledge (Chapter 4).
Keywords/Search Tags:Data, Gene, Statistical, Chapter, Information
Related items