Font Size: a A A

Bayesian Hierarchical Modeling for Massive Sequence Datasets

Posted on:2012-05-03Degree:Ph.DType:Dissertation
University:University of California, Los AngelesCandidate:Tom, Jennifer AileenFull Text:PDF
GTID:1468390011969025Subject:Biology
Abstract/Summary:
Rich infectious disease sequencing databases include viruses sampled from different geographical locations, times, viral subtypes, and subjects inspiring novel biological hypotheses involving correlation, covariates, and their interactions through time. Formally testing these models, potentially based on thousands of taxa, demands statistical ingenuity due to the computational intractability of working with a massive modeling space. Ideally, a researcher interested in reconstructing the evolutionary history of a virus identifies the isolates of interest, sequences them, and uses readily available Bayesian software to infer a phylogenetic tree from these sequences. Unfortunately, due to the computational complexity of inferring phylogenies compounded by the large number of sequences, the researcher is often forced to partition the taxa by a covariate (e.g., sampling location) and run independent or stratified analyses. This stratification, while facilitating fast estimation, results in overparameterization and ignores the correlation between parameters across strata. Additionally, stratification fails to profit from the massive amounts of data available because parameters are estimated from siloed strata, removed from the implicit context that motivated the initial data collection. Using the intermediate realizations from these stratified analyses, I fit hierarchical models based on importance sampling. This strategy yields improved estimators due to shrinkage towards the mean and the use of Bayes factors. I have successfully applied this methodology to unresolved biological hypotheses concerning influenza A using both a (1) mixture model with patterned covariance matrices and (2) a nonparametric wavelet-based model.
Keywords/Search Tags:Massive
Related items