Font Size: a A A

Analysis of high-throughput biological data: Some statistical problems in RNA-seq and mouse genotyping

Posted on:2010-08-19Degree:Ph.DType:Thesis
University:University of California, BerkeleyCandidate:Taub, Margaret AnneFull Text:PDF
GTID:2440390002475743Subject:Statistics
Abstract/Summary:
The many areas of research of high-throughput computational biology provide endless opportunities for methodological contributions by statisticians. In this thesis, we present results in two main areas, one just emerging and one well-established.;In Part I of this thesis, we present new results related to the analysis of high-throughput sequencing data. The last year or so has seen the emergence of many new technologies aimed at enabling the massively parallel sequencing of many molecules of DNA simultaneously. This technological leap forward has enabled scientists to conduct exciting experiments that were impossible with previous technologies, and statisticians are being flooded with new data to analyze. We focus on two analytical problems related to new short-read sequencing technologies, each aimed at a different aspect of the goal of quantifying gene expression using sequencing. First, we present a new method aimed at determining which gene a particular sequence fragment originated from, in order to obtain better unbiased estimates of gene expression. Second, we develop a new empirical Bayes test statistic aimed at measuring differential gene expression between two samples which have been sequenced. Both problems combine fundamental statistical concepts with cutting-edge biology research.;Part II of this thesis focuses on genetic analysis of the mouse model organism, a more established area of both biological and statistical inquiry. We present an analysis of the performance of a high-throughput microarray in measuring genotype information in a pooled set of mice, for the purposes of detecting a disease-carrying mutation locus. This problem combines relatively new technological advances with classical theories of linkage analysis.
Keywords/Search Tags:High-throughput, New, Data, Statistical
Related items