Font Size: a A A

Augmenting the bootstrap to analyze high-dimensional genomic data

Posted on:2009-04-16Degree:Ph.DType:Thesis
University:The Pennsylvania State UniversityCandidate:Tyekucheva, SvitlanaFull Text:PDF
GTID:2440390005459348Subject:Biology
Abstract/Summary:
The data produced by contemporary high-throughput genomic techniques are often high dimensional and undersampled. In these settings, several statistical analyses become problematic. Among these are techniques that require the inversion of variance-covariance matrices, such as those pursuing supervised dimension reduction or the assessment of interdependence structures, and classification and regression techniques prone to overfitting. In this thesis we show how the ideas of bagging and smoothed bootstrap can be used to overcome undersampling and improve the performance of a number of statistical procedures widely used in genomic applications. We investigate the conditions under which our method, which we call augmented bootstrap, improves estimation and demonstrate its performance on simulated data and on data derived from genomic DNA sequences and microarray experiments.
Keywords/Search Tags:Genomic, Data, Bootstrap
Related items