Font Size: a A A

Developing computational methods for studying nonmodel organism genetics and human disease with next-generation sequencing data

Posted on:2013-04-21Degree:Ph.DType:Thesis
University:The University of UtahCandidate:Hu, HaoFull Text:PDF
GTID:2454390008487608Subject:Biology
Abstract/Summary:PDF Full Text Request
The rapidly decreasing of costs of sequencing is revolutionizing genetics. Two applications of next-generation sequencing data are of particular importance in this regard. First, high-throughput sequencing now offers a fast and inexpensive means to investigate the genomes and genetics of nonmodel organisms. Second, human personal-genomics data offer a unique opportunity for discovering the genetic basis of human traits and diseases.;My PhD research has focused on developing computational methods to study genetics using next-generation sequencing data. In the first chapter of my thesis, I present a series of genome-based studies of the venomous cone snail Conus bullatus, a source of pharmaceutically important small cysteine-rich peptides called conopeptides or conotoxins. Using high-coverage transcriptome sequence from its venom duct together with low-coverage genomic reads, I have developed new methods to characterize key genomic traits in the absence of a complete reference genome, including genome size, sequence diversity, repeat content and mobile element densities. I have also developed an in silico transcriptomics pipeline for conotoxin discovery, and have used it to identify novel conotoxins as well as candidate enzymes that are likely to be involved in the post-translational processing of conotoxins.;In the second and the third chapters of my thesis, I describe a probabilistic disease-gene search algorithm VAAST (the Variant Annotation, Analysis and Search Tool) for finding damaged genes and their disease-causing variants; I also describe a powerful new extension to the original code-base called VAAST 2.0. In these chapters, I demonstrate that VAAST is both an accurate rare Mendelian disease-gene finder and a powerful means for identifying genes and alleles underlying common diseases. I have also carried systematic population-genetic simulations in order to benchmark the performance of VAAST and VAAST 2.0 under different genetic scenarios, and these demonstrate that VAAST 2.0 is the most robust and broadly applicable method available today for identification of genes involved in common genetic diseases such as breast cancer, hypertriglyceridemia and Crohn disease.
Keywords/Search Tags:Next-generation sequencing, Genetic, Data, VAAST, Methods, Human
PDF Full Text Request
Related items