Font Size: a A A

Computational analysis on genomic variation: Detecting and characterizing structural variants in the human genome

Posted on:2011-03-14Degree:Ph.DType:Thesis
University:Yale UniversityCandidate:Lam, Hugo Y.KFull Text:PDF
GTID:2443390002455949Subject:Biology
Abstract/Summary:
Genomic variation refers to the difference of DNA sequence between two or more individuals. In the past, it was believed that most human sequence variation was attributable to single nucleotide polymorphisms (SNPs), which was estimated to occur every 300--1,000 bases on average when comparing two different chromosomes. Nowadays, with the advance of sequencing technology, we are able to reveal a large number of different variation called structural variation (SV). This kind of variation includes genomic rearrangement such as deletion, insertion and inversion, which are usually defined as >1 kbp in size. These SVs have considerable impact on genomic variation by causing more nucleotide differences between individuals than SNPs and by creating gene duplication or deletion. Even though many recent findings have implicated the importance of SVs such as disease association, the understanding of their formation processes and the ability to identify them are still very limited, which have particularly hampered further studies on a large scale. To this end, this thesis aims to carry out a detailed and large-scale computational analysis on genomic variation. It demonstrates a loss-of-function variation analysis across different eukaryotic genomes by using a database of pseudogene families and an ontology, which reveals the formational bias of pseudogene and its relation with other genomic segments such as segmental duplications (SDs). It goes on to investigate the formation mechanisms of SVs by correlating SDs and copy number variants (CNVs) with genomic repeats such as the Alu elements. Then, it extends the characterization of SVs by using an SV breakpoint library and reveals their formational biases. Finally, it introduces a novel computational approach for reliably and efficiently identifying SVs in a newly-sequenced personal genome.
Keywords/Search Tags:Variation, Computational, Svs
Related items