Statistical Analysis of Haplotypes, Untyped SNPs, and CNVs in Genome-Wide Association Studies

Posted on:2012-07-25

Degree:Ph.D

Type:Dissertation

University:The University of North Carolina at Chapel Hill

Candidate:Hu, Yijuan

Full Text:PDF

GTID:1453390008993036

Subject:Biology

Abstract/Summary:

Missing data arise in genetic association studies when one is interested in assessing the effects of haplotypes, untyped single nucleotide polymorphisms (SNPs) or copy number variants (CNVs). Haplotypes are combinations of nucleotides at multiple loci along individual homologous chromosomes, and the use of haplotypes tends to yield more efficient analysis of disease association than SNPs. Untyped SNPs are SNPs that are not on the genotyping chips used in the study (i.e., missing on all study subjects), and the analysis of untyped SNPs can facilitate localization of disease-causing variants and permit meta-analysis of association studies with different genotyping platforms. A CNV refers to the duplication or deletion of a segment of DNA sequence compared to a reference genome assembly, and can play a causal role in genetic diseases.;In the first part of the proposal, we provide a general likelihood-based framework for making inference on the effects of haplotypes or untyped SNPs and their interactions with environmental variables. Unlike most of the existing methods, we allow genetic and environmental variables to be correlated. We show that the maximum likelihood estimators are consistent, asymptotically normal, and asymptotically efficient and we develop EM algorithms to implement the corresponding inference procedures. We conduct extensive simulation studies and apply the methods to a genome-wide association study (GWAS) of lung cancer.;In the second part, we focus on comparing two approaches in the analysis of untyped SNPs. The maximum likelihood approach integrates prediction of untyped genotypes and estimation of association parameters into a single framework and yields consistent and efficient estimators of genetic effects and gene-environment interactions with proper variance estimators. The imputation approach is a two-stage strategy which first imputes the untyped genotypes by either the most likely genotypes or the expected genotype counts and then uses the imputed values in downstream association analysis. We conduct extensive simulation studies to compare the bias, type I error, power, and confidence interval coverage between the two methods under various situations. In addition, we provide an illustration with genome-wide data from the Wellcome Trust Case-Control Consortium (WTCCC).;In the third part, we present a general framework for the integrated analysis of CNVs and SNPs in association studies, including the analysis of total copy number as a special case. We use allele-specific copy numbers (ASCNs) to describe both the copy number and allelic variations of a locus. The joint effects of CNVs and SNPs on the disease are formulated in terms of allele-specific copy numbers (ASCNs). Our approach combines the ASCN calling and association analysis into a single step while allowing for differential errors. We construct likelihood functions that properly account for the case-control sampling and measurement errors. We establish the asymptotic properties of the maximum likelihood estimators and develop EM algorithms to implement the proposed inference procedures. The advantages of the proposed methods over the existing ones are demonstrated through realistic simulation studies and an application to a GWAS of schizophrenia.

Keywords/Search Tags:

Studies, Association, Untyped, Haplotypes, Cnvs, Genome-wide, Genetic, Effects

Related items

1	Genome-wide Association Study,Copy Number Variations And Selection Signatures Detect For Sheep With Different Types Of Tails
2	Genome-wide Association Studies For Pig Meat Traits And Exploration Of Major Genes
3	SNP-set Tests for Sequencing and Genome-Wide Association Studies
4	Development Of Methods For Non-additive Genome-wide Association Studies And Its Application In Pigs,Rats And Mice
5	Genome-wide association study of conotruncal and related cardiac malformations
6	Phenotyping And Genome-wide Association Studies Of Improtant Agronomic Traits In Cassava (Manihot Esculenta Cranz)
7	Genome Wide Association Studies For Reproductive Traits And CNV Detection In Pigs
8	Evaluation Of Genetic Diversity And Genome-wide Association Studies Of Important Traits In Chinese Fir
9	Genome-and Transcriptome-wide Association Studies Provide Insights Into The Genetic Basis Of Natural Variation Of Thousand-seed Weight In Brassica Napus
10	A Restricted Two-stage Approach For Genome-wide Association Studies In Inbreeding Crops And Its Application And Software Development