Font Size: a A A

Statistical Methods For Testing Gene-Environment Interactions With Rare Variants

Posted on:2024-05-01Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Q JinFull Text:PDF
GTID:1520307340473824Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
Genome-wide association studies(GWASs)have successfully identified thousands of common variants associated with complex diseases.These variants,however,explain only a small part of the heritability of the diseases,the remaining unexplained heritability is called the"missing heritability".The development of sequencing technology and the reduction of sequencing costs have made it possible to study rare variants.Researchers have gradually shifted their focus from common variants with minor allele frequencies(MAFs)greater than 1%to rare variants with MAFs less than 1%.Recent studies showed that rare variants play an important role in explaining the"missing heritability".Human complex diseases usually are the result of genetic factors,environmental factors and the interactions between them.Studying gene-environment interactions(GEIs)can reveal the pathogenesis of complex diseases more accurately and are important for exploring the etiology of complex diseases.To date,many statistical methods for testing GEI effects with rare variants have been proposed.However,their statistical power is usually limited,partly due to the following issues.First,due to the low frequencies of rare variants,most existing GEI tests are based on a gene to enrich the association signals.In general,most rare variants in a gene have no effect on the disease of interest,and only a very small number of rare variants are possibly pathogenic variants.Therefore,noise is introduced by combining all rare variants in the gene,which reduces the statistical power.Second,compared with the association tests for common variants,GEI tests with rare variants require much larger sample size.Most of the existing GEI tests with rare variants are proposed for analyzing a single study,which has limited sample size.Thus,the tests have low statistical power.Third,GEI may simultaneously affects multiple phenotypes,namely,the GEI pleiotropy.Jointly analyzing multiple phenotypes can provide higher statistical power.However,existing GEI tests are still limited to a single phenotype,so have low statistical power.To overcome these issues,this work focuses on developing efficient GEI tests with rare variants.The main contribustions are:Firstly,the aggregated Cauchy association test(ACAT)demonstrates high statistical power when only a small number of rare variants in a gene are pathogenic variants,and ACAT does not need to consider the correlation between P values of the variants.In this work,ACAT is extended to GEI analysis,Cauchy GEI tests with fixed and random main effects are proposed by using ACAT to combine variant level P values from testing GEI,the two tests calculate the significance of a gene by selecting the smallest P value in the single-variant GEI analysis.Using ACAT to combine P values from multiple GEI tests,an omnibus Cauchy GEI test is proposed,it combines the strengths of multiple GEI tests.In simulation studies,the proposed Cauchy GEI tests show higher statistical power than existing GEI tests in the case of sparse pathogenic rare variants in a gene.Even in the case of dense pathogenic rare variants in a gene,the proposed Cauchy GEI tests still have similar statistical power as the existing GEI tests.In all considered simulation scenarios,the omnibus Cauchy GEI test always has the highest statistical power.The proposed Cauchy GEI tests are applied in genome-wide analyses of Blood pressure(BP)phenotypes to detect gene-Body mass index(BMI)interactions,using the whole-exome sequencing data from UK biobank(UKB).At the suggestive significance level of 1.0×10-4,KCNC4,GAR1,FAM120AOS and NT5C3B were identified to show interactions with BMI.Secondly,single study GEI tests have low statistical power due to its limited sample size,meta-analysis that combine results from multiple studies can increase the effective sample size and enhance the statistical power.In this work,meta-analysis tests for testing GEI effects with rare variants are proposed by combining summary statistics from multiple studies,considering the scenarios that main effects are fixed or random,and GEI effects among studies are homogeneous or heterogeneous.Through simulations,when homogeneous GEI effects are present across studies,the proposed homogeneous GEI meta-analysis tests have the same statistical power as the pooled analysis,which combines individual raw data from multiple studies.When heterogeneous GEI effects are present across studies,the proposed heterogeneous GEI meta-analysis tests can proceed heterogeneity properly,they present higher statistical power than the pooled analysis.The proposed GEI meta-analysis tests are applied to analyze gene-age interactions in BP phenotypes with the whole-exome sequencing data in UKB.At the nominal significance level of 0.05,CDC25A and C10orf107 were identified to show interactions with age.Finally,considering that GEI pleiotropy exists and simultaneously analyzing multiple correlated phenotypes can provide higher statistical power than single phenotype analysis.Based on four phenotype kernels to model different correlation structures for GEI effects on multiple phenotypes,four multiphenotype GEI tests are proposed.Through simulations,the proposed multiphenotype GEI tests have higher statistical power in the presence of pleiotropy than in the absence of pleiotropy,and they show higher statistical power as the proportion of causal variants increases.In addition,with the increase of the correlation among phenotypes,except for the multiphenotype GEI test with linear phenotype kernel,the other proposed tests provide improved statistical power.Among all considered simulation scenarios,the multiphenotype GEI tests with heterogeneous kernel and projection phenotype kernel are robust and have superior statistical power than the other two tests.The two efficient multiphenotype GEI tests are applied in the genome-wide analysis of gene-hemoglobin(Hb)interactions for BP phenotypes with the whole-exome sequencing data in UKB.At the genome-wide significance level of 2.5×10-6,LEUTX was identified to be associated with BP phenotypes through its interaction with Hb.The two efficient multiphenotype GEI tests were also applied to analyze gene-Hb interactions for22 BP-related genes.At the nominal significance level of 0.05,MYO1C and Hb interaction was identified.
Keywords/Search Tags:Gene-environment interaction, Rare variant, Aggregated Cauchy association test, Pleiotropy, Meta-analysis, Blood pressure, Whole-exome sequencing, UK biobank
PDF Full Text Request
Related items