Font Size: a A A

Multi-omics Data Based Studies Of Mapping And Parsineg For Metabolic Syndrome And Metabolic Component Associated Genetic Variants

Posted on:2018-09-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:D ZhouFull Text:PDF
GTID:1314330515961088Subject:Pathology and pathophysiology
Abstract/Summary:PDF Full Text Request
Metabolic syndrome(MetS),a cluster of metabolic disturbances,mainly involves central obesity,glucose,and lipid metabolism disorders.Studies have showed that MetS increases the risk for cardiovascular diseases and diabetes.According to the definition of Chinese Diabetes Society,the prevalence of MetS in Chinese is 17.6%in 2010.Twin studies have indicated that the genetic variations play an important role in developing metabolic disorders.Some metabolic disorder associated loci have been identified by conventional genome-wide association studies(GWAS).However,two problems remain unresolved:(1)these loci only explain a small proportion of the estimated heritability,a large proportion of "heritability,is missing;(2)most of the GWAS only report the associations between the tag SNP and the phenotype,a huge "black box" is left behind the associations.Subsequent studies have showed that most of the reported genes are not the true regulatory mediator between the association of genotype and phenotype.Recently,IRX3 have been evidenced to be the mediator gene for the obesity associated loci on FTO.The limited parsing for the GWAS signals and the difficulties in localization of key gene and causal variants will obstruct the functional studies.Recently,studies have showed that most of the trait-associated loci affect the phenotype through transcriptional regulation.That means combination of GWAS and eQTLs(expression quantitative trait loci)signals can improve the ability to discover true associations.Recently,ENCODE,GTEx and Roadmap project have released numerous omics data involving gene expression,DNA methylation,histone modification,and transcription factor binding site.Fine-mapping could be performed for the reported signals using these annotations in the association between genotype and phenotype.The majority of GWAS have been performed in Europeans,limited loci are mapped in Han Chinese.Considering the difference of the genetic background,weperformed an annotation-based GWAS using omics data for mapping and explaining for metabolic disorder associated variants.Current study mainly consists of three parts(Figure 1):Part I A conventional genome-wide association study for metabolic syndrome and metabolic components was conducted.Variants showed top signals were replicated in multiple cohorts.SNP function prediction and analyses for gene-environment interactions were performed.Part II An annotation-based genome-wide association study for glucose and lipid metabolic disorders was performed after the conventional strategy.Multi-omics data were used to annotate the regulatory flow of the genotype-phenotype associations.Fine-mapping and functional study were conducted for parsing the mechanism behind.Part III A systematic annotation was carried out for previous reported glucose and lipid-associated loci.The multi-omics data based analysis unravel the "black box"behind the associations and help to better understand these signals.Part I Genetic susceptibility study for metabolic syndromePurpose:To screen and validate MetS and metabolic disorder associated loci in Han Chinese.Materials and Methods:Using single nucleotide polymorphism(SNP)as markers for the common variant,a GWAS was performed for MetS and metabolic disorders in 1742 subjects from Xiaoshan,Hangzhou.Loci with top association signals from the GWAS were selected for replications using independent cohorts(totally 10978 samples)from multi-region(eastern,northern,and northeastern)of China.After the replication,we performed SNP function prediction and gene-environment interaction for loci reached genome-wide significant level.Results:In combined analyses,the genotype of rs651821 on APOA5 and an east-Asian-specific common variant rs671 on ALDH2 were found to be associated with MetS.Independent of the top known signal at rs651821 on APOA5,a novel secondary triglyceride-associated signal at rs180326 on BUD13(Pcombined= 2.4E-08)was identified.Notably,by an integrated analysis of the genotypes and the serum levels of APOA5,BUD 13,and triglyceride,we observed that BUD 13 was another potential mediator,besides APOA5,of the association between rs651821 and serum triglyceride.Interactions between rs671 and alcohol consumptions were observed for MetS.The effects of rs671 on metabolic components were more prominent in drinkers than in non-drinkers.Brief summary:(1)rs651821(APOA5)and rs671(ALDH2)were found to be associated with MetS in Han Chinese.(2)A novel secondary signal was observed at rs180326(BUD 13)after controlling the top signal rs651821(APOA5)at APOA cluster.(3)Interactions between rs671 and alcohol consumption levels were uncovered for MetS and metabolic components.Part ? Annotation-based genome-wide screening and fine-mapping for glucose and lipid metabolic-associated lociPurpose:Based on the association study and the multi-omics annotations,we aims to further explore glucose and lipid metabolic associated loci,perform fine-mapping analysis and validate the regulation of gene expression for the novel loci.Materials and Methods:Firstly,a GWAS was performed for glucose and lipid metabolic phenotypes in 1742 samples(same samples used in part ?).Secondly,we co-localized them with eQTL signals in adipose tissue,liver,pancreas,and muscle skeleton.Then,trait-associated eQTLs were replicated in independent cohorts.In order to infer the candidate causal gene,gene-based analyses were conducted for SNPs reached genome-wide significant levels after replications.Then,fine-mapping was conducted for the lead SNP and the indicated gene using chromatin states,histone modification,transcription factor binding data as annotations from ENCODE and Roadmap project.Finally,luciferase reporter assay was done for the potential causal variants.Results:The co-localization of GWAS and eQTL signals indicated 22 glucose and lipid metabolic trait associated loci in the discovery stage.After replication,a serum high-density lipoprotein cholesterol(HDL-C)associated variant rs1880188 was identified(Pcombined = 1.4E-10).The SNP rs1880118 mainly captured two genes,DAGLB and RAC1.In the additive model,rs1880118 was associated with DAGLB(diacylglycerol lipase,beta)expression levels in adipose tissue(P=5.9E-42)and explained 47.7%of the expression variance.Then,gene-based expression-trait association tests revealed the significant association between the DAGLB and serum HDL-C levels using Mendelian-randomization-based approaches called "TWAS"(P =3.0E-8),"SMR"(P =1.1E-4),and "Sherlock"(P=1.6E-6).An active transcription region near 5' of DAGLB was uncovered by analysis of transcription factor binding sites,DHS,and histone modification signals H3K27ac,H3K4me3,and H3K9ac.Then we found that the segment containing the minor alleles of rs4724806(r2=0.77 to rs1880118)showed increased transcriptional activity compared with segment contains the major alleles,which was consistent with the eQTL analyses.Brief summary:A novel HDL-C-associated variant at rs1880118 was replicated in Han Chinese.The causal variant rs4724806,r2=0.77 to rs1880118,regulated the transcription activity of DAGLB.Current study indicated the role of DAGLB in lipid metabolism.Part III Systematic parsing for glucose and lipid metabolic-associated loci using multi-omics dataPurpose:Systematic parsing for previously reported glucose and lipid metabolic-associated loci using multi-omics data as annotation.Explore the regulatory flow of genotype to phenotype.Narrow down the gap between the GWAS and function studies.Materials and Methods:Firstly,glucose and lipid metabolism associated loci in NHGRI-EBI was retrieved and filtered for analyses.Secondly,all of the variants which could be tagged by the reported lead SNP were imputed using 1000 genome project data.Then,a systematic annotation was performed for each variant.Annotation involves tissue-specific gene expression,chromatin states,DNA methylation,histone modification,transcription factor binding site,etc.For variants in the coding region,additional annotations involving conservation between species,post-translation modification were used for parsing the GWAS signals.Result:After the filter,592 glucose or lipid metabolism associated loci were found.Totally 17646 loci were used after imputation(LD r2>0.5).In transcription level,we observed 104 glucose or lipid level associated eQTLs in adipose,liver,pancreas isle,or skeleton muscle.Context-dependent eQTLs were observed.Significant associations between the genotype of rs702485 and the expression level of DAGLB were revealed only after the LPS stimulate(Pbefore>0.05,Pafter= 2.52E-16).133 methylation quantitative trait loci(mQTLs)were annotated in adipose tissue or in pancreas isle.Some of these mQTLs associated with the DNA methylation levels in multiple CpG sites.Annotation also showed that 49 variants could tag one or more loci located in the signal peaks of histone modification in relevant tissues.In translation level regulation,122 lead SNP(r2>0.5)or 43 lead SNPs(r2>0.8)could tag one or more non-synonymous mutations.Among these mutations,16 of them were annotated as "damaging" using both SIFT and Polyphen score.Fine-mapping was performed for the glucose metabolism associated variant at rs1535500 on KCNK16.The G to T variant of rs1535500 will result in loss of seven CpG site near the CpG island at 5' of KCNK17.Integrating all of the data,we observed a three-way association between rs1535500,DNA methylation and KCNK17 expression in pancreas isle.Brief summary:(1)Using omics data,glucose or lipid metabolism associated loci were re-parsed.Numbers of genes and regulatory elements are annotated between the association of the lead SNP and phenotype.Like other complex traits,we observed that only one-third of the genes we annotation were consistent with previously reported genes.(2)Only 7%-20%lead SNP could tag one or more non-synonymous variants,other lead SNP probably affect the phenotype via transcription regulation.(3)The glucose metabolism associated variant rs1535500 was fine-mapped.A three-way associations between rs1535500,DNA methylation and KCNK17 expression in pancreas isle was uncovered.Conclusions:Through the above three parts,the following conclusions could be drawn:(1)The SNPs rs651821 on APOA5 and rs671 on ALDH2 were found to be associated with the genetic susceptibility of MetS in Han Chinese;A novel secondary signal was observed at rs180326 on BUD13 after controlling the top signal rs651821;Interactions between rs671 and alcohol consumption levels were uncovered.(2)Annotation for transcription regulation could improve the effectiveness of the discovery stage of GWAS;A novel HDL-C-associated variant at rs1880118 was validated.The causal variant rs4724806,r2 = 0.77 to rs1880118,regulated the transcription activity of DAGLB.(3)More than 80%of the glucose or lipid metabolism associated loci regulate phenotype mainly through transcription level;Omics data based re-parsing indicated that only one third of the genes were consistent with previously GWAS reported genes.e.g.The glucose metabolism associated variant rs1535500 on KCNK16 was annotated to be associated with KCNK17 expression via DNA methylation regulation.
Keywords/Search Tags:Metabolic syndrome, Genome-wide association study, expression quantitative trait loci, Multi-omics data based annotaion, Fine-mapping
PDF Full Text Request
Related items