Font Size: a A A

Study On Identifying Shared Risk Genes Of Complex Correlated Diseases Using Association Analysis Methods

Posted on:2021-09-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:H P GuoFull Text:PDF
GTID:1484306521469714Subject:Statistics
Abstract/Summary:PDF Full Text Request
Genome-wide association study(GWAS)has turned out to be an essential technology for exploring the genetic mechanism of complex traits.In the past decade,the research on GWAS methods developed from the initial single locus and single trait analysis to multi-locus analysis and multi-trait analysis.However,the GWAS results can only explain a small part of heritability.Therefore,the study on GWAS methodology is of great significance.Clinical and epidemiological studies show that complex related diseases often occur in the same person or different members from the same family,but the internal genetic mechanism of the comorbidity has not been well-elucidated.With the rapid development of high-throughput sequencing technology and cost reduction,a large number of samples are used in complex disease GWAS researches.GWAS with large samples can bring higher statistical power,and the GWAS results can provide reference for further study of complex diseases.Thus,large-scale association analysis based on GWAS result can help us understanding the common genetic mechanism of complex related diseases.In view of the current fact that the statistical power in multi-locus GWAS methods can be further improved,a two-stage mutual information based Bayesian Lasso method(MBLASSO)was proposed in this thesis.In order to verify the performances,MBLASSO was compared with other three mainstream methods in three different simulation datasets.It was found that MBLASSO can improve the statistical power and the accuracy of effect estimation.Moreover,MBLASSO performs best on model fitting,the accuracy of detected associations is the highest,and 21 genes can only be detected by MBLASSO in four flowering time related traits of Arabidopsis thaliana.This method provides an effective tool for GWAS researches.In view of the little knowledge about the shared genetic mechanism of asthma,hay fever and eczema,this thesis proposed an idea that integrating multi-trait and multi-omic association analyses can identify more shared risk genes.Large-scale GWAS results of asthma,hay fever and eczema were selected for our study.Firstly,multi-trait association study identified 66 pleiotropic genes.Then multi-omic analyses(i.e.,genome-wide and transcriptome-wide gene-based tests)were used to detect genes associated with each of the three diseases,respectively.Finally,150 shared risk genes were identified,in which 60 genes were novel.Functional enrichment analysis revealed that the shared risk genes are enriched in inflammatory bowel disease(hsa05321),Th17 cell differentiation(hsa04659),Th1 and Th2 cell differentiation(hsa04658)and other related biological pathways.These findings may provide help on treatment of asthma,hay fever and eczema in clinical applications.In view of the unknown genetic mechanism shared between nonalcoholic fatty liver disease(NAFLD)and metabolic traits,this thesis estimated the genetic correlation between NAFLD and metabolic traits,and large-scale association studies such as cross-trait meta-analysis were used to identify the shared risk genes.More especially,the GWAS results of NAFLD and nine metabolic traits were selected in this thesis.Genetic correlation analysis showed that obesity and type II diabetes have significant genetic correlation with NAFLD.Multi-trait association study has been performed on NAFLD,obesity and type II diabetes to increase the statistical power on the whole.Cross-trait meta-analysis identified 104 pleiotropic variations,involving 122 genes.Genome-wide gene analysis identified 6 genes associated with the three traits.The total shared risk genes(124)enriched in 12 biological pathways,which contained Pathways in cancer(hsa05200),Maturity onset diabetes of the young(hsa04950),Colorectal cancer(hsa05210)and Insulin secretion(hsa04911)and so on.This study provides a basis for further understanding of the relationship between NAFLD and metabolic traits.In summary,this thesis focuses on the methodology of association studies and the application in identifying the shared genetic mechanism of complex related diseases.In terms of methodology,the multi-locus GWAS method(MBLASSO)proposed in this thesis can improve the statistical power and effect estimation accuracy.In terms of application,multi-trait and multi-omic analyses can identify more shared risk genes associated with asthma,hay fever and eczema;large-scale crosstrait meta-analysis and gene analysis can be used to identify the shared risk genes between NAFLD and its related metabolic traits.The innovations of this thesis are as follows:· It was proposed that the combination of Pearson correlation and mutual information screening can improve the statistical power and effect estimation accuracy in GWAS.· 150 shared risk genes of asthma,hay fever and eczema were detected by integrating multi-trait and multi-omic analyses,60 of which were newly found.· It was found that NAFLD has significant genetic correlations with obesity and type II diabetes.The overall statistical power was improved by multi-trait association analysis,and 124 shared risk genes were identified by large-scale cross-trait association analysis and gene analysis.
Keywords/Search Tags:association analysis, complex diseases, genetic correlation, shared risk genes, biological pathway
PDF Full Text Request
Related items