Font Size: a A A

Identification Of Common Risk Genes And Biological Pathways In Seven Autoimmune Diseases Using MetaCCA Method

Posted on:2020-10-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:X C JiaFull Text:PDF
GTID:1364330575453030Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
With the change of people's lifestyle,ecological environment and disease spectrum,complex diseases have gradually become the number one killer threatening human health in recent years.Autoimmune diseases have been a common concern of clinicians,geneticists and bioinformatics due to their diverse clinical manifestations,complex pathogenesis,high heritability and family aggregation.Although genome-wide association study?GWAS?has been widely used to identify potentially causal or risk-conferring genetic variants for common human diseases in the individual level measurement and made great achievements in revealing the genetic genes related to the occurrence,development and treatment of autoimmune diseases,this univariate approach has had limited success in detecting complex genotype-phenotype correlations due to ignoring the correlation between different phenotypes of disease.The existing GWAS and bioinformatics studies have shown that not only are the genetic susceptibility of autoimmune diseases affected by multiple genes,but also there are common risk genes and biological mechanisms among several autoimmune diseases.Therefore,how to identify the common genetic and biological mechanisms among autoimmune diseases by multivariate statistical analysis based on massive GWAS data has become a research hotspot.This study intends to explore identify shared risk genes and biological pathways in seven autoimmune diseases including celiac disease?CEL?,inflammatory bowel disease?IBD,which includes Crohn's disease?CRO?and ulcerative colitis?UC??,multiple sclerosis?MS?,primary biliary cirrhosis?PBC?,rheumatoid arthritis?RA?,systemic lupus erythematosus?SLE?and type 1 diabetes?T1D?based on publicly available summary statistics of GWAS.Objective1.Multivariate statistical model was established to analyze the common risk genes among seven autoimmune diseases based on publicly available summary statistics of GWAS,which provide a reference for genome-wide multivariate statistical analysis and further biological experiments.2.The biological pathways and protein interactions of common risk genes in autoimmune diseases were analyzed to provide biological basis for the occurrence,development and treatment of diseases based on STING,Enrichr,DAVID and other online databases.Methods1.The GWAS summary statistics of seven autoimmune diseases in this present study were downloaded from ImmunoBase.Plink and R software were used to complete data consolidation,gene annotation and single-nucleotide polymorphism?SNP?pruning according to the 1000 Genome datasets.2.The statistical model of summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis?metaCCA?was established to identify the common risk genes and loci among seven autoimmune diseases.3.The gene-based association analysis of versatile gene-based association study?VEGAS?based on original GWAS summary statistics was used to identify association of one gene with specific disease.4.Gene ontology?GO?,kyoto Encyclopedia of genes and genomes?KEGG?analysis and protein-protein interaction?PPI?network analysis were used to identify the common biological pathways in autoimmune diseases.Results1.Basic information of GWAS datasetsThe linkage disequilibrium of R2>0.2 was used to remove SNPs with large pairwise correlations.After gene annotation and SNP pruning,there were 41,274SNPs located in 11,516 gene regions available.In GWAS analysis,there were 4 SNPs correlated with CEL,55 SNPs correlated with IBD,6 SNPs correlated with MS,9SNPs correlated with PBC,11 SNPs correlated with RA,10 SNPs correlated with SLE and 12 SNPs correlated with T1D when threshold was 5×10-8.2.Identification of common risk genes in autoimmune diseasesFor the univariate SNP-multivariate phenotypes metaCCA analysis,4,962 SNPs reached the Bonferroni corrected threshold(P<1.21×10-6),and the canonical correlation r between each SNP and phenotype ranged from 0.0372 to 0.6586.For the multivariate SNPs-multivariate phenotypes metaCCA analysis,1,044 genes with a significance threshold(P<4.34×10-6)were identified as the potential pleiotropic genes.The canonical correlation r between genotype and phenotype ranged from0.0322 to 0.5899.After the gene-based association analysis of VEGAS2,9 genes were identified for CEL,111 genes were identified for IBD,18 genes were identified for MS,21genes were identified for PBC,20 genes were identified for RA,20 genes were identified for SLE,and 33 significant genes were identified for T1D with the P<1.0×10-6.By screening the results of gene-based analysis p-values,we identified 67common risk genes yielding significance in the metaCCA analysis and were associated with at least one disease in the VEGAS2 analysis(PmetaCCA<4.34×10-66 and PVEGAS2<1.0×10-6).When the 67 common risk genes of autoimmune diseases were retrieved from ImmunoBase,GWAS Catalog and Web of Science databases,27 of these 67 putative pleiotropic genes had been previously reported to be associated with more than one of these seven diseases,16 genes were previously reported to be associated with only one autoimmune disease,and other 24 remaining significant genes had never been reported to be associated with any autoimmune disease.3.Identification of common biological pathways in autoimmune diseasesWhen 67 common risk genes associated with autoimmune diseases were used as the gene sets for the GO term enrichment analysis,there were 6 significant molecular function terms and 62 significant biological process terms were identified to be enriched in the development of autoimmune diseases.The 6 significant molecular function terms were growth hormone receptor binding?GO:0005131??kinase activity?GO:0016301??protein kinase activity?GO:0004672??MAP kinase kinase kinase kinase activity?GO:0008349??phosphotransferase activity,alcohol group as acceptor?GO:0016773?and protein tyrosine kinase activity?GO:0004713?.The top five significant GO biological process terms were positive regulation of gene expression?GO:0010628??interleukin-23-mediated signaling pathway?GO:0038155??activation of protein kinase activity?GO:0032147??cellular response to cytokine stimulus?GO:0071345?and regulation of tyrosine phosphorylation of STAT protein?GO:0042509?.KEGG results showed that 67 risk genes in autoimmune diseases were enriched in 5 KEGG pathways.And the 5 significant KEGG pathways were JAK-STAT signaling pathway?hsa04630??Toxoplasmosis?hsa05145??Longevity regulating pathway-multiple species?hsa04213??Measles?hsa05162?and Leishmaniasis?hsa05140?.There were 12 common risk genes involved in this five pathways.And the JAK-STAT signaling pathway?hsa04630?was the most gene enriched,including IL23R,TYK2,JAK2,PTPN2 and IL22RA2.PPI network analysis of 67 common risk genes in autoimmune diseases showed that there were 10 gene expression proteins had a strong interaction with other gene expression proteins?the number of interaction nodes was more than 5?,and the comprehensive score of 15 protein interaction nodes was more than 0.9.A total of 8genes including FGF2?IL23R?IRF1?ITGAM?JAK2?PTPN2?TNFAIP3 and TYK2were with the number of interaction nodes?5 and protein interaction synthesis score?0.9.Conclusions1.MetaCCA could analyze the correlation between multivariate phenotypes and multivariate genotypes and had the advantages of high throughput,low cost and no candidate gene involved.Therefore,metaCCA could identify the genetic variation of autoimmune diseases efficiently.2.This study validated 27 confirmed genes which were identified as common risk genes in previous different types of studies and identified 40 novel common risk genes in autoimmune diseases,which provide a reference for genome-wide multivariate statistical analysis and further biological experiments.3.TYK2 and JAK2 play a key role in the development of autoimmune diseases.JAK-STAT signaling pathway?hsa04630?has been recognized as an important biological pathway in the occurrence of autoimmune diseases.Two novel common risk genes,FGF2 and ITGAM,are significantly enriched in several biological pathways and protein interactions,which may provide insights for the biological mechanism of autoimmune diseases.
Keywords/Search Tags:Autoimmune diseases, GWAS, MetaCCA, Common risk genes, Pleiotropic genes
PDF Full Text Request
Related items