Font Size: a A A

Multilocus Association Analysis For Case-control Study Incorporating X Chromosome Inactivation

Posted on:2022-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:B H LiFull Text:PDF
GTID:2480306335482964Subject:Public Health
Abstract/Summary:PDF Full Text Request
Background:Only a few genome-wide association studies(GWAS)focus on X chromosome.However,although X chromosome genes accounts for about 5%of the entire genome in human,mutants on X chromosome often have a greater impact on complex diseases than autosomes.Note that there have been several biological issues on X chromosome(such as dosage compensation,X chromosome inactivation(XCI),and the difference in the number of copies of X chromosomes in both sexes),which are different from those on autosomes.Therefore,the X chromosomal association analyses are more complicated than those on autosomes.Though some X chromosomal association analyses have been proposed to consider XCI or dosage compensation,they are often based on single loci.When there are many loci,the linkage disequilibrium(LD)among multiple loci exists.The X chromosomal association analyses based on single loci will lose the test power due to multiple comparisons.Therefore,it is essential to develop association analysis based on multiple loci.On the other hand,although there are several X chromosomal multiple-locus association tests for case-control data,they only considered the different patterns of XCI or the effect of dosage compensation,which have not incorporated the LD information among loci.Moreover,for case-control data,there has been no association analysis found to simultaneously consider XCI and LD information among multiple loci.Hence,it is necessary to propose association analysis which concurrently considers different patterns of XCI and the LD information among multiple loci.Objective:For case-control data,association analysis which simultaneously considers different patterns of XCI and the LD information between multiple loci is proposed.Methods:(1)Based on the prevalence,multilocus association tests(Qf and Qm)are proposed for females and males,respectively.(2)We consider the difference in the number of X chromosomes between different sexes and the existence of linkage disequilibrium among multiple loci.Based on the LD information among multiple loci,the association analysis methods(Tf and Tm)are proposed for females and males,respectively.In the calculation of LD among multiple loci,we use the EM algorithm to estimate the haplotype frequencies of males and females from external data.(3)The grid search method is used to construct the score test statistics based on data of females under different inactivation patterns,and the Cauchy method is used to combine the P values of the test statistics.Then,the Fisher method is used to combine the P values of the test statistics for females and males to obtain two multilocus combined XCI based association analysis methods Q and T,which are based on the estimate of the prevalence and the estimate of the LD,respectively.For comparison,we also consider the test statistics based on the true prevalence and the true LD value,denoted as QR and TR,respectively.Finally,the simulation studies are used to verify the robustness and validity of the proposed multilocus methods for association analysis;(4)The proposed methods are applied to Minnesota Center for Twin and Family Research data for their practical use.Results:In most cases,Q,QR,T and TR all control the type I error rate well ranging from 0.036 to 0.063.The powers of T and TR are higher than those of Q and QR,the power of TR is slightly higher than that of T,and the power of QR is slightly higher than Q.When the sample size increases,the proportion of the number of the risk loci increases,and the ratio of the number of cases to the number of controls is more balanced,the performance in power of Q,QR,T and TR will be better.As the LD between loci increases,the four proposed methods will be less powerful.In the real data application,the LD-based method identified the haplotype block(from SNP rs7063887 to SNP rsl2841491),which is associated with the alcohol consumption composite score(P value=0.0378)at the significance level of 5%.Conclusion:In this thesis,we propose association tests that simultaneously consider different patterns of XCI and the LD information between loci for case-control data,with one method based on the prevalence and another method based on the LD information between loci.We recommend the LD-based method and a balanced design with the number of cases to the number of controls being 1:1.
Keywords/Search Tags:X chromosome inactivation, Genome-wide association analysis, Multilocus, Case-control study
PDF Full Text Request
Related items