Font Size: a A A

Comparative Analysis On Two-stage Sequence Kernel Association Test Methods In Genetic Association Study

Posted on:2022-03-01Degree:MasterType:Thesis
Country:ChinaCandidate:W P ZhaoFull Text:PDF
GTID:2480306335954699Subject:Biology
Abstract/Summary:PDF Full Text Request
Genome-wide association studies have become a popular tool for identifying genetic variants associated with disease risk.In order to improve the situation of serious loss of test power due to incorrectly specified genetic models,it is urgent and important to develop an association test method that does not assume genetic models and has satisfactory statistical power for a series of genetic models.After consulting the relevant literature,we found that some scholars have proposed a two-phase sequence kernel association test(tpSKAT),which can effectively identify genetic variants that are significantly associated with complex diseases under a case-control design.It is noted that the selection of different kernel functions in the tpSKAT method will affect the statistical power of the test method,and this article will study this.We mainly consider the most commonly used linear kernel function,quadratic kernel function and the original IBS kernel function,and compare the statistical power of the two-phase sequence kernel association test method under these three kernel functions,aiming to determine the kernel function with optimal statistical efficiency and robustness in the two-phase sequence kernel association test method.Firstly,the design process of the test method are follows:the first phase is determined the genetic model which is on the basis of the Hardy-Weinberg equilibrium test,and the second phase is the sequence kernel association test by selecting different kernel functions under the determined genetic model.Taking into account the difficulties of obtaining the actual genetic disease data set and the computational burden,this article uses a series of parameter settings to randomly generate genotype data with compound symmetry and AR-1 correlation matrix covariance structure,and the simulation research of two-phase sequence kernel association test method is carried out.Secondly,the probabilities of the first type of errors made by the two-phase sequence kernel association test method under the three kernel functions are simulated,and the values obtained are very close to the pre-specified significance level of 0.05,which indicates that the two-phase sequence kernel association test method can control the first type of errors well,and has nothing to do with the choice of kernel function.Finally,this article chose five different scenarios to simulate the statistical power of the two-phase sequence kernel association test method under the three kernel functions.And the following conclusions are drawn:(1)When the genetic model is recessive and dominant,the two-phase sequence kernel association test methods under the three kernel functions are effective,but the effectiveness under the additive model will be reduced,so the effect of using it in the additive genetic model is not ideal.(2)Under the recessive genetic model,when the effects of the two pathogenic sites are the same,the statistical power of the two-phase sequence kernel association test under the three kernel functions is much higher than the opposite effect of two pathogenic sites.(3)When the genetic model is a recessive model,and the complex traits depend on the genetic variation of linear correlation,choosing the linear kernel function can make the two-phase sequence kernel association test method have the best performance.When the model is a dominant genetic model,the two-phase sequence kernel association test method using the IBS kernel function is the most effective.
Keywords/Search Tags:Genome-wide association study, Two-phase sequence kernel association test, Genetic model, Kernel function, Simulation comparative study
PDF Full Text Request
Related items