Font Size: a A A

Application Research Of Latent Class Model In Rare Variants Association Studies

Posted on:2016-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:T BuFull Text:PDF
GTID:2284330479989582Subject:Epidemiology and Health Statistics
Abstract/Summary:PDF Full Text Request
Background Genome-wide association studies have been widely used to reveal the genetic structure of complex diseases and quantitative traits. These studies systematically evaluated the association between common variants(MAF> 5%) and complex diseases. More than 2000 common variants are identified by GWASs, these findings provides new clues for disease meanhnic research. However, these common variants could only explain very low proportion for genetic risk. Given this phenomenon, some scholars have proposed the existence of "genetic missing", such as the low-frequency or rare variants can explain part of the additional risk of disease, and also there is evidence that the low-frequency variability and rare variants associated with disease. With the rapid development of sequencing technology, the arrival of next-generation sequencing technologies provide unprecedented value for researching rare variants, these technologies also bringing new challenges of statistical analysis. Therefore, to develop new methods of statistical analysis methods to identifiy disease associated variants. This study extends the latent class model to the analysis of genetic data to provide statistical support for genetic analysis in the future.Methods The data of this research come from Texas Medical Research Center Genetic Analysis Workshop 17 which create GAW17 data set. The variants’ information of this data sets are come from 697 multi-racial individuals, and the quantitive traits and binary traits are simulated 200 times based on the sequence data. Collapsing the rare variant of the same gene to a new variable and then apply linear regression for quantitive traits and logistic regression for binary traits to analyze the association between the traits and gene, and then calculate the power and type I error of these two methods. Apply LCA and LCFA in pathway analysis, analyze the association between quantitive traits and pathway by using linear regression, calculate and compare type I error and the power of these two methods.Results Gene-based analysis: for quantitive traits, LCA have a high power and could class individuals when the effects of rare variant and common variant are strong, and will not be influenced by the number of non-effect variants; LCA will have low power and model will not convergence when the effects of variants are low. Compare with PCC, the LCA has lower type I error. Both PCC and LCA have low power in binary trait associate with low effects variants, but LCA still has advantage with lower type I error.Pathway-based analysis: LCA has high power 1.000 and low type I error 0.030 when analyze a whole pathway. For LCFA, the pathway was classed into three latent factors. The power of three latent factors are 0.595, 1.000 and 0.980, respectively; and the type I error of three latent factors are 0.070, 0.040 and 0.045, respectively.Conclusions LCA and LCFA could combine with rare variants set strategy to analyze rare variants association study by constructing the latent variable of genetic variants data.Gene-based analysis: for the quantitive traits with sample size 697, LCA could identify strong effect rare and common variants even though the variants set have more non-effect variant. However, LCA will more likely not convergence when the variants have low effects; for binary traits with low effects the LCA has low power.Pathway-based analysis, LCFA could class heterogeneous group, and meanwhile identify the latent factor of pathway by classing same characteristic genetic variants into latent factors, and provide reference for biology mechanism study.
Keywords/Search Tags:Latent class analysis, Latent class factor analysis, Gene-based analysis, Pathway-based analysis
PDF Full Text Request
Related items