Font Size: a A A

Research On The Impact Of Gene-gene And Gene-environment Interaction On Complex Diseases

Posted on:2020-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:X H ZhaoFull Text:PDF
GTID:2430330572479819Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
The Genome-Wide Association Study(GWAS)has identified many variant genes associated with complex human diseases,but with the rapid development of secondgeneration sequencing technology,the genetic data has increased geometrically.Because genetic data has the characteristics of small sample size,high dimension and large data noise,it is time-consuming,laborious and unsatisfactory to study gene-gene and geneenvironment interactions through traditional statistical methods.Therefore,statistical methods need to be combined.with data mining technology to accurately analyzes genegene,gene-environment interactions for the study of complex diseases.In this paper,the real GAW17 dataset is used to analyse and research Gene-Gene and Gene-Environment interaction,the Lasso algorithm is used to reduce the dimensionality of gene data of the first chromosome of the GAW17 data,to filter the variables,and to obtain the significant effect variables and environmental variables;Using the main effect obtained by the Lasso algorithm,a random forest model and a support vector machine model are established in the second stage,the parameters of the two models are optimized respectively,and a relatively significant Gene-Gene,Gene-Environment interaction are obtained;The random forest model and the support vector machine model established at the second stage are compared,and a series of evaluation indexes are obtained.Results show that the support vector machine model is better than the random forest model.Comparing these two models with the previous studies,the AUC values of the two models established in this paper are higher than the weighted empirical Bayes method and the joint covariance model,which shows that the models built in this paper have good application value.
Keywords/Search Tags:GAW17, The interaction effects, Two-stage method, Lasso algorithm, Random forest, Support vector machine
PDF Full Text Request
Related items