Font Size: a A A

High Dimensional Semi-parametric Gene Association Analysis Based On Gaussian Mixture Model

Posted on:2022-11-26Degree:MasterType:Thesis
Country:ChinaCandidate:F Y ChenFull Text:PDF
GTID:2480306770478414Subject:Policy and Law Research of Medicine and Sanitation
Abstract/Summary:PDF Full Text Request
Genome-wide association analysis(GWAS)is an important method to study complex diseases,aiming to detect disease-related single nucleotide polymorphisms(SNPs)on a genome-wide scale.Due to the large scale of data involved,some studies have considered using high-dimensional statistical tools to screen disease-causing genes.However,with the increase of data complexity,some scholars have further considered the heterogeneity of the data and proposed to use the Gaussian mixture model for gene association analysis,but such studies have not fully considered the uncertainty of the gene model.In this case,the model is often unknown,and wrong model usage may reduce its detection efficiency.In addition,in gene association analysis,only some gene variables may affect the disease,that is,the high-dimensional gene data is sparse.At the same time,when we collect disease data,we often get some non-genetic variables,such as height,weight,age,etc.And these variables may also have an impact on the disease,so we can consider using a nonlinear relationship to describe this impact.Therefore,this paper conducts gene association analysis based on Gaussian mixture model,considers the uncertainty of gene model,uses nonlinear relationship to describe the influence of non-genetic variables on disease,uses B-spline basis function to approximate non-genetic variables,and constructs A semi-parametric additive model under genetic model uncertainty.At the same time,when exploring the screening of pathogenic genes,we should consider using the penalized likelihood method based on Gaussian mixture model to perform variable selection on high-dimensional gene data,so as to screen out pathogenic gene locus.Through numerical simulation,the results show that the method of variable selection based on Gaussian mixture model has good effect.Finally,a summary and outlook are given for the method in this paper.
Keywords/Search Tags:gene model, B-splines, Gaussian mixture model, variable selection
PDF Full Text Request
Related items