Font Size: a A A

Three Stage Design Of Genome-wide Association Analysis For High-dimensional Complex Data

Posted on:2021-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhouFull Text:PDF
GTID:2480306293956069Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Genome wide association analysis(GWAS)is the most effective method to study human complex diseases from the perspective of genetic variation,which uses single nucleotide polymorphism(SNP)in genome as molecular genetic marker to carry out genome-wide comparative analysis or correlation analysis.In genome-wide association analysis,the initial research is to directly analyze the correlation of gene models to screen significant gene loci.In order to reduce the cost and improve the research efficiency,the researchers proposed a two-stage design,that is,in the first stage to select some candidate gene sites,and then in the second stage only to study the gene sites selected in the first stage.However,in the two-stage design analysis,most of the research is carried out under the assumption that the genetic model is determined,and the situation that the genetic model is unknown is not considered,resulting in the performance of unstable statistical efficacy.Based on the uncertainty of genetic model,this paper proposes a three-stage design of genome-wide association analysis for high-dimensional complex data.Firstly,the genetic linear model of genetic model is introduced,and then variable selection is carried out for the model,mainly through the high-dimensional variable selection method SCAD for preliminary selection of a large number of gene loci,that is,dimension reduction,and then through the least square regression to identify the genetic model,and finally through the linear regression to get the parameter estimation of gene loci.The results of computer simulation show that the three-stage design of genome-wide association analysis proposed in this paper is more effective than the traditional two-stage design when the genetic model is uncertain.In general,the three-stage design method proposed in this paper can be used to screen pathogenic genes when the genetic model is uncertain.
Keywords/Search Tags:three stage design, genetic model, least square estimation, variable selection, SCAD
PDF Full Text Request
Related items