Font Size: a A A

Conditional Nonparametric Independence Screening And Its Application In Genetic Data

Posted on:2022-10-03Degree:MasterType:Thesis
Country:ChinaCandidate:P XuFull Text:PDF
GTID:2480306485984049Subject:Statistics
Abstract/Summary:PDF Full Text Request
In genetic research,genome-wide association analysis(GWAS)is useful method to analysis the relationship between complex diseases and genes.This method aim to find out the relationship between the target gene and the specific disease by analysising the relationship between the target gene and the specific disease.Since 2005 Klein et al.found genes that affect macular degeneration through genetic association analysis,it have helped scientists to screen out gene positions related to complex genetic diseases successfully such as coronary heart disease,obesity,type 2 diabetes,and schizophrenia.With the maturity of genotyping technology,it has become possible to obtain data on thou-sands of gene locations.This means that we will face the problem of high-dimensional data,which makes traditional statistical theories and methods receive a big challenge.In the fact,the interac-tion between genes and genes,genes and environment(such as,age,gender,etc.)will have an strong affect on the disease.The traditional method that only considering the effect between a single gene locus and the disease.Ignoring this may lead to some misjudgments and reduce the accuracy of genetic screening.At the same time,in the process of gene sequencing,it is difficult to accurately measure the genotype of each gene site.We often only know the probabilities of the three genotypes at the site,and the true genotype cannot be obtained.Conventional correla-tion analysis mainly consider the linear model,the model structure is assumed to limit the scope thereof.Non-parametric model can not be applied to the model structure is assumed,with more applicability.Therefore,this paper considers that some important variables are known,based on the non-parametric additive model to screen the genotype uncertain data for pathogenic genes,and proposes a conditional non-parametric screening(CNIS)method.Under some appropriate conditions,we proved that the screening in the first stage of the method has consistent screening properties and can retain important variables with probability 1.The variable selection in the sec-ond stage also has good consistency.Simulation results based on Monte Carlo data shows that this method has better performance than the NIS method.
Keywords/Search Tags:Genome-wide association analysis, Uncertain genotype, Additive model, Con-ditional non-parametric independent screening, Consistency
PDF Full Text Request
Related items