Font Size: a A A

Research On SNPs Identification In Genetic Association Analysis

Posted on:2010-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:Y K WangFull Text:PDF
GTID:2120360278972775Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
With the development of Biotechnology, mass biological data has been coming into being, at the same time, the methods to deal with mass biology data are relatively poor. In order to explore the knowledge and information under the mass data, we integrate mathematics, computer science and biology tools, which promote the rapid development of Bioinformatics. Single nucleotide polymorphisms (SNPs) is one of the most common form of polymorphism in the genome, SNPs identification is an important aspect of the Bioinformatics.SNPs detection has a wide range of applications for the prevention and treatment of complex disease, especially for the current complexity of multi-gene disease such as Tumors, Coronary Heart Disease, Diabetes and so on. Therefore, a large number of association studies with SNPs as genetic markers is coming into being for complex disease. Because of the specificity of biological data, traditional Single Locus Analysis can not meet the needs of SNPs identification,especially when the SNPs are in Linkage Disequilibrium or the number of SNPs is much larger than the number of sample. In this article, we use Ridge Regression, Stepwise, Lasso and Boosting Algorithm to SNPs identification, and compare their performance by ROC curve and the corresponding AUC area. Compared with the Single Locus analysis, they perform better in SNPs identification.
Keywords/Search Tags:SNPs identification, Linkage disequilibrium, Single Locus, Ridge Regression, Stepwise, Lasso, Boosting
PDF Full Text Request
Related items