Font Size: a A A

Research On Informative SNP Selection Method Based On Genetic Algorithm

Posted on:2014-04-30Degree:MasterType:Thesis
Country:ChinaCandidate:M LiFull Text:PDF
GTID:2370330488499524Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Single Nucleotide Polymorphism(SNP)is a DNA set of polymorphism based on the single nucleotide variations at the genomic level.A small part of SNPs which are considered as tag SNPs or informative SNPs are applied to haplotype association study.Currently,there are many methods for informative SNP selection.However,they also have some drawbacks including:high time complexity,incompact informative SNP set,low reconstruction ratio or loss much information for genetic association.Thus,a method is proposed based on genetic algorithm in order to enhance performance and to meet the need of haplotype association study.Informative SNPs selection methods based on haplotype reconstruction consist of two components:constructing subset of informative SNPs and haplotype reconstruction.To process high dimensions and small samples of SNP datasets,a genetic algorithm is designed for these steps.The creativities and contribution are as follows:Firstly,SNP datasets are encoded with binary code.To reduce the number of SNPs,filter SNPs which has very low MAF and then remove redundant SNPs by calculating LD between SNPs.Thus,it largely saves computational complexity for informative SNPs selection.However,the number of SNPs is still big and the feature combinatorial space is very huge.Therefore,it is very difficult to search optimum solution.In our study,a novel method is proposed to find approximately optimum solution in a relative short time while keep reconstruction ratio.Meanwhile,due to numbers of noninformative SNPs,these traditional methods have to retrain prediction model repeatedly,which is very time consuming.In this work,I make full use of multiple output nodes of neural network to avoid repeatedly retrain learning model.At last,in order to facilitate association study of complex diseases,I implement visual software for informative SNPs selection.Then,it is used to process various real and simulation datasets,and compare with other latest method.Experimental results show that our method find better informative SNPs set with high reconstruction ratio in relative short time.
Keywords/Search Tags:Single Nucleotide Polymorphism, tag SNPs, genetic algorithm, artificial neural network
PDF Full Text Request
Related items