Font Size: a A A

A Study Of Tag SNP Selection Method Based On Multilocus Linkage Disequilibrium

Posted on:2015-07-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y H HeFull Text:PDF
GTID:2180330434453095Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Abstract:A single nucleotide variation of genome sequence is called single nucleotide polymorphisms (Single Nucleotide Polymorphism, SNP). The studies found that different individuals can be represented by using a small amount of SNP loci which contain the bulk information of whole sample. This kind of SNP is called the tag SNP. The process of determining the genotype of sequences is haplotyping. Although biological experiments can achieve more accurate and reliable haplotyping results, but the process is costly, difficult to meet the real-time analysis of large-scale biological data needs. Therefore, using the bioinformatics methods to select the label SNP loci, and then carrying out the analysis on a single label sites can greatly reduce the cost and retain the information of original sequence.Selecting the tag SNP from the genome containing hundreds of thousands of SNP’s is proven to be NP-hard problem. At present, several methods have been used to tag SNP selection. However, they are still some drawbacks such as high time complexity, the large number of tags and the unsatisfactory reconstruction accuracy. In this paper, we propose a selection method based on multiloci linkage disequilibrium measure. Experiments show that the method is more suitable for the practical study. The main work of this paper is as follows:Firstly, describe the model of tag SNP selection problem, and compare the characteristics of the current analytical methods based on different ideas. Also, elaborate the basic steps of reconstruction strategy.Secondly, we propose a method to construct the candidate tag SNP set based on ant colony algorithm. In order to effectively reduce the computational complexity of the algorithm, at this stage, we just aim at multilocus linkage disequilibrium measure and then uses ant colony algorithm to find near-optimal solutions of it. This study mainly includes the design of heuristic factors and path selection function to improve the performance of the algorithm.Thirdly, we propose the backward scheme to refine the subset. This target of the process is to accurately reconstruct the samples. The main purpose of the reconstruction process is to further improve the accuracy and reduce the number of pins of the SNP.Fourthly, in order to validate the effectiveness of this method based on C++programming language we designed and implemented this improved algorithm. Then, we compared with other methods on several datasets. Experiments show that our method is practical.
Keywords/Search Tags:Single Nucleotide Polymorphism, tag SNPs, ant colonyalgorithm, bioinformatics
PDF Full Text Request
Related items