Font Size: a A A

Research On The Combinatorial Optimization Problem In Detection Of Genetic Diversities

Posted on:2009-07-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:J L WuFull Text:PDF
GTID:1118360278957312Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The detection of genetic diversities is a key step for investigating genetic diversities. In recent years, the methods assisted by computation for detecting genetic diversities have been applied extensively. There have occurred a series of combinatorial optimization problems in the research field, which focus on improving detection efficiency and decreasing detection cost. In this dissertation, three of these combinatorial optimization problems are studied, which are individual haplotyping problem, linkage disequilibrium (LD) tag SNPs selection problem in multiple populations and multiplex polymerase chain reaction primer set selection problem.Based on the weighted minimum letter flips model of the individual haplotyping problem, a heuristic algorithm HAW is proposed. The algorithm computes a global agreeable set for each SNP fragment, and chooses two global agreeable sets including the most fragments (the intersection of the two sets is null) to construct a pair of initial haplotypes, and the final pair of haplotypes are built gradually by extending the initial ones with the rest fragments. Extensive experimental results indicate that HAW algorithm can solve the model efficiently, and gets higher reconstruction rate than previously presented algorithm.Based on the minimum fragment removal model of the individual haplotyping problem, a particle swarm optimization algorithm PSO-MFR is presented. In this algorithm, particles are denoted by binary codes, and the computation operations between particle position and velocity are redefined. Because a kind of short particle code is designed for PSO-MFR algorithm by taking advantage of the low heterozygous frequencies of single nucleotide polymorphisms (SNPs), PSO-MFR algorithm can generate a small solution space, which makes it get a good solution and be fast. Experiments results show that PSO-MFR algorithm is effective for solving the model. It can obtain relatively high reconstruction rate in a short time, and get higher reconstruction rate than Fast Hare algorithm.Based on the minimum error correction model of the individual haplotyping problem, a thorough analysis is given on the reasons for how the real best result can be lost during a haplotyping process. We propose a new idea to reduce the probability of losing the best result by generating a small set of optimal results, instead of a single optimal result. A parthenogenetic algorithm PGA-MEC is presented to solve the model. In this algorithm, a kind of short particle code is adopted, and a novel recombination operator is designed, which makes full use of the information in fragments to adjust the chromosomes step by step. Experimental results indicate that PGA-MEC algorithm can solve the model effectively, and get higher reconstruction rate haplotypes with shorter running time than previous work. In addition, an improved algorithm IPGA-MEC is proposed by applying the idea of generating a small set of optimal results to algorithm PGA-MEC. Experimental results show that the idea of generating a small optimal result set may effectively avoid losing the real best result and improve the reconstruction rate further.In this dissertation, a mathematical model of the minimum LD tag SNPs selection problem in multiple populations (MTSMP) is formulated, and which is proved to be NP-hard by reducing from set cover problem in polynomial time. A greedy algorithm MP-Tagging is proposed for solving this problem. The algorithm selects either two tag SNPs or one SNP at each iteration. Experimental results indicate that MP-Tagging algorithm has very high efficiency for solving this problem, it can find fewer tag SNPs than previous algorithm.The problem of designing multiplex polymerase chain reaction primer set is studied in this dissertation. A mathematical model is presented for the minimum primer set selection problem with multiple constraints. A greedy algorithm MG is proposed for solving this model. In addition, by introducing a novel recombination operator based on MG algorithm, a parthenogenetic algorithm MG-PGA is developed to solve the model. Experimental results show that MG-PGA algorithm can not only find a smaller primer set than previous work, but also satisfy multiple biological constraints. The presented model and algorithm provide a practical solution for MP-PCR primer set design.In this paper, several effective algorithms are presented for solving three typical combinatorial optimization problems for detecting genetic diversities. These presented algorithms will be helpful to improve detection efficiency and decrease detection cost.
Keywords/Search Tags:genetic diversity, combinatorial optimization, haplotype, tag single nucleotide polymorphisms (Tag SNPs), primer set
PDF Full Text Request
Related items