Font Size: a A A

Solving Gene-Duplication And Gene-Loss Problems By Tree Operations

Posted on:2012-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:2218330338463795Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In November 24,1859, Charles Robert Darwin, who is a British biologist, published the book of "Origin of Species" and proposed the theory of biological evolution. With the proposal of biological evolution, more and more scientists identify a viewpoint:there exists an inextricable relationship between all species on Earth which is named genealogical relationship. Scientists have observed that this genealogical relationship can be represented by a vast evolutionary tree. Evolutionary tree, which is also named phylogenetic tree, depicts evolution relationships between species and how to construct this tree of all species is a fundamental scientific problem facing human being today.With the development of genetic technology, we hold more and more genomic sequence information which provides a mount of potential information for phylogenetic analyses. Many models and methods have been developed to build evolutionary trees based on this genomic sequence information. A common feature of most of these models is that they start out with genes which are fragments of the genome and carry an abundant of genetic information. Generally speaking, to build a phylogenetic tree for a set of species, one constructs a phylogenetic tree from genes taken from those species. Such trees are called gene trees. The implicit assumption is that the evolution of the chosen genes mimics the evolution of the species themselves. However, due to complex evolutionary processes such as gene duplication, gene loss and gene recombination, trees constructed on genes do not always accurately represent the evolutionary history of the corresponding species.Gene duplication and gene loss are familiar evolutionary phenomenon. And they play a major role in the revolution of all life on Earth. Goodman et al. proposed a gene duplication-loss model. In this model, one can infer the true species tree by solving the following two optimization problem:the gene duplication problem and the gene loss problem. The input of theses two problems is a set of gene trees and these goal is finding an optimum species tree in which the number of gene duplication(or gene loss) is the least.Ma Bin et al. have proved that both of these problems are NP-hard. Therefore, in practice, heuristic algorithms based on local search are used to solve these problems.In this paper, we intensively study the existing algorithms solved the gene duplication problem and the gene loss problem. And then we improved the algorithms based on SPR (rooted subtree pruning and regrafting) operation and TBR (tree bisection and reconnection) operation:(1) We analyse the implementation process of the algorithm for gene duplication problem based on SPR operation and find that there exists some abundant computing. We design a new algorithm to eliminate the abundant computing. The experimental results show that the improved algorithm has a good performance.(2) Based on the link between SPR operation and TBR operation, we proposed a new algorithm to solve the gene loss problem base on TBR operation。...
Keywords/Search Tags:Gene Tree, Species Tree, Gene Duplication, Gene Loss, Heuristic Strategies, Local Search, SPR, TBR
PDF Full Text Request
Related items