Font Size: a A A

An Improved Algorithm For The Zero Exemplar Breakpoint Distance

Posted on:2016-06-01Degree:MasterType:Thesis
Country:ChinaCandidate:R J LiFull Text:PDF
GTID:2180330461987396Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Genomics appeared in 1980, with a few species genome project’s startup, genomics has haved a very good development. In recent years, with the emergence of new technologies, genomics obtains further development. Enormous genomics data emerging from these new technologies constantly, in the face of the explosive growth of data, there is an urgent need for new method to process these data, and comparative genomics emerge as the times requires.The exemplar breakpoint distance problem is one of the most important problems in genome comparison and has been studied extensively in the literature. The description of the exemplar breakpoint problem is as following:for the two input genome, to find their exemplar sequence respectively, and satisfying the distance between them is minimum. The basic version of this problem is:each gene family occurs in the two genomes at most twice, and decide whether the breakpoint distance of their exemplar sequence is zero or not, i.e. the two input genomes can be reduced to the same genome. We call the problem zero exemplar breakpoint distance, and ZEBD(2,2) for short. Blin et al. have proved that the problem is NP-Hard, and the problem can’t be approximated within any factor, even if each gene family occurs at most twice in the genome.The best existing algorithm for ZEBD(2,2) runs in O(n21.86121"). In this paper, we improved the algorithm with running in O(n21.84931"). If each subgraph constructed by a gene family has at most two edges in the exemplar graph, we can decide whether they can be reduced to a genome in polynomial time. We firstly make the subgraph constructed by each gene family having at most two edges in the exemplar graph constructed by the two genomes. Our method is by exhaustive. We improved the time complexity by reducing the number of exhaustive. The detail is as following:we decompose the component pair constructed by two different gene family into at most three simple subgraph, and decompose the component triple constructed by a component pair which satisfies some condition with another different gene family into at most five simple subgraph. This technique reduces the number of cases for handling two gene family from 4 to 3 and three gene family from 8 to, hence leading to faster algorithm.
Keywords/Search Tags:exemplar breakpoint distance, genome, gene family, algorithm, time complexity
PDF Full Text Request
Related items