Scaffold Filling Under The Common Adjacency Distance

Posted on:2014-02-24

Degree:Master

Type:Thesis

Country:China

Candidate:Q Zhao

Full Text:PDF

GTID:2248330398959204

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the development of genome sequencing technology, it is possible to sequence more organisms for genomic analysis; however, many released genomes are incomplete genomes which are called scaffolds. Scaffolds cannot be used for genomic analysis because they could introduce errors. A feasible method is filling the missing genes into scaffolds, which results a problem called scaffold filling that gradually becomes a hot research field of computational biology. Scaffold filling uses a genome from a close species as a reference, and fills the missing genes into scaffolds to make scaffolds complete and close to the original genome. Scaffold filling is one-sided or two-sided according to the reference genome is complete or not.In this paper, a character represents a gene family and a string represents a genome. When the input genome contains no gene repetition, scaffold filling problem is polynomially solvable under the breakpoint distance or DCJ distance. When the input genome contains gene repetition, scaffold filling problem is NP-Complete under the breakpoint distance or common adjacency distance.At present, the major study has focused on the general case of scaffold filling problem with gene repetition, that is filling a string of arbitrary length at the same position is allowed, however, this is not common in many cases that only1or2genes are filled at the same position. Therefore, in this paper, we consider a special scaffold filling problem that if the string of length l can be filled at the same position of a scaffold to make the adjacencies increased by1+1under the common adjacency distance, then the value of l is only1or2. In the special scaffold filling problem, maximize the distance between two genomes is our objective. The above problem is abbreviated as SF-MNCA(2)(Scaffold Filling to Maximize the Number of Common Adjacencies). The main contribution of this paper is as follows.(1) Firstly prove that SF-MNCA(2) is NP-Complete. This paper uses the proof method of NP-Complete theory to prove that SF-MNCA(2) is NP-Complete by making a reduction from the NP-Complete problem3DM to one-sided SF-MNCA(2) in polynomial time.(2)Design a2-approximation algorithm for one-sided SF-MNCA. This algorithm has deep theoretical significance because it lays a theoretical foundation for the subsequent algorithms. This paper gives the detail process and correctness proof of the2-approximation algorithm.(3) Design a6/5+ε-approximation algorithm for one-sided SF-MNCA(2). This is a local search algorithm base on the2-approximation algorithm of one-sided SF-MNCA. This paper designs the algorithm, analyzes its running time and proves its factor is6/5+ε.

Keywords/Search Tags:

Scaffold Filling, Common Adjacencies, Approximation Algorithm, Local Search

PDF Full Text Request

Related items

1	Studies On Algorithms For Genome Scaffold Filling Problem
2	Research On Algorithms Of Scaffold Filling Problems
3	Local Search Algorithm For Max Nae-(K)-Sat And Max-(K)-Sat
4	Local Search Algorithms For Capacitated Facility Location And K-means Problems
5	3D Printing And Performance Analysis Of Cartilage Tissue Scaffold
6	An Approximation Algorithm For Spherical K-means Problem With Penalty
7	Research And Implementation Of Automatic Control System For Scaffold Hoist Based On PLC
8	A Kind Of Moving Target Search Algorithm For The Local Search Space
9	Research Of Computer-Aided Bonic Tissue Scaffold Modeling Method
10	Oblivious and Non-Oblivious Local Search for Combinatorial Optimization