A machine learning approach for designing DNA sequence assembly algorithms

Posted on:2004-04-10

Degree:Ph.D

Type:Dissertation

University:Rensselaer Polytechnic Institute

Candidate:Lim, Darren Troy

Full Text:PDF

GTID:1468390011474783

Subject:Computer Science

Abstract/Summary:

We present two separate algorithms for solving the DNA sequence assembly problem. The sequence assembly problem is the reconstruction of a large sequence of DNA from a set of subsequences called fragments. Fragments are created by breaking, at random intervals, copies of the original DNA sequence. This creates a system of fragments in which many of the fragments overlap with each other. Identifying these overlapping fragments is the key to reforming the original strand.; The first algorithm first identifies a “correct” series of fragment merges which would result in producing the original sample from which they were obtained. It enters each series into a database of solutions, which is then used to sequence DNA different than those used to create the database.; The second algorithm uses a k-mer based approach to identifying overlapping regions in fragments. The method is an improvement over the first algorithm in two ways: (1) it is designed to sequence real fragments, which are different in composition from simulated fragments; (2) it can be used to sequence much longer strands of DNA.; For both algorithms, parameters of computation are learned through experimentation with sequences of previously assembled DNA. Our experiments show that the parameters of computation generated by learning on a set of DNAs can be used to successfully sequence a separate set of DNA sequences.

Keywords/Search Tags:

DNA sequence, Algorithms, Fragments

Related items

1	Research On The Planar Fragments Matching
2	Research On Automatic Reassembly Of2D Fragments
3	Protein Structure Classification Algorithms Based On Sequence Similarity
4	Study On The Detection And Parameter Statistical Method Of Glass Fragments Under Impact Fragmentation
5	Biological sequence analyses - Theory, algorithms, and applications
6	Research And Application On File Fragments Identification And Reassembly Technology
7	Research On The Method Of Generating Code Fragments In Response To Free-form Queries
8	Space & distance as I require: The journals & prose fragments of Philip Whalen 1950 - 1966
9	Bank Notes Fragments Reassembly Method Based On Digital Image Processing
10	Research On The Technology Of 2D Fragments Re-asembly Based On Contour Features