Font Size: a A A

Research On Sequence Alignment Algorithms In Bioinformatics

Posted on:2005-10-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y R TangFull Text:PDF
GTID:1118360122988910Subject:Agricultural Electrification and Automation
Abstract/Summary:PDF Full Text Request
Sequence alignment is a basic information disposal method in Bioinformatics. It is useful for discovering functional, structural, and evolutionary information in DNA and protein sequences. Because sequence data increase rapidly in biology sequence database, it is very exigent to develop algorithms that have high biology sensitivity and efficiency. Pairwise and multiple sequence alignment algorithms of Bioinformaics are studied in this paper. The main contents and production can be briefly summarized as follows:1. Based on fast pairwise sequence alignment algorithm named as Ukkonen, a high efficient applied global pairwise sequence alignment algorithm is presented in this paper. The algorithm used the memory method of FA (Fast Alignment) algorithm for reference. The FA algorithm records element's origin relation while computing score matrix. And the algorithm adoped the Checkpoint technology to obtain some Checkpoint points in replacement matrix.2. On the research actuality of pairwise sequence alignment, several classic pairwise sequence alignment algorithm were analysized. These algorithms were programmed with C++ programme language and offered contrast experiment condition for the pairwise sequence alignment algorithm presented in this paper.3. A solution applying genetic algorithm to multiple sequence alignment was presentd after multiple sequence alignment research actuality was analysized. In this solution, some genetic strategies was designed for multiple sequence alignment problem. For example, coding, initial population, fitness function and the operations of natural selection, crossover and mutation.4. Because genetic algorithm obtains local optimal solution at times, an approach was advanced that incorporates simulated annealing into genetic algorithm to improve the performance of genetic algorithm. An annealing operation is added and can improve the convergence rate evidently in multiple sequence alignment.5. In biology field, CLUSTAL is one of the most popular multiple sequence alignment software. When large numbers of sequences are aligned at one time, the time complexity of CLUSTAL algorithm is very big and it is not suit for vast sequences. So, in order to reduce the time complexity, a parallel CLUSTAL algorithm was designed based on a PC of Windows operating system and multiple processors. Furthermore, the core algorithm codes of fast pairwise alignmet were optimized in CLUSTAL.6. Neighbor-joining algorithm is used for constructing a guide tree in CLUSTAL program. It is the core algorithm of some guide tree special softwares. Because the guide tree is not near-optimal, an improved neighbor-joining was designed that its idea is to look for a Leading taxon.7. On the basis of the sequence alignment algorithm presented in this paper, a Chinese sequence alignment software system was designed and implemented. The system consists of five functionalmodules, sequence file management, sequence alignment algorithms selection, parameters setting, alignment result displaying, guide tree displaying, and so on.The research contents are improved and innovated algorithms of sequence alignment in Bioinformatics. These algorithms are advanced more evidently than traditional algorithms in biology sentivity and computing efficiency. The software system based on these algorithms can offer sustainment for Bioinfromatics research and practice.
Keywords/Search Tags:Pairwise sequence alignment, Multiple sequence alignment, Bioinformatics
PDF Full Text Request
Related items