Font Size: a A A

Fast Computation Of RNA Secondary Structure Prediction

Posted on:2008-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:C K JiangFull Text:PDF
GTID:2178360242473263Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The prediction of RNA secondary structure is a kind of computation, according to the RNA sequence, which is enacted by discerning base pair set, to attain a combined structure with minimal free energy. How to predict more precise structures is one of the focuses of bio-informatics in the past three decades.Anciently, people always adopt phylogenetic comparison method to find the actual structure. However this method which performs with magnificent artificial work and needs some grim constrains, could not be applied extensively. So far, the method of computing thermodynamic minimal free energy is generally adopted to predict RNA secondary structure. Mfold algorithm which is based on dynamic programming and nearest energy model has become the classic method of lowest thermodynamic free energy predicting.Pseudoknots rife exist in RNA secondary structures. But Mfold algorithm includes no pseudoknots model. So far, the algorithm from Rivas is the best algorithm for predicting pseudoknots. This algorithm can find all the optimal planer pseudoknots and some non-planer ones, but it is so high in time complexity.In this paper, first of all, we implement and improve the Mfold algorithm. With the concepts of loop and nearest energy model plus the energy data from lab, this Mfold algorithm can compute the lowest free energy of any sub-chain in a dynamic way, and the traceback algorithm can find the optimal secondary structure. Then we promote Mfold algorithm by incorporating the coaxial stack to Mfold algorithm, with doing this we can obtain lower energy and higher accuracy than original algorithm. The complexity of improved algorithm is equal to Mfold.After operating magnitude experiments to compare these two algorithms, the results tell us that, the accuracy of our algorithm is promoted from 76.43% to 78.25.Later we implement a suboptimal multi-fold algorithm. First, imagine an alignment as a circle chain, which is divided into the included fragment and the excluded fragment by a random base pair. Then compute energy and structures of the two fragments, and unite them, to attain suboptimal energy and structure. Its time complexity is cube and space complexity is square. This suboptimal multi-fold algorithm can provide a lot of information for RNA secondary structure analysis.After that, we give out an algorithm with simple pseudoknots model. By randomly choosing two overlapped base pairs, and computing fragments divided by them, we find a secondary structure with one simple pseudoknot. The complexity of this algorithm is equal to Mfold algorithm.At last we provide more than thirty actual alignments testing experiments, which include structure comparison and computation of accuracy.The innovation of this paper:1. We improve Mfold algorithm by combining coaxial stack model. This algorithm enhances the accuracy from 76.43% to 78.25%.2. We provide an artificially interfering algorithm to compute secondary structure with simple pseudoknots model, which is cube in time complexity and square in space.3. We actualize Mfold algorithm and improved algorithm with Java, adduce more man thirty examples to compare these two algorithms in structures and accuracy aspects.
Keywords/Search Tags:Algorithm, Suboptimal, Pseudoknots, Secondary structure, Accuracy
PDF Full Text Request
Related items