Font Size: a A A

Computational Prediction of RNA Secondary Structure with Applications to RNA Viruses

Posted on:2014-07-09Degree:Ph.DType:Dissertation
University:University of RochesterCandidate:DiChiacchio, LauraFull Text:PDF
GTID:1450390008455622Subject:Biophysics
Abstract/Summary:
RNA is key in gene expression, and is responsible for multiple catalytic and regulatory mechanisms in the cell. An important step toward understanding RNA function is determining its structure. Computational approaches have become integral to this process, particularly those that predict structure based on the thermodynamics of RNA folding. RNA secondary structure, defined as the collection of canonical base pairs, provides significant information about function. Methods for predicting secondary structure of single-stranded molecules have been refined to become both sophisticated and accurate, but prediction of RNA-RNA interactions has remained a computational challenge. The role of intramolecular structure formation and its influence on determining RNA-RNA interactions is significant, but is a difficult problem that is computationally expensive to solve. A novel algorithm for predicting RNA-RNA interactions was developed that utilizes pseudo-free energy minimization. This is an extension to standard free energy minimization that uses the single-stranded partition function calculations to predict the probability that each nucleotide is involved in self-structure. A pseudo-energy penalty is administered to each nucleotide based on its propensity to form self-structure, augmenting the prediction of RNA-RNA interactions. This algorithm provides a statistically significant increase in sensitivity over the best known method for generalized RNA-RNA interaction prediction. The gold standard for structure determination is comparative sequence analysis. In this method, multiple homologous RNA sequences are compared to identify conserved structure. The accuracy of such structure prediction methods is reliant upon an initial primary sequence alignment that is informative and accurate. Pairwise Hidden Markov models (HMMs) are frequently used for RNA sequence alignment. These algorithms are trained to generate a probabilistic alignment for two sequences at a time. Probabilistic pairwise alignment is currently an important first step in building a multiple sequence alignment for use in automated prediction of conserved secondary structure. An HMM that simultaneously aligns three sequences was developed and benchmarked using the Rfam database of multiple sequence alignments. To our knowledge this is the first algorithm to consider three sequences simultaneously, as each sequence added over pairwise significantly increases the computational demand.
Keywords/Search Tags:RNA, Structure, Computational, Prediction, Sequence, Multiple
Related items