Font Size: a A A

Study On Several Computational Issues Related To RNA Secondary Structure

Posted on:2009-12-14Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q LiuFull Text:PDF
GTID:1100360245972715Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
The computational studies of RNA structure and function are one of the hotest research topics in bioinformatics. As more and more non-coding RNAs are identified recently, RNA has been viewed as a more complex molecule in its diversity of functions rathor than barly serving as the transmitter between DNA and protein. A view that is emerging now is that RNA world are much more complicated than our previously thought, and RNAs may play the same important roles in the central dogma as DNA and protein does. In addition, "Rnomics" and "Ribonomics" has currently become another two new interdisciplines after the notion of "Genomics" and "Proteomics" were presented.Computational techniques can be served as an extremely helpful complementarity to the experimental study of RNA structure and function, expecially when this study is in its large scale, high throughput and systematic level. In order to faciliate this study, more efficient bioinformatic analysis approaches are needed. In our thesis, we performed some basic computational studies on RNA secondary structure as well as their functions from the perspective of algorithm designning and platform implementation. The content in this thesis includes: representation of RNA secondary structure; RNA secondary structure prediction; comparison of RNA secondary structure; compression of RNA secondary structure and informational complexity measurement; construction of integrated platform for RNA structure analysis; non-coding RNA prediction, etc.The main contributes of this thesis are listed as following:(1). Different representations of RNA secondary structure are compared. A 6-D encoding based RNA secondary structure representation method is presented. The RNA secondary structure can be represented as a structure matrix, and the corresponding singlar value vector of this matrix is calculated to extract the main information of the RNA secondary structure. This kind of representation has provided an accurate description of RNA secondary structure mathmatically.(2). A thoroughly discussion of RNA secondary structure prediction approches is given. Based on the definitions of Discrete Hopfield Neural Network (DHNN) and Maximal Independent Set (MIS) in Graph theory, a heuristic algorithm is presented to select stems in RNA structure as well as its application in RNA secondary structure prediction. This method has substaintially improved the efficency of RNA secondary structure prediction.(3). Different RNA secondary structure comparison methods is discussed. A novel method is presented to compare the similarity of RNA secondary structures using a matrix representation of the RNA structures. Relevant features of the RNA secondary structures can be easily extracted through singular value decomposition (SVD) of the representing matrices.(4). A fuzzy kernel clustering method is applied, using the similarity metric defined above, to cluster the RNA secondary structure ensembles. Our application results suggest that our fuzzy kernel clustering method is highly promising for classifications of RNA structure ensembles, because of its low computational complexity and high clustering accuracy.(5). An integrated platform (RNACluster) is constructured to calculate and compare different distances between RNA secondary structures, and to perform cluster identification to derive useful information of RNA structure ensembles, using a minimum spanning tree (MST) based clustering algorithm. RNACluster can be used in the analysis of RNA structure ensemble clustering,RNA conformational switches and non-coding RNA prediction.(6). An universal algorithm for the compression of RNA secondary structure is discussed (RNACompress). RNACompress employs an efficient content-free-grammar based model to compress RNA sequences and their secondary structures simultaneously.(7). A novel informational complexity measurment of RNA seondary structure is presented based on the notion of Kolmogorov comlexity. A test of the activities of 11 distinct GTP-binding RNAs (aptamers) compared with their structural complexity shows that our defined informational complexity can be used to describe how complexity varies with activity.(8). A survey of the concept of non-coding RNA and its computational prediction is given. A complete list of web sources for non-coding RNA analysis is compiled. Several future research directions and topics in this field are discussed.
Keywords/Search Tags:RNA, secondary structure, secondary structure prediction, structure comparison, clustering of structure ensemble, compression, informational complexity, non-coding RNA
PDF Full Text Request
Related items