Font Size: a A A

Statistical Analysis Of RNAs And Coaxial-stacking Prediction For RNA Pseudoknots Based On The 3D Structures

Posted on:2023-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:Z H GuoFull Text:PDF
GTID:2568306779989149Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
RNA has a variety of functions in organisms,such as the delivery of genetic material,catalysis and so on.Since structure determines molecular biological functions to a certain extent,it is necessary to know the exact spatial structure of RNA in order to accurately understand its basic biological functions.Because it is difficult to measure the spatial structure of RNA by experimental methods,in order to make up for the deficiency of the experiment,it is necessary to use computer algorithm to carry out configuration sampling and design a reasonable scoring function to distinguish the natural state from the candidate structure.At present,there are some RNA tertiary structure prediction methods,such as 3d RNA and RNAComposer based on known structure,and PHYSICs-based Mc-fold/Mc-sym,FARNA,hire-RNA,etc.At the same time,a reliable statistical potential can guide both RNA folding and the selection of predicted candidate structures.Taken together,all these advances in RNA structure modeling suggest that collecting various statistics on RNA 3D structures is generally essential for predicting RNA tertiary structures.While an accurate scoring function based on the statistics of known RNA structures is a key component for successful RNA structure prediction or evaluation,there are few tools or web servers that can be directly used to make comprehensive statistical analysis for RNA 3D structures.In this work,we developed RNAStat,an integrated tool for making statistics on RNA3 D structures.For given RNA structures,RNAStat automatically calculates RNA structural properties such as size and shape,and shows their distributions.Based on the RNA structure annotation from DSSR,RNAStat provides statistical information of RNA secondary structure motifs including canonical/non-canonical base pairs,stems,and various loops.In particular,the geometry of base-pairing/stacking can be calculated in RNAStat by constructing a local coordinate system for each base.In addition,RNAStat also supplies the distribution of distance between any atoms to the users to help build distance-based RNA statistical potentials.To test the usability of the tool,we established a non-redundant RNA 3D structure dataset,and based on the dataset,we made a comprehensive statistical analysis on RNA structures,which could have the guiding significance for RNA structure modeling.In addition,this article also statistical analysis of the current experiment has measured the spatial structure of pseudoknots using the current several mainstream of machine learning algorithms,including the support vector machine(SVM),bayes(BNB),regression(LR)and decision tree(DT)and random forest(RF)and other machine learning algorithms,based on a pseudoknots 11 kinds of characteristics of the training,The coaxial stacking effect of pseudoknots was predicted.We found that the random forest prediction model had the best effect,with a prediction accuracy of 0.829.Later,we used this model to predict whether there is coaxial stacking of pseudoknots in Pseudo Base++ database,providing help for the subsequent prediction of spatial structure containing pseudoknots.
Keywords/Search Tags:RNA spatial structure, Statistical analysis, RNA Pseudoknots, Coaxial stacking, Machine learning
PDF Full Text Request
Related items