| Nucleic acid is a unique biological macromolecule,which plays an important role in the activities of life.Over the past 30 years,efforts have been made to study the tertiary structure of nucleic acids to reveal their various important biological functions.Compared with proteins,nucleic acids are more flexible and are easily affected by the cellular environment and ions,and the high-resolution tertiary structure of RNA experimentally determined is very limited.Therefore,it has become more and more important to predict unknown nucleic acids by means of computational methods combined with existing nucleic acid structure data.Great progress has been made in the research of prediction methods for the tertiary structure of linear RNA molecules,but the prediction accuracy,especially for long RNA,needs to be further improved.Moreover,there is no method to predict the tertiary structure of Circular RNA and DNA,which are currently hot research topics.In response to these problems,this paper improves and expands the 3dRNA method proposed by our research group.The specific research contents and results are as follows:1)Improvement of 3dRNA template library: 3dRNA is a template-based method to predict the tertiary structure of RNA.Due to the serious shortage of templates in its template library,3dRNA is difficult to predict the tertiary structures of large topologically complex RNA.In this paper,we make full use of the RNA tertiary structure measured experimentally in recent years,and designed an algorithm that can generate two new template libraries3 dRNA_Lib1 and 3dRNA_Lib2 according to the experimental structure in PDB,and can monitor the new RNA structures in the PDB monthly and automatically update the two template libraries.Different from 3dRNA’s original template library 3dRNA_Oldlib,we retained homologous templates,modified bases and non-standard base pairs,which could not only enrich the template library but also enable 3dRNA to better predict non-standard base pairs in RNA.The results show that the new template library can significantly improve the prediction ability of 3dRNA,especially for some large ribosomal RNA.2)Tertiary structure prediction of CircRNA: CircRNA is a class of non-coding RNA molecules that do not have a 5’ end cap and a 3’ end poly(A)tail and form a Circular structure by covalent bonds.However,there is currently no experimentally measured tertiary structure of CircRNA,which is not conducive to understanding or explaining the function of CircRNA.This paper extends 3dRNA to predict the tertiary structure of CircRNA,mainly including the development of a minimum secondary structure element decomposition algorithm for CircRNA,and a set of root node numbering rules to help the assembly of root nodes and child nodes.By analyzing the assembled structures of 34 CircRNAs and their corresponding linear RNAs,it is found that their structures are very different in some cases.We also found that CircRNA prefers to bind to PKR compared to homologous linear RNA,which is consistent with the experiment and indirectly proves the effect of 3dRNA in predicting CircRNA.3)Tertiary structure prediction of DNA: Breakthrough progress has been made in RNA tertiary structure prediction.In contrast,there are few studies on DNA 3D structure prediction.This paper presents a template-based method 3d DNA to predict the threedimensional structure of DNA.Compared with 3dRNA,we have made the following improvements: both DNA templates andRNA templates are considered in the template library;break loops are taken into account,so that 3d DNA has the ability to handle broken or multi-stranded DNA.A set of template search algorithms have been redesigned to help SSE find more suitable templates.The systematic test results showed that 3d DNA gave reliable predictions for the tertiary structure of single-stranded DNA,double-stranded DNA and multi-stranded DNA with mean RMSDs of 3.13 (?),2.74 (?) and 5.28 (?),respectively.4)A major bottleneck in RNA tertiary structure prediction is the use of a suitable scoring function to identify the model closest to the native structure from a large number of candidate models.Traditionally,the improvement space of the inverse Boltzmann scoring function is limited.Recently,the machine learning method has attracted much attention,but its performance is still insufficient.In this paper,a scoring function GCNscore based on graph convolution network is proposed.The candidate RNA three-dimensional structure is represented as a graph with atomic resolution of node features and edge features,and then the edge features and node features are constantly updated by multiple message transfer layers of graph convolution network to capture the global structure information.The results show that the accuracy of GCNscore is better than the latest prediction method ARES. |