| RNA(Ribonucleic Acid)is an important form of genetic material,which widely exists in a variety of organisms and plays an important role in various life activities.Therefore,the study of RNA structure is the basis of analyzing gene expression rules,regulating gene expression and even changing biological traits.In the past,the academic used biochemical method to directly obtain the secondary structure of RNA,but this method cannot be popularized due to low efficiency and difficult cost control.In order to fill the huge demand for RNA secondary structure prediction,the academic began to use some computational methods,among which the minimum free energy algorithm is the most widely used,which has the advantage of low cost and high efficiency in predicting short sequence’s secondary structure.However,due to its use of dynamic programming algorithm,its efficiency decreases exponentially with the increase of sequence length.At the same time,the accuracy of such method is not good enough.In addition,this method does not perform well in predicting RNA secondary structure containing pseudoknots.After that,many machine learning methods have been applied to this research field,but results are not so good.Based on the previous studies on RNA secondary structure prediction,there is no solution that can simultaneously meet the requirements of high efficiency,prediction accuracy and prediction of pseudoknots.In order to enhance the accuracy and efficiency,and achieve end-to-end prediction of RNA secondary structure simultaneously,this paper applies deep learning and designs a method called ‘LTPConstraint’.This method uses neural networks built with Transformer,Generator,and other structures,and designs constraint layers at the top of the network to constrain the predicted results according to the natural characteristics of RNA structure,which keeps the prediction consistent with RNA structure.In this paper,Method of transfer learning is used to train LTPContraint’s network.This paper set the data of different families as the target domain,and the model suitable for the target domain is trained by transferring the pre-training model.To test the effect of the model,the model was tested using multiple RNA family databases with different sequence lengths and quantity.While five classical RNA secondary structure prediction models were tested on the same data and served as a control group.The results showed that the method of LTPConstraint showed significant improvements in both prediction accuracy and stability.To test LTPConstraint’s ability in predicting pseudoknots,I also used the corresponding data set and used Prob Knot and Knotty models to predict the same structure as a control group.The results showed that the LTPConstraint model also showed good accuracy and stability in predicting structures containing pseudoknots.It’s a big improvement over the other two models.In order to test the effect of transfer learning,a control group was set up in this paper.Conventional methods and transfer learning methods were respectively used to train the data of the target domain to obtain the model.The experimental results show that the model obtained by using transfer learning method has better prediction accuracy and stability,while the model is more robust,and indirectly reduces the computational cost of training.All experiments demonstrated that LTPConstraint improved the accuracy,stability,and robustness of RNA secondary structure prediction models.In the future,this method may become a mainstream forecasting method. |