Font Size: a A A

Study On Atomic Multipole Moments Prediction For RNA Based On ARDGPR

Posted on:2021-03-06Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y LiuFull Text:PDF
GTID:2404330611952001Subject:computer science and Technology
Abstract/Summary:PDF Full Text Request
RNA analysis is a very important subject in modern analytical science,which is the basis of interpreting RNA function and exploring the molecular mechanism of diseases.Traditional experimental methods are expensive to determine RNA structure,and the molecular state at each moment in the biological process of RNA couldn't be observed and recorded.With the development of interdisciplinary science,computer technology is increasingly applied in the fields of chemistry,biology and other fields to solve difficult or even unsolvable problems with traditional experimental methods.Computational chemistry uses computer simulations to calculate the properties of molecules,such as energy,dipole,quadrupole,vibration frequency,and reactivity,to help researchers obtain more chemical information,to overcome the shortcomings of traditional experiments.Computational chemistry uses the force field to calculate and study the secondary structure of RNA.Although it is not as accurate as the quantum mechanic,the force field is very fast and low cost.The nonbond interaction energies within the molecule plays an important role in the stable structure of RNA,including the interatomic electrostatic interaction energy and Van der Waals force,and the interatomic electrostatic interaction energy plays a major role.The force field can be improved by the calculation of atomic electrostatic interaction using atomic multipole moments,and the accuracy of the calculated results can be improved,so that more reliable structure prediction of RNA can be obtained.Because the traditional method is very time-consuming to calculate the atomic multipole moments,this thesis introduces machine learning to the prediction of the atomic multipole moments in RNA molecules.The main work of this thesis is to predict the atomic multipole moments of pentose molecules in RNA by machine learning.First,5000 pentose molecules were obtained: we randomly downloaded 300 RNA molecules in the PDB database,cut small fragments of pentose molecules and saturated them.Then,the atomic multipole moments of the target atom in the saturated pentose molecule were calculated,and the failed pentose molecule was eliminated.The input characteristics of the target atom were then established by the local atomic frame(ALF),and the experimental data set was constructed.Secondly,the third chapter of this thesis mainly carried out a series of studies on the gaussian process regression model(ARDGPR)of anisotropic kernel,and compared and analyzed the results of the four prediction models including gaussian process regression model of isotropic kernel(GPR),generalized regression neural network(GRNN),radial basis function neural network(RBFNN)and Bagging.The anisotropic kernel of ARDGPR is obtained by adding an automatic relevance determination(ARD)to the kernel of GPR.The experimental results show that the ARDGPR model obtains the highest prediction accuracy,followed by Bagging model,while RBFNN and GRNN have the lowest prediction accuracy.At the same time,compared with the prediction results of ARDGPR model and GPR model,it is shown that ARDGPR model can describe the relationship between data features and predicted targets better,which verify that the ALF coordinate system has well embedded the orientation characteristics into the data when constructing the target atomic input features.Although the calculation accuracy of ARDGPR model is much higher than that of GPR model,its calculation time is higher than that of GPR model.By analyzing the prediction results of ARDGPR model,the reasons for the increase in the calculation time of ARDGPR model are found out.In chapter 4,based on the knowledge of atomic field and atomic characteristics in computational chemistry,from the perspective of data feature dimension,the feature of data set is reduced from 75 dimension to 30 dimension for the data set of non-H atom.The experimental results show that the ARDGPR model has stable performance and excellent performance,and the prediction accuracy is further improved.The improved experimental data set features improved the prediction accuracy of the ARDGPR model,and greatly reduced the calculation time of the ARDGPR model,which proved the effectiveness of the improved idea.Meanwhile,it also showed that the ARDGPR model with the addition of automatic relevance determination to constitute the anisotropic nucleus was more suitable for the prediction research of atomic multipole moments.
Keywords/Search Tags:RNA, atomic multipole moments, force field, gaussian processes regression, automatic relevance determination
PDF Full Text Request
Related items