Font Size: a A A

Prediction Of Protein-RNA Binding Sites Based On Graph Convolution Neural Network

Posted on:2022-08-15Degree:MasterType:Thesis
Country:ChinaCandidate:H W YangFull Text:PDF
GTID:2480306329498994Subject:Computer technology
Abstract/Summary:PDF Full Text Request
The interaction between protein and RNA is the basis of many cells' regulation and gene expression processes.At the same time,many studies have proved that almost all proteins need to interact with RNA to give full play to their functions.By analyzing the interaction between protein and RNA,we can not only deepen the understanding of protein,but also extend the effective methods to study more biological processes.At present,there are mainly two methods to predict the interaction between protein and RNA,which are biological experiments and computational methods.In biological experiments,NMR and X-ray diffraction are used to identify the binding sites.Although the accuracy of these methods is high,it costs a lot of manpower and material resources,and is not suitable for large-scale research.In contrast,the calculation method can predict the interaction by analyzing the combination mode between different sites.The current calculation methods mainly focus on the prediction of sequence level and single sequence small fragments and cannot predict the specific amino acid nucleotide binding pairs.Based on this,this paper puts forward the following work.(1)In view of the current data set cannot meet our research needs,we use crawler technology to build a new data set.The structural information of 2706 protein RNA macromolecular complexes was collected by crawling data from PDB database.After the subsequent data processing process,a total of 439 effective proteinRNA binding pairs were used to extract positive and negative samples and train the model.(2)For RNA sequences,we use a new method to generate word vector features based on 3-mer short sequences.This feature not only contains the context information of the sequence,but also excavates the hidden dependencies in the sequence.(3)A prediction model based on graph convolution neural network is proposed.For a given protein and RNA sequence,the model can predict the amino acid-nucleotide binding pairs on the two sequences and construct the corresponding binding site map based on these pairs.Through 10-fold cross validation,the precision,recall and F1-score of our model on independent test set are0.814,0.772 and 0.805.After replacing GCN network with gat network,the precision,recall and F1 score of the model are 0.827,0.798 and 0.813.Experiments show that the model proposed in this paper can not only effectively predict the binding sites,but also provide a new direction and ideas for the next study of protein-RNA interaction.
Keywords/Search Tags:protein RNA interaction prediction, graph convolution neural network, prediction of binding sites
PDF Full Text Request
Related items