Font Size: a A A

Prediction Of Protein And RNA Contact MAP Based On Fully Convolutional Network

Posted on:2021-04-17Degree:MasterType:Thesis
Country:ChinaCandidate:B T JiaFull Text:PDF
GTID:2370330620972173Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Protein and RNA interaction plays a vital role in various biological processes.Biological processes such as gene regulation,RNA splicing,and RNA degradation are all related to protein-RNA binding sites.Therefore,analyzing binding sites of proteins and RNA can comprehend the mechanism of proteins and RNA,which has great significance for understanding these biological processes.At present,there are many biological experiments to obtain the binding site,such as X-ray crystal diffraction,nuclear magnetic resonance,CLIP-seq,etc.These biological experiment methods can accurately obtain binding sites of protein and RNA complexes,but will cost lots of manpower,material and financial resources.With the increase of spatial data of protein and RNA complexes,researchers try to predict and analyze binding sites through computational methods.Usually,they use spatial data of protein and RNA complexes to predict binding sites.Most existing methods are based on the analysis of amino acid and nucleic acid pairs in the sequence,ignoring the influence of the entire sequence of binding sites.At the same time,the data sets used by these methods are very small,and the prediction results may be biased.In response to the above problems,the following work has been done in this article:(1)The structural data of proteins and RNA molecules were recollected.A total of 1130 pieces of protein-RNA complex data were collected from the PDB database.(2)The protein co-evolution information PSSM matrix(Position-Specific Scoring Matrix)and the One-Hot coding of RNA were used as data features.(3)The contact map was regard as the label of the data.(4)A protein-RNA binding site map prediction model based on fully convolutional network(PPRCM)was proposed.On the test set,the accuracy,recall,and F1-score of our model were 67%,56%,and 61%,respectively.At the same time,we did five-fold cross-validation.The average results of precision,recall,and F1-score were 64%,53%,and 58%.It was proved that the model has good stability and reliability.This model put forward a new idea for the study of protein-RNA binding sites.It provided researchers a new way to predict protein-RNA binding sites.
Keywords/Search Tags:Fully convolutional network, protein and RNA contact map, PSSM, sequence encoding
PDF Full Text Request
Related items