Font Size: a A A

Genome-wide RNA-binding Proteins Identification Based On Evolutionary Deep Convolutional Neural Network

Posted on:2022-11-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y W WangFull Text:PDF
GTID:2480306758980239Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
RNA-binding proteins(RBPs)are a class of proteins related to RNA regulation and metabolism,which play important roles in RNA maturation,transport,localization and translation.However,experimental genome-wide RNA-binding detection methods are costly and time-consuming.Therefore,there is an urgent need for an efficient and fast method to predict RBP binding sites using sequence patterns learned from existing annotation knowledge.Due to the rapid development of crosslinkingimmunoprecipitation and high-throughput sequencing(CLIP-seq),a large amount of data on the interaction of RNA molecules with RNA-binding proteins has been accumulated,which provides an opportunity to use big data to predict RBP.Recently,genome-wide RNA-binding event detection methods have been proposed to predict RBPs.However,existing computational methods usually suffer from some limitations,such as high dimensionality,data sparsity,and low model performance.Therefore,how to effectively represent RBP feature information and design efficient computation methods to identify RBP is a challenging research content.The main contribution of this paper is to improve the traditional deep neural network from two aspects of model optimization and adding feature representation,and design an evolutionary deep neural network and a multi-code ensemble deep neural network to improve the accuracy of RBP identification.Deep convolutional neural networks have good advantages in dealing with high-dimensional sparse data.To further improve the performance of deep convolutional neural networks,we propose an evolutionary deep convolutional neural network(EDCNN),which enhances traditional deep convolutional neural networks through letting evolutionary optimization work in coordination with gradient descent to identify protein-RNA interaction.EDCNN combines evolutionary algorithms and different gradient descent algorithms in a complementary algorithm,where gradient descent steps and evolution steps can alternately optimize RNA-binding protein recognition performance.To validate the performance of EDCNN,we conduct experiments on two large-scale CLIP-seq datasets,and the results show that EDCNN provides better performance than other state-of-the-art methods.In addition,we also verified the effectiveness of the algorithm from multiple perspectives,such as time complexity analysis,parameter analysis and motif analysis.Furthermore,we design a multi-code ensemble deep neural network(MCEDNN)to improve the accuracy of identifying RBPs.First,the RNA sequence is converted into multiple encoding representations,and then different feature extractors are designed for different representations,and then the learned high-level features are aggregated,and finally utilizes a multi-layer perceptron to identify RBP binding sites.We conduct multiple sets of experiments on 55 large-scale RBP datasets to verify the effectiveness of the algorithm.
Keywords/Search Tags:RNA sequence, RNA binding protein identification, convolutional neural network, Deep learning, evolutionary algorithm
PDF Full Text Request
Related items