Font Size: a A A

Prediction Of Plant Long Noncoding RNAs Interactions With Proteins By Deep Learning

Posted on:2022-09-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:Jael Sanyanda WekesaFull Text:PDF
GTID:1480306341985929Subject:Computer Application Technology
Abstract/Summary:PDF Full Text Request
Identifying RNA-binding protein sites is critical to cell biology at transcriptional,post-transcriptional,translational,and post-translational levels.Studies have revealed that lncRNAs exert regulatory effects on various biochemical pathways partly through interacting with DNA,RNA,and proteins.Notably,predicting the interaction between lncRNAs and proteins is essential for studying molecular mechanisms,understanding the pathogenesis of diseases,and deciphering lncRNA functions.Therefore,building a high-performing system for predicting lncRNA-protein interaction and subsequent functional annotation for lncRNAs are crucial for crop development and related research.This dissertation aims to investigate the intersection of plant genomics and deep learning considering different kinds of data forms.For versatility and to explore different model design principles,biological information based on a variant of feature extraction and selection methods are used to develop lncRNA-protein interaction prediction algorithms.Function inference based on the interactions is performed.The experiments were performed on Arabidopsis thaliana and Zea mays datasets to verify the performance of the proposed methods.The central hypothesis is that lncRNAs with no known functions that interact with similar proteins may display similar functions,which could be learned from the analysis of their interaction partners.The main challenges include feature engineering and the interpretation of representations learned by deep learning models aligned for the target domain/genome knowledge.Firstly,an efficient deep learning model based on optimal sequence features is proposed to predict interactions between lncRNAs and proteins.A recurrent neural network is applied to capture contextual long-range information dependencies since lncRNAs are characterized by long sequences.Then,feature selection using a recursive feature elimination algorithm is employed to achieve optimal performance.The model achieved 88.12%and 90.74%accuracy for two plant species.Secondly,a graph-based deep learning model that uses graph representation-learning and structural features is proposed for the prediction of lncRNA-protein interaction.The effectiveness of using chaos game representation together with graph attention is demonstrated in the model.Accuracies of 85.76%and 91.97%were obtained by the model.Thirdly,a multi-model ensemble deep learning method that integrates sequence-structural features and implements self-attention mechanisms is proposed to demonstrate scalability and interpretability in the prediction of lncRNA-protein interaction.The techniques employed result in significantly high performance with 89.50%and 92.32%accuracy for two plant species.Lastly,a hybrid method that integrates a deep neural network and ensemble learning algorithms is proposed.The method predicts lncRNA-protein interactions and analyzes the interactions for functional annotation of lncRNAs.The experimental results show that sequence information only produces a reliable prediction of interaction partners because lncRNA-protein interaction is largely influenced by sequence complementarity.Accuracies of 89.98%and 93.44%for two plant species were achieved.The key factors that influence the performance of the deep learning-based prediction methods are investigated,thus demonstrating the research value of this dissertation on improving interaction prediction.The framework contains methods for the integrative analysis of large-scale lncRNA and protein data for interaction prediction and functional analysis.It is anticipated that the proposed methods hold great promise to broaden our knowledge of plant lncRNA-protein interaction and lncRNA functional research.
Keywords/Search Tags:Long non-coding RNA, Protein, Interaction, Deep learning, Prediction
PDF Full Text Request
Related items