Font Size: a A A

Research On Long Non-Coding RNA-Protein Interaction And Disease Association Algorithms

Posted on:2020-10-17Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhangFull Text:PDF
GTID:2370330575481223Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Long noncoding RNA(lncRNA)is a kind of RNA whose length is more than 200 bases and can form stable spatial structure by folding,but it can not encode proteins.Benefiting from the development of bioinformatics and next-generation sequencing technology,more and more attention has been paid to reserches of lncRNAs in recent years.Deeply exploring the interactions between lncRNAs and proteins and associations with diseases is the main way to infer functions of lncRNAs and to further study lncRNAs.At present,the prediction of interactions between lncRNAs and proteins is still in the preliminary stage.LncRNA plays an important role in many biological pathways and biological molecular functions.And lncRNA is closely related to the occurrence and development of many diseases.LncRNA can achieve biological functions through the interactions with biological macromolecules.And one of the most important molecular mechanisms of lncRNAs is the interactions with proteins.The development and application of network science in the field of bioinformatics have also contributed to the in-depth exploration of lncRNAs.At present,many methods have been proposed to predict the interactions between lncRNAs and proteins.There are two main kinds of methods to predict the interactions between lncRNAs and proteins: The first is that computational methods for predicting interactions between lncRNAs and proteins based on the intrinsic features of sequence,structure and physicochemical property information;The second is that computational methods for predicting the interactions between lncRNAs and proteins based on network methods with external associations.On the one hand,it reflects the strong interest and importance of the interactions between lncRNA and protein molecules on the other hand,it also shows the great landscape of machine learning and network science in exploring bioinformatics field.In the first part of this study,we systematically and comprehensively analyzed and compared the computational models for predicting the interactions between lncRNAs and proteins based on machine learning and network analysis methods,and summarized the advantages and disadvantages of these two kinds of methods,as well as the scope of application.This part can not only show the current research progress of the interactions between lncRNAs and proteins,but also help users choose appropriate method to predict the interactions with different datasets,and ultimately achieve more reliable interaction results.In the second part of this study,a novel computational method to lncRNA-protein interaction predictions is conceived.Constructing the heterogeneous network using network representation learning DeepWalk algorithm to deeply mine the external associations of molecules and the topological structure information between biological molecules in the heterogeneous network,and construct a variety of classifier models to predict lncRNA-protein interactions.Furthermore,compared with other algorithms for predicting the interactions between lncRNAs and proteins,our algorithm has good performance in terms of effectiveness.In addition,the number of lncRNA-disease associations included in public databases can not be compared with the number of identified lncRNAs.There are only a few data related to diseases.Therefore,in the third part of this study,we constructed a prediction model of lncRNA-disease associations using topological similarity in the heterogeneous network.Firstly,a heterogeneous network consisting of lncRNAmicroRNA interactions,known lncRNA-disease associations,microRNA-disease associations,lncRNA-lncRNA interactions and disease-disease interactions is constructed.Vector representation of the node is obtained based on network representation learning DeepWalk algorithm and calculation of topological similarity are used to predict the associations between lncRNAs and diseases based on rule-based inference method.Using 10-fold cross validation,we compared our proposed method with RWRHLD and RWRlncD,which are network-based models for predicting lncRNA-disease associations,and further validating and comparing the predicted results based on text mining,the proposed method for predicting lncRNA-disease associations has achieved satisfactory performance,and is superior to RWRHLD and RWRlncD algorithms.In this study,we summarize and discuss the calculational methods for predicting lncRNA-protein interactions,and propose a new method for predicting lncRNA-protein interactions based on mining hidden topological information in the heterogeneous network to extract the features of molecular external associations,and construct a machine learning model to predict lncRNA-protein interactions.Moreover,a heterogeneous network of known lncRNA-disease associations is constructed to predict the associations between lncRNAs and diseases based on association rules of topological similarity.The work in this paper can help to infer the functions of lncRNAs and promote the research of lncRNAs in an all-round way.
Keywords/Search Tags:lncRNA-protein interactions, lncRNA-disease associations, network science, machine learning, predictions, performance comparison
PDF Full Text Request
Related items