Font Size: a A A

Disease-related LncRNA Prediction Methods

Posted on:2020-12-10Degree:MasterType:Thesis
Country:ChinaCandidate:X F XiaoFull Text:PDF
GTID:2370330623451442Subject:Software engineering
Abstract/Summary:PDF Full Text Request
LncRNA are ncRNAs of lengths greater than 200 nucleotides.They are able to regulate their target genes at post-transcriptional levels and have key regulatory roles in many important biological processes,such as cell differentiation,chromatin remodeling,and more.In recent years,it has been increasingly clear that long noncoding RNAs(lncRNA)play critical roles in many biological processes associated with human diseases.Inferring potential lncRNA-disease associations is essential to reveal the secrets behind diseases,develop novel drugs,and optimize personalized treatments.However,biological experiments to validate lncRNA-disease associations are very time-consuming and costly.Moreover,there are no negative samples for predicting the disease-related lncRNA,and it is impossible to predict isolated disease-related lncRNA or isolated lncRNA-related diseases,and the prediction accuracy is not high.In order to solve these problems,the paper focuses on studying the effective method to predict lncRNA-disease associations.The specific research work is as follows:(1)The paper have proposed a method called BPLLDA to predict lncRNA-disease associations based on paths of fixed lengths in a heterogeneous lncRNA-disease association network.Specifically,BPLLDA first constructs a heterogeneous lncRNA-disease network by integrating the lncRNA-disease association network,the lncRNA functional similarity network,and the disease semantic similarity network.It then infers the probability of an lncRNA-disease association based on paths connecting them and their lengths in the network.Compared to existing methods,BPLLDA has a few advantages,including not demanding negative samples and the ability to predict associations related to novel lncRNA or novel diseases.And this paper performs leave-one-out cross-validation(LOOCV)to evaluate the predictive performance of the BPLLDA method.The leave-one-out cross-validation areas under the receiver operating characteristic curve of BPLLDA are 0.87117,higher than th ose of the two compared methods.In addition,cervical cancer,glioma,and non-small-cell lung cancer were selected as case studies,for which the predicted top five lncRNA-disease associations were verified by recently published literature.(2)This paper proposes a method by alternating least squares based on matrix factorization to predict the hidden lncRNA-disease associations,referred to as ALSBMF.The ALSBMF algorithm first decomposes the known lncRNA-disease correlation matrix into two characteristic matrices,then defines the optimization function using disease semantic similarity,lncRNA functional similarity and known lncRNA-disease association and solves two optimal feature matrix by least squares method.The two optimal feature matrix is finally multiplied to reconstruct the scoring matrix,filling the missing values of the original matrix to predict the hidden lncRNA-disease associations.Compared to existing methods,ALSBMF has the same advantages as BPLLDA.it does not require the ability of negative samples and can predict associations related to novel lncRNA or novel diseases.In addition,this paper performs leave-one-out cross-validation(LOOCV)and five-fold cross-validation to evaluate the prediction performance of the ALSBMF method.The AUCs are 0.9501 and 0.9215,respectively,which are better than the existing methods.The method also selected colon cancer,kidney cancer,and liver cancer as case studies.The predicted top three colon cancer,kidney cancer,and liver cancer-related lncRNA were validated in the latest LncRNADisease database and related literature.
Keywords/Search Tags:Disease similarity, LncRNA similarity, Path with limited length, Alternating least squares, Matrix factorization
PDF Full Text Request
Related items