Font Size: a A A

Computational Methods For Elucidating Function Of Long Non-coding RNA

Posted on:2021-01-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:X N FanFull Text:PDF
GTID:1520307100473954Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Long non-coding RNAs(lnc RNAs)paly crucial roles in diverse biological processes,implementing their functions through interaction with other molecules(e.g.,proteins,mi RNA,DNA).The mutations and dysregulations of lnc RNAs are often associated with many human complex diseases.Discovering the lnc RNA-protein interactions and lnc RNA-disease associations can help to elucidate the function of lnc RNAs,their regulatory mechanism,their pathological mechanisms in complex diseases,and provide guidance for more effective therapeutic intervention for complex diseases.The computational methods for elucidating function of lnc RNA were deelply studied in this thesis.The main works are as follows:1.To distinguish lnc RNAs from protein-coding transcripts,we presented a multimodal deep learning framework(namely lnc RNA_Mdeep).Lnc RNA_Mdeep incorporates two kinds of input modalities(i.e.,manually crafted features and raw transcript sequences),then a multimodal deep learning framework is built for learning the highlevel abstract representations and predicting the probability whether a transcript is lnc RNA or not.Lnc RNA_Mdeep achieves 98.73% prediction accuracy in 10-fold cross-validation test on human.Compared with other eight state-of-the-art methods,Lnc RNA_Mdeep shows 93.12% prediction accuracy independent test on human,which is 0.94%~15.41% higher than that of other eight methods.In addition,the results on 11cross-species datasets show that Lnc RNA_Mdeep is a powerful predictor for identifying lnc RNAs.2.A powerful computational method of LPI-BLS was proposed to predict the lnc RNAprotein interactions by using the broad learning system and building a stacked ensemble classifier with a logistical regression model.LPI-BLS first adopted the broad learning system to predict the lnc RNA-protein interactions.Broad learning system is an alternative way of learning in deep structure and a flat network with few parameters.Then,the results of multiple individual broad learning systems were fed into the stacked ensemble classifier built with a logistical regression to further improve the predictive performance.Compared with other state-of-the-art methods in 5-fold cross-validation test,LPI-BLS has the best performance on RPI488 and RPI7317 dataset.The results in the independent test also show that our LPI-BLS can effectively predict the lnc RNAprotein interactions.3.We developed the IDHI-MIRW method to identify potential lnc RNA-disease associations by integrating diverse heterogeneous information sources with positive pointwise mutual information and random walk with restart algorithm.IDHI-MIRW first constructed multiple lnc RNA similarity networks and disease similarity networks from diverse lnc RNA-related and disease-related datasets,then implemented the random walk with restart algorithm on these similarity networks for extracting the topological features which are fused with positive pointwise mutual information to build a large-scale lnc RNA-disease heterogeneous network.Finally,IDHI-MIRW implemented random walk with restart algorithm on the lnc RNA-disease heterogeneous network to infer potential lnc RNA-disease associations.Compared with other state-of-the-art methods,IDHI-MIRW achieves the best prediction performance.In case studies of breast cancer,stomach cancer,and colorectal cancer,36/45(80%)novel lnc RNA-disease associations predicted by IDHI-MIRW were supported by recent literatures.Furthermore,we found lnc RNA LINC01816 is associated with the survival of colorectal cancer patients.4.We developed the LDA-SSBRW method based on bi-random walks on heterogeneous network to identify potential lnc RNA-disease associations.LDASSBRW first fused the lnc RNA Gaussian interaction profile kernel similarity and sequence similarity based on k-mer content to construct the lnc RNA similarity network,and used disease semantic similariy based on Disease Ontology to construct the disease similarity network,then built a lnc RNA-disease heterogeneous network by intergrating the lnc RNA similarity network,disease similariy network,and known lnc RNA-disease associations.Finally,LDA-SSBRW implemented a tailored bi-random walk on the heterogenenous network to predict potential lnc RNA-disease associations.Compared with other state-of-the-art methods,LDA-SSBRW achieves superior performance than other methods in LOOCV.In addition,the case studies on four diseases(i.e.,melanoma,esophageal cancer,acute myeloid leukemia,and papillary thyroid carcinoma)indicate that LDA-SSBRW can effectively predict potential lnc RNA-disease associations.
Keywords/Search Tags:Long non-coding RNAs, lncRNA-protein interactions, lncRNA-disease associations, deep learning, broad learning, heterogeneous network
PDF Full Text Request
Related items