Font Size: a A A

MiRNA-Disease Association Prediction Method Based On Meta-path

Posted on:2022-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:Y J ZhengFull Text:PDF
GTID:2504306602994919Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of genomics and bioinformatics,non-coding RNA(nc RNA)has gradually gained more attention from scientists.Researchers have discovered that gene sequences once considered junk genes actually participate in many important biological processes and play an important role.Due to the limited number of proteins that can be targeted by drugs,scientists began to explore the association of nc RNA with human diseases.Many nc RNAs have been confirmed to play an important role in the regulation of human disease genes,especially mi RNAs.Whether mi RNA can be used as a marker of disease diagnosis,treatment and prognosis has become a new research direction.Considering the long cycle and high cost of biological experiments,more and more researchers have started to use computational methods to predict the association of mi RNAs with diseases in recent years as the data related to mi RNAs have been increasing.In this thesis,we propose a mi RNA-disease association prediction method based on metapath,denoted as MDPBMP.Firstly,a mi RNA-disease-gene heterogeneous information network is constructed based on known mi RNA-disease associations,disease-gene associations and mi RNA functional similarities.Secondly,seven symmetric meta-paths are selected according to different semantics.Then,the initial features of mi RNA,disease and gene nodes in the network are mapped to the same vector space,and the vector information carried by each meta-path instance is extracted.By performing weighted summation on the meta-path instances with the same starting node of each meta-path,the vector information of the starting node extracted on the corresponding meta-path is obtained.Next,the vector information of the same node obtained on different meta-paths is weighted and summed,and the dimensionality of the aggregated vector is reduced to to obtain the final feature vector of the node.Finally,the feature vectors of mi RNAs and diseases were used to calculate their association probability scores.In the five-fold cross-validation,MDPBMP obtained the highest AUC value compared with the other three prediction models.In the case study,we listed the top 50 mi RNAs predicted by MDPBMP associated with lung cancer,esophageal cancer,colon cancer,and breast cancer.49,48,49,and 50 were validated,respectively.It is worth noting that breast cancer is used to test the predictive ability of the model for new diseases.Compared with existing meta-path-based prediction models,MDPBMP creates feature vectors for all nodes,and updates the feature vector of the starting node by fusing the feature information of each node on the meta-path instance.This approach not only extracts the characteristic information of mi RNA and disease nodes themselves,but also effectively captures the information carried by intermediate nodes on the meta-path.Secondly,MDPBMP introduces the disease-gene associations.Compared with the association networks that only contain mi RNAs and diseases,there are more types of optional metapaths in the network,which helps to better extract the structural features in the network and helps predict the association between new diseases and mi RNAs.MDPBMP not only demonstrates good predictive performance,but also accurately predicts mi RNAs associated with new diseases.Using the prediction results as a reference for biological experiments can effectively reduce the period and cost of experiments,which is important for the study of mi RNA and disease association.
Keywords/Search Tags:miRNA-Disease Association, Meta-Path, Association Prediction, Heterogeneous Information Network
PDF Full Text Request
Related items