Font Size: a A A

Representation-learning-based Algorithms Of Predicting MicroRNA And Gene Relationships

Posted on:2021-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:W D XieFull Text:PDF
GTID:2480306122464204Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Recently,the representation learning methods and deep learning methods have showed many splendid possibilities in solving problems in different fields.In the field of bioinformatics,it is prevalent to combine them with the discovery of potential relationships.New generation sequencing methods added with myriads of omics data have provided scientists with chances of exploring the potential interactions between biological molecules.MicroRNA(miRNA)is a type of small biological molecule which plays an important role in the mechanism of diseases and regulates gene in different ways.The interactions between miRNA,gene and diseases are complicated.If the relationships between miRNA and gene could be precisely predicted,it would help those researchers better understand the mechanism of complex diseases such as cancers and discover some potential cures or treatments.Therefore,we proposed two methods that combined the sequential information and geometrical information of miRNA and gene and proposed two representation learning-based miRNA-gene relation prediction algorithms.The main work of our paper was as follow:(1)For the topics of miRNA and gene relationship prediction,previous sequence-based methods have some drawbacks like hard to acquire accurate features,hard to take the most advantage of the knowledge already existed or validated and to fit the input format of deep learning methods.To solve these problems,this article proposed a representation-learning-based algorithm named SG-LSTM that combined the sequential and topological information to predict relationships between miRNA and gene.For one thing,Doc2vec was meant for extracting the sequential information from miRNA and gene.For another thing,Role2vec was used for the learning of the topological information.After the merge of these two embedding,validated relationships were used for the construction of data set.In the end,in order to calculate the scores of every pair,we introduced a deep learning method named LSTM for the model training and relations predicting.The cross validation illustrated that the algorithm proposed by this paper got the highest AUC value compared with other methods.Achieving a very good performance in both of large,small or imbalanced data sets indicated the robustness of proposed methods.The predictive ability vector was constructed to measure the predictive ability of our methods.In the mean time,the intersection between our methods and some traditional prediction method indicated that our framework could be helpful in the discovery of potential relationships of miRNA and gene.(2)The problems of choosing negative samples had troubled researchers for a long time.Though there were some methods like using distance-related-methods to select negative samples,it could not avoid misjudging some potential relationships as negative samples.In order to solve this problem,we proposed a representation-learning-based manual negative samples generation algorithm named GAN-NEG for the prediction of relationships between miRNA and gene.In this method,we first matched the miRNA seed region and gene 3'-UTR region,and filtered biological meaningful positive samples.Then,we introduced negative samples from other methods to enlarge the set of our negative samples.At last,we leveraged the negative samples to train a WGAN-GP model to generate a great amount of manual negative samples.The introduction of manual negative samples could reduce samples that were used for the process of training.The cross validation indicated that the AUC value of this algorithm had been further improved.A stronger predictive ability vector plus with bigger intersection with traditional prediction methods just indicated that the prediction ability of GAN-NEG had been further enhanced.
Keywords/Search Tags:Representation Learning, LSTM, Generation of manual negative samples, microRNA-gene relations prediction
PDF Full Text Request
Related items