Font Size: a A A

Prediction Of Enhancer-Promoter Interactions Based On Deep Learning

Posted on:2021-05-26Degree:MasterType:Thesis
Country:ChinaCandidate:C Y HongFull Text:PDF
GTID:2480306017972839Subject:Computer technology
Abstract/Summary:PDF Full Text Request
As important gene regulatory elements,enhancers and promoters can regulate gene expression.Therefore,studying the pattern of enhancer-promoter interactions(EPI)has great significance for human development.However,it is time consuming and laborious to identify EPI by using biological experiments.Therefore,a large number of researchers wish to use biocomputing methods to solve this problem.In recent years,the rapid development of gene sequencing technology has provided a large amount of biological sequence data,which makes it possible to predict EPI by analyzing gene sequences.But there remains much scope for improvement in the existing excellent methods.In this thesis,we combine gene sequence analysis with the deep learning techniques in the field of natural language processing and propose two new deeplearning models from two different perspectives:sequence interaction information analysis and sequence structure information analysis.The main research contents of this article are summarized as follows:Firstly,from the perspective of sequence interaction information analysis,we construct a deep learning model—MatchEPI.The model gets the mutual information by following the idea of making two sequences meet as early as possible.The experiment on six cell lines proved that MatchEPI performs better than existing models.Then,from the perspective of sequence structure information analysis,we construct another deep learning model—EPIVAN.This model uses pre-trained DNA vectors to encode the gene squences and uses an attention mechanism to make the model pay more attention to key features.Moreover,EPIVAN also enhances the model's learning of cell line-specific features and cell line-common features by using the pre-trained learning strategy.This model works better than the existing models on six cell lines.In the experiments,we also analyze the contribution of pre-trained DNA vectors and attention mechanisms to the model.In addition,we also prove that the pretrained model,EPIVAN-general,has the ability to transfer,which can be used as a starting point for the training of new cell lines in the future.The two models proposed in this thesis have excellent performance.They prove that there is an interactive mode between promoter and enhancer sequences and also prove that these two elements have specific information on the sequence structure(such as cell line-specific features and cell line-common features).We hope that this work could provide new ideas for research in this field.
Keywords/Search Tags:Enhancer-promoter interactions, Sequence analysis, Deep learning
PDF Full Text Request
Related items