Font Size: a A A

DNA Methylation Prediction Model Based On Recurrent Neural Network And Its Fusion Method

Posted on:2021-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:X C ZhouFull Text:PDF
GTID:2370330623968614Subject:Engineering
Abstract/Summary:PDF Full Text Request
DNA methylation is an epigenetic mechanism that involves many important life activities,such as cell proliferation,biological aging,and tumorigenesis.Studying DNA methylation is of great significance in gene expression regulation,disease prevention,and tumor recognition.The method of detecting DNA methylation status by experimental means can obtain higher accuracy,but due to the high cost,it is difficult to apply on a large scale.Therefore,using machine learning models to predict DNA methylation has become an important supplement to experimental methods.In recent years,with the development of deep learning technology,researchers have begun to use deep learning frameworks to study DNA methylation.Compared with traditional machine learning methods,deep learning can make full use of existing methylation databases.Potential methylation features can be automatically learned from a large amount of data.At present,deep methylation prediction models based on deep learning such as deep CpG model and methylation regression convolutional neural network model have achieved good results,but there are still deficiencies in the difficulty of extracting the sequence characteristics of DNA sequences and the poor performance of some regions.In response to the above problems,three deep learning models for predicting DNA methylation by local DNA sequences were constructed in this paper.Compared with existing models,the prediction performance of DNA methylation is improved.The specific work is as follows:(1)For the problem that the inner neurons of each layer of the convolutional neural network are independent of each other,it is difficult to effectively use the timing information of the DNA sequence.The text is based on the neuron between the layers of the recurrent neural network.With sensitive characteristics,a DNA methylation prediction model based on recurrent neural network was constructed.Through the comparison experiment with the methylation regression convolutional neural network model under the same data set,the mean square error of the methylation level regression of the recurrent neural network model in this paper has dropped to 0.0361,the accuracy rate of the methylation status classification has been improved to 90.66%,and the recurrent neural network model has a low methylation The prediction accuracy of the loci is high,which indicates that the features extracted by the recurrent neural network model can make a greater contribution when studying the methylation pattern of the low methylation region.(2)The recurrent neural network model has good performance in the low methylation region,but the performance in the high methylation region is insufficient.In order to make up for this deficiency,and further improve the classification and regression performance of the model.In this paper,through feature fusion,the methylation regression convolutional neural network model with good performance in the hypermethylated region is fused into the recurrent neural network model to construct a feature fusion model.Compared with the recurrent neural network model and the methylation regression convolutional neural network model,the performance of the feature fusion model after training is improved in the high and low methylation regions,and a better prediction effect is achieved.The mean square error of the methylation level regression of the feature fusion model is reduced to 0.0305,the classification accuracy of methylation status was increased to 91.72%.(3)The overall performance of the feature fusion model on the whole genome has been improved,but compared with the recurrent neural network model,the performance difference between the regions has increased,and the prediction performance in the Intergenic and Open sea regions has also decreased.In response to this problem,this paper proposes a multi-task learning method that divides the tasks to be tested into DNA regions,and builds a shared-private mode multi-task learning model.The private features of each region's methylation pattern are extracted through the private module,and the public features of the methylation pattern on the whole genome are extracted through the shared module,and the private features and the public features are fused in each task.Through the multi-task learning method,the commonality of methylation patterns in different regions is retained,the influence of differences in methylation regions is reduced,and the performance of each region of the model is improved,and finally the methylation level on the whole genome returns The mean square error is reduced to 0.247,and the classification accuracy of methylation state is improved to 93.16%.
Keywords/Search Tags:DNA sequence, DNA methylation, recurrent neural network, feature fusion, multi-task learning
PDF Full Text Request
Related items