Font Size: a A A

Prediction Of DNA Methylation Sites Based On Nucleotide Coding

Posted on:2024-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:W Z FuFull Text:PDF
GTID:2530306935499504Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As a major epigenetic modification,DNA methylation is involved in most of the activities of life in organisms.DNA methylation plays an important role in X-chromosome inactivation,genomic imprinting,aging,and cancer through the regulation of gene expression.Implementation of the human genome project has led to rapid development of the corresponding sequencing technology.And it has encouraged the generation of a large amount of disordered DNA methylation site data.Therefore,accurate prediction of sites of methylation in DNA sequences is of great importance for the development of new drugs and regulation of human biochemical traits.On the one hand,the identification of methylation sites by experimental methods is very complicated and it has a long period.On the other hand,it is not feasible to detect methylation sites from large-scale sequences by experimental methods.The computer method of DNA methylation localization can identify methylation sites from large-scale sequences with high recognition efficiency.And it is of great practical significance for reducing experimental costs and obtaining reliable localization.In addition,it is helpful for further experimental studies of methylation.This thesis focuses on the prediction of DNA methylation sites,mainly studies the feature extraction methods of DNA sequences,expresses the characteristics of DNA sequences more comprehensively through feature fusion,and constructs a machine learning classification model to achieve high-precision prediction of DNA methylation sites.The main work of this thesis is summarized as follows:(1)In this thesis,a new method based on traditional feature coding combined with deep learning is proposed to solve the classification of DNA methylation sites.Firstly,the DNA sequence is converted into a computer-acceptable binary sequence by One-hot encoding.Secondly,based on the interdependence between nucleotides,the method uses convolutional neural networks and long-term and short-term memory network to construct the model.This thesis made the effective prediction analysis based on convolutional neural network and long short-term memory network.The experimental results show that the combination of convolutional neural network and long short-term memory network has the best performance.(2)In this thesis,a feature extraction method based on spatial embedding features combined with convolutional neural network is proposed to solve the classification of DNA methylation sites.Firstly,K-mer segmentation is used to split the DNA sequence into words.Secondly,considering the correlation information between non-adjacent nucleotides in DNA sequences,this thesis proposes a feature extraction method based on spatial embedding to capture the context information features of nucleotide sequences.Finally,a classification model of DNA methylation sites is built based on convolutional neural network.The experimental results show that our model has better classification performance than other DNA methylation classification models.(3)In this thesis,a method of extraction and fusion of traditional features and abstract features based on DNA sequences is proposed to construct a classification model of DNA methylation sites.Firstly,the traditional coding methods One-hot,NCP and EIIP are used to extract the features of DNA sequences.Then,considering the correlation information between distant nucleotides in DNA sequences,this thesis combines the feature combination based on One-hot,NCP,EIIP and the features based on word embedding to describe the information of DNA sequences in an all-round way.Experimental results demonstrate that the proposed method further improves the accuracy of DNA methylation site prediction.In addition,the model is also tested on cross-species datasets.The experimental results show that the model constructed in this thesis has excellent classification prediction ability in DNA methylation site prediction of other species.
Keywords/Search Tags:DNA methylation site, deep learning, feature extraction, feature fusion, space embedding
PDF Full Text Request
Related items