Font Size: a A A

Protein Phosphorylation And Hydroxylation Sites Prediction Algorithm Based On Mixed Deep Learning Model

Posted on:2020-12-14Degree:MasterType:Thesis
Country:ChinaCandidate:X Y XuFull Text:PDF
GTID:2370330575998012Subject:Biology
Abstract/Summary:PDF Full Text Request
As an important component of cells and tissues,proteins are the material basis for the formation of living organisms.Protein post-translational modification(PTM)is an important mechanism regulating protein function.There are many types of protein post-translational modifications.In-depth study and identification of PTM plays an important role in understanding protein structure and function.With the emergence of large unmarked protein sequences in the post-genome era,whether the post-translational modification residue sites can be quickly,accurately,and efficiently identified is essential for basic research and drug develepment it is also the most fundamental core issue in the field,especially the study of protein phosphorylation and hydroxylation.However,existing biological methods are expensive and time-consuming,and existing prediction method models are simple and the accuracy is not high.In recent years,deep learning has made breakthroughs in image recognition,machine vision,and natural language processing.However,there is almost no deep learning algorithm for the study of protein post-translational modification.This paper builds a hybrid deep neural network model and further applies to predict protein phosphorylation and hydroxylation.The main work are as follows:1.The current research status of PTM,the overview of phosphorylation and hydroxylation sites prediction methods,and the status of deep learning research are reviewed.2.Construct a hybrid deep learning neural network model.By extracting the characteristic peptides of the protein sequence,the peptide 13 centered on the PTM site is selected as the optimal peptide segment,and the character sequence is further converted into a numerical matrix by one-hot encoding to be used as an input of the neural network.The high-dimensional features of protein sequences were extracted by convolutional neural network(CNN),the relationship between amino acid residues was extracted by circulating neural network(RNN),and the characteristics of the two neural networks were combined to construct a hybrid neural network CNN+RNN predictive PTM model.3.Using a hybrid deep learning neural network model to predict phosphorylation and hydroxylation modification sites.By training,testing,and predicting on protein positive and negative sample data sets.In the phosphorylation site prediction,we constructed a hybrid neural network of convolutional neural network and bidirectional long-term and short-term memory network(BLSTM).By this hybrid neural network,the lowest overall accuracy of the phosphorylation site prediction is 0.914 on the dataset,the lowest AUC value is 0.994.In the hydroxylation site prediction,we constructed a hybrid neural network of convolutional neural network and long-term and short-term memory network(LSTM),By this hybrid neural network.the lowest overall accuracy of the hydroxylation prediction is 0.892 on the datase.the lowest AUC value is 0.96.The hybrid neural network model is superior than the existing prediction model overall.
Keywords/Search Tags:protein post-translational modification, convolutional neural network, circulating neural network, phosphorylation sites, hydroxylation sites, ROC curve, PR curve
PDF Full Text Request
Related items