Study On Explainable Deep Learning Model For Predicting DNA-Binding Protein

Posted on:2019-06-21

Degree:Master

Type:Thesis

Country:China

Candidate:S Liu

Full Text:PDF

GTID:2370330626952104

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

DNA-binding proteins(DBP)play their key roles through the interactions with DNA in cellular activities.Currently,deep-learning networks are used to make highly accurate predictions for multiple DNA-binding protein primary sequences.However,due to the reasons of more data transmission and different processing methods of different functional layers in deep-learning models,it is difficult to explain the training models and results.We design the following experiments to solve these problems:(1)Simplified feature engineering,sequence processing modules and classifier.We use a one-hot coding method to process the primary sequence of DNA-binding protein,and use the one-dimensional convolutional layer of multiple convolution kernels for the first layer of processing.Then we transpose the result which will be convoluted with one kernel with size of one.We use the logistic regression to classify the results after sequence processing.Because the sequence processing and classification process are completely linear processes,we can directly integrate the front and back operations to obtain the weighted summation convolution kernel in the convolution process.The cross-validation accuracy of this model can reach 80.5%~86.6%.We used this model to find the similarity of partial amino acid expression in DNA-binding proteins.(2)Using explainable subnetworks.We replace the logistic regression with explainable subnetworks,which is a process that uses a multi-layer perceptron to approximate an arbitrary rational function with a certain precision,and introduces a sub-network for each feature to learn the nonlinear contribution of each feature.Using an explainable subnetwork,we obtained some convolution kernels and their nonlinear contributions,and considered that the nonlinear model can converge more quickly and accurately.(3)Generation model based on convolution discriminant.We design a scoringincreasing sequence generation algorithm to generate sequences that can be identified by existing classification models.Then we try to evaluate the generated DNA-binding protein primary structure sequences.

Keywords/Search Tags:

Protein primary structure, Deep learning, Convolutional neural network, Explainable Subnetwork, Sequence generation

PDF Full Text Request

Related items

1	Application Of Deep Learning Algorithm In Protein Structure Prediction
2	On The Prediction Of DNA-binding Proteins Only From Primary Sequences:A Deep Learning Approach
3	Research On Protein Function Prediction Based On Deep Learning
4	Extraction Of Shortest Representation Of Protein Folds Based On Convolutional Neural Network
5	Genome-wide RNA-binding Proteins Identification Based On Evolutionary Deep Convolutional Neural Network
6	Research On Protein Classification Prediction Based On Deep Learning
7	Protein Contact Map Prediction Using Deep Convolutional Neural Network
8	Application Research On Prediction Of Protein Function Using Deep Learning
9	Study On Predicting Nucleotide-Binding Protein Using Deep Learning Approach
10	Research On Protein Secondary Structure Prediction Based On Deep Learning Method