The Prediction Of Secondary Structure Of Protein Based On Deep Learning

Posted on:2019-01-28

Degree:Master

Type:Thesis

Country:China

Candidate:S X Xie

Full Text:PDF

GTID:2310330542973649

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

The prediction of secondary structure of protein is an important research field in bioinformatics.With the development of artificial intelligence,many researchers have begun to make prediction of secondary structure of protein using machine learning.Although we have achieved some satisfying results,further improvement is needed.In this paper,we employed three methods,namely,fuzzy support vector machine,convolutional neural network?CNN?combined with FSVM,CNN combined with Long Short-Term Memory,to predict the secondary structure of protein.?1?Prediction of the secondary structure of protein by fuzzy support vector machine?FSVM?.Firstly,it constructs two initial hyper planes,which involves an iterative process to locate class centers and the approximate hyper plane based on the initial hyper planes with an iterative process in the feature space;then,the membership values of samples in the training set are assigned according to the distances between each sample to the approximate hyper plane;Finally,a FSVM based on feature space is trained based on the training set.Besides,our method also exploits information on sequence-based structural similarity.In four datasets?e.g.RS126,CB513,data1199 and CASP?our method achieves 94.2%,93.1%,96.7%and92.1%Q₃ accuracy and 91.7%,89.7%,94.1%and 89.6%SOV values,respectively.?2?Prediction of the secondary structure of protein by CNN combined with FSVM.Firstly,we transform the vector features of protein into matrix features;then,some feature representations of protein are extracted from the original features by CNN;finally,based on the features from CNN,we train a FSVM classifier and make the prediction on test sets.In four datasets?e.g.RS126,CB513,data1199 and CASP?our method achieves 94.3%,93.8%,97.1%and 92.7%Q₃ accuracy and 92.5%,90.4%,94.5%and 90.2%SOV values,respectively.?3?Prediction of the secondary structure of protein by CNN combined with LSTM.Since CNN are shift invariance,we firstly use multiple kernels of different sizes to extract local features;then,considering the long term dependence between the residues in a protein sequences,we use bidirectional LSTM to extract the global features;Finally,the local features and the global features are combined to form the final feature,and the soft-max classifier is used to predict the secondary structure of protein.In four datasets?e.g.RS126,CB513,data1199 and CASP?our method achieves 94.5%,94.2%,97.2%and 93.5%Q₃ accuracy and 92.2%,90.3%,94.8%and 90.2%SOV values,respectively.Experimental results show that the three methods achieve high accuracy in the prediction of secondary structure of protein.Finally,this paper analyzes the shortcomings of methods mentioned above and proposes the further research direction.

Keywords/Search Tags:

Secondary structure of protein, FSVM, CNN_FSVM, CNN_LSTM, Sequence-based structural similarity

PDF Full Text Request

Related items

1	A Study On The Protein Secondary Structure Prediction And The Connection Between Protein Secondary Structure And Its 3D Structure
2	The Machine Learning Model Of Protein Structural Prediction Based On Protein Sequence
3	Protein sequence divergence relative to protein folding
4	The Research On Analysis Methods Of DNA Sequences And RNA Secondary Structures Based On New Representations Models
5	The Statistical Relationship Between MRNA Sequence, Structure, Energy And Protein Secondary Structure
6	Research On Several Problems For Protein And RNA
7	Structural Matrix Is Applied In Comparison Of Similarity For Biological Sequences
8	Protein Secondary Structure Prediction Based On A Balanced Classification Algorithm
9	Prediction Of Protein Tertiary Structural Classes Based On Predicted Secondary Structure
10	Thermodynamic and structural properties of small RNA secondary structural motifs and their role in RNA -protein interactions