Prediction Of Neuropeptide Precursor And Its Cleavage Site Based On Machine Learning

Posted on:2022-10-01

Degree:Master

Type:Thesis

Country:China

Candidate:Y Wang

Full Text:PDF

GTID:2480306524982389

Subject:Biophysics

Abstract/Summary:

PDF Full Text Request

Neuropeptides are a class of bioactive peptides with about 5 to 50 amino acids in length.They are ubiquitous in the central and peripheral nervous system,and play crucial parts in the activation of signal cascades in reproductive,metabolic,sensory,memory,learning and other life activities.Neuropeptides are derived from neuropeptide precursor proteins,which are directly translated from m RNAs and usually consists of a signal peptide,one or several neuropeptide sequences and some other sequences.After proteolysis and a series of post-translational modifications of the neuropeptide precursors,onr or more mature neuropeptides are produced.Under the background of exponential growth of function-unknown protein sequences and limited type of known neuropeptides,accurate identification of neuropeptide precursor sequences and their cleavage sites is significant for the development of neuroscience,especially for neuropeptide research.However,the existing research methods mainly rely on experiment,such as site-directed mutagenesis and aminal experiments,which are time-consuming and laborious,and sometimes unsatisfactory because the accuracy is relatively low.With the rapid development of bioinformatics,more and more computational methods have been widely used in life science research,including protein structure modeling,RNA-RNA interaction,drug design and many other fields,and neuropeptide research is of no exception.Support vector machine(SVM),random forest and some other machine learning methods were applied to do the following two work: first,a SVM model based on pseudo amino aicd composition was constructed to predict neuropeptide precursor sequences.The dataset which is collected from an published article including 405 nuropeptide precursors(as positive data)and 405 non non-neuropeptide precursors with the same length distribution as neuropeptide precursor sequencs(as negative data).The prediction accuracy of this model reached 87.14%,and AUC was 0.9391.Second,using SVM,random forest,K-nearest neighbors,neural network and other machine learning methods,we constructed several models based on different featurization methods relate to amino acid sequence composition,distribution and physicochemical properties.The original data source is the neuropeptide precursor sequences in the previous work.A series of futher data processing was carried out according to its annotation in Uni Prot.Then model construction and prediction were implemented for the obtained 937 positive data and the randomly selected 937 negative data with the same sequence length.The model with best performance was the one based on support vector machine with enhanced amino acid composition features,with accuracy of 90.37% and AUC of0.9576.So we developed a predictive tool called Neuro CS for this model.For the convenience of use,the tool provides free online service:http://i.uestc.edu.cn/Neuro CS/dist/index.html#/...

Keywords/Search Tags:

neuropeptides, neuropeptide precursors, cleavage sites, machine learning, support vector machine

PDF Full Text Request

Related items

1	Predicting Functional Sites Based On Support Vector Machine And Extreme Learning Machine
2	Prediction Of Neuropeptide Cleavage Sites Based On Random Forest
3	Support Vector Machine Data Classification
4	Research On Prediction Of Phosphorylation Modification Sites Based On Machine Learning
5	Support Vector Machines Classifier Based On Margin Vectors
6	Support Vector Machine Based On Artificial Error
7	Eukaryotic Gene Promoter Recognition Based On Optimized Support Vector Machine
8	Research On Classification Learning Machine Based On The Rescaled Hinge Loss Function
9	Database Construction And Precursor Prediction For Neuropeptide
10	Study Of Algorithms For Support Vector Machine