Font Size: a A A

Predicting Amidation Sites Based On Amino Acid Sequence Information

Posted on:2019-07-30Degree:MasterType:Thesis
Country:ChinaCandidate:S Q ZhaoFull Text:PDF
GTID:2370330593451041Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Amidation has an important relationship with diseases such as hypertension,cancer,neurological dysfunction,sleep apnea and the like,and the identification of amidation sites can not only help us understand the process of amidation,but also help to understand the original causes of these diseases.However,the traditional methods of biological experiments to identify amidation sites are often time consuming,laborious and expensive.Therefore,we propose a method based on machine learning algorithm to identify amidation sites,which not only has great advantages in terms of time and cost,but also has a good prediction effect.In this paper,we propose a computational method for recognizing amidation sites through amino acid sequence characteristics and machine learning algorithms.This study combines three feature extraction methods to construct feature vector,which are position-specific amino acid propensity,pertinence of k-space amino acid pair,and high-quality indices;using the extremely randomized trees algorithm to remove redundancy and dependence among components of the feature vector by a supervised fashion,that is,the selection of the optimal feature subset;finally the support vector machine algorithm is used to identify the amidation sites.Our method not only captures the physicochemical properties of amino acids in the sequence,but also captures the position-related information of amino acids so that it can obtain the information of the sequence to be predicted more comprehensively,and the weighted support vector machine classifier can also get excellent prediction results on the imbalanced data sets.Finally,our method has achieved very good predictive results on independent datasets with an accuracy rate of 0.962,the Matthews correlation coefficient of 0.89,and the AUC of 0.964.The experimental results show that our proposed method has obvious advantages in identifying amidation sites.
Keywords/Search Tags:Post-translational modification, Amidation, Feature extraction, Machine learning
PDF Full Text Request
Related items