Font Size: a A A

Prediction Of Cysteine S-sulphenylation Sites Based On Deep Learning

Posted on:2022-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:X R LvFull Text:PDF
GTID:2480306566479504Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Cysteine S-sulphenylation(CSO)is the reversible oxidation of protein cysteinyl thiols to suphenic acids.CSO functions as an intermediate on the path toward other redox modifications,such as disulfide formation and S-glutathionylation.This modification has been reported to influence protein functions,regulate signal transduction and affect cell cycle.Because of its functional significance,several prediction approaches have been developed.Nevertheless,they are based on a limited dataset from Homo sapiens and there is a lack of prediction tools for the CSO sites of other species.Recently,this modification has been investigated at the proteomics scale for a few species and the number of identified CSO sites has significantly increased.Thus,it is essential to explore the characteristics of this modification across different species and construct prediction models with better performances based on the enlarged dataset.In this study,the main work is to build predictive models of CSO sites and compare them with previous algorithms.The work includes the following three parts:(1)Construction of a data set.The data set included experimentally identified modification sites of humans and Arabidopsis species.The amount of data is more than three times that used for the previous models.(2)Development and comparison of multiple models.Not only traditional machine learning models but also deep learning models were developed and compared,including one-dimensional Convolutional Neural Network model,two-dimensional Convolutional Neural Network model and Recurrent Neural Network model.(3)Two species-specific prediction models were constructed for humans and Arabidopsis.In addition,to facilitate the study of modification sites in other species,a universal prediction model was constructed.These comparisons showed that the Long Short Term Memory model with the Word-Embedding layer,dubbed LSTMWE,performed favourably to the traditional machine-learning models and other deep learning models across different species,in terms of cross-validation and independent test.Its area under the receiver operating characteristic curve ranged from 0.82 to 0.85 for different organisms,which was superior to the reported CSO predictors.Moreover,we developed the general model based on the integrated data from different species and it showed great universality and effectiveness.To provide convenience in the community,We provided the online prediction service called Deep CSO that included both species-specific and general models,which is accessible through http://www.bioinfogo.org/Deep CSO.
Keywords/Search Tags:modification site prediction, Cysteine S-sulphenylation, posttranslational modification, Deep Learning
PDF Full Text Request
Related items