Font Size: a A A

Research On The Prediction Of Cross-species Protein Crotonylation Modification Sites Based On Deep Learning

Posted on:2021-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y M ZhaoFull Text:PDF
GTID:2430330611492474Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Protein post-translational modification(PTM)refers to the covalent bonding of chemical small molecule groups on the amino acid side chains of proteins under both enzymatic and non-enzymatic conditions.It achieves exponential amplification of protein functions.Among them,Lysine crotonylation(Kcr)is a modification originally identified on histones,which involves a variety of biological processes and is related to acute kidney injury,inactivation of HIV reactivation,depression and other diseases.Therefore,the accurate identification of crotonylation modification sites from protein sequence data is of great significance for basic research and drug development.The experimental methods for identifying crotonylation sites are long and expensive,so it is necessary to develop computational prediction methods.Some histone-based models containing up to 169 crotonylation sites have been developed.Recently,thousands of crotonylation modification sites have been verified on non-histones of human,papaya,rice and tobacco through experiments.It is unclear whether the crotonylation classifier previously developed based on histones can identify non-histone crotonylation modification sites.Therefore,there is an urgent need to design a cross-species model that can recognize histone and non-histone protein crotonylation sites.In view of several problems in the prediction of protein crotonylation modification sites,the main research work is as follows:(1)Based on the experimental paper,a non-histone crotonylation standard data set was constructed for the first time.After retrieving and collecting biological experiment data,a standard data set was constructed through a four-step data cleaning process.(2)Feature extraction and feature selection of croton acylation modification.Twelve different classifiers were constructed by combining different features and algorithms to identify non-histone crotonylation sites.The experimental results show that the enhanced group amino acid composition(EGAAC)is more effective than the best feature extraction algorithm k-spacer amino acid pairs(CKSAAP)and other classic amino acid sequence feature extraction algorithms,which have been described in the published literature.(3)We are the first to build a cross-species crotonylation site prediction model based on deep learning,named Deep Kcrot.Through the discussion and visual display of performance among models,the influence of data volume on the prediction performance of deep learning model is discussed.This paper discusses whether the published histone based model and non histone based model are applicable to each other.After deep kcrot is retrained with histone crotonylation data,it performs well in the prediction of histone and non histone crotonylation modification sites.Then we compare the performance differences between the models based on different species and the models based on all species data training.Based on the species differences,we retain a cross species general model and four specific species models in deepkcrot.(4)Develop an online Kcr prediction platform for Deep Kcrot algorithm(http://www.bioinfogo.org/deepkcrot/).
Keywords/Search Tags:bioinformatics, feature extraction, deep learning, convolutional neural network, crotonylation
PDF Full Text Request
Related items