| Post-translational modification(PTM)of protein refers to the break or generation of covalent bond on back-bone or side chain of protein.Over 500 PTMs have been discovered until now,within which,protein methylation,acetylation and phosphorylation are 3 of the most studied PTMs.Protein methylation mainly occurs on side chain of lysine and arginine residue,the most well knew function of protein methylation is "histone code",which determines the epigenetic feature of cells.Recently,rapid progress in the field of protein methylation unmasked the function of non-histone methylation,which is far more than“histone code”.Lysine acetylation is another modification anticipated in epigenetics,recent finding on the roles of acetylation played in metabolism highlights the field of lysine acetylation.As the most studied PTM,protein phosphorylation involved in almost all aspect of cellular activity.Lysine and arginine methylation can be classified to different types according to the position and times of substitution reaction.For lysine,the substitution can occur one,two or three times result in mono-,di-and tri-methylation,respectively.Meaning while,arginine methylation has three types,naming mono-,symmetry-di-and asymmetry-di-methylation.In this work,we manually collected 1,521 lysine methylation sites and 1,751 arginine methylation sites,all sites were annotated with methylation type information.With this dataset,the first type specific methylation site predictor GPS-MSP was trained and developed using modified GPS3.0 algorithm in this work.By classifying the collected data according to species,we developed 28 species specific methylation predictor.As different histone acetylation transferase(HAT)has different substrates,we collected 702 experimental verified HAT-specific lysine acetylation sites from literatures manually.In our dataset,7 HATs have enough substrates for training prediction models,including CREBBP,EP300,HAT1,KAT2A,KAT2B,KAT5 and KAT8.Using GPS2.2 algorithm,we developed HAT-specific lysine acetylation site predictor GPS-PAIL.In this work,both web service and local software package are implemented for GPS-PAIL.To investigating proteins involving in autophagy and cell death,we manually collected 4,243 proteins anticipated in autophagy and cell death pathway.Furthermore,we identified 183,319 potential proteins related to autophagy and cell death by performing ortholog detection,and build the database of autophagy,necrosis and apoptosis orchestrators,naming THANATOS.By investigating ortholog genes of 41 ATG genes,we perform a comprehensive analysis of core ATG genes and proved core autophagy machinery is highly conserved across all 148 species.By analyzing data from ICGC and DrugBank database,we revealed that known cancer genes and drug targets were dramatically over-represented in human autophagy proteins,which were significantly associated in a number of signaling and disease pathways,and frequently mutated in pancreatic cancer.Through compiling a large dataset of PTM,we analyzed the roles played by PTMs in the regulation of autophagy.Our results proved that phosphorylation played an essential role in regulating autophagy.Meaning while,other types of PTMs,for example,ubiquitination,also preferentially enriched in human autophagy.In this work,we also built the plant phosphorylation site database dbPPT,containing 82,175 phosphorylation sites on 31,012 proteins across 20 plant species.Furthermore,we constructed the comprehensive phosphorylation site database for animals and fungi,naming dbPAF,which hosting 483,001 phosphorylation sites on 54,148 proteins.With the experimental verified dataset of 72 HATs,97 HDACs,116 acetyl-readers,112 HMTs,76 HDMs and 156 methyl-readers,we developed the database for eukaryotic writers,erasers and readers protein of histone acetylation and methylation system.By integrating the comprehensive information for rice,we built the knowledge base for rice which was called IC4R. |