Font Size: a A A

Prediction And Functional Analysis For Protein Post-translational Modification Sites

Posted on:2014-07-13Degree:MasterType:Thesis
Country:ChinaCandidate:S B SuoFull Text:PDF
GTID:2250330401971664Subject:Analytical Chemistry
Abstract/Summary:PDF Full Text Request
Post-translational modifications (PTMs) are important in cellular control mechanism. They affect many protein properties, such as folding, activity and their functions. So the further development of PTMs play an key role in understanding various disease conditions. With the rapidly growing of proteomics field during the last decade, it has led to the generation of prodigious quantity of data. These large data are greatly improving the further development of PTMs. Although some high-throughput experimental technologies have gained much great achievement in PTMs researches, most of them are laborious and has low throughput. Therefore, the prediction and analysis of PTMs with high-efficiency and reliable computational approaches is desirable and necessary. In this work, multiple feature extraction algorithms were used based on amino acid sequences, and then machine learning method and evolutionary conservation method were combined to construct different predictors for different PTMs. Also the functional analysis of PTMs and the relationship between PTMs and diseases were performed. The main contents are summarized as follows:1. A position-specific predictor was developed to predict lysine acetylation site. The residues around the acetylation sites were selected or excluded based on their information gain values. We incorporated features of amino acid composition information, evolutionary similarity and physicochemical properties to predict lysine acetylation sites. The prediction model achieved an accuracy of79.84%and a Matthews correlation coefficient of59.72%using the10-fold cross-validation on balanced positive and negative samples. A feature analysis showed that all features applied in this method contributed to the acetylation process. A position-specific analysis showed that the features derived from the important neighboring residues contributed to the acetylation determination. Finally a user-friendly web service was developed by combining the Matlab and Asp.net techniques: http://bioinfo.ncu.edu.cn/inquiries_PSKAcePred.aspx.2. Proteome-wide analysis of amino acid variations that influence protein lysine acetylation was performed. Firstly, the predictor KAcePred which used position specific scoring matrix (PSSM) profiles and best physicochemical properties as features was developed for predicting human acetylation sites. Then, with the KAcePred, we predicted the lysine acetylation sites for the original sequences and the variant sequences. Here, we defined the AcetylAAVs as acetylation related amino acid variations, and categorized three types. Using the developed prediction system, named KAcePred, we detected that50.87%of amino acid variations are potential AcetylAAVs and12.32%of disease variations are AcetylAAVs. More interestingly, from the statistical analysis, we found that the amino acid variations that directly create new potential lysine acetylation sites have more chance to cause diseases. A user-friendly web interface for analysis of AcetylAAVs is now freely available at http://bioinfo.ncu.edu.cn/AcetylAAVs_Home.aspx.3. We systematically analysed the kinases’characteristic of all disease-related phosphorylation substrates by using our kinase-specific predictors which are developed on the basis of Phosphorylation Set Enrichment Analysis (PSEA) method. We evaluated the efficiency of our method with independent test and concluded that our approach is helpful for identifying kinases responsible for phosphorylated substrates. More interestingly, from the systematic analysis, we found that Mitogen-activated protein kinase (MAPK) and Glycogen synthase kinase (GSK) families are more inclined to catalyse the happening of abnormal phosphorylation and further result in diseases. It can be anticipated that the characteristic analysis of disease-related phosphorylation kinases might be useful to promote protein kinase inhibitor drug development for diseases and help to identify the mechanism of phosphorylation related diseases. A user-friendly web interface for kinase-specific prediction is now freely available at http://bioinfo.ncu.edu.cn/PKPred_Home.aspx.
Keywords/Search Tags:post-translational modifications, acetylation, information gain, amino acidvariation, phosphorylation, kinase, disease, support vector machine, evolutionaryconservation, sequence set enrichment analysis
PDF Full Text Request
Related items