Font Size: a A A

Study And Implementation On Algorithm Of Inferring Phosphorylation Relationships Based On Protein Domains

Posted on:2018-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:R Y LiFull Text:PDF
GTID:2310330521450902Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Protein phosphorylation is one of the most widespread post-translational modifications,which almost involves all of the biological processes such as cell metabolism,growth,differentiation and signal transduction,etc.In short,the protein phosphorylation is exactly the process that protein kinases phosphorylate substrates by adding phosphate groups so as to change the structure and function of substrates.Studies have shown that phosphorylation is closely related to the formation of the disease and cancer,due to the disorders of phosphorylation process such as the mutation or add or delete of modified locus,is likely to lead to the abnormality of protein function.In addition,some protein kinases have been used as drug targets in cancer treatment.Although there are a lot of phosphorylation sites have been found,they are quite lack the information of corresponding protein kinases.Consequently,it is necessary to develop computational methods for predicting the phosphorylation relationships between the protein kinases and substrates.There have been some computational methods proposed to solve phosphorylation prediction problem.Generally,most of the methods predicted phosphorylation relationships based on protein sequence information by applying corresponding machine learning algorithms such as support vector machines,decision tree,Bayesian Network and so on.Besides,the other biological information such as protein co-expression,co-localization and protein interactions are also considered to optimize the prediction performance.However,the limitation of these methods is that they always tend to perform differently in different data sets and the prediction accuracy and robustness are yet to be further improved.In this paper,we proposed a new probability model named as “Phos D” to predict phosphorylation relationships.Considering that the protein domain is the structural and functional unit of proteins,which is more stable and conservative,as a result,we assume that kinase-substrate interactions are accomplished with kinase-domain interactions.Based on this assumption,Phos D identified the pattern of kinase-domain interactions from the known phosphorylation data,and then such pattern will be considered as the primary feature to do phosphorylation prediction.For a specific kinase,those proteins that contain specific domains of such kinase will be considered as candidate substrates.Besides,a probability model is designed to measure the phosphorylation probability of each kinase-substrate pairs.Actually,phosphorylation relationships between kinases and substrates are a kind of special protein interactions and there are amount of protein interactions data available.Hence,it is feasible to integrating protein interaction information into our method to realize the secondary filter so as to further improve the prediction precision.Compared to the other six popular approaches in phosphorylation,Phos D achieves robust results on four benchmark databases and outperforms all of them with higher precision.Furthermore,we noticed that given a kinase,the more substrates are known for it the more accurate its predicted substrates will be,and the domains involved in kinase-substrate interactions are found to be more conserved across proteins phosphorylated by multiple kinases.These findings can help develop more efficient computational approaches in the future.In addition,some of our predicted kinase-substrate relationships are validated by signaling pathways,indicating the predictive power of our approach and also show the biological significance of phosphorylation in signal transduction.
Keywords/Search Tags:phosphorylation, protein domains, protein-protein interactions, probability model
PDF Full Text Request
Related items