Font Size: a A A

Clustering Function Of Phosphorylation Based On GSVM

Posted on:2017-01-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q WangFull Text:PDF
GTID:2180330488459214Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Protein kinases play a vital role in the whole process of cell growth, differentiation and apoptosis. The realization of the function of kinase is reflected by the phosphorylation of protein sites that can open or close one, even more functions of the protein. This phosphorylation process can be shut down as well by inhibitor. Recent study shows that abnormal phosphorylation is one of the major causes of diseases. Given a protein, phosphorylation can occur in different sites of the protein. Thus, the study of different phosphoric acid proteins is the key to understand their functional roles. Although many phosphorylation sites have been found, most of them are not annotated by any kinase information.Existing research on protein phosphorylation mainly focuses on the single site of the protein, without knowing the associated kinase. These abnormal regulations of proteins by phosphorylation can give rise to many serious diseases. However, the understanding of phosphorylation is still insufficient. In this study, we try to approach this problem in two folds:one for peptide kinase prediction and the other for kinase-inhibitor interaction prediction. Specifically, we study the following problems:(1) Regarding the lack of kinase associated with peptide, we propose an algorithm combining Bayes approach along with the OTSU automatic parameter selector. Based on the optimal peptide window size, our approach can significantly improve the prediction performance.(2) For the problem of having high dimensionality of kinase and inhibitor, we propose the GSVM algorithm, which is based on the granulized feature space of kinase and inhibitor. This granulized SVM markedly improves the prediction performance. In addition, we use Platt scaling to estimate the weight of each sample, and input the weighted sample to SVM again to elevate the prediction performance.(3) Facing the issue that the kinase-inhibitor positive sample is rare, while the unlabeled pair is abundant, we employ the PU learning algorithm along with layered GSVM together to handle it. PU learning is designed to solve the problem having this situation intrinsically, thus the good prediction performance. While the layered GSVM can reinforce the performance.The experimental results indicate that the algorithm designed in this study has good prediction performance, particularly in specificity, sensitivity and accuracy. In addition, our approach has notably generalization ability for the prediction of kinase and kinase-inhibitor interaction. It demonstrates that the combination of PU learning and GSVM is a good choice for solving our problem. In addition, the use of OSTU auto parameter learning and layered GSVM improves the prediction performance as well.
Keywords/Search Tags:GSVM, PU learning, Phosphorylation, inhibitor
PDF Full Text Request
Related items