Font Size: a A A

Research On Protein Phosphorylation Sites Prediction And Rules Extraction

Posted on:2007-03-08Degree:MasterType:Thesis
Country:ChinaCandidate:J J CaiFull Text:PDF
GTID:2120360185454141Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Protein phosphorylation is one of the most important reversible post-translational modifications (PTMs). Phosphorylation and dephosphorylation provides a regulatory mechanism in eukaryotic cells. High-throughput methods for the identification of PTMs are being developed, in particular the application of mass spectrometry to the fields of proteomics. With the recent increase in protein phosphorylated sites identified by mass spectrometry, in silico prediction of potential phosphorylation sites may facilitate the identification of phosphorylated protein. It is indeed advantageous to provide validation for biological experiments and discover new rules of phosphorylation by integrating computational approaches into phosphorylated proteins research.Computational intelligence is a good choice for high performance phosphorylated sites prediction. Furthermore, explaining how a prediction is made is the key to its credibility, especially for applications to bioinformatics. Not only are the extracted rules reasonable interpretations that are useful to guide the biological experiments, but also are helpful to integrate computational technology for advanced deduction.In this thesis, after comprehensive comparisons among the different features of phosphorylated sites, we select physicochemical and biological properties of amino acids around the sites through the primary structure of protein for the feature extraction. We design a new phosphorylated sites prediction method named AproPhos with AdaBoost as feature selection and classification. Different from other prediction methods with lower sensitivity, our method shows about 10% higher sensitivity as well as about 2% higher specificity. In order to provide the understandable explanation of the prediction, we design a novel approach to extract rules from AdaBoost classification. AproPhos and the rules extraction method expand the application field of the phosphorylated sites prediction. They can give the distribution formulas of amino acids properties around the sites at the same time perform the good prediction, as well as can enhance the efficiency of phosphorylated protein identification with tandem mass spectra.In this thesis, we also develop a new method FFP (Fragment ion Formula Prediction) which can predict the best formulas of fragment ions more accurately through the minimization of the distance between theoretical and observed isotope patterns within less time. It can help to preprocess the mass spectrum data and improve the reliability of the identification of protein (including phosphorylated proteins) with tandem mass spectra.
Keywords/Search Tags:phosphorylation, prediction, rules extraction, SVM, AdaBoost
PDF Full Text Request
Related items