Font Size: a A A

The Research Of Algorithms For Protein Phosphorylation Motif Discovery

Posted on:2015-11-17Degree:MasterType:Thesis
Country:ChinaCandidate:H P GongFull Text:PDF
GTID:2298330467984638Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Phosphorylation motif represents position-specific amino acid patterns around the phosphorylation sites. As the discovery of this kind of phosphorylation motifs reveals the underlying regulation mechanism and facilitates the prediction of unknown phosphorylation events, many researches have paid attention to its biological significance. During recent years, the advent of high-throughput methods such as tandem mass spectrometry has greatly enhanced the investigation into phosphorylation, which provides a unique opportunity to to conduct the study of phosphorylation motif discovery. Many phosphorylation sites have been marked in the database, which makes it capable to use computational methods to discover phosphorylation motifs. Several methods haves been proposed, such as Motif-X, MoDL, Motif-All, MMFPh. All these methods can uncover a certain number of motifs, however, the problem of how to efficiently discover all significant motifs without redundancy still remains unsolved.This paper suggests two kinds of methods for motif discovery. For the first one, we give a new definition of phosphorylation motif discovery called conditional phosphorylation motif discovery and propose a method named C-Motif for solution. Compared with MMFPh and Motif-All, C-Motif guarantees to find all the significant motifs and the motifs whose over-expressiveness mainly benefits from its constituting parts can be filtered out in an elegant manner. In addition, it is very efficient. For the second one, we use frequent pattern mining algorithms to mine frequent motifs, and then apply permutation test to accrurately assess their statistical significance. We suggest three permutation methods:Standard permutation (SP), Adaptive Marginal Effect Permutation (AMEP), Modified Adaptive Marginal Effect Permutation (MAMEP). Experimental results on real data and simulation studies show that all permutation methods are capable of removing potential false positives, particularly, AMEP and MAMEP are of practical use and can satisfy the requirements of higher Power or lower FDR respectively for biological researchers.
Keywords/Search Tags:Phosphorylation Motif, Frequent Pattern Mining, Permutation Test, General Significance, Local Significance
PDF Full Text Request
Related items