Font Size: a A A

Identifying Protein Complex In The Large-scale Protein Interaction Networks

Posted on:2012-01-12Degree:MasterType:Thesis
Country:ChinaCandidate:B B LiuFull Text:PDF
GTID:2120330335989546Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Currently, the biomedical research is in the post-genome era. In the new era, one of the most important challenges is to systematically analyze and comprehensively understand how the proteins accomplish the life activities by interacting with each other. It plays an important role in predicting the protein functions and understanding specific biological processes that identify protein complexes from large-scale protein interaction networks.The recognition accuracy of CPM is lowly and CPM is unfit to identify protein complexes with meso-scale when it applied in protein interaction networks. Thus, we introduced distance restriction, developed a novel parameterized algorithm called CPM-DR and a non-parameterized algorithm called CP-DR for identifying protein complexes. CPM-DR and CP-DR are applied to the protein interaction networks of Sacchromyces cerevisiae. The experiment results show that CPM-DR and CP-DR can detect a large number of protein complexes with specific biological significance and biological functions more effectively, more precisely and more comprehensively.The traditional density-based and local search algorithms neglect many peripheral proteins that connect to the core protein clusters with few links when identify protein complexes. Thus, biologically meaningful protein complexes that do not have highly connected topologies are ignored. In addition, the previous methods for identifying protein complexes seldom take into account the fundamentality discrepancy of proteins to cellular life. However, the fact is that the importance of proteins to life activities is different. To overcome these disadvantages of previous algorithms, we propose a novel protein complex discovery algorithm based on Essential Protein and lOcal Fitness, named EPOF. The new algorithm EPOF is applied to the unweighted and weighted yeast protein interaction network. Experimental results show that EPOF outperforms other previous competing algorithms. In addition, EPOF could identify the significant protein complexes with low density. Moreover, EPOF verified the importance of essential proteins for identifying protein complexes.Due to the dynamic nature of protein-protein interaction, the available protein interaction data high false positives and incomplete, we propose a new protein complex detecting algorithm based on Tissue-Specificity and lOcal Fitness (called TSOF) by integrating tissue-specificity gene expression data and human protein interaction network. TSOF is applied to the static human protein interaction network and the experiment results show that TSOF outperforms other previous competing algorithms. Moreover, TSOF verified the importance of tissue-specificity seed set, which is integrate biological character and network topological feature, for identifying protein complexes.The protein complex mining algorithms proposed in this paper starts off from different angles and solves some problems effectively in the processes of clustering in protein interaction networks. The identified protein complexes are proved to be statistically significant, which can provide some references for biologists in their biochemical experiments.
Keywords/Search Tags:systems biology, protein interaction network, clustering algorithm, protein complex
PDF Full Text Request
Related items