Font Size: a A A

Path-based Similarity Method Of Detecting Protein Complexes

Posted on:2013-06-24Degree:MasterType:Thesis
Country:ChinaCandidate:D Y WangFull Text:PDF
GTID:2230330395955318Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
It is important for studying of cellular organization and functions to detect proteincomplexes. With the high-throughput techniques produce lots of large scale protein-proteininteraction networks(PPI) data, it is a chance to detect protein complexes and to study thetopology of PPIs. However, there are high rate of false positives and false negatives in PPIsdue to the drawbacks of current experiments and the complexity of the organism, whichcauses some difficulties for detecting protein complexes accurately. Recently, many methodshave been presented to detect protein complexes, a protein complex is usually predicted as adense subgraph of PPI.We propose an algorithm TLP(Tow Level Paths) to detect protein complexes based onthe similarity of two level paths. The similarity of two level paths are the probability ofexisting direct paths between two groups of proteins and the probability of exiting indirectpaths merely with a same neighbor between two groups of proteins. The initial similarity ofprotein pairs is assigned as probability of two level paths. We greedily merge two groups ofproteins with the highest similarity and the density of result group should be larger thanpre-defined threshold using hierarchical clustering algorithm. After each mergence, we willupdate the new similarity of two level paths between new group and its neighborhood groups.When no groups of proteins satisfied the condition, current groups of proteins are predicted asthe protein complexes. We compare TLP to six existing protein complexes predictionmethods in terms of existing and our new proposed evaluations on reference proteincomplexes in benchmark data sets. Experimental results on three different scales andproperties of yeast PPIs show that our proposed algorithm TLP has the best performance. Theidentified complexes with our algorithm are demonstrated to match very well with standarddata set and thus provide more insights for future biological study.
Keywords/Search Tags:protein-protein interaction network, protein complex, detection, evaluation
PDF Full Text Request
Related items