Font Size: a A A

Research On Algorithm Of Identifying Protein Complexes Based On Gene Ontology And Network Topology In Protein-protein Interaction Network

Posted on:2015-12-26Degree:MasterType:Thesis
Country:ChinaCandidate:X P WangFull Text:PDF
GTID:2370330488499634Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The identification of protein complexes and functional modules in protein-protein interaction(PPI)networks is very important to understand the organization and function of the biological system.A large dataset of experimentally detected protein-protein interactions(PPI)has been determined using high-throughput experimental techniques,many researchers focus on identifying protein complexes in large-scale PPI networks.However,due to the existence of false positives and false negatives in PPI networks and the uncertainty of topological structure of protein complex,there are still many challenges to identify protein complexes accurately and efficiently in PPI networks.To solve these problems,we focus on the reconstruction of protein network and the design of the protein complex recognition algorithm.The main research work includes:The current protein-protein interaction data containing large noise,mining protein complexes has great limitations directly from the protein network.In view of the existing problems,we propose two weighting methods of reconstruction protein networks combining Gene Ontology data and network topology,namely AdjustCD +GO and CD-distance + GO.In the experiment,we selected four existing algorithms and two data sets DIP and Krogan,and compared our methods with the other two existing weighting methods(ie AdjustCD and GO method)for de-noising performance.Experiment results show that our proposed methods have better de-noising effects than the other two methods,and our method can better improve the algorithm's performance in specificity(Sp),sensitivity(Sn),F-score.Based on our previous proposed weighting method and complex topological features,we propose a novel core-periphery algorithm based on Weighted network to identify protein complexes,ie PCALG(A Novel Protein Complex Identification Algorithm Based on the Integration of Local Network Topology and Gene Ontology).Using two datasets DIP and Krogan,the algorithm is compared with six existing classical algorithms to analyze its performance in terms of sensitivity(Sn),specificity(Sp),F-score,coverage rate and p-value.Experimental results show that PCALG algorithm outperforms the other algorithms,especially in specificity,F-score and p-value.
Keywords/Search Tags:Protein complexes, noise network, Gene Ontology, weighted network, PCALG algorithm, network topology
PDF Full Text Request
Related items