Font Size: a A A

Research And Implementation Of Protein Complex Identification Algorithm For Protein-Protein Interaction Network

Posted on:2020-04-11Degree:MasterType:Thesis
Country:ChinaCandidate:P H WangFull Text:PDF
GTID:2370330602468354Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Protein complexes are one of the most important functional units in cellular biological processes,so the detection of protein complexes is very important for understanding the principles of cellular organization and function.Traditional experimental methods for identifying protein complexes are too complicated,and the protein complexes identified based on computational methods can not avoid high false positive results.In this paper,the topological and biological characteristics of Protein-Protein Interaction(PPI)network are considered,research and analysis are performed on the problems of protein complexes with low noise and low identification efficiency of overlapping structures..Aiming at the problem of high false-positives in existing protein interaction data,considering the topological characteristics of PPI network nodes,constructing a weighted protein interaction network,we proposed an clustering algorithm of backbone-degree tree based on weighted protein-protein interaction network for protein complex identification(BTW).Firstly,the algorithm analyzes the topological features of PPI network nodes,uses the weighted backbone algorithm to weigh the PPI network.Then it performs clustering on the PPI network through the Walktrap algorithm,and conducts experiments in the PPI network datasets of several Saccharomyces cerevisiae.The results clustering experiments show that the algorithm has higher false positives than MCL and Walktrap algorithms,and the recognition accuracy and performance are significantly improved.Aiming at the inability to identify overlapping protein complexes and ignoring the functional information between proteins,we proposed an Clustering algorithm based on topological features and gene ontology information for protein functional module identification(WCFM).A weighted network model was established by selecting the semantic similarity of gene ontology to measure the strength and weak relationship of protein interactions.This method weights the edges in the PPI network,thereby reducing the dependence on the network topology.The experimental results show that combining gene ontology information with PPI data can improve the accuracy of identifying protein complexes and make the results more biologically significant.A comprehensive analysis system Cluster C for PPI network clustering method is designed and developed.At present,the platform has integrated eight clustering algorithms such as ClusterONE,SPICi and MCL,and five evaluation methods such as F-measure and Accuracy.At the same time,D3.js visualization technology is applied to large-scale protein interaction network to visualize PPI network and clustering results in order to better explain biological phenomena.
Keywords/Search Tags:Protein-Protein interaction network, Clustering algorithm, Protein complex, Topological features, Gene ontology
PDF Full Text Request
Related items