Essential proteins are important for the survival and development of organism, without these essential proteins organism cannot survive. The identification of the essential proteins not only help to understand the operation mechanism of the life, but also helps in the research of the mechanisms of biological evolution. So it is significant for the research of systems biology, and it can provide valuable theories and methods for disease treatment and drug design. In the post-genomic era, with the increasing of protein-protein interaction data, identifying essential proteins based on protein-protein interaction networks becomes a hot topic in bioinformatics.Based on the topological character of networks. This paper analyzes the characteristics of the node in the protein-protein interaction networks, and explores the local property of the protein in protein complexes. Then, new methods are proposed for identifying essential proteins in protein-protein interaction networks.The most methods of identifying essential proteins are based on the topological property of the whole network. Previous studies have shown that the essential of proteins have high correlation with protein complexes. Based on this, harmonizing edge clustering coefficient and protein complex information, we propose a new method, CSC(Combining SoECC and Complex Centrality) algorithm, to identify essential proteins. The result shows that the number of identifying essential proteins by CSC algorithm is more than ten other centrality methods, which means that the accuracy of CSC algorithm is better than other methods. Moreover, in prediction of low-connectivity essential proteins, CSC algorithm has a significant improvement. So it is meaningful integrating the whole property and local property to identify essential proteins.The importance of different proteins in protein complexes is different, and the protein-protein interaction networks exist a lot of false positives. In consideration of these facts, we propose a new centrality algorithm, Graph Entropy Centrality (GEC), to identify essential proteins. Through computing the entropy value and GO semantic similarity value of every protein in the protein complexes, to evaluate the importance of the proteins. The result shows that the number of identifying essential proteins by GEC algorithm is more than other methods, which means that the accuracy of GEC algorithm is better than other methods. And GEC algorithm can identify more essential protein which other methods could not identify them. Moreover, in prediction of low-connectivity essential proteins, GEC algorithm has a good performance. |