Font Size: a A A

Research On Key Protein Recognition Algorithms Based On Protein Interaction Network

Posted on:2019-11-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z G YangFull Text:PDF
GTID:2430330602461021Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Studies have shown that the identification of essential proteins is not only helpful for the understanding of the growth and regulation for cellular life,but also useful for the disease study and drug design.But the biomedical experimental methods for identification of essential proteins are usually costly and inefficient.With the development of high-throughput technologies,a growing number of protein-protein interactions(PPI)are available,which promotes the studies of essential proteins from the network level.Up to now,though a series of network-based essential proteins identification methods have been proposed,it is still a challenge to improve the prediction as the high false negatives and false positives in PPI networks.To resolve these problems,the main research contents of this paper are as follows:(1)We have proposed a method for constructing a reliable protein interaction network.The original static protein interaction network(S-PIN)includes high false negatives and false positives,and these data have reduced the reliability of the network and the accuracy of some algorithms.In view of this situation,we propose a method for constructing a weighted and reliable protein interaction network(RE-PIN)by using subcellular location information and protein complex information in this paper.Experimental results demonstrate that RE-PIN can effectively improve the accuracy of these network-based essential proteins identification algorithms.(2)We have improved the edge clustering coefficient(ECC)and the neighborhood centrality(NC)algorithm.As the ECC is not applicable to the weighted protein interaction network,and it ignores the effects of false negatives and false positives,resulting in inaccurate description of the network topology.Firstly,it is improved based on the concept of the reliability of protein-protein interaction,and the definition of the reliable clustering coefficient(RE-ECC)is given in this paper.Then,we propose an improved algorithm for identifying essential proteins based on RE-ECC and NC,named as reliable neighborhood centrality(RE-NC).Experimental results demonstrate that RE-NC performs obviously better than other eight algorithms.(3)We have proposed an algorithm for identifying essential proteins based on the protein domain specificity.Due to the topology-based algorithms ignore the intrinsic biological significance of protein interaction network,the accuracy of such algorithms is not high.In this paper,inspired by the TF-IDF,we propose two new algorithms,Do-NC and Do-ReNC for unweighted network and weighted network respectively,to identify essential proteins by integrating the protein domain information with the topological features of protein interaction network.Experimental results demonstrate that Do-NC and Do-ReNC perform better than other eight algorithms in the corresponding network.(4)We have studied a method for identifying essential proteins based on multi-feature fusion of D-S evidence theory.Different algorithms usually use different features to assess the significance of the protein,and the results are different.Therefore,we introduce a method,named DS-ESS,to identify essential proteins by considering the results of different algorithms comprehensively based on the D-S evidence theory.The result illustrates that DS-ESS can effectively improve the identification accuracy.
Keywords/Search Tags:protein interaction network, essential protein, subcellular location, protein complex, reliable edge clustering coefficient, protein domain, TF-IDF, D-S evidence theory
PDF Full Text Request
Related items