Font Size: a A A

Essential Protein Discovery Based On The Integration Of Protein-protein Interaction Network And Gene Expression Data

Posted on:2014-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:H H ZhangFull Text:PDF
GTID:2250330425470933Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Essential proteins are those proteins which are indispensable to the viability and reproduction of an organism. They play an important role in cell activities. Identification of essential proteins is significant not only for the research of life science, but for practical purposes, such as diagnosis and treatment for diseases and drug design. With the high-throughput technology development, methods based on protein-protein interaction network for identifying essential proteins become more and more popular. The main original works include:The current methods (DC, BC, CC, SC, EC, IC, BN, DMNC, SoECC and LAC) for identifying essential proteins based on protein-protein interaction network are researched and tested on species of Saccharomyces cerevisiae, separately. The results show that these methods only based on protein-protein interaction network highly depend on the network and are sensitive to noise, such as false positive.In view of this, a new centrality measure named PeC is proposed by integrating protein-protein interaction network and gene expression data in this thesis. Different from other centrality measures, PeC determines a protein’s essentiality not only based on its connectivity, but also whether it has a high probability to be co-clustered and co-expressed with its neighbors. Edge clustering coefficient is used to describing how strong two interacting proteins are co-clustered and Pearson correlation coefficient is used to evaluating how strong two interacting proteins are co-expressed. The performance of PeC is tested on species of Saccharomyces cerevisiae and the predicted precision clearly exceeds that of the other ten previously proposed essential protein discovery methods.A new priori knowledge-based scheme to discover new essential proteins from protein-protein interaction networks are proposed in this thesis. Based on the new scheme, two essential protein discovery algorithms, CPPK and CEPPK, are developed. CPPK predicts new essential proteins based on network topology and CEPPK detects new essential proteins by integrating protein-protein interaction network and gene expression data. The performances of CPPK and CEPPK are tested on well studied species of Saccharomyces cerevisiae and the predicted precisions exceed that of the other ten previously proposed essential protein discovery methods within a certain sample level.Methods are proposed by integrating protein-protein interaction network and gene expression data in this thesis, providing new approaches and ideas for predicting essential proteins.
Keywords/Search Tags:Essential protein, protein-protein interaction network, centrality measure, gene expression, priori knowledge
PDF Full Text Request
Related items