Font Size: a A A

The Evolution Of Protein Interaction Networks

Posted on:2012-02-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z Y LiuFull Text:PDF
GTID:1110330371462874Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
Protein interaction networks (PINs) define most cellular functions. Researches on the evolution of PINs may not only shed light on the principles driving the evolution of living organisms, but also help us understand the biological organization and the function of cellular proteins. People have studied the evolution of PINs from multiple levels for several years, however, there is still lots of debate in this filed and the evolutionary mechanism of PINs also has not been totally uncovered. In this thesis, we focus on the original and evolutionary mode of PINs, and protein self-interactions which play an important role in the evolution of PINs.First, we study the evolutionary mode of PINs from the perspective of network motifs, and try to provide novel evidence supporting the hypothesis of the clustered additions during the growth of PINs. In the study of the evolution of PINs, how the proteins were added to the PIN during its growth is a basic and important issue. Network motifs are referred to as recurring interconnected patterns of specific topology in complex networks, may represent the simplest building blocks of cellular machines and are of biological significance. First, we classify proteins based on their original time, and find in today's PINs, proteins of the same age class tend to interact with each other and further cluster to form network motifs. Further, such co-origins of motif constituents are affected by their topologies and biological functions. Then we find that the proteins within motifs whose constituents are of the same age class tend to be densely interconnected, co-evolve and share the same biological functions, and these motifs tend to be constitutional units of protein complexes. These findings provide novel evidence for the hypothesis of the additions of clustered interacting nodes and point out network motifs, especially the motifs with dense topology and specific function may play important roles during this process. The results suggest functional constraints (natural selection) may be the underlying driving force for such additions of clustered interacting nodes. This work may help us understand the evolutionary mechanism of PINs.Second, we systemically study self-interacting proteins from multiple aspects, and further develop a self-interacting protein prediction model with multiple data resources based on the na?ve Bayesian network. Protein self-interactions which are referred to as the interaction between two or more copies of a protein play an important role in the evolution of PINs, especially in the emergence of the modularity of PINs. The two most common high-throughput assays used to detect protein interactions, yeast two-hybird (Y2H) and affinity purification with mass spectrometry (AP-MS), have limited ability to discern self-interactions, which may lead to the underestimation of the number of self-interactions. Underrepresented self-interactions in the interaction data may lead to erroneous assertions on the evolution of PINs. Here first we systematically study self-interacting proteins from multiple aspects. We find that compared with other non-self-interacting proteins, from the sequence aspect, self-interacting proteins tend to have more domains and have lower fraction of disordered proteins; from functional aspect, tend to be enriched of signature molecules, enzyme genes, housekeeping genes (human) or essential genes (yeast); from the evolutionary aspect, tend to be conserved and enriched of proteins which originated at the age of the common ancestor of three domains of life; from the topological aspect, tend to be important in multiple types of biological networks. Then, based on na?ve Bayesian network, we develop a human self-interacting protein prediction model with multiple data resources including model organism self-interacting protein data, interaction domain data and network topological information. Using five-fold cross-validation, the prediction model obtains a good performance. For the first time we systematically study self-interacting proteins and develop a self-interacting protein prediction model. This work may help us understand the cellular function of self-interacting proteins, and the prediction model may efficiently extend the list of self-interacting proteins, which also lays foundation for the future research on the role of protein self-interactions in the evolution of PINs. Finally, to mine the function of proteins produced by the research on Human Liver Organelle Proteome (HLOP) in our lab, we construct bioinformatics systems to automatically predict the protein functions based on the protein interaction network. HLOP identifies a large number of proteins located in nucleus, mitochondria, plasma membrane and endoplasmic reticulum in human liver cells. Mining the important biological knowledge from these proteins brings challenges for the bioinformatics researchers. Based on the protein interaction network and GO (Gene Ontology), KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway annotations, we construct two prediction systems using―direct method‖(using the neighbors of the interested protein in the PIN) and―module-assisted method‖(using functional module in which the interested protein is located) for gene function mining. Using the two methods and meanwhile considering subcellular location and liver-expressed gene data, we predict new GO functions for 1753 HLOP genes, including 180 genes which haven't previous GO biological process annotations and 511 genes which haven't previous corresponding subcellular annotations, and we also predict potential KEGG pathways for 1592 HLOP genes, including 154 genes which haven't GO biological process annotations and 477 genes which haven't previous subcellular annotations. By the preliminarily manual filter of the prediction results, we find 6 genes which may be closely related with organelle functions, providing important functional clues for the further wet experiments. The constructed gene function prediction systems will play an important role in proteomics research in our lab.This thesis will help us understand the evolutionary and functional mechanism of living organisms, which lays the foundation for the coming systems biology research.
Keywords/Search Tags:Protein interaction networks, Biological evolution, Network motifs, Protein self-interactions, Prediction of gene functions
PDF Full Text Request
Related items