Font Size: a A A

Structural Modeling And Characterization Of Protein Interaction Network

Posted on:2006-08-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:F ZhangFull Text:PDF
GTID:1118360185995694Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As sequence information continues to rapidly accumulate from the Human Genome Project and as structure information becomes increasingly available from the Structural Genomics Initiative, the next logical step is to integrate biological information from the molecular level of sequence and structure to the assembly of protein interaction network. The central idea of this dissertation is to describe protein interactions in terms of evolutionarily conserved structural domains that are involved as the interfaces with each other. We first used sequence-based comparisons to cluster homologous sequences. Concurrently, we integrated high-throughput proteomics results for protein interactions. Because such data sets were largely available only for a few model systems such as yeast as well as fly and worm, our integration and clustering allow extrapolation of protein interaction information from model systems to all other homologous species. Such integration also allows for the cross-validation and assessment of all known protein interactions. With the sequence clusters being interacted to each other, the central hypothesis of this work is that biological system is conserved at the level of protein interaction networks, that there is a general increase of biological universality from sequence, structure to the network levels. We believe strongly that such a biological conserveness or universality can be largely described by a network, in which patterns of protein interactions are conserved. This includes the components and the organizations involved in the same network. Once again, such an interaction network must be mediated by conserved protein domains.The key to create such as assembly of protein interaction network at the domain level is to create a partition of sequential and structural representations for all existing protein domains. We believe these domains are the actual biological building blocks. We then used a machine learning approach to deduce a protein interaction map that is most consistent with the underlying domain information. The idea is that current interaction information is still largely obtained at the sequence level whereas each sequence could have multiple conserved domains, thus an exponential number of possibilities of interactions have to be compared. This would have been an intractable problem if an exhaustive search had to be performed to correlate the interaction maps between the sequence and domain levels. Our strategy of getting such an optimal map in terms of conserved protein domains is to apply an EM algorithm as a short cut for the searching. It was our hope that with the interaction information available at the domain level, our ultimate goal would be to model as many...
Keywords/Search Tags:sequence clustering, protein-protein interaction, comparative modeling, protein docking, protein interaction network
PDF Full Text Request
Related items