Font Size: a A A

Identifying protein-protein binding sites and binding partners using sequence and structure information

Posted on:2008-05-04Degree:Ph.DType:Dissertation
University:University of California, San DiegoCandidate:Chung, Jo-LanFull Text:PDF
GTID:1440390005477763Subject:Chemistry
Abstract/Summary:
A rapid increase in the amount of protein sequence data has been provided by high-throughput DNA sequencing. Concurrently, structural genomics projects provide a large growth in the number of experimentally determined protein structures. One of the challenges in current biology is to exploit these data to understand and predict protein functions which include protein-protein interactions. The goal of this work is to utilize the increasing amount of structural information, provided by both structural genomics and traditional structural biology, along with the sequence information, to identify where proteins bind to each other (binding site locations) and to which proteins these sites bind (binding specificity).; First, a graphical tool and a scoring function were developed to extract information on the structural conservation of residues from multiple protein structures. The structurally conserved residues, derived with the scoring function, were observed to correlate with protein binding sites. A new method, based on machine learning techniques, was then developed to identify the location of protein-protein binding sites using sequence information and structural conservation. We found that incorporation of the structural conservation significantly improved prediction performance.; Subsequently, to identify the specificity of protein binding sites, we developed another method to detect whether two protein binding sites, once identified, interact with each other. This method extracts sequence and structural complementary across protein interfaces using machine learning techniques. Finally, we built a pipeline that links this method with a modification of our binding site prediction method described above. We have demonstrated that this high-throughput pipeline is capable of identifying binding sites for proteins, their interacting binding sites and, ultimately, their binding partners on a large scale.
Keywords/Search Tags:Protein, Binding sites, Sequence, Identify, Structural, Information, Using
Related items