Font Size: a A A

Integrating sequence and structure data for identifying functional sites on protein structures

Posted on:2006-02-16Degree:Ph.DType:Dissertation
University:Stanford UniversityCandidate:Liang, Mike Hsin-PingFull Text:PDF
GTID:1450390008955267Subject:Biology
Abstract/Summary:
Genomes of hundreds of organisms have now been sequenced, and the sequencing of many more continues to accelerate. These sequencing projects have revealed many new proteins whose functions are unknown. Many of these proteins have no sequence similarity to known functionally characterized proteins. To aid in functionally annotating these proteins, structural genomics initiatives are developing high-throughput methods for determining the 3D structure of these uncharacterized proteins. An important challenge in analyzing this data is to utilize the increasingly available 3D structure data to help identify protein function.;To address this challenge, I have developed a method for integrating 3D structure information with 1D sequence information to identify functional sites on protein structures. I have developed a method for automatically characterizing the 3D physicochemical descriptions of the environment around a 1D sequence pattern. I have shown that the resulting 3D structural model has better sensitivity and specificity than using the 1D sequence pattern alone. I have evaluated my method on calcium binding sites and serine protease active sites. I have also automatically created a 3D structural library of protein functional sites for automatic function prediction of protein structures. I have implemented a web-accessible system for automatic function annotation of 3D protein structures and made the structural library of functional sites available at http://feature.stanford.edu/webfeature/.
Keywords/Search Tags:Functional sites, Protein structures, Sequence, Data, Structural
Related items