Font Size: a A A

Computational protein structure prediction

Posted on:2004-08-22Degree:Ph.DType:Thesis
University:University of Toronto (Canada)Candidate:Feldman, Howard JonathanFull Text:PDF
GTID:2460390011968671Subject:Biophysics
Abstract/Summary:
How does a given protein sequence arrive at a native, low energy conformation from amongst the seemingly infinite number of possibilities? Proteins fold along a directed pathway, and do not sample all of conformational space. However, understanding how that pathway is chosen has eluded scientists. The purpose of this thesis is to investigate this question, the protein folding problem, taking the sampling and scoring aspects into consideration independently.; A novel, speed-optimized method has been developed to sample protein conformational space by building probabilistic all-atom off-lattice protein models. This is accomplished through a kinetic random walk in either torsional or Euclidean space, optionally biased by secondary structure prediction. We demonstrate that the distribution of deviations from native state for small, globular proteins is an Extreme Value Distribution. This tells us the amount of sampling required to get within a given deviation. We have also applied our method to perform homology modelling, and coarse-grained unfolding of proteins. For four out of five proteins compared, our method reproduces the major unfolding events and predicts fold initiation sites as well as molecular dynamics does, in a fraction of the time.; Having characterized the sampling portion of protein folding, one must also be able to pick out the best structure(s). Using a number of existing potentials, an extensive test of scoring functions on seventeen different folds was performed. Crease Energy was derived as an approximation to the steepness of the local free energy landscape at a given structure. We found that these are often able to select good structures based on energy score from large samples, but usually cannot pick out the very best.; The Distributed Folding Project is a distributed computing project which samples conformational space and scores structures as they are built, attempting to pick out the best ones generated. Over 30 billion structures of over 20 proteins have been sampled, using 6000 CPUs across the world, making this the largest protein structure prediction computation to date. This project will serve as an ideal platform for rapidly testing and evaluating new sampling and scoring methods to develop better methods for the future.
Keywords/Search Tags:Protein, Structure, Energy, Sampling
Related items