Font Size: a A A

Parallel algorithms and distributed systems for computational biophysics

Posted on:2008-08-27Degree:Ph.DType:Dissertation
University:University of Notre DameCandidate:Brenner, Paul RFull Text:PDF
GTID:1448390005957270Subject:Biophysics
Abstract/Summary:
The understanding of atomic scale biomolecular function is a key component in the prevention and treatment of disease. Computational biophysics has proven essential in this regard, accelerating the development and analysis of new biomolecular theories. The effective contribution of biophysical simulation is limited by the computational complexity of the existing models. In this work new computationally efficient parallel algorithms and distributed system frameworks are developed to extend the capability of biophysical simulation. In tandem to this development, I present the simulation and analysis of a target protein domain linked to cancer, Huntington disease, and Alzheimer disease.;The Replica Exchange Method is a popular biomolecular sampling algorithm that utilizes multiple simulations (replicas), to more rapidly overcome energy landscape boundaries and accelerate sampling. The method has limitations in scale related to the size of the biomolecular system and required number of replicas. I introduce a novel all pairs exchange implementation of the algorithm that provides asymptotically four fold speedup of conformation traversal for replica counts of 8 and larger with typical exchange rates. Experimental tests with the blocked alanine dipeptide show a 100% sampling improvement according to potential energy Paul R. Brenner averages and an ergodic measure. The cluster sampling rate for a target protein domain was nearly twice that of the single exchange near neighbor method. The method meets the detailed balance criterion for Monte Carlo methods and introduces no new parameterizations, biases, or heuristics.;The development of distributed systems for scientific computation is an active research field propelled by the growing number of research projects relying on computationally complex simulations as part of the discovery process. Many proposed frameworks have been successfully matched with unique applications to provide the computational capacity required. Only recently, has more focus been targeted toward the efficient management of the distributed data. I introduce a processing in network storage' distributed system framework that efficiently couples computation with data management over heterogeneous, autonomous, and distributed resources. The framework provides a fault tolerant, scalable, and bandwidth conserving approach through the utilization of existing grid software utilities and a new hybrid database/filesystem developed with our collaborators. The performance is evaluated during the generation of 500 biomolecular simulations producing over 1 million output files distributed over volunteer resources.;The correlation of atomic scale simulations with existing experimental techniques provides complementary data sets that cross validate and more thoroughly map biomolecular motion of interest. This correlation however is complicated by the often disjoint nature of the observables accessible from simulation and experiment. In this work, biophysical simulations of the isomerase PIN1 WW domain reveal insight into promising reaction coordinates to help map simulation observed recognition loop motion to experimental nuclear magnetic resonance (NMR) results. Post processing analysis methods and metrics including dihedral distributions, conformational clustering, hydrogen bond determination, and committor probability calculations indicate that the observed motion of the arginine 12 residue is coupled to the multivariate conformational changes of the recognition loop.
Keywords/Search Tags:Computational, Distributed, Biomolecular, System
Related items