Font Size: a A A

Studies of DNA and protein structure prediction from sequence: Data mining and molecular mechanics methods

Posted on:2002-02-08Degree:Ph.DType:Thesis
University:Wesleyan UniversityCandidate:Liu, YongxingFull Text:PDF
GTID:2468390011498176Subject:Chemistry
Abstract/Summary:
My thesis is focused on the prediction of protein structure using molecular mechanics methods and the prediction of DNA structure using data mining and machine learning methods based on DNA bending and crystal structure data. The thesis is organized as three projects: (a) the improvement of the Generalized Born solvation model, (b) the use of Multicopy Simulated Annealing (MCSA) method for protein structure prediction, (c) the prediction of DNA curvature and structure from crystal structure and experimental bending data. In the first project, the Generalized Born model for calculating the solvation free energy was modified, based on a pair-wise approximation, and optimized to account for both solvation energy and pKa shifts of dicarboxylic acids. The modified model gives a better estimation for both solvation free energy and pKa shifts. The improved solvation model is suitable for conformational search and molecular dynamics simulation without explicit consideration of water molecules. In the second project, the Amber united atom force field and the Generalized Born solvation model were combined into the prediction method using Multicopy Simulated Annealing (MCSA). The calculations are carried out in torsional space to reduce the number of degrees of freedom in the conformational search and is very suitable for parallelization. Several test cases on peptides and small proteins are presented. By analyzing the trajectory using energy component analysis, I estimated the dominant terms in protein folding during different stages of the folding process. In the third project, the subject of sequence effects on DNA structure and axis bending are investigated and an apparent conflict between DNA bending models and crystal structure data is resolved. This project combines data mining and machine learning to obtain a refined set of dinucleoctide model parameters for DNA bending. The new model accounts well for several DNA bending test cases as well as a way to unify crystal structure data and DNA bending experimental data. A DNA structure construct can be built based on this bending model. The results from this thesis contribute both in methodology and application in the field of protein structure prediction and DNA bending, two areas expected to be of considerable future importance in structural molecular biology and bioinformatics.
Keywords/Search Tags:Structure, DNA bending, Molecular mechanics methods, Prediction, Data mining, Generalized born solvation model, Multicopy simulated annealing, Solvation free energy
Related items