Font Size: a A A

New hybrid of empirical and knowledge-based scoring functions using novel geometrical descriptors, molecular surface descriptors and machine-learning methods

Posted on:2005-03-17Degree:Ph.DType:Thesis
University:Rensselaer Polytechnic InstituteCandidate:Deng, WeiFull Text:PDF
GTID:2458390008998886Subject:Chemistry
Abstract/Summary:
Protein-ligand interactions are an important and challenging problem in rational drug design. Protein-ligand docking and scoring are essential techniques to study the functions of macromolecular targets and small compounds. Among all scoring functions, knowledge-based scoring functions are the latest and most promising method.; The method described in this thesis is a hybrid of both empirical and knowledge-based scoring functions, In contrast to pair potentials of traditional knowledge-based scoring functions, it applies novel geometrical and molecular surface property descriptors. Like empirical scoring functions, QSAR modeling methods are utilized in this approach. However, this method conceptually considers all the physical effects in protein-ligand interactions, compared to empirical scoring functions.; Three types of descriptors are applied in this approach: atom pair descriptors, TAE/RECON surface property descriptors, and tessellated tetrahedron descriptors. Atom pair descriptors consider the fact that the strength of ligand binding is correlated with the nature of protein-ligand atom pairs in a distance-dependent manner. TAE/RECON descriptors study the surface electronic properties, and find correlations and complementarities between the ligand and protein binding site. Tessellated tetrahedron descriptors investigate the geometrical and molecular properties in the binding site three-dimensional space, and analyze the correlation of this information and the protein-ligand binding energies. All three types of descriptors have obtained reliable scoring and pattern recognition results using bootstrapping mode of Kernel PLS (Partial Least Squares) modeling. Other machine learning methods, such as sensitivity analysis feature selection, Y-scrambling, are involved in our study. The computational results and possible future enhancement are discussed at the end of the thesis.
Keywords/Search Tags:Scoring, Descriptors, Surface, Empirical, Molecular, Method, Geometrical, Protein-ligand
Related items