Font Size: a A A

Novel computational techniques for combinatorial library design, database mining and QSAR analysis

Posted on:1998-06-18Degree:Ph.DType:Thesis
University:The University of North Carolina at Chapel HillCandidate:Zheng, WeifanFull Text:PDF
GTID:2468390014475494Subject:Chemistry
Abstract/Summary:
A suite of computational algorithms has been developed for rational design of combinatorial libraries. It includes complementary modules for database mining, targeted and diverse library design, and Quantitative Structure-Activity Relationship (QSAR) analysis. Thus, this suite affords a comprehensive approach to lead identification and optimization.;In order to improve the efficiency of lead identification or generation, a novel method for the efficient diversity sampling of actual or virtual chemical libraries has been developed. The diversity of a subset of compounds is measured by a special function, and the most diverse subset is obtained using simulated annealing as the optimization tool. Application of this method to simulated datasets showed that the optimal subset of compounds (1) was representative and (2) was characterized by higher hit rates for "active compounds" than the random sampling.;For lead evolution using targeted chemical libraries, a novel computational approach has been developed for the rational library design. Virtual library compounds are represented by Kier-Hall topological descriptors. Molecular similarities are evaluated quantitatively by modified Euclidean distance metrics in multidimensional descriptor space. Virtual library compounds most similar to the lead molecule(-s) are identified by the means of stochastic search of the library structural space using the Simulated Annealing protocol. Frequency analysis of the building block composition of selected virtual compounds identifies building blocks that can be used in combinatorial synthesis of the targeted libraries. Application of this method to a peptoid library correctly identified building blocks found in known active peptoids with opioid activities.;For lead optimization, a new nonlinear QSAR technique has been developed that can be applied to even large databases of biologically active compounds. This method is based upon the K-nearest neighbor principle of pattern recognition, and uses a stochastic optimization technique for variable selection. Application of this method to several experimental datasets showed that robust models have been obtained in most cases. This method also provides an efficient way to enhance database searching: the hit rates for active compounds obtained using this method are consistently higher than those obtained by random sampling.
Keywords/Search Tags:Library design, Computational, QSAR, Combinatorial, Database, Compounds, Method, Novel
Related items