Font Size: a A A

Prediction of chemical properties and biological activities of organic compounds from molecular structure and use of pattern recognition techniques for the analysis of data from an optical sensor array

Posted on:2002-05-18Degree:Ph.DType:Thesis
University:The Pennsylvania State UniversityCandidate:Bakken, Gregory AFull Text:PDF
GTID:2468390011496658Subject:Chemistry
Abstract/Summary:
Two areas of computational chemistry are described in this thesis. The first portion involves development of quantitative structure-activity relationships (QSARs). QSARs seek relationships between the structure of a compound and a physical property or biological activity of interest. The second part of this thesis covers methods of analysis for data from an optical sensor array. Such arrays, together with appropriate pattern recognition techniques, can be used for identification of gas phase analytes.; Methodology for QSAR formation is presented, along with discussion of specific applications. The first QSAR application involves the prediction of radical reaction rate constants. For methyl radical rate constants, a computational neural network (CNN) is developed using six descriptors that provides a root-mean-square error (RMSE) of 0.496 log units for a prediction set. For hydroxyl radical rate constants, a ten-descriptor CNN is generated that produces RMSE of 0.254 log units for the prediction set. The second QSAR application involves prediction of binding affinities for inhibitors of type 1 5α-reductase. A ten-descriptor CNN is developed for prediction of binding affinity to 5α-reductase that produces RMSE of 0.293 log units for compounds in the prediction set. In related work, linear discriminant analysis is used to generate models to classify 609 multidrug resistance reversal agents based on activity. A model with six topological descriptors is developed that produces 92.0% correct classification for the prediction set.; The second part of this thesis presents work on analysis of sensor array data. An overview of the technology is provided, along with discussion of an application in which an array of four bead types is used to collect responses for vapors containing nitroaromatic compounds (NACs) and for vapors without NACs. Models are developed using k-nearest neighbors analysis which are able to correctly label all samples in an external prediction set based on the presence or absence of an NAC vapor in the sample. Additionally, models are generated with one NAC vapor not present in the training data. All such models provide prediction accuracy greater than 90%, indicating that for the vapors investigated, the array is able to recognize the nitro functional group.
Keywords/Search Tags:Prediction, Array, Data, QSAR, Log, Compounds, Sensor
Related items