Font Size: a A A

A new model for molecular representation and classification: Formal approach based on the ETS framework

Posted on:2004-08-17Degree:Ph.DType:Dissertation
University:University of New Brunswick (Canada)Candidate:Korkin, DmitryFull Text:PDF
GTID:1468390011966704Subject:Computer Science
Abstract/Summary:
The informatics-driven approach has become one of the major approaches in modern science. This trend is especially manifest in the life sciences, where the areas of molecular informatics, cheminformatics and bioinformatics have been rapidly developing during the last decade. Among the key problems that arise in molecular informatics are the representation of molecular objects (organic compounds, drugs, proteins, DNAs, etc.), the representation of molecular classes of the objects, as well as the classification of existing ones and prediction of new molecular objects as belonging to a specific molecular class. Therefore, there is a great demand for a model of molecular representation and classification. What would be the key features for such a model? First of all, the structural nature of molecular objects suggests a representation that would preserve the structural features of objects. Second, in order to be able to classify and predict new molecular objects, based on the existing data, it is important to incorporate the inductive approach in the model. Finally, the automation of scientific discovery in all scientific areas, including molecular informatics, requires a formal model, since computer systems still cannot be taught the implicit understanding and interpretation of the objects that we store in our minds. Unfortunately, none of the existing formal models has all of the above key features.; In this work, we outline a new formal model for molecular representation and classification, based on the evolving transformations system (ETS) framework. This tentative model, called the ChemETS model, is developed primarily for the small molecules such as organic compounds. As a part of this model, we introduce a structural representation of molecular objects and their classes that contains some of the advanced molecular features such as molecular shapes, define the class typicality and similarity measures for molecular objects, and formulate the central problem of inductive learning for molecular classes and the related problems. We apply the ChemETS model to the area of computer-aided drug design (CADD) and the therapeutic class of androgens in particular. As a result, using the ChemETS model, we are able to (1) reconstruct a class representation of the androgens; (2) correctly classify the existing compounds as either exhibiting androgenic properties or not, based on the obtained class representation; (3) predict new androgenic-like compounds that are very likely to be drug candidates. The obtained results show the advantages of the new model and suggest the future areas of its application.
Keywords/Search Tags:Model, Molecular, New, Representation, Approach, Formal
Related items