Font Size: a A A

Chemical Compound Classification Research Based On Graph Kernels

Posted on:2016-02-13Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhaiFull Text:PDF
GTID:2308330503950607Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Drug discovery, design and development are a costly and challenging work. Pharmaceutical chemists are interested in more than identifying particular compound with the desired efficacy, but would like to study which part of it produces the desired efficacy so that they can reasonably synthesize new drugs. Computation techniques that build models to correctly assign chemical compounds to various classes can make up of these limitations, and therefore become very popular, so it has become an active area of research.Kernel function makes the non-linear classification problem becomes linearly separable, a foundation in the study of a large number of compounds similarity calculation method and conclusions based on previous experiments make a clear direction. Basic researches and innovations are:(1) Presented a method of generating descriptor based on connection and the detail implementation steps, and take advantage of information stored in the process of generating the descriptor. We improved state-of-the-art method and allocated different weights to different size descriptors in the calculation of the similarity of compounds, showed optimum classification performance.(2) Influenced by random walk kernel and optimum allocation kernel, we added neighborhood topology information to each atom and proposed a substructure matching kernel. Due to the activity of compound have a strong relationship with its substructure, by controlling the distance between the center atoms and atom radius to generate pairs of substructures in each graph, based on substructure pairs for graph comparison, we proposed substructure pairs kernel. Both of the custom kernels considered more relevant information with compound classification.In our experimental evaluation, our methods equal to or better than other existing methods on several graph classification benchmark datasets in terms of accuracy. Our methods give another perspective for compounds classification according to its activity, so a similar problem in classification can use it as a reference.
Keywords/Search Tags:Chemical compound database, Classification algorithm, Graph kernels, Support vector machine(SVM)
PDF Full Text Request
Related items