Font Size: a A A

Prediction Of Organic Chemical Reaction Based On Gated Graph Convolutional Neural Network

Posted on:2022-01-02Degree:MasterType:Thesis
Country:ChinaCandidate:Z C LaiFull Text:PDF
GTID:2518306476998709Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Traditional new drug R&D faces the dilemma of high investment,long time-consuming,and low success rate.According to the data provided by the Journal of the American Medical Association,the average cost of developing a new drug is about 2.8billion U.S.dollars,and the average time for developing a new drug is about 14 years.Drug synthesis route design is essentially a chemical molecule reverse synthesis problem,that is,how to use common molecules in the chemical raw material library to design a synthetic route to synthesize the target molecule.At present,the main method of drug synthesis route design is to perform route search based on the single-step reverse synthesis reaction prediction model combined with Monte Carlo tree search algorithm.The current drug synthesis route design model still faces many difficulties,and the drug synthesis route recommended by it may not be able to successfully synthesize the target molecule.The main bottleneck of drug synthesis route design is:(1)The search space of chemical molecules is huge,the single-step reverse synthesis reaction is difficult to predict,and the accuracy of the model is low.(2)The number of chemical reactions available for learning and training is insufficient,and the number of chemical reaction types Uneven distribution can easily bring bias to model training(3)Route search efficiency is low and search time is too long.In order to solve the above problems,this paper designs a single-step forward reaction prediction model to assist in the generation of target molecular synthesis pathways.The main research contents of this paper are as follows:(1)Aiming at the problem of uneven distribution of the number of chemical reaction types.This paper designs an active sampling training method,that is,after learning a complete round of training set data,before starting the next round of complete training,collecting data with a higher loss value and a smaller number of response types are trained first.By increasing the number of training times for reactions with a small number of reaction types,the bias problem caused by the uneven distribution of the number of reaction types can be alleviated.(2)In order to predict organic chemical reaction products efficiently and accurately,this paper designs an Active Sampling-training Gated Graph Convolutional Neural-network(ASGGCN)based on active sampling training.The model first inputs the SMILES code of the chemical reactant,predicts the location of the reaction center through the gated graph convolutional network and the attention mechanism,and then enumerates the possible chemical bond combinations to generate candidate products according to the chemical constraints,and then passes the gated graph volume The product difference network screens the candidate products and finally obtains the reaction product.The gated graph convolutional neural network has three weight parameter matrices and integrates information through gating.Compared with the traditional graph convolutional neural network,the gated graph convolutional neural network can obtain richer atomic hidden feature information.The experimental results show that ASGGCN's prediction accuracy for Top-1 of chemical reaction products can reach 87.2%,which is 1.6%higher than WLDN[14]model,and 6.9%higher than Seq2Seq[26]model.The model can predict organic chemical reactions more accurately.(3)Aiming at the two problems of the low accuracy of the single-step reverse synthesis model and the low efficiency of synthetic route search,this paper designs a single-step forward reaction prediction model to assist in generating the target molecule synthesis route.The molecules considered in the single-step forward reaction prediction problem are only reactant molecules and the number is limited,so the accuracy rate is higher than that of the single-step reverse synthesis reaction prediction model.The single-step forward reaction prediction model is used to verify the results of the single-step reverse synthesis reaction prediction model.Tailoring the wrong result branches can improve the chemical feasibility,and reduce the search range of the synthesis path to improve the search efficiency.
Keywords/Search Tags:Active Sampling, Gated graph convolution network, organic chemical reaction, Molecular retrosynthesis
PDF Full Text Request
Related items