| The design and launch of a new drug requires an average of 13 years of R & D cycles and multi-billion dollar R & D costs,and the generation of drug-like molecular candidate libraries is a key step in drug synthesis,so it is possible to generate molecular candidate libraries using deep learning Significant.This article focuses on the research and optimization of drug-like molecular generation models based on deep learning.Two models were used in the experiments to study molecular generation and optimization.The first experiment is to use the Dilated CNN algorithm as the decoder network for molecular generation in the variational autoencoder architecture,and use indicators such as Valid and Unique as the evaluation criteria of the model.However,the goal of the pharmaceutical industry is to generate molecules with specific biological activity.Therefore,Experiment 1 also uses transfer learning to optimize the generation model.In the transfer learning experiment,commonly used S.aureus and Plasmodium falciparum were used as research targets,and EOR was used The evaluation index of the model after retraining respectively verifies whether the molecules existing in the test set can be reproduced in the generated set.Experiment two uses the improved variational autoencoder in experiment one as the generation model and random forest as the prediction model,and uses the policy gradient descent algorithm to integrate the two into one system.The system regards the generated model as an agent and the discriminant model as the environment,and constitutes an optimization algorithm based on the Markov decision process framework.The system uses JAK2 as a target for maximization and minimization.The improved variational autoencoder proposed in this paper reaches 0.981 in the Valid of generating molecules,which exceeds the effect based on the variational autoencoder model and recurrent neural network model.The optimization algorithm based on policy gradient descent uses a new reward function,which alleviates the problem of rapid decline of the validity index of the generated model after the model update. |