| Drug development is a complicated system process with high cost,long period and high risk.Discovery and optimization of lead compounds are vital in the process of new drug development and pose significant challenges.As an important method for lead compound discovery and optimization,de novo drug design optimizes compound structures using molecular generation methods to obtain the best solution for predetermined properties.It is expected to achieve more comprehensive exploration of drug-like space as it does not rely on known compound libraries.In recent years,artificial intelligence technologies,represented by deep learning,have been introduced into de novo drug design due to its powerful information mining and non-linear fitting capabilities,providing significant development opportunities for new drug discovery.Although these advanced de novo drug design methods have become hotspots in the cross-disciplinary field of AI,there are still limitations in their practical applications in drug development: From the algorithmic perspective,most current methods employ advanced AI technologies that are not well adapted to drug design tasks and have poor target extrapolation,making it challenging to meet the task requirements for optimizing multiple attributes of drug candidates and target specificity.From the user’s perspective,the majority of current methods lack usability and support for visualization software.Additionally,users have unknown prediction effectiveness for these innovative methods,and there is a lack of objective and systematic evaluation.In response to these problems,this thesis focuses on methodological research on AI-based de novo drug design.Firstly,a systematic evaluation of the prediction effectiveness of various de novo drug design methods based on deep learning frameworks is conducted.Subsequently,a highadaptation target-specific innovative algorithm for de novo drug design based on deep learning is developed.Finally,a user-friendly customized module de novo drug design software platform is built.The main researches are as follows:(1)In the first part of the research,the performance of DL-based de novo drug design models with four DL frameworks(VAE,GAN,RNN,and RL)were comprehensively evaluated.First,the overall quality of the molecular set generated by the models were analyzed,focusing on whether the model can generate enough valid,novel and diverse molecules.Second,the performance of different molecular generation models under different tasks,such as goal-oriented(drug rediscovery,scaffold optimization and hopping)and objective-specified(generate novel compounds for given properties),was evaluated.Finally,the CDK2 and p38,two important targets in cancer treatment,were selected as practical application scenarios and systematically evaluated the performance of each model.(2)In the second part of the research,RELATION,a target-specific model for de novo drug design was proposed.Different from other DL methods,RELATION takes the3 D grid conformations of ligand-receptor complexes with atomic physicochemical properties as the input,and DSN was employed to conduct bidirectional transfer learning in RELATION to facilitate the exchange of the information between ligands and ligandprotein complexes.To improve the effectiveness of RELATION for de novo design,the constraints of pharmacophores and docking-score-based sampling were taken by the conditional generation.Finally,the comprehensive performance of the RELATION model in de novo drug design towards two targets(AKT1 and CDK2)was investigated.(3)In the third part of the research,GARel,a novel drug design model,was proposed,which is a novel framework by combining genetic algorithms and deep learning.In GARel,the mutation and crossover operators of genetic algorithms are set as the expert policy to consistently guide the bidirectional TL generator to generate target-specific molecules.The effectiveness of the model in generating active compounds was verified for three targets,namely,AA2 AR,EGFR,and SARS-Co V-2 MPro.The verification process examined the introduction of genetic algorithms to assess the improvement of the deep learning framework and the balance between the novelty and drug-likeness of the generated molecules.(4)In the fourth part of the research,Re MODE,a de novo drug design platform baed on DL was developed.Re MODE supports the generation of inhibitors for commonly used23 kinase targets,and users can select a target to create specific generation tasks and quickly generate molecules.In addition,many customized generation methods were developed to meet the requirements of users in drug design: users can create multiobjective constraint tasks in the “Properties” module and fragment-based design tasks in the “Structure features” module;users can also use the modules "Pharmacophore features" and "Bayesian optimization" to generate molecules with favorable pharmacophore matching scores and docked conformations,which are expected to improve the efficiency of discovering and optimizing high-quality lead compounds in the era of artificial intelligence-based drug discovery. |