| The outbreak of COVID-19 has caused a huge impact on people’s production and life.It is particularly important to develop relevant drugs and predict the development trend of the epidemic to control the epidemic,which involves two aspects: drug design and the study of infectious mechanism of infectious diseases.The activity of drugs is closely to the properties of drug molecules.An effective scoring function can enhance the understanding of disease pathogenesis,and the understanding of the internal mechanism of infectious diseases can help the effective development of drugs.This paper studied the latest research progress of scoring functions for evaluating the properties of drug molecules in the process of interaction between protein and ligand,and analyzed the feature extraction of drug molecules and the controllability prediction of infectious diseases in combination with topology and machine learning algorithms.Firstly,this paper studies the latest research progress of scoring functions of interaction between protein and ligand based on physical mechanism,empirical research,statistical potential and descriptor.Secondly,this paper uses the distribution information of the drug molecular atom in the three-dimensional space to construct a serous of simplicial complex,takes the life cycle of some topological invariants as the topological features of molecular structure,extracts the structural features of molecules by using the pretraining model,and carries out the binary classification of drug molecular activity with the help of the gradient boosting decision tree algorithm in machine learning.The numerical experimental results show that the classification accuracy based on topological features and pretraining features is higher than other algorithms reported in the literature.Finnaly,by constructing an epidemic model,this paper carries out the research on the controllability prediction of infectious diseases.Four different types of networks are used to simulate the population structure,the SIRD dynamics of infectious diseases is simulated on the network,the structural features of the networks are calculated to predict the basic reproductive number by machine learning algorithm.The numerical experimental results show that the accuracy of predicting the basic reproductive number using artificial neural network can reach 86% and using decision trees can reach 90%.In this paper,the topological data analysis and machine learning algorithm are combined to propose the algorithm of drug molecular topological feature extraction,which provides a new method for the prediction of molecular properties in drug design.It is significant to reveal the relationship between molecular structure and function.At the same time,an infectious disease transmission model is constructed to predict the basic reproductive number,and deepen the understanding of the disease transmission mechanism in order to develop relevant drugs. |