Font Size: a A A

Research On The Prediction Method Of Aqueous Solubility Of Compound Based On Multi-Population Whale Optimization Algorithm

Posted on:2024-09-19Degree:MasterType:Thesis
Country:ChinaCandidate:Y ShenFull Text:PDF
GTID:2531307139476534Subject:Materials and Chemical Engineering (Professional Degree)
Abstract/Summary:PDF Full Text Request
Aqueous solubility is an essential property of compounds,which reflects the ability of a compound to dissolve in water.The research on the aqueous solubility of compounds is significant because it plays a vital role in the fields of biopharmaceuticals,agricultural production,and environmental governance.For example,in biopharmaceuticals,drugs made of highly aqueous-soluble compounds can quickly dissolve in human body fluids and thus enter the human body’s circulatory system to exert their medicinal effects.In agricultural production,fertilizers made of low aqueous-soluble compounds can stay in the soil for a long time,continuously providing nutrients for crops.In environmental governance,highly aqueous-soluble compounds are easily decomposed and absorbed by organisms,reducing their residual time in the environment and thereby reducing the impact of chemical synthesis products on the environment.Being familiar with the aqueous solubility of existing compounds under different conditions and predicting the aqueous solubility of new compounds is of great significance to the development of chemical production,pharmaceutical preparation,and other fields.Quantitative Structure-Property Relationship(QSPR)based methods to predict the aqueous solubility of compounds have been widely investigated in recent years because of their low cost,fast prediction speed,and the ability to predict the aqueous solubility of compounds in bulk without the need for complex experimental measurements.QSPR mainly uses algorithms to train a prediction model based on existing aqueous solubility data sets and later uses the model to predict the aqueous solubility of new compounds.With the rapid development of artificial intelligence in recent years,many algorithms based on machine learning and deep learning have shown great advantages in prediction tasks,and one of the widely used models is the long short-term memory(LSTM)neural network.Therefore,this thesis uses LSTM to build a compound water solubility prediction model based on QSPR.However,the hyperparameters of LSTM have a significant influence on its prediction performance,and the whale optimization algorithm(WOA)can be used as an efficient optimization algorithm to optimize the hyperparameters of LSTM.However,the conventional and basic WOA easily falls into the local optimum during the optimization process,and its convergence speed is slow.In order to overcome these shortcomings,this thesis first improves the basic WOA based on the multi-population mechanism and proposes three multi-population whale optimization algorithms.Experimental results show that these three improved WOAs have higher optimization accuracy and faster convergence speed than the traditional WOA.Subsequently,this thesis uses these three multi-population WOAs to optimize the hyperparameters of LSTM to improve the prediction accuracy of LSTM for the aqueous solubility of compounds.Finally,the optimized LSTM is validated on the aqueous solubility test set,and the experimental results suggested that the optimized LSTM can significantly improve the aqueous solubility prediction accuracy.The main research content and innovation points of this thesis are as follows:(1)Based on the particle swarm algorithm(PSO)and WOA,a dual population WOA is proposed.The algorithm uses PSO as the exploitation population and WOA as the exploration population.In the iterative process,the exploration and exploitation populations exchange each other’s elite individuals to enhance optimization accuracy.The algorithm is tested on the CEC 2017 test set,and the experiments show that it has good optimization performance.(2)A multi-population co-evolutionary WOA is proposed,which divides the original population into three sub-populations to increase the diversity of the population and avoid falling into local optimum prematurely.At the same time,a novel multipopulation evolution mechanism is introduced to further improve the optimization accuracy and convergence speed.Experiments show that the algorithm runs fast,has high solution accuracy,and has excellent performance on most of the CEC 2017 test sets.(3)A WOA based on an Excenter-based learning(EBL)mechanism and adaptive Gaussian mutation is proposed.The algorithm first divides the population into three sub-populations to increase its ability to jump out of the local optimum.Second,a novel EBL mechanism is proposed to help individuals get rid of local optima.Finally,a novel adaptive Gaussian mutation strategy is proposed to maintain population diversity and speed up the convergence of the algorithm in the later stage.Compared with other comparative algorithms,this algorithm has the highest optimization accuracy and the fastest convergence speed on the CEC 2017 test set,and its optimization performance is significantly better than the traditional WOA.(4)Use the three multi-population WOAs proposed in this thesis to optimize the hyperparameters of LSTM.Experiments show that the optimization effect of these three improved WOAs on LSTM is significantly better than that of the traditional WOA,and the prediction accuracy of the optimized LSTM on the aqueous solubility of compounds exceeds that of the unoptimized LSTM.
Keywords/Search Tags:Aqueous solubility, Quantitative Structure-Property Relationship (QSPR), Long short-term memory (LSTM), Whale Optimization Algorithm, Multipopulation
PDF Full Text Request
Related items