| Plant cuticle water partition coefficient is very important to evaluate the environmental fate and risk of organic pollutants.Tomato fruit is a medium for measuring plant cuticle water partition coefficient because its cuticle is thick and does not contain pores,which is easy to separate and treat.However,the determination of tomato cuticle water partition coefficient(Kcw)by experimental method is time-consuming and expensive.It is difficult to quickly determine the physicochemical properties of arbitrary compounds and reveal the micro mechanism of pollutant transformation,and it can not meet the needs of risk prediction and management of a wide variety and growing compounds.Therefore,it is of great significance to develop prediction models instead of experimental methods to predict the Kcw value of compounds.In this study,two quantitative structure-activity relationship(QSAR)models for predicting organic pollutant Kcw were established by using linear machine learning algorithm multiple linear stepwise regression(MLR)and nonlinear machine learning algorithm artificial neural network(ANN).The model is fully evaluated and its application domain is characterized in accordance with OECD guidelines.In addition,the mechanism of affecting the distribution behavior of organic pollutants between tomato cuticle and water phase was discussed through mechanism interpretation.The specific research contents and results are as follows:(1)Taking 127 Kcw measured values of 74 organic pollutants collected from the literature(involving 13 compounds such as alcohols,ethers,esters,phenols and herbicides)as dependent variables and 6 descriptors selected by MLR(H_Dz(p),J_Dz(p),ETA_Alpha_A,Hy,S2K,TI2_L)as independent variables,a linear mlr-qsar model was established.The statistical parameter values evaluated by various models(Radj2=0.799,RMSEtra=0.741,QLOO2=0.797,QBOOT2=0.814;Rext2=0.837,Qext2=0.830)show that the mlr-qasr model is superior in terms of goodness of fit,robustness and external prediction ability.The application domain of the model is characterized by Williams diagram.The mechanism explanation revealed that the main factors affecting the distribution behavior of organic pollutants between tomato cuticle and water phase were atomic space density and polarizability.(2)Based on the six descriptors selected by MLR algorithm,ANN-QSAR model is established by artificial neural network algorithm to establish the nonlinear relationship between Kcw and descriptors.ANN-QSAR model shows good performance in goodness of fit(Radj2=0.900,RMSEtra=0.522),robustness(QLOO2=0.867,QBOOT2=0.828)and external prediction ability(Rext2=0.837,Qext2=0.830),and can effectively predict Kcw of organic compounds.The model based on the Kcw values of 13 compounds including alcohols,ethers,esters,phenols and herbicides has a wide range of applications.In contrast,the fitting degree and robustness of ANN-QSAR model are better than mlr-qasr model.In this study,the relationship between molecular structure and Kcw is analyzed from the perspective of linearity and nonlinearity.The QSAR model is established by MLR algorithm and ANN algorithm respectively.It provides an efficient and accurate tool to predict the partition coefficient of organic compounds between tomato cuticle and water phase,which not only fills the blank data of the experiment,provides data support for evaluating the environmental fate and risk of organic pollutants,but also provides a theoretical basis for understanding the distribution behavior of organic pollutants. |