Aromatic compounds are ubiquitous organic compounds in nature and important chemical raw materials.Quantitative structure-activity relationship(QSAR)is of great significance in studying the properties of aromatic compounds,which makes up for the high cost and long research cycle of experimental methods.Compared with traditional 2D-QSAR and 3D-QSAR methods,HQSAR(holographical quantitative structure-activity relationship)has simpler calculation and higher prediction ability.Ensemble modeling can reduce the influence of sample size or complex sample on the prediction results and improve the prediction ability and stability of the model.The application of ensemble modeling in HQSAR method not only makes up for the shortage of single model,but also provides new ideas for QSAR research methods.Therefore,in this study,HQSAR and ensemble modeling were used to study the quantitative relationship between the structure and properties of aromatic compounds.The main contents include:1.HQSAR was used to establish the quantitative structure-activity relationship between the molecular structure of PCBS and the three groups of activity data including n-octanol/air distribution coefficient,n-octanol/water distribution coefficient and biological concentration factor.These three kinds of activity data are all important parameters for measuring the environmental behavior of PCBS.The prediction ability of the model was evaluated by the method of retention cross validation and external test set validation.The cross validation coefficient of the three HQSAR models was 0.957,0.954 and 0.987,respectively.The results show that the model has good predictive power.The relationship between molecular structure and activity of PCBS was analyzed by molecular contribution diagram.2.HQSAR was used to establish the quantitative structure-activity relationship between the molecular structure of hydrophobic organic pollutants and LDPE-water distribution coefficient.The prediction ability of the model was evaluated by using the method of retention cross validation and external test set validation,and the results showed that the coefficient of cross-validation was 0.960 and the coefficient of non-cross-validation was 0.981.The results show that the model has good prediction ability.The relationship between the structure of hydrophobic organic pollutants and LDPE-water distribution coefficient was analyzed by molecular contribution diagram.3.Using HQSAR method and ensemble modeling,the quantitative relationship between the molecular structure of dioxins and the n-octanol/water distribution coefficient was studied.The prediction ability of the model was evaluated by using the method of retention cross validation and external test set validation,and the results showed that the coefficient of cross-validation was 0.986 and the coefficient of non-cross-validation was 0.991,indicating that the model had good prediction ability and robustness.The relationship between the molecular structure of dioxins and the distribution coefficient of n-octanol/water was analyzed by the molecular contribution diagram.For ensemble modeling,the member models with an average relative error of less than 0.8%and a total number of member models of 80 were selected.The ensemble HQSAR model was established,and the model was verified by the external test set validation method.The results showed that the predictive ability and robustness of the ensemble HQSAR model were improved.4.The quantitative relationship between the molecular structure of pahs and the chromatographic retention index was studied by using HQSAR method and integrated modeling.The prediction ability of the model was evaluated by using the method of retention cross validation and external test set validation,and the results showed that he coefficient of cross-validation was 0.994 and the coefficient of non-cross-validation was 0.973.The results show that the established HQSAR model is reliable and has strong prediction ability.The relationship between pahs and chromatographic retention index was analyzed by molecular contribution diagram.average relative error less than 2.0%was selected as the acceptance standard for integrated modeling,and the total number of member models was determined to be 130.Integrated HQSAR model was established,and external test set validation method was used to evaluate the predictive ability of integrated model.The results show that the model parameters are robust and reliable.Compared with HQSAR,it is found that the integrated HQSAR model can improve the prediction ability and robustness of the model. |