| Leukotrienes(LTs)are pro-inflammatory lipid mediators derived from arachidonic acid(AA)and play an important role in asthma,arthritis,cardiovascular disease and cancer.By inhibiting the activity of 5-lipoxygenase activating protein(FLAP),leukotriene synthesis can be blocked.Compounds targeting FLAP can act as broad-spectrum leukotriene modulators,which have broad application prospects.However,there are no drugs targeting FLAP on the market.The main work of this paper is as follows:(1)Classification models of FLAP inhibitors ware established using a variety of machine learning algorithms.Starting from the ligand of FLAP,a small molecule activity database containing 2112 FLAP inhibitors was established.Five kinds of fingerprint descriptors were calculated:Avalon,extended-connectivity fingerprint(ECFP4),MACCS,RDKit fingerprint(RDK),and topological torsion fingerprint(TT).25 classification models for discriminating the activity level of FLAP inhibitors were established using support vector machine(SVM),random forest(RF),logistic regression(LR),multi-layer perceptron(MLP)and gradient boost decision tree(XGBoost).The SVM and XGBoost algorithms were suitable for building classification prediction models for FLAP inhibitors,and ECFP4 and TT were suitable for presenting the relationship between FLAP inhibitor structure and activity.The SVM-ECFP4 model built by the combination of ECFP4 and SVM had the best performance.The accuracy of the test set was 0.862,and the Matthews correlation coefficient(MCC)was 0.722.The distance between a compound and the model(dSTDPRO)was used to define the application domain of the model,which reflected the reliability of the model to predict unknown compounds.At the expense of reducing the coverage of model,the accuracy of the test set of the model SVM-ECFP4 was improved to 0.946.By comparing the relationship between dSTD-PRO and prediction accuracy,it was found that the reliability of the prediction results mainly depended on the molecular structure of the compound itself.The composition of the training set compounds had a greater impact on the performance of the model than the modeling algorithms and descriptor types.(2)The structural characteristics of FLAP inhibitors were classified by K-Means.Aiming at the molecular structural characteristics of FLAP inhibitors,a 10-bit custom fingerprint that was suitable for classifying the data set was designed.Using a custom defined fingerprint as the input to the K-means,2112 FLAP inhibitors were divided into eight subsets.Clustering results indicated that most of the molecules in the FLAP inhibitors contain halogen,N-containing fused ring,aryl oxadiazole/oxazolidine,sulfonyl,amide,triarylamine-heteroarene and diarylamino-heteroarene.The stereoisomerism of the chiral carbon and the connection between aromatic rings had an important effect on the activity of the FLAP inhibitor.(3)Quantitative structure-activity relationship models for FLAP inhibitors was established.A set of 1083 FLAP inhibitors,whose enzyme bioactivities were detected with the same metod,was collected.The CORINA and RDKit molecular descriptors were calculated for each FLAP inhibitor.Six quantitative prediction models were built by multiple linear regression(MLR),support vector machine regression(SVR)and random forest regression(RFR).The coefficient of determination(R2)of the test set ranged from 0.476 to 0.670,and a root mean square error(RMSE)ranged from 0.617 to 0.490.The model built by SVR and RFR algorithms performed better than the model built by MLR.The RDKit molecular descriptor was more representative of the relationship between the FLAP inhibitor structure and activity.Three consensus models were built by combining multiple models.The best consensus model had a R2 of 0.690 and a RMSE of 0.474 for the test set,which was a huge improvement over the single model before the combination.The number of hydrogen bond donors,hydrophobicity,charge distribution and electronegativity in the FLAP inhibitor molecule had an important effect on activity.The model developed in this study can be used to screen compounds prior to activity assay.The structural activity relationships summarized can help drug chemists design more active molecules and safer FLAP drug candidates. |