Font Size: a A A

Computer-aided Study On Structure-Activity Relationship Of Tyrosinase Inhibitors

Posted on:2022-02-27Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y WuFull Text:PDF
GTID:2481306602959289Subject:Chemical Engineering and Technology
Abstract/Summary:PDF Full Text Request
As a key rate-limiting enzyme,tyrosinase(TYR)plays an important role in the reactions of melanin production and browning,thus it is closely related to skin whitening,treatments for pigmentation diseases,and food preservation.In this thesis,we collected mushroom tyrosinase(abTYR)inhibitors as our research objects,then implemented a variety of machine learning algorithms to carry out the classification study and the quantitative regression study for the bioactivities of tyrosinase inhibitors.Besides,we utilized the computational methods to design and obtain some new molecules that may have the potential ability to inhibit tyrosinase theoretically.Following are the research contents of this thesis:(1)Classification study for TYR inhibitors by a variety of machine learning algorithms.Based on the collected 1097 abTYR inhibitors with IC50 values,we used three kinds of fingerprint descriptors(MACCS,ECFP4,and Avalon)to characterize the data set,then five machine learning algorithms,including support vector machine(SVM),deep neural networks(DNN),logistic regression(LR),decision tree(DT),as well as random forest(RF)were used to build 15 classification models.Among these models,Model 5B established by ECFP4 fingerprints and DNN algorithm showed the best classification performance,which achieved prediction accuracy(Q)of 91.36%and Matthews correlation coefficient(MCC)of 0.81 on the test set,respectively.We calculated the thresholds(Threshold?0.90)of applicability domains for each of the 15 classification models,the prediction results for compounds within the applicability domains of models were considered to have approximately 90%reliability.The Threshold0.90 of Model 5B was 0.4872 and it showed the best coverage performance,covering 99.43%of the training set and 99.55%of the test set.The external test set and the decoys set were used to verify the generalization ability of the models:Model 5B achieved the best performance,covering 98.48%of the decoys set and reaching the Q of 94.05%;covering 99.34%of the external test set and reaching the Q of 84.67%and MCC of 0.71.Besides,we used t-SNE and K-Means algorithms to divide 1097 inhibitors into 8 subsets.By summarizing the main scaffolds and substructures in each subset,we found that subset 1(thiourea tyrosinase inhibitors)had the highest proportion of highly active inhibitors,about 72%of compounds in subset 1 were highly active inhibitors.Quinoline,amide,and thiophene were the substructures that often appeared in the highly active inhibitors of subset 1.(2)Quantitative regression study for the TYR inhibitors by a variety of machine learning algorithms.In total,we collected 813 abTYR inhibitors with exact IC50 values for the quantitative study.Then we used the CORINA descriptors and RDKit 2D descriptors to characterize the 813 inhibitors.By using three kinds of descriptor ranking methods(Pearson,SVM-RFE,RF-RFE)and three machine learning algorithms,including SVM,RF,as well as DNN,we built 14 quantitative regression models.The best model was Model 7G built with RDKit 2D descriptors and DNN algorithm,which showed the coefficient of determination(R2)of 0.770 and the root mean square error(RMSE)of 0.482 on the test set,respectively.The Williams plot was used to define and visualize the applicability domain of Model 7G.With the warning leverage(h*)of 0.235,about 98.31%of the training set and 88.96%of the test set were located in the applicability domain of Model 7G,which meant the prediction results by Model 7G for these compounds were reliable.In the analysis of RDKit 2D descriptors,we found that the substructures(such as thiocarbonyl,tertiary amine)represented by several descriptors with larger contributions were consistent with the fragments found in the classification study.(3)Utilize scaffold hopping method to design some new molecules that were considered to theoretically have the potential ability to inhibit TYR.At first,we chose the main scaffolds in subset 1 as the materials for scaffold hopping:the thiosemicarbazide and acylthiourea substructures of the scaffolds were retained,then by scaffold hopping,a total of 15344 new molecules were generated.Afterwards,the bioactivities of 15344 new molecules were predicted by the classification models and quantitative regression models built in this study,7600 molecules that were predicted to be highly active TYR inhibitors(IC50?10 ?M)were kept for further evaluation.Then,we used the quantitative estimate of drug-like(QED)properties,poor substructure filtering(PAINS),synthetic complexity evaluation(SCScore),evaluation of pharmacokinetics(ADMET),patent investigation,and molecular docking to evaluate and select these 7600 molecules.Finally,we chose three new molecules with the potential ability to become tyrosinase inhibitors.In this thesis,we researched the structure-activity relationship of TYR inhibitors based on the computational methods:a variety of machine learning algorithms were used to establish classification models and quantitative regression models for predicting the bioactivities of TYR inhibitors;according to the descriptors analysis and clustering analysis,the structural fragments that had a great influence on the activities of tyrosinase inhibitors were obtained;some new molecules with the potential ability to inhibit TYR were generated.The research results in this thesis can provide some help for developing tyrosinase inhibitors.
Keywords/Search Tags:tyrosinase inhibitors, classification research, clustering, quantitative prediction, scaffold hopping
PDF Full Text Request
Related items