Font Size: a A A

Application Of Machine Learning And Classical Molecular Dynamics Simulation In Protein-protein And Protein-drug Interaction

Posted on:2021-10-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:Muhammad JunaidFull Text:PDF
GTID:1480306503462244Subject:Biology
Abstract/Summary:PDF Full Text Request
In the last few decades,computer simulation and machine learning approach became a potent tool to study experimental systems such as enzyme functions and protein-protein interactions.The computational power increases day to day due to the development in the computers and computational software to study more complex systems,including the structure of the protein,DNA,protein-protein interaction,and enzyme-substrate interaction.In the present thesis,computer simulation and machine learning approaches were used to explore different types of biological systems.The first study is related to the hot-spot analysis for the drug discovery targeting protein-protein interaction of Helicobacter pylori(H.pylori).Cag A and tumor suppressor protein ASSP2.In the second study,we have developed a novel stacking model by machine learning,which can be used in the inhibitor virtual screening against the urease enzyme in H.pylori.In the third study,an extensive molecular dynamics simulations and machine learning techniques were used to study the interaction of three pyrazinamidase(PZase)mutants N11 K,P69T and D126 N with pyrazinamide(PZA)drug in order to reveal the drug resistance mechanism of Mycobacterium tuberculosis(MTB)to PZA drug.Half of the world population is infected by the gram-negative bacterium H.pylori.It colonized the stomach and associated with severe gastric pathologies,including gastric cancer and peptic ulceration.The most virulent factor of H.pylori is the cytotoxin-associated gene A(Cag A)that is injected into the host cell.Cag A interacts with several host proteins and alters their function,thereby causing several diseases.The most well-known target of Cag A is the tumor suppressor protein ASPP2.The subdomain I at the N-terminus of Cag A interacts with the proline-rich motif of ASPP2.Here in this study,we carried out alanine scanning mutagenesis and an extensive molecular dynamics simulation summing up to 3.8 ?s to find out hot spot and least essential residues.The least critical residues were mutated to the other residues to generate the decoy peptides and to gain the strong binding affinity between Cag A and decoy peptides.MMGBSA confirmed the strong binding affinity of decoy peptides with Cag A.The alanine scanning showed that mutation of Y207 and T211 to alanine decreased the binding affinity and were considered as hot spot residues.Likewise,the dynamics simulation and MMGBSA analysis also showed the importance of these two residues at the interface.The four-features pharmacophore model was developed based on these two residues Y207 and T211,and the top ten molecules were filtered from the ZINC,NCI and Ch EMBL database.The good binding affinity of these molecules shows the reliability of our adopted protocol for binding hot-spot residues.The ASPP2-binding pocket in Cag A possesses potential druggability and could be engaged by our designed decoy peptides and small molecules.We believe that our study provides a new insight for using Cag A as a therapeutic target for gastric cancer treatment and provides a platform for a future experiment.Urease is a nickel-containing enzyme found in algae,fungi,and H.pylori.Urease enzyme catalyzes the conversion of urea into ammonia and carbon dioxide.During catalysis,the carbonyl oxygen and ammonium group of urea directly interact with nickel ions leading to the formation of ammonia and carbamate.The carbamate then converted into ammonia and carbon dioxide.Urease is considered the most important drug target against many disease-causing bacteria,such as H.pylori.Urease helps the H.pylori to survive against the acidic environment of the stomach by producing ammonia.Ammonia neutralizing the harsh acidic environment in its vicinity.The pathogenesis of urease contributes to many other illnesses,including urinary tract infections,hepatic comas,and harmful kidney stones.The use of urease inhibitors has been an effective strategy to regulate urease activity.Designing highly selective compounds for urease enzyme are critical considerations to both drug discovery and mechanism studies.It is challenging but in demand to have classifiers to identify active molecules from the inactive molecule.In the present study,eight different machine learning algorithms Neural network(NN),K-nearest neighbor(KNN),support vector machine(SVM),random forest(RF),logistic regression(LR),Gradient boosting(GBM),Extreme Gradient Boosting(XGB)and stacking model were built to classify active and active molecules of urease enzyme.Three types of features,including molecular descriptors(2D and 3D),MACCS fingerprints,and ECFP4 fingerprints,were calculated for both active and inactive compounds.The performances of each algorithm on the classification with different types of features were compared and discussed.According to the receiver operating characteristic curves and the calculated metrics,the stacking model was ranked first with highest accuracy 0.98 score.The ECFP4 fingerprint showed better results for stacking among other features.Besides using conventional molecular docking studies for compound virtual screening,machine-learning-based decision-making models provide alternative options.This study can be of value to the application of machine learning in the area of drug discovery and compound development.PZA drug is a first-line prodrug that effectively shortens the duration of tuberculosis therapy from 9 to 6 months.pnc A gene encodes PZase that,in turn responsible for the activation of PZA prodrug into its active form,pyrazinoic acid(POA).POA is toxic for the bacteria and potently inhibits the growth of latent MTB even at low p H.PZA resistance is caused by three genes pnc A,rps A,and pan D.Among them,the pnc A gene contributes 72 to 99% of the resistance.Hence,the present study focused on the novel mutations N11 K,P69T,and D126 N in pnc A gene.Molecular docking and dynamics simulation were employed to unravel the mechanism of resistance.Furthermore,the SVM model was used to uncover the structural features responsible for the active and inactive state of PZase enzyme.Our in-depth analysis and results are in strong agreement with experimental observation.Binding cavity analysis showed a decrease and increased in the volume of the active site and hinders the correct orientation of PZA drug in the active site.Moreover,the Patchdock and autodock score was found low as compared to WT owing to the disturbance of shape complementarity between PZase and PZA drugs.MMGBSA analyses showed that these mutations decrease the binding affinity toward the PZA drug.In conclusion,mutations N11 K,P69T,and D126 N result in weak binding affinity with drug,but they also cause significant structural deformations that leading to PZA resistance.This study provides useful information that mutations in other than active parts may also cause proteins folding and ligand displacement effect,altering the biological functions.Our research provides an improved understanding of the protein-protein interaction and protein-small molecule interaction to better study the pathogenesis and treatment of diseases caused by Helicobacter pylori and Mycobacterium tuberculosis.Moreover,the present thesis combined molecular dynamics simulation and machine learning techniques to provides new ideas for drug design and development.
Keywords/Search Tags:Molecular dynamics simulation, Machine learning, CagA, ASPP2, Urease inhibitors, PZase mutations
PDF Full Text Request
Related items