Skip to main content

Advertisement

Log in

Computational workflow for discovering small molecular binders for shallow binding sites by integrating molecular dynamics simulation, pharmacophore modeling, and machine learning: STAT3 as case study

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

A Correction to this article was published on 19 October 2023

This article has been updated

Abstract

STAT3 belongs to a family of seven transcription factors. It plays an important role in activating the transcription of various genes involved in a variety of cellular processes. High levels of STAT3 are detected in several types of cancer. Hence, STAT3 inhibition is considered a promising therapeutic anti-cancer strategy. However, since STAT3 inhibitors bind to the shallow SH2 domain of the protein, it is expected that hydration water molecules play significant role in ligand-binding complicating the discovery of potent binders. To remedy this issue, we herein propose to extract pharmacophores from molecular dynamics (MD) frames of a potent co-crystallized ligand complexed within STAT3 SH2 domain. Subsequently, we employ genetic function algorithm coupled with machine learning (GFA-ML) to explore the optimal combination of MD-derived pharmacophores that can account for the variations in bioactivity among a list of inhibitors. To enhance the dataset, the training and testing lists were augmented nearly a 100-fold by considering multiple conformers of the ligands. A single significant pharmacophore emerged after 188 ns of MD simulation to represent STAT3-ligand binding. Screening the National Cancer Institute (NCI) database with this model identified one low micromolar inhibitor most likely binds to the SH2 domain of STAT3 and inhibits this pathway.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Data availability

Data are available upon request from the corresponding author.

Change history

References

  1. Hospital A, Goñi JR, Orozco M, Gelpi J (2015) Adv Appl Bioinforma Chem 8:37–47

    Google Scholar 

  2. Aykut AO, Atilgan AR, Atilgan C (2013) PLoS Comput Biol 9(12):e1003366

    PubMed  PubMed Central  Google Scholar 

  3. Costa MG, Batista PR, Bisch PM, Perahia D (2015) J Chem Theory Comput 11(6):2755

    CAS  PubMed  Google Scholar 

  4. Gioia D, Bertazzo M, Recanatini M, Masetti M, Cavalli A (2017) Molecules 22(11):2029

    PubMed  PubMed Central  Google Scholar 

  5. Eun C, Ortiz-Sánchez JM, Da L, Wang D, McCammon JA (2014) PLoS ONE 9(5):e97975

    PubMed  PubMed Central  Google Scholar 

  6. Lee JY, Krieger JM, Li H, Bahar I (2020) Protein Sci 29(1):76

    CAS  PubMed  Google Scholar 

  7. Wakefield AE, Kozakov D, Vajda S (2022) Curr Opin Struct Biol 75:102396

    CAS  PubMed  PubMed Central  Google Scholar 

  8. Guo Z, Li B, Cheng L-T, Zhou S, McCammon JA, Che J (2015) J Chem Theory Comput 11(2):753

    CAS  PubMed  PubMed Central  Google Scholar 

  9. Xie L, Bourne PE (2007) A robust and efficient algorithm for the shape description of protein structures and its application in predicting ligand binding sites. BMC bioinformatics. Springer, Berlin, p 1

    Google Scholar 

  10. Sadybekov AV, Katritch V (2023) Nature 616(7958):673

    CAS  PubMed  Google Scholar 

  11. Hassan Baig M, Ahmad K, Roy S, Mohammad Ashraf J, Adil M, Haris Siddiqui M, Khan S, Amjad Kamal M, Provazník I, Choi I (2016) Curr Pharm Des 22(5):572

    Google Scholar 

  12. McCarthy M, Prakash P, Gorfe AA (2016) Acta Biochim Biophys Sin 48(1):3

    CAS  PubMed  Google Scholar 

  13. Zhavoronkov A, Vanhaelen Q, Oprea TI (2020) Clin Pharmacol Ther 107(4):780

    PubMed  PubMed Central  Google Scholar 

  14. Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G, Li B, Madabhushi A, Shah P, Spitzer M (2019) Nat Rev Drug Discovery 18(6):463

    CAS  PubMed  Google Scholar 

  15. Zhang L, Zhan C. Machine learning in rock facies classification: An application of XGBoost. International Geophysical Conference, Qingdao, China: Society of Exploration Geophysicists and Chinese Petroleum Society, 2017: 1371

  16. Qi Y (2012) Random forest for bioinformatics. Ensemble machine learning: Methods and applications. Springer, Berlin, p 307

    Google Scholar 

  17. Lavecchia A (2015) Drug Discovery Today 20(3):318

    PubMed  Google Scholar 

  18. Wickramasinghe I, Kalutarage H (2021) Soft Comput 25(3):2277

    Google Scholar 

  19. Jaradat NJ, Khanfar MA, Habash M, Taha MO (2015) J Comput Aided Mol Des 29(6):561

    CAS  PubMed  Google Scholar 

  20. Varuna Shree N, Kumar T (2018) Brain informatics 5(1):23

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Hajmeer M, Basheer I (2002) J Microbiol Methods 51(2):217

    CAS  PubMed  Google Scholar 

  22. Gupta P, Sinha NK (2000) CHAPTER 14 - neural networks for identification of nonlinear systems: an overview. In: Sinha NK, Gupta MM (eds) Soft Computing and Intelligent Systems. Academic Press, San Diego, p 337

    Google Scholar 

  23. Jiang L, Cai Z, Zhang H, Wang D (2013) J Exp Theor Artif Intell 25(2):273

    CAS  Google Scholar 

  24. Tuyen TT, Jaafari A, Yen HPH, Nguyen-Thoi T, Van Phong T, Nguyen HD, Van Le H, Phuong TTM, Nguyen SH, Prakash I (2021) Eco Inform 63:101292

    Google Scholar 

  25. Wong SC, Gatt A, Stamatescu V, McDonnell MD. Understanding data augmentation for classification: when to warp? 2016 International conference on digital image computing: techniques and applications (DICTA): IEEE, 2016: 1

  26. Hatmal MmM, Abuyaman O, Taha M (2021) Comput Struct Biotechnol J 19:4790

    CAS  PubMed  PubMed Central  Google Scholar 

  27. Jaradat NJ, Alshaer W, Hatmal M, Taha MO (2023) RSC Adv 13(7):4623

    CAS  PubMed  PubMed Central  Google Scholar 

  28. Bromberg JF, Wrzeszczynska MH, Devgan G, Zhao Y, Pestell RG, Albanese C, Darnell JE Jr (1999) Cell 98(3):295

    CAS  PubMed  Google Scholar 

  29. Adan H, Daniel J, Raptis L (2022) Cells 11(16):2537

    CAS  PubMed  PubMed Central  Google Scholar 

  30. Bromberg J (2002) J Clin Investig 109(9):1139

    CAS  PubMed  PubMed Central  Google Scholar 

  31. Zou S, Tong Q, Liu B, Huang W, Tian Y, Fu X (2020) Mol Cancer 19(1):1

    Google Scholar 

  32. Frank DA (2007) Cancer Lett 251(2):199

    CAS  PubMed  Google Scholar 

  33. Yue P, Lopez-Tapia F, Paladino D, Li Y, Chen C-H, Namanja AT, Hilliard T, Chen Y, Tius MA, Turkson J (2016) Can Res 76(3):652

    CAS  Google Scholar 

  34. Feng K-R, Wang F, Shi X-W, Tan Y-X, Zhao J-Y, Zhang J-W, Li Q-H, Lin G-Q, Gao D, Tian P (2020) Eur J Med Chem 201:112428

    CAS  PubMed  Google Scholar 

  35. Verdura S, Cuyàs E, Llorach-Parés L, Pérez-Sánchez A, Micol V, Nonell-Canals A, Joven J, Valiente M, Sánchez-Martínez M, Bosch-Barrera J (2018) Food Chem Toxicol 116:161

    CAS  PubMed  Google Scholar 

  36. Mencalha AL, Du Rocher B, Salles D, Binato R, Abdelhay E (2010) Cancer Chemother Pharmacol 65(6):1039

    CAS  PubMed  Google Scholar 

  37. Zhang L, Wang Y, Dong Y, Chen Z, Eckols TK, Kasembeli MM, Tweardy DJ, Mitch WE (2020) Am J Physiol-Renal Physiol 319(1):F84

    CAS  PubMed  PubMed Central  Google Scholar 

  38. Masciocchi D, Gelain A, Villa S, Meneghetti F, Barlocco D (2011) Future Med Chem 3(5):567

    CAS  PubMed  Google Scholar 

  39. Maurer M, Oostenbrink C (2019) J Mol Recognit 32(12):e2810

    CAS  PubMed  PubMed Central  Google Scholar 

  40. Singh AV, Kayal A, Malik A, Maharjan RS, Dietrich P, Thissen A, Siewert K, Curato C, Pande K, Prahlad D (2022) Langmuir 38(26):7976

    CAS  PubMed  Google Scholar 

  41. Mark P, Nilsson L (2001) Chem A 105(43):9954

    CAS  Google Scholar 

  42. Momany FA, Rone R (1992) J Comput Chem 13(7):888

    CAS  Google Scholar 

  43. Hatmal MmM, Jaber S, Taha MO (2016) J Comput-Aided Mol Design 30:1149

    CAS  Google Scholar 

  44. Hatmal MmM, Taha MO (2017) Future Med Chem 9(11):1141

    CAS  PubMed  Google Scholar 

  45. Hatmal MmM, Taha MO (2018) J Chem Information Model 58(4):879

    CAS  Google Scholar 

  46. Triballeau N, Acher F, Brabet I, Pin J-P, Bertrand H-O (2005) J Med Chem 48(7):2534

    CAS  PubMed  Google Scholar 

  47. Shahin R, Taha MO (2012) Bioorg Med Chem 20(1):377

    CAS  PubMed  Google Scholar 

  48. Kirchmair J, Markt P, Distinto S, Wolber G, Langer T (2008) J Comput Aided Mol Des 22(3):213

    CAS  PubMed  Google Scholar 

  49. Leach A Nucleic Acids Research 45:D945

  50. Davies M, Nowotka M, Papadatos G, Dedman N, Gaulton A, Atkinson F, Bellis L, Overington JP (2015) Nucleic Acids Res 43(W1):W612

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Jupp S, Malone J, Bolleman J, Brandizi M, Davies M, Garcia L, Gaulton A, Gehant S, Laibe C, Redaschi N (2014) Bioinformatics 30(9):1338

    CAS  PubMed  PubMed Central  Google Scholar 

  52. Gaulton A, Hersey A, Nowotka M, Bento AP, Chambers J, Mendez D, Mutowo P, Atkinson F, Bellis LJ, Cibrián-Uhalte E (2017) Nucleic Acids Res 45(D1):D945

    CAS  PubMed  Google Scholar 

  53. Taha MO, Habash M, Hatmal MmM, Abdelazeem AH, Qandil A (2015) J Mol Graph Model 56:91

    CAS  PubMed  Google Scholar 

  54. Li J, Ehlers T, Sutter J, Varma-O’Brien S, Kirchmair J (2007) J Chem Inf Model 47(5):1923

    CAS  PubMed  Google Scholar 

  55. Al-Tawil MF, Daoud S, Hatmal MmM, Taha MO (2022) RSC Adv 12(17):10686

    CAS  PubMed  PubMed Central  Google Scholar 

  56. Aqtash Ra, Zihlif MA, Hammad H, Nassar ZD, Al Meliti J, Taha MO (2017) Comput Biol Chem 71:170

    PubMed  Google Scholar 

  57. Kurogi Y, Guner OF (2001) Curr Med Chem 8(9):1035

    CAS  PubMed  Google Scholar 

  58. Simm J, Humbeck L, Zalewski A et al (2021) Splitting chemical structure data sets for federated privacy-preserving machine learning. J Cheminform 13:96. https://doi.org/10.1186/s13321-021-00576-2

    Article  PubMed  PubMed Central  Google Scholar 

  59. Géron A (2019) Hands-on machine learning with Scikit-Learn, Keras and TensorFlow: concepts, tools, and techniques to build intelligent systems, 2nd edn. O’Reilly, Springfield

    Google Scholar 

  60. Vamathevan J, Clark D, Czodrowski P, Dunham I, Ferran E, Lee G, Li B, Madabhushi A, Shah P, Spitzer M, Zhao S (2019) Nat Rev Drug Discov 18(6):463

    CAS  PubMed  PubMed Central  Google Scholar 

  61. Berrar D (2018) Encyclopedia of Bioinformatics and Computational Biology: ABC of Bioinformatics 403

  62. McHugh ML (2012) Biochemia medica 22(3):276

    PubMed  PubMed Central  Google Scholar 

  63. Vehtari A, Gelman A, Gabry J (2017) Stat Comput 27(5):1413

    Google Scholar 

  64. Kondeti PK, Ravi K, Mutheneni SR, Kadiri MR, Kumaraswamy S, Vadlamani R, Upadhyayula SM (2019). Epidemiol Infection. https://doi.org/10.1017/S0950268819001481

    Article  Google Scholar 

  65. Hall P, Gill N (2018) An introduction to machine learning interpretability. O’Reilly Media, Inc., NewYork

    Google Scholar 

  66. Molnar C (2022) ‘8.6 Global Surrogate’, in Interpretable machine learning: A guide for making Black Box models explainable, 2nd edn. Munich. christophm.github.io/interpretable-ml-book/, Christoph Molnar

    Google Scholar 

  67. Rogers D, Hopfinger AJ (1994) J Chem Inf Comput Sci 34(4):854

    CAS  Google Scholar 

  68. Rodríguez-Pérez R, Bajorath J (2019) J Med Chem 63(16):8761

    PubMed  Google Scholar 

  69. Rodríguez-Pérez R, Bajorath J (2020) J Comput Aided Mol Des 34(10):1013

    PubMed  PubMed Central  Google Scholar 

  70. Ghorbani A, Zou J. Data shapley: Equitable valuation of data for machine learning. International Conference on Machine Learning: PMLR, 2019:2242

  71. Heppler LN, Attarha S, Persaud R, Brown JI, Wang P, Petrova B, Tošić I, Burton FB, Flamand Y, Walker SR (2022) J Biol Chem 298(2):101531

    CAS  PubMed  Google Scholar 

  72. Shastri A, Schinke C, Yanovsky AV, Bhagat TD, Giricz O, Barreyro L, Boultwood J, Pellagati A, Yu Y, Brown JR (2014) Blood 124(21):3602

    Google Scholar 

  73. Khan MW, Saadalla A, Ewida AH, Al-Katranji K, Al-Saoudi G, Giaccone ZT, Gounari F, Zhang M, Frank DA, Khazaie K (2018) Cancer Immunol Immunother 67(1):13

    CAS  PubMed  Google Scholar 

  74. Tuffaha GO, Hatmal MmM, Taha MO (2019) J Mol Graph Model 91:30

    CAS  PubMed  Google Scholar 

  75. Al-Sha’er MA, Taha MO (2021) Curr Comput-Aided Drug Design 17(4):511

    Google Scholar 

  76. Bulavas V, Marcinkevičius V, Rumiński J (2021) Informatica 32(3):441

    Google Scholar 

  77. Al-Sha’er MA, Taha MO (2018) J Mol Graph Model 83:1536

    Google Scholar 

  78. Khanfar MA, Taha MO (2013) J Chem Inf Model 53(10):2587

    CAS  PubMed  Google Scholar 

  79. Al-Nadaf A, Taha MO (2013) Med Chem Res 22:1979

    CAS  Google Scholar 

  80. Rodríguez-Pérez R, Bajorath J (2020) J Comput Aided Mol Des 34:1013

    PubMed  PubMed Central  Google Scholar 

  81. Lipiński PF, Szurmak P (2017) Chem Pap 71(11):2217

    Google Scholar 

  82. Schust J, Sperl B, Hollis A, Mayer TU, Berg T (2006) Chem Biol 13(11):1235

    CAS  PubMed  Google Scholar 

  83. Poria DK, Sheshadri N, Balamurugan K, Sharan S, Sterneck E (2021). J Biol Chem. https://doi.org/10.1074/jbc.RA120.016645

    Article  PubMed  Google Scholar 

  84. Xia Y, Wang G, Jiang M, Liu X, Zhao Y, Song Y, Jiang B, Zhu D, Hu L, Zhang Z (2021) Onco Targets Ther 14:4047

    PubMed  PubMed Central  Google Scholar 

  85. Gordan JD, Thompson CB, Simon MC (2007) Cancer Cell 12(2):108

    CAS  PubMed  PubMed Central  Google Scholar 

  86. Zhou F, Yang Y, Xing D (2011) FEBS J 278(3):403

    CAS  PubMed  Google Scholar 

  87. Taylor EC, Harrington PJ, Fletcher SR, Beardsley GP, Moran RG (1985) J Med Chem 28(7):914

    CAS  PubMed  Google Scholar 

Download references

Acknowledgements

The authors thank the Deanship of Academic Research at the University of Jordan for funding this project. The authors would also like to thank Dr. Walhan al Shaer, Fadwa Daoud, and Suha Wehaibi, from Cell Therapy Center for their technical assistance in biology testing experiments.

Funding

This project was funded the Deanship of Scientific Research at the University of Jordan, Amman, Jordan.

Author information

Authors and Affiliations

Authors

Contributions

NJJ: Investigation, Formal analysis, Review & Editing. MH: Investigation, Review & Editing. DA.: Formal analysis, Review & Editing. MOT: Conceptualization, Methodology, Supervision, Investigation, Resources, Writing, Review & Editing.

Corresponding author

Correspondence to Mutasem Omar Taha.

Ethics declarations

Competing interests

The authors declare that they have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper.

Ethical approval

Not applicable.

Consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 894 KB)

Supplementary file2 (RAR 43921 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Jaradat, N.J., Hatmal, M., Alqudah, D. et al. Computational workflow for discovering small molecular binders for shallow binding sites by integrating molecular dynamics simulation, pharmacophore modeling, and machine learning: STAT3 as case study. J Comput Aided Mol Des 37, 659–678 (2023). https://doi.org/10.1007/s10822-023-00528-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-023-00528-y

Keywords

Navigation