Skip to main content
Log in

pH-dependent solubility prediction for optimized drug absorption and compound uptake by plants

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

Aqueous solubility is the most important physicochemical property for agrochemical and drug candidates and a prerequisite for uptake, distribution, transport, and finally the bioavailability in living species. We here present the first-ever direct machine learning models for pH-dependent solubility in water. For this, we combined almost 300000 data points from 11 solubility assays performed over 24 years and over one million data points from lipophilicity and melting point experiments. Data were split into three pH-classes − acidic, neutral and basic − , representing the conditions of stomach and intestinal tract for animals and humans, and phloem and xylem for plants. We find that multi-task neural networks using ECFP-6 fingerprints outperform baseline random forests and single-task neural networks on the individual tasks. Our final model with three solubility tasks using the pH-class combined data from different assays and five helper tasks results in root mean square errors of 0.56 log units overall (acidic 0.61; neutral 0.52; basic 0.54) and Spearman rank correlations of 0.83 (acidic 0.78; neutral 0.86; basic 0.86), making it a valuable tool for profiling of compounds in pharmaceutical and agrochemical research. The model allows for the prediction of compound pH profiles with mean and median RMSE per molecule of 0.62 and 0.56 log units.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Amidon GL, Lennernas H, Shah VP, Crison JR (1995) A theoretical basis for a biopharmaceutic drug classification: the correlation of in vitro drug product dissolution and in vivo bioavailability. Pharm Res 12:413–420

    Article  CAS  PubMed  Google Scholar 

  2. Jorgensen WL, Duffy EM (2002) Prediction of drug solubility from structure. Adv Drug Deliv Rev 54:355–366

    Article  CAS  PubMed  Google Scholar 

  3. Zhang Y, Lorsbach BA, Castetter S, Lambert WT, Kister J, Wang NX, Klittich CJR, Roth J, Sparks TC, Loso MR (2018) Physicochemical property guidelines for modern agrochemicals. Pest Manag Sci 74:1979–1991

    Article  CAS  Google Scholar 

  4. Manallack DT (2027) The acid/base profile of agrochemicals. SAR QSAR Environ Res 28:621–628

    Article  Google Scholar 

  5. Comer JEA (2003) In drug bioavailability, vol. 1, chapter 2. Wiley-VCH, New York, pp 21–45

    Book  Google Scholar 

  6. Fallingborg J (1999) Intraluminal pH of the human gastrointestinal tract. Dan Med Bull 46:183–196

    CAS  PubMed  Google Scholar 

  7. Nowak M, Selmar D (2018) Cellular distribution of alkaloids and their translocation via phloem and xylem: the importance of compartment pH. Plant Biol J 18:879–882

    Article  Google Scholar 

  8. Bergstroem CAS, Luthman K, Artursson P (2004) Accuracy of calculated pH-dependent aqueous drug solubility. Eur J Pharm Sci 22:387–398

    Article  Google Scholar 

  9. Loh ZH, Samanta AK, Heng PWS (2015) Overview of milling techniques for improving the solubility of poorly water-soluble drugs. Asian J Pharm Sci 10:255–274

    Article  Google Scholar 

  10. Veseli A, Zakelj S, Kristl A (2019) A review of methods for solubility determination in biopharmaceutical drug characterization. Drug Devel Indust Pharm 45:1717–1724

    Article  CAS  Google Scholar 

  11. Alsenz J, Kansy M (2007) High throughput solubility measurement in drug discovery and development. Adv Drug Deliv Rev 59:546–567

    Article  CAS  PubMed  Google Scholar 

  12. Galia E, Nicolaides E, Hörter D, Löbenberg R, Reppas C, Dressman J (1998) Evaluation of various dissolution media for predicting in vivo performance of class I and II drugs. Pharm Res 15:698–705

    Article  CAS  PubMed  Google Scholar 

  13. Galia E, Nicolaides E, Reppas C, Dressman J (1996) New media discriminate dissolution of poorly soluble drugs. Pharm Res 13:262

    Google Scholar 

  14. Kanikkannan N (2018) Technologies to improve the solubility, dissolution and bioavailability of poorly soluble drugs. J Anal Pharm Res 7:198

    Article  Google Scholar 

  15. Delaney JS (2005) Predicting aqueous solubility from structure. Drug Discov Today 10:289–295

    Article  CAS  PubMed  Google Scholar 

  16. Balakin KV, Savchuk NP, Tetko IV (2006) In silico approaches to prediction of aqueous and DMSO solubility of drug-like compounds: trends, problems and solutions. Curr Med Chem 13:223–241

    Article  CAS  PubMed  Google Scholar 

  17. Faller B, Ertl P (2007) Computational approaches to determine drug solubility. Adv Drug Deliv Rev 59:533–545

    Article  CAS  PubMed  Google Scholar 

  18. Göller AH, Hennemann M, Keldenich J, Clark T (2006) In silico prediction of buffer solubility based on quantum-mechanical and HQSAR- and topology-based descriptors. J Chem Inf Model 46:648–658

    Article  PubMed  Google Scholar 

  19. Schwaighofer A, Schroeter T, Mika S, Laub J, ter Laak A, Sülzle D, Ganzer U, Heinrich N (2007) Accurate solubility prediction with error bars for electrolytes: a machine learning approach. J Chem Inf Model 47:407–424

    Article  CAS  PubMed  Google Scholar 

  20. Schroeter T, Schwaighofer A, Mika S, ter Laak A, Sülzle D, Ganzer U, Heinrich N, Müller KR (2007) Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules. J Comput Aided Mol Des 21:651–664

    Article  CAS  PubMed  Google Scholar 

  21. Montanari F, Kuhnke L, ter Laak A, Clevert DA (2020) Modeling physico-chemical ADMET endpoints with multitask graph convolutional networks. Molecules 25:44–56

    Article  CAS  Google Scholar 

  22. Galarza LM, Gomez LAT Prediction of pH-dependent aqueous solubility of druglike molecules of different chemical behavior. MOL2NET 03, International Conference Series on Multidisciplinary Sciences. (2017)

  23. Aleksic S, Seeliger D, Brown JB (2021) ADMET predictability at Boehringer Ingelheim: state-of-the- art, and do bigger datasets or algorithms make a difference? Mol Inf 40:2100113

    Google Scholar 

  24. Hasselbalch KA (1916) Die Berechnung der Wasserstoffzahl des Blutes aus der freien und gebunden Kohlensäure desselben, und die Sauerstoffbindung des Blutes als Funktion der Wasserstoffzahl. Biochem Z 78:112–144

    CAS  Google Scholar 

  25. Bergström CAS, Luthman K, Artursson P (2004) Accuracy of calculated pH-dependent aqueous drug solubility. Eur J Pharm Sci 22:387–398

    Article  PubMed  Google Scholar 

  26. Hansen NT, Kouskoumvekaki I, Jorgensen FS, Brunak S, Jonsdottir SO (2006) Prediction of pH-dependent aqueous solubility of druglike molecules. J Chem Inf Model 46:2601–2609

    Article  CAS  PubMed  Google Scholar 

  27. ACD/Percepta, Advanced Chemistry Development, Inc., Toronto, ON, Canada, www.acdlabs.com (2022). Accessed 15 Feb 2023.

  28. ADMET Predictor, version 7.1; Simulations Plus, Inc.: Lancaster, CA (2014)

  29. Pipeline Pilot, version 21.2.0.2574, server version 21.2.0.2575; Dassault Systemes BIOVIA Corp.: San Diego, CA (2020)

  30. National Center for Biotechnology Information PubChem Bioassay Record for AID 1996, Aqueous Solubility from MLSMR Stock Solutions, Source: Burnham Center for Chemical Genomics. https://pubchem.ncbi.nlm.nih.gov/bioassay/1996 (2022). Accessed 1 Dec 2022

  31. https://www.ebi.ac.uk/chembl/document_report_card/CHEMBL3301361/ (2023). Accessed 15 Feb 2023.

  32. Wenlock MC, Austin RP, Potter T, Barton P (2011) A highly automated assay for determining the aqueous equilibrium solubility of drug discovery compounds. J Ass Lab Autom 16(276):284

    Google Scholar 

  33. Kramer C, Heinisch T, Fligge T, Beck B, Clark T (2009) A consistent dataset of kinetic solubilities for early-phase drug discovery. Chem Med Chem 4:1529–1536

    Article  CAS  PubMed  Google Scholar 

  34. Sieger P, Cui Y, Scheuer S (2017) pH-dependent solubility and permeability profiles: a useful tool for prediction of oral bioavailability. Eur J Pharm Sci 195:82–90

    Article  Google Scholar 

  35. Rogers D, Hahn M (2010) Extended-connectivity fingerprints. J Chem Inf Model 50:742–754

    Article  CAS  PubMed  Google Scholar 

  36. Sosnin S, Karlov D, Tetko IV, Fedorov MV (2019) Comparative study of multitask toxicity modeling on a broad chemical space. J Chem Inf Model 59:1062–1072

    Article  CAS  PubMed  Google Scholar 

  37. Alexander DLJ, Tropsha A, Winkler DA (2015) Beware of R2: simple, unambiguous assessment of the prediction accuracy of QSAR and QSPR models. J Chem Inf Model 55:1316–1322

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Yingqing RY, Yalkowsky SH (2001) Prediction of drug solubility by the General Solubility Equation (GSE). J Chem Inf Comput Sci 41:354–357

    Article  Google Scholar 

  39. Dahl GE, Jaitly N, Salakhutdinov R Multi-task Neural Networks for QSAR Predictions, arXiv:1406.1231 (2014). Accessed 15 Feb 2023.

  40. Kearnes S, Goldman B, Pande V Modeling Industrial ADMET Data with Multitask Networks, arXiv:1606.08793 (2016). Accessed 15 Feb 2023.

  41. Winter R, Montanari F, Noe F, Clevert DA (2019) Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations. Chem Sci 10:1692–1701

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Contributions

AB performed the machine learning work and prepared all figures and data. SN and FM provided the machine learning concept and framework. AG identified the datasets and wrote the main manuscript. All authors reviewed the manuscript.

Corresponding author

Correspondence to Andreas H. Göller.

Ethics declarations

Competing interest

The authors declare no competing interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 10614 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Bonin, A., Montanari, F., Niederführ, S. et al. pH-dependent solubility prediction for optimized drug absorption and compound uptake by plants. J Comput Aided Mol Des 37, 129–145 (2023). https://doi.org/10.1007/s10822-023-00496-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-023-00496-3

Keywords

Navigation