Skip to main content
Log in

COSMO-RS blind prediction of distribution coefficients and aqueous pKa values from the SAMPL8 challenge

  • Published:
Journal of Computer-Aided Molecular Design Aims and scope Submit manuscript

Abstract

The SAMPL8 blind prediction challenge, which addresses the acid/base dissociation constants (pKa) and the distribution coefficients (logD), was addressed by the Conductor like Screening Model for Realistic Solvation (COSMO-RS). Using the COSMOtherm implementation of COSMO-RS together with a rigorous conformational sampling, yielded logD predictions with a root mean square deviation (RMSD) of 1.36 log units over all 11 compounds and seven bi-phasic systems of the data set, which was the most accurate of all contest submissions (logD).

For the SAMPL8 pKa competition, participants were asked to report the standard state free energies of all microstates, which were then used to calculate the macroscopic pKa. We have used COSMO-RS based linear free energy fit models to calculate the requested energies. The assignment of the calculated and experimental pKa values was made on the basis of the popular transitions, i.e. the transition hat was predicted by the majority of the submissions. With this assignment and a model that covers both, pKa and base pKa, we achieved an RMSD of 3.44 log units (18 pKa values of 14 molecules), which is the second place of the six ranked submissions. By changing to an assignment that is based on the experimental transition curves, the RMSD reduces to 1.65. In addition to the ranked contribution, we submitted two more data sets, one for the standard pKa model and one or the standard base pKa model of COSMOtherm. Using the experiment based assignment with the predictions of the two sets we received a RMSD of 1.42 log units (25 pKa values of 20 molecules). The deviation mainly arises from a single outlier compound, the omission of which leads to an RMSD of 0.89 log units.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data Availability

All data generated or analyzed during this study are available in this published article, its supplementary information files, or reference [30].

Notes

  1. In cases where the full protonation range could be described by the COSMOtherm pKa (SAMPL8-2) or the base pKa model (SAMPL8-10, SAMPL8-11, SAMPL8-15, SAMPL8-8), these models were used instead of the unified model.

Abbreviations

COSMO-RS:

Conductor like Screening Model for Realistic Solvation

COSMO:

Conductor like Screening Model

DMF:

Dimethylformamide

LFER:

Linear free energy relationship

MAD:

Mean absolute deviation between predicted and experimental data

MEK:

Methyl ethyl ketone

RMSD:

Root mean square deviation between predicted and experimental data

TBME:

Tert butyl methyl ether

References

  1. Bahr MN, Nandkeolyar A, Kenna JK et al (2021) Automated high throughput pKa and distribution coefficient measurements of pharmaceutical compounds for the SAMPL8 blind prediction challenge. J Comput Aided Mol Des 35:1141–1155. https://doi.org/10.1007/s10822-021-00427-0

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Abramov YA (2018) Rational solvent selection for Pharmaceutical Impurity Purge. Cryst Growth Des 18:1208–1214. https://doi.org/10.1021/acs.cgd.7b01748

    Article  CAS  Google Scholar 

  3. Moss GP, Cronin MTD (2002) Quantitative structure–permeability relationships for percutaneous absorption: re-analysis of steroid data. Int J Pharm 238:105–109. https://doi.org/10.1016/S0378-5173(02)00057-1

    Article  CAS  PubMed  Google Scholar 

  4. Mackay D, Celsie AKD, Powell DE, Parnis JM (2018) Bioconcentration, bioaccumulation, biomagnification and trophic magnification: a modelling perspective. Environ Sci: Processes Impacts 20:72–85. https://doi.org/10.1039/c7em00485k

    Article  CAS  Google Scholar 

  5. Walker TW, Frelka N, Shen Z, Chew AK, Huber GW (2020) Recycling of multilayer plastic packaging materials by solvent-targeted recovery and precipitation. Sci Adv 6:eaba7599. https://doi.org/10.1126/sciadv.aba7599

    Article  PubMed  PubMed Central  Google Scholar 

  6. Sánchez-Rivera KL, Zhou P, Kim MS, González Chávez LD, Grey S, Nelson K, Wang S-C, Hermans I, Zavala VM, Van Lehn RC, Huber GW (2021) Reducing Antisolvent Use in the STRAP process by enabling a temperature-controlled polymer dissolution and precipitation for the recycling of Multilayer Plastic Films. Chem Sus Chem 14:4317–4329. https://doi.org/10.1002/cssc.202101128

    Article  CAS  Google Scholar 

  7. Mohan M, Keasling JD, Simmons BA, Singh S (2022) In silico COSMO-RS predictive screening of ionic liquids for the dissolution of plastic. Green Chem 24:4140–4152. https://doi.org/10.1039/d1gc03464b

    Article  CAS  Google Scholar 

  8. Gutiérrez JP, Meindersma GW, de Haan AB (2012) COSMO-RS-Based ionic-liquid selection for extractive distillation processes. Ind Eng Chem Res 51:11518–11529. https://doi.org/10.1021/ie301506n

    Article  CAS  Google Scholar 

  9. Janoschek L, Grozdev L, Berensmeier S (2018) Membrane-assisted extraction of monoterpenes: from in silico solvent screening towards biotechnological process application. R Soc Opensc 5:172004–172018. https://doi.org/10.1098/rsos.172004

    Article  CAS  Google Scholar 

  10. Yara-Varón E, Li Y, Balcells M, Canela-Garayoa R, Fabiano-Tixier AS, Chemat F (2017) Vegetable oils as alternative solvents for green Oleo-Extraction, purification and formulation of Food and Natural Products. Molecules 22:1474. https://europepmc.org/article/med/28872605

    Article  PubMed  PubMed Central  Google Scholar 

  11. Klamt A (2016) COSMO-RS for aqueous solvation and interfaces. Fluid Phase Equilibria 40:152–158. https://doi.org/10.1016/j.fluid.2015.05.027

    Article  CAS  Google Scholar 

  12. Klamt A, Diedenhofen M (2010) Blind prediction test of free energies of hydration with COSMO-RS. J Comput Aided Mol Des 24:357–360. https://doi.org/10.1007/s10822-010-9354-4

    Article  CAS  PubMed  Google Scholar 

  13. Klamt A, Eckert F, Reinisch J, Wichmann K (2016) Prediction of cyclohexane-water distribution coefficients with COSMO-RS on the SAMPL5 data set. J Comput Aided Mol Des 30:959–967. https://doi.org/10.1007/s10822-016-9927-y

    Article  CAS  PubMed  Google Scholar 

  14. Loschen C, Reinisch J, Klamt A (2020) COSMORS based predictions for the SAMPL6 logP challenge. J Comput Aided Mol Des 34:385–392. https://doi.org/10.1007/s10822-019-00259-z

    Article  CAS  PubMed  Google Scholar 

  15. Warnau J, Wichmann K, Reinisch J (2021) COSMO-RS predictions of LogP in the SAMPL7 blind challenge. J Comput Aided Mol Des 35:813–818. https://doi.org/10.1007/s10822-021-00395-5

    Article  CAS  PubMed  Google Scholar 

  16. Klamt A (1995) Conductor-like screening model for real solvents: a new approach to the quantitative calculation of solvation phenomena. J Phys Chem 99:2224–2235. https://doi.org/10.1021/j100007a062

    Article  CAS  Google Scholar 

  17. Klamt A, Schüürmann G (1993) COSMO: a new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient. J Chem Soc Perkin Trans 2 1993:799–805. https://doi.org/10.1039/P29930000799

    Article  Google Scholar 

  18. Klamt A (2018) The COSMO and COSMO-RS solvation models: COSMO and COSMO-RS. Wiley Interdiscip Rev Comput Mol Sci 8:e1338. https://doi.org/10.1002/wcms.1338

    Article  CAS  Google Scholar 

  19. Eckert F, Klamt A (2002) Fast solvent screening via quantum chemistry: COSMO-RS approach. AIChE J 48:369–385. https://doi.org/10.1002/aic.690480220

    Article  CAS  Google Scholar 

  20. BIOVIA COSMOconf 21. Dassault Systemes, https://www.3ds.com, Cologne, Germany

  21. BIOVIA COSMOquick 21. Dassault Systemes, https://www.3ds.com, Cologne, Germany

  22. Loschen C, Klamt A (2012) COSMOquick: a novel interface for fast σ-profile composition and its application to COSMO-RS solvent screening using multiple reference solvents. Ind Eng Chem Res 51:14303–14308. https://doi.org/10.1021/ie3023675

    Article  CAS  Google Scholar 

  23. TURBOMOLE V7.5. University of Karlsruhe and Forschungszentrum Karlsruhe GmbH, 1989–2007, TURBOMOLE GmbH, since 2007; available from http://www.turbomole.com, Karlsruhe, Germany

  24. BIOVIA COSMObase 21. Dassault Systèmes. https://www.3ds.com, Cologne, Germany

  25. BIOVIA COSMOtherm 21. Dassault Systèmes. https://www.3ds.com, Cologne, Germany

  26. Klamt A, Eckert F, Diedenhofen M, Beck ME (2003) First Principles calculations of aqueous pKa values for Organic and Inorganic acids using COSMO-RS reveal an inconsistency in the slope of the pKa scale. J Phys Chem A 107(44):9380–9386. https://doi.org/10.1021/jp034688o

    Article  CAS  PubMed  Google Scholar 

  27. Eckert F, Klamt A (2006) Accurate prediction of basicity in aqueous solution with COSMO-RS. J Comput Chem 27(1):11–19. https://doi.org/10.1002/jcc.20309

    Article  CAS  PubMed  Google Scholar 

  28. Bergazin TD, Tielker N, Zhang Y et al (2021) Evaluation of log P, pKa, and log D predictions from the SAMPL7 blind challenge. J Comput Aided Mol Des 35:771–802. https://doi.org/10.1007/s10822-021-00397-3

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Gunner MR, Murakami T, Rustenburg AS, Işık M, Chodera JD (2020) Standard state free energies, not pKas, are ideal for describing small molecule protonation and tautomeric states. J Comput Aided Mol Des 34:561–573. https://doi.org/10.1007/s10822-020-00280-7

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Mobley DL, Amezcua M, Nandkeolyar A, Bergazin TD, Tielker N, Ray D (2023) samplchallenges/SAMPL8: 1.0.0 (1.0.0). Zenodo. https://doi.org/10.5281/zenodo.7535037

  31. Ingram T, Richter U, Mehling T, Smirnova I (2011) Modelling of pH dependent n-octanol/water partition coefficients of ionizable pharmaceuticals. Fluid Phase Equilibria 305:197–203. https://doi.org/10.1016/j.fluid.2011.04.006

    Article  CAS  Google Scholar 

  32. Chen C-S, Lin S-T (2016) Prediction of pH Effect on the octanol – water partition coefficient of Ionizable Pharmaceutical. Ind Eng Chem Res 55:9284–9294. https://doi.org/10.1021/acs.iecr.6b02040

    Article  CAS  Google Scholar 

  33. Scott DC, Clymer JW (2002) Estimation of distribution coefficients from the partition coefficient and pKa. Pharm Technol 26:30–39

    CAS  Google Scholar 

  34. Dallos A, Liszi JJ (1995) (liquid + liquid) equilibria of (octan-1-ol + water) at temperatures from 288.15 K to 323.15 K. Chem Thermodyn 27:447–448. https://doi.org/10.1006/jcht.1995.0046

    Article  CAS  Google Scholar 

  35. Lladosa E, Montón JB, de la Torre J, Martínez NF (2011) Liquid – liquid and vapor – liquid – liquid equilibrium of the 2-Butanone + 2-Butanol + water system. J Chem Eng Data 56:1755–1761. https://doi.org/10.1021/je1004643

    Article  CAS  Google Scholar 

  36. Ashour I (2005) Liquid – liquid equilibrium of MTBE + ethanol + water and MTBE + 1-Hexanol + water over the temperature range of 288.15 to 308.15 K. J Chem Eng Data 50:113–118. https://doi.org/10.1021/je049799a

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank the organizers for setting up the SAMPL8 challenge and the National Institutes of Health for its support of the SAMPL project via R01GM124270 to David L. Mobley (UC Irvine). We appreciate Juliana Gretz, Paul Czodrowski, Nicolas Tielker, and Stefan M. Kast of the TU Dortmund for sharing their acidity constants. MD thanks David L. Mobley and Aakankschit Nandkeolyar for fruitful discussions.

Funding

This research was funded solely by Dassault Systèmes.

Author information

Authors and Affiliations

Authors

Contributions

MD and FE wrote the manuscript. All authors reviewed the manuscript and approved the version to be published. FE and MD analyzed the data and interpreted the results. ST fitted the parameters of the unified pKa LFER model. MD ran the tautomer/conformer generation and performed the COSMO-RS predictions.

Corresponding author

Correspondence to Michael Diedenhofen.

Ethics declarations

Competing interests

The authors declare the following competing financial interests: MD, FE, and ST are employees of Dassault Systèmes, BIOVIA. Dassault Systèmes commercially distributes the software COSMOtherm, COSMOconf, COSMOquick, COSMObase, and TURBOMOLE, which were used in the present study.

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Diedenhofen, M., Eckert, F. & Terzi, S. COSMO-RS blind prediction of distribution coefficients and aqueous pKa values from the SAMPL8 challenge. J Comput Aided Mol Des 37, 395–405 (2023). https://doi.org/10.1007/s10822-023-00514-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10822-023-00514-4

Keywords

Navigation