Abstract
The SAMPL8 blind prediction challenge, which addresses the acid/base dissociation constants (pKa) and the distribution coefficients (logD), was addressed by the Conductor like Screening Model for Realistic Solvation (COSMO-RS). Using the COSMOtherm implementation of COSMO-RS together with a rigorous conformational sampling, yielded logD predictions with a root mean square deviation (RMSD) of 1.36 log units over all 11 compounds and seven bi-phasic systems of the data set, which was the most accurate of all contest submissions (logD).
For the SAMPL8 pKa competition, participants were asked to report the standard state free energies of all microstates, which were then used to calculate the macroscopic pKa. We have used COSMO-RS based linear free energy fit models to calculate the requested energies. The assignment of the calculated and experimental pKa values was made on the basis of the popular transitions, i.e. the transition hat was predicted by the majority of the submissions. With this assignment and a model that covers both, pKa and base pKa, we achieved an RMSD of 3.44 log units (18 pKa values of 14 molecules), which is the second place of the six ranked submissions. By changing to an assignment that is based on the experimental transition curves, the RMSD reduces to 1.65. In addition to the ranked contribution, we submitted two more data sets, one for the standard pKa model and one or the standard base pKa model of COSMOtherm. Using the experiment based assignment with the predictions of the two sets we received a RMSD of 1.42 log units (25 pKa values of 20 molecules). The deviation mainly arises from a single outlier compound, the omission of which leads to an RMSD of 0.89 log units.
Similar content being viewed by others
Data Availability
All data generated or analyzed during this study are available in this published article, its supplementary information files, or reference [30].
Notes
In cases where the full protonation range could be described by the COSMOtherm pKa (SAMPL8-2) or the base pKa model (SAMPL8-10, SAMPL8-11, SAMPL8-15, SAMPL8-8), these models were used instead of the unified model.
Abbreviations
- COSMO-RS:
-
Conductor like Screening Model for Realistic Solvation
- COSMO:
-
Conductor like Screening Model
- DMF:
-
Dimethylformamide
- LFER:
-
Linear free energy relationship
- MAD:
-
Mean absolute deviation between predicted and experimental data
- MEK:
-
Methyl ethyl ketone
- RMSD:
-
Root mean square deviation between predicted and experimental data
- TBME:
-
Tert butyl methyl ether
References
Bahr MN, Nandkeolyar A, Kenna JK et al (2021) Automated high throughput pKa and distribution coefficient measurements of pharmaceutical compounds for the SAMPL8 blind prediction challenge. J Comput Aided Mol Des 35:1141–1155. https://doi.org/10.1007/s10822-021-00427-0
Abramov YA (2018) Rational solvent selection for Pharmaceutical Impurity Purge. Cryst Growth Des 18:1208–1214. https://doi.org/10.1021/acs.cgd.7b01748
Moss GP, Cronin MTD (2002) Quantitative structure–permeability relationships for percutaneous absorption: re-analysis of steroid data. Int J Pharm 238:105–109. https://doi.org/10.1016/S0378-5173(02)00057-1
Mackay D, Celsie AKD, Powell DE, Parnis JM (2018) Bioconcentration, bioaccumulation, biomagnification and trophic magnification: a modelling perspective. Environ Sci: Processes Impacts 20:72–85. https://doi.org/10.1039/c7em00485k
Walker TW, Frelka N, Shen Z, Chew AK, Huber GW (2020) Recycling of multilayer plastic packaging materials by solvent-targeted recovery and precipitation. Sci Adv 6:eaba7599. https://doi.org/10.1126/sciadv.aba7599
Sánchez-Rivera KL, Zhou P, Kim MS, González Chávez LD, Grey S, Nelson K, Wang S-C, Hermans I, Zavala VM, Van Lehn RC, Huber GW (2021) Reducing Antisolvent Use in the STRAP process by enabling a temperature-controlled polymer dissolution and precipitation for the recycling of Multilayer Plastic Films. Chem Sus Chem 14:4317–4329. https://doi.org/10.1002/cssc.202101128
Mohan M, Keasling JD, Simmons BA, Singh S (2022) In silico COSMO-RS predictive screening of ionic liquids for the dissolution of plastic. Green Chem 24:4140–4152. https://doi.org/10.1039/d1gc03464b
Gutiérrez JP, Meindersma GW, de Haan AB (2012) COSMO-RS-Based ionic-liquid selection for extractive distillation processes. Ind Eng Chem Res 51:11518–11529. https://doi.org/10.1021/ie301506n
Janoschek L, Grozdev L, Berensmeier S (2018) Membrane-assisted extraction of monoterpenes: from in silico solvent screening towards biotechnological process application. R Soc Opensc 5:172004–172018. https://doi.org/10.1098/rsos.172004
Yara-Varón E, Li Y, Balcells M, Canela-Garayoa R, Fabiano-Tixier AS, Chemat F (2017) Vegetable oils as alternative solvents for green Oleo-Extraction, purification and formulation of Food and Natural Products. Molecules 22:1474. https://europepmc.org/article/med/28872605
Klamt A (2016) COSMO-RS for aqueous solvation and interfaces. Fluid Phase Equilibria 40:152–158. https://doi.org/10.1016/j.fluid.2015.05.027
Klamt A, Diedenhofen M (2010) Blind prediction test of free energies of hydration with COSMO-RS. J Comput Aided Mol Des 24:357–360. https://doi.org/10.1007/s10822-010-9354-4
Klamt A, Eckert F, Reinisch J, Wichmann K (2016) Prediction of cyclohexane-water distribution coefficients with COSMO-RS on the SAMPL5 data set. J Comput Aided Mol Des 30:959–967. https://doi.org/10.1007/s10822-016-9927-y
Loschen C, Reinisch J, Klamt A (2020) COSMORS based predictions for the SAMPL6 logP challenge. J Comput Aided Mol Des 34:385–392. https://doi.org/10.1007/s10822-019-00259-z
Warnau J, Wichmann K, Reinisch J (2021) COSMO-RS predictions of LogP in the SAMPL7 blind challenge. J Comput Aided Mol Des 35:813–818. https://doi.org/10.1007/s10822-021-00395-5
Klamt A (1995) Conductor-like screening model for real solvents: a new approach to the quantitative calculation of solvation phenomena. J Phys Chem 99:2224–2235. https://doi.org/10.1021/j100007a062
Klamt A, Schüürmann G (1993) COSMO: a new approach to dielectric screening in solvents with explicit expressions for the screening energy and its gradient. J Chem Soc Perkin Trans 2 1993:799–805. https://doi.org/10.1039/P29930000799
Klamt A (2018) The COSMO and COSMO-RS solvation models: COSMO and COSMO-RS. Wiley Interdiscip Rev Comput Mol Sci 8:e1338. https://doi.org/10.1002/wcms.1338
Eckert F, Klamt A (2002) Fast solvent screening via quantum chemistry: COSMO-RS approach. AIChE J 48:369–385. https://doi.org/10.1002/aic.690480220
BIOVIA COSMOconf 21. Dassault Systemes, https://www.3ds.com, Cologne, Germany
BIOVIA COSMOquick 21. Dassault Systemes, https://www.3ds.com, Cologne, Germany
Loschen C, Klamt A (2012) COSMOquick: a novel interface for fast σ-profile composition and its application to COSMO-RS solvent screening using multiple reference solvents. Ind Eng Chem Res 51:14303–14308. https://doi.org/10.1021/ie3023675
TURBOMOLE V7.5. University of Karlsruhe and Forschungszentrum Karlsruhe GmbH, 1989–2007, TURBOMOLE GmbH, since 2007; available from http://www.turbomole.com, Karlsruhe, Germany
BIOVIA COSMObase 21. Dassault Systèmes. https://www.3ds.com, Cologne, Germany
BIOVIA COSMOtherm 21. Dassault Systèmes. https://www.3ds.com, Cologne, Germany
Klamt A, Eckert F, Diedenhofen M, Beck ME (2003) First Principles calculations of aqueous pKa values for Organic and Inorganic acids using COSMO-RS reveal an inconsistency in the slope of the pKa scale. J Phys Chem A 107(44):9380–9386. https://doi.org/10.1021/jp034688o
Eckert F, Klamt A (2006) Accurate prediction of basicity in aqueous solution with COSMO-RS. J Comput Chem 27(1):11–19. https://doi.org/10.1002/jcc.20309
Bergazin TD, Tielker N, Zhang Y et al (2021) Evaluation of log P, pKa, and log D predictions from the SAMPL7 blind challenge. J Comput Aided Mol Des 35:771–802. https://doi.org/10.1007/s10822-021-00397-3
Gunner MR, Murakami T, Rustenburg AS, Işık M, Chodera JD (2020) Standard state free energies, not pKas, are ideal for describing small molecule protonation and tautomeric states. J Comput Aided Mol Des 34:561–573. https://doi.org/10.1007/s10822-020-00280-7
Mobley DL, Amezcua M, Nandkeolyar A, Bergazin TD, Tielker N, Ray D (2023) samplchallenges/SAMPL8: 1.0.0 (1.0.0). Zenodo. https://doi.org/10.5281/zenodo.7535037
Ingram T, Richter U, Mehling T, Smirnova I (2011) Modelling of pH dependent n-octanol/water partition coefficients of ionizable pharmaceuticals. Fluid Phase Equilibria 305:197–203. https://doi.org/10.1016/j.fluid.2011.04.006
Chen C-S, Lin S-T (2016) Prediction of pH Effect on the octanol – water partition coefficient of Ionizable Pharmaceutical. Ind Eng Chem Res 55:9284–9294. https://doi.org/10.1021/acs.iecr.6b02040
Scott DC, Clymer JW (2002) Estimation of distribution coefficients from the partition coefficient and pKa. Pharm Technol 26:30–39
Dallos A, Liszi JJ (1995) (liquid + liquid) equilibria of (octan-1-ol + water) at temperatures from 288.15 K to 323.15 K. Chem Thermodyn 27:447–448. https://doi.org/10.1006/jcht.1995.0046
Lladosa E, Montón JB, de la Torre J, Martínez NF (2011) Liquid – liquid and vapor – liquid – liquid equilibrium of the 2-Butanone + 2-Butanol + water system. J Chem Eng Data 56:1755–1761. https://doi.org/10.1021/je1004643
Ashour I (2005) Liquid – liquid equilibrium of MTBE + ethanol + water and MTBE + 1-Hexanol + water over the temperature range of 288.15 to 308.15 K. J Chem Eng Data 50:113–118. https://doi.org/10.1021/je049799a
Acknowledgements
We thank the organizers for setting up the SAMPL8 challenge and the National Institutes of Health for its support of the SAMPL project via R01GM124270 to David L. Mobley (UC Irvine). We appreciate Juliana Gretz, Paul Czodrowski, Nicolas Tielker, and Stefan M. Kast of the TU Dortmund for sharing their acidity constants. MD thanks David L. Mobley and Aakankschit Nandkeolyar for fruitful discussions.
Funding
This research was funded solely by Dassault Systèmes.
Author information
Authors and Affiliations
Contributions
MD and FE wrote the manuscript. All authors reviewed the manuscript and approved the version to be published. FE and MD analyzed the data and interpreted the results. ST fitted the parameters of the unified pKa LFER model. MD ran the tautomer/conformer generation and performed the COSMO-RS predictions.
Corresponding author
Ethics declarations
Competing interests
The authors declare the following competing financial interests: MD, FE, and ST are employees of Dassault Systèmes, BIOVIA. Dassault Systèmes commercially distributes the software COSMOtherm, COSMOconf, COSMOquick, COSMObase, and TURBOMOLE, which were used in the present study.
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Diedenhofen, M., Eckert, F. & Terzi, S. COSMO-RS blind prediction of distribution coefficients and aqueous pKa values from the SAMPL8 challenge. J Comput Aided Mol Des 37, 395–405 (2023). https://doi.org/10.1007/s10822-023-00514-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-023-00514-4