Introduction

The carbonic anhydrase (CA) superfamily of metalloenzymes is a class of enzymes which catalyse the reversible hydration of carbon dioxide. Fifteen isoforms with high structural similarity are found in humans, playing a role in CO2 transport, pH regulation, lipogenesis, gluconeogenesis, and ureagenesis [18, 68]. Different human CA (hCA) isoforms have been linked to a wide range of diseases including glaucoma and osteoporosis (hCA II), obesity (hCA VA and VB) and various cancers (hCA IX and XII) [8, 4, 10, 15, 41, 61, 69]. Consequently, designing CA inhibitors (CAIs) is of significant interest. To date, more than twenty-five drug compounds have been used clinically, with multiple others in clinical trials [45].

The general structure of hCAs is shown in Fig. 1. The active sites of all hCAs contain a Zn(II) ion that is coordinated to three histidine residues and a hydroxyl molecule in a tetrahedral arrangement [36]. The active site cavity has a narrow conical shape, giving a well-defined binding pocket. It is widely accepted that the hydration of CO2 is triggered by proton transfer from the Zn(II)-bound water to H64 to form a hydroxide anion followed by a nucleophilic attack of CO2 to form bicarbonate ion [3, 31].

Fig. 1
figure 1

Structure of hCA II (PDB ID 2CBA). Protein backbone coloured by secondary structure. Purple represents α-helix, Blue represents a 310 helix, yellow represent an extended β-sheet, white represents a coil, and brown represents a β-bridge

The most common form of hCA inhibition involves a ligand replacing the coordinating water and binding directly to the zinc(II) ion. In particular, sulfonamide groups are widely used as the zinc-binding head group of CA inhibitors [70]. Clinical antiglaucoma agents such as acetazolamide, and brinzolamide contain a sulfonamide head group connected to a heteroaromatic ring structure, and bind at low nanomolar concentrations to several isoforms [57, 76]. Similarly, arylsulfonamides such as the ureido-substituted arylsulfonamide SLC-0111 show excellent potency when binding to hCA IX, with low nanomolar inhibition constants. This compound is currently in stage II clinical trials for the treatment of metastatic hypoxic cancers [46, 54].

A major challenge in targeting hCAs is posed by the high degree of structural similarity between the different isoforms [58]. As a result, selective inhibition of individual isoforms is very challenging. Given the broad distribution, and the wide range of applications of different hCA isoforms throughout the body, the use of non-selective inhibitors can lead to off-target binding. One example of this is the drug acetazolamide, which has been shown to have applications for glaucoma, altitude sickness, and as a seizure medication, but has fallen out of clinical use due to the number of side effects [8, 35, 61]. Consequently, the design of novel hCA inhibitors must consider both the potency and the selectivity of the ligand. Generally, this selectivity is gained through appending chemical groups to the ligand which reach into the more chemically diverse outer region of the binding site [7, 49, 71]. For this reason, it is important to develop tools which can accurately model inhibitor binding affinity both within and between isozymes.

Molecular docking, molecular dynamics (MD) simulations, and more recently the use of machine learning are commonly used to predict protein–ligand binding energies. For example, molecular docking is often used to predict binding poses and for pre-screening large libraries of compounds [13, 19, 20, 63, 72, 81]. In the context of CA inhibitor design, molecular docking has been widely used to screen ligands against a variety of isoforms [17, 27, 30, 38]. Recently, a docking protocol was used to screen a library of compounds against CA VII and identified four ligands with low nanomolar potency [23]. However, the predictive power of docking methods for screening hCA ligands was limited until a scoring function (Autodock4Zn) was optimised to better describe the interactions between the ligand and the Zn(II) centre [64]. One recent assessment study of seven docking programs indicated this approach had the highest scoring and ranking powers for a set of 97 zinc metalloenzymes ligand complexes [16].

More theoretically robust but computationally intensive MD studies of the energetics of CA binding are less common. Rossi and co-workers used the AMBER forcefield and free energy perturbation to predict the relative binding affinities of three ligands to hCA II within 1 kcal mol−1 of experimental values [62]. In another study, a full reaction profile of acetazolamide binding to hCA II was acquired via steered molecular dynamics simulations, and the resulting binding affinity was in good agreement with experiment [77]. As an alternative, the MM/GBSA (or PBSA) approaches are relatively efficient MD-based methods for estimating binding affinities [78]. These ‘end-state’ methods calculate the binding energy based on the strength of the intermolecular interactions between the protein and ligand using configurations sampled from the MD trajectory. Solvent effects are accounted for in an implicit manner, with either a Poisson-Boltzmann (MM/PBSA) or generalized born (MM/GBSA) model. The non-electrostatic component of the solvent is recovered from the solvent accessible surface area.

MM/GBSA has previously been applied to the problem of ligand binding affinities of CAIs with some success. Guimarães used the OPLS-2005/TIP4P forcefields to compare the ability of MM/GBSA and two FEP-based approaches, Desmond and MCPRO +, to predict trends in the binding of a set of 13 simple substituted arylsulfonamides to hCA II. This study used default OPLS-AA charges, which are optimised to reproduce empirical properties such as heat of vaporisations and solvent densities. The MM/GBSA approach was shown to have a moderately strong correlation score of R2 = 0.60, and its performance was comparable with FEP approaches [28]. Further, work in the Meuwly group demonstrated that MM/GBSA binding energies yielded moderate correlation with experimentally derived binding free energies (R2 = 0.55) for a set of 17 diverse sulfonamide ligands [65].

Finally, our recent work examined whether the inclusion of a pKa correction in MM/GBSA simulations improves their correlation with experiment. This is because experimental binding constants take into account the deprotonation of the sulfonamide ligand whereas this energetic cost is neglected in classical MD simulations since the ligand is modelled in its deprotonated form. That study concluded that the inclusion of this pKa correction did not result in a noticeable change in the correlation with experimental data presumably because the uncertainty in MM/GBSA predictions are higher than these corrections [32].

In light of the promising performance of MM/GBSA as well as the recent improvements made to docking programs such as the extension of a zinc-optimised forcefield in Autodock Vina, this paper aims to address the following questions with the view to identifying effective protocols for prediction of relative CA-inhibitor binding energies:

  1. (1)

    Does the use of computationally expensive MD methods improve the prediction of the relative binding affinities compared to molecular docking?

  2. (2)

    Is it possible to further optimise the performance of MM/GBSA? Specifically, the performance of these methods is highly sensitive to the choice of atomic charges used to describe the electrostatic interactions between the enzyme and ligand [79]. For example, Guimarães employed empirically optimised OPLS charges while Meuwly used DFT derived Mulliken charges. In this paper, we sought to identify an optimal approach (level of theory and charge calculation scheme) that will maximise the correlation between MM/GBSA with experimental binding constants when considered across a diverse dataset.

Towards this end, we have assembled a dataset of 32 chemically diverse zinc-binding hCA inhibitors with experimentally measured binding free energies in hCA II (Fig. 2) [11, 37, 42, 53, 54, 65]. The first 15 ligands consist of arylsulfonamides studied in the Meuwly group [65]. In that study, two para-substituted arylsulfonamides were shown to be unstable during the MD simulations, and as such were not included here. Ligands 1625, and 3032 are a set of more chemically diverse binding groups, containing multiple unique heterocycles. Finally, 26 is ureidobenzenesulfonamide (SLC-0111) [46] that is currently in clinical trials as an anti-cancer agent, and 27-29 are related derivatives [13]. This test set is used to evaluate the performance of the zinc optimised AD4Zn forcefield used in conjunction with the Autodock Vina program as well as MM/GBSA calculations employing different ligand atomic charges obtained from Mulliken, electrostatic potential (ESP) and natural population analysis (NPA) schemes calculated at various levels of theory, viz. Hartree Fock, B3LYP-D3(BJ), and M06-2X. These atomic charge calculation schemes are chosen because they are used in popular force fields and/or used in previous MD simulations of CA-inhibitor complexes [43, 65].

Fig. 2
figure 2

Dataset of CA II inhibitors with experimental binding affinities

Computational details

Molecular docking

Docking was performed with the AutoDock Vina program with the zinc metalloenzyme optimised AutoDock4Zn [64]. The three dimensional protein structure (PDB ID: 1LUG, resolution 0.95 Å) [5] was obtained from the RCSB PDB database [6]. Ligand structures were obtained from crystal structures, or built in IQmol [25], and polar hydrogens were added in the AutoDock Tools suite [51]. Ligand structures were prepared with the Meeko package for docking [47] and docked poses were visualised with the Autodock Tools suite. The docking was conducted in triplicate with an exhaustiveness parameter of 32.

Molecular dynamics simulations

The starting structure of hCA II was obtained from the crystal structure (PDB ID: 1LUG). Protein protonation state was determined with the H +  + server [26] at a pH of 6.5 and a salinity of 0.15 M, and missing hydrogens were added by this server. Ligands were protonated with the AmberTools “Reduce”program [12]. One sulfonamide hydrogen was manually removed to form the ligand in its anionic state. The ligand was then cleaned with the pdb4amber function in AmberTools. TIP3P water molecules [33] were added to create a border of 8 Å with the solvate function in VMD, resulting in a box size of 63 × 60 × 71 Å. Counterions were added to maintain electroneutrality via the “autoionize” VMD plugin. Initial ligand configurations were obtained from crystal structures where available, or through docking with the Autodock Vina program using the AutoDock4Zn forcefield (ligands 9, 10, 13, 14, 16–18).

Simulations were conducted in NAMD 2.13 [55, 56] with periodic boundary conditions at a constant temperature of 300 K. The Langevin algorithm was used to keep a constant pressure of 1 bar with the Noose-Hoover Langevin Piston method using a timestep of 2 fs. The Particle Mesh Ewald algorithm [80] was used to calculate long range electrostatics, with a distance cut-off of 12 Å. The bonds of the TIP3P water molecules were kept rigid using the RATTLE algorithm. Harmonic restraints were placed on the backbone of the protein with a force constant of 2 kcal mol−1 Å−1. The system was minimized for 10 000 steps and heated to 300 K over 400 ps. The harmonic restraints were then gradually reduced to 0.1 kcal mol−1 Å−1 over 2.5 ns. The system was subsequently equilibrated for 4 ns, and production runs were carried out for 4 ns. For each ligand, 5 independent trajectories were generated. The CHARMM protein [9] and CGenFF [73] forcefields were used to model the complex. Ligand atom types and bonded parameters were assigned from the CGenFF server [74, 75]. For the sulfonamide ligands, the N–S–O angle was set to 111.00° with a force constant of 80 kcal mol−1 rad−2 to better reproduce the sulfonamide geometry.

MM/GBSA

In MM/GBSA, the free energy of a system is estimated from Eq. 1

$$\Delta G = E_{bond} + E_{el} + E_{vdw} + G_{polar} + G_{npolar} - TS$$
(1)

where Ebond, Eel, and EvdW correspond to the standard MM bonded, electrostatic, and vdW energy terms. Gpolar and Gnpolar are the polar and non-polar contributions to the solvation free energy respectively. The final term is the temperature T multiplied by the entropy S, estimated from a normal-mode analysis or quasi-harmonic approximation approach. For the change in free energy as the free ligand (L) binds to the protein (P) is typically computed through Eq. 2

$$\Delta G_{bind} = \left\langle {G_{P - L} - G_{P} - G_{L} } \right\rangle_{PL}$$
(2)

Here, the binding affinity is evaluated as the difference in the energy between the protein—ligand complex PL, and the isolated protein P and ligand L, averaged over all configurations sampled from the MD simulation of the protein–ligand (PL) complex. The inclusion of entropic effects is typically the most computationally intensive part of the MM/GBSA analysis. Consequently, this effect is often neglected in MM/GBSA calculations. Meuwly’s work demonstrated the inclusion of entropy did not improve the prediction of trends in binding affinities for hCA inhibitors [65] and hence it is not included in this work. Therefore, the binding energy is calculated from Eq. 3.

$$\Delta E_{MM} + \Delta G_{S} = \left\langle {(E_{gas}^{PL} + G_{polar}^{PL} + G_{npolar}^{PL} ) - (E_{gas}^{P} + G_{polar}^{P} + G_{npolar}^{P} ) - (E_{gas}^{L} + G_{polar}^{L} + G_{npolar}^{L} )} \right\rangle_{PL}$$
(3)

To save on computational cost and to reduce the noise in the calculations, it is common that each term is evaluated on frames from the trajectory of the bound complex (indicated by <  > PL) [24, 34]. Hence, the reorganisation energy needed to change the conformational state of the unbound protein and ligand are also not considered.

The polar and non-polar contributions to the solvation free energy is calculated using a Generalized Born solvent model and consideration of the solvent accessible surface area [50] MM/GBSA energies were evaluated with the MMPBSA.py script in the AmberTools21 package [12, 29, 44]. Frames were sampled at 20 ps intervals from 4 ns production runs, as the binding affinity was found to have converged by this point. A salt concentration of 0.15 M was specified for the MM/GBSA calculations.

QM calculation of atomic charges

QM calculations were performed using the Gaussian16 package [22]. Ligand geometries were optimised from their bound pose with either HF, M06-2X, or B3LYP-D3(BJ). All calculations were performed with the 6-31G(d,p) basis set. ESP charges from the Merz-Kollman scheme [66], Mulliken [52], and NPA charges [60] were evaluated from the optimised geometry at the corresponding level of theory in the gas phase. As recommended in previous MD studies [43, 65], the atomic charge for the Zn atom was set to + 1, in close agreement to the charge from a B3LYP-D3(BJ)/6-31G(d,p) calculation of a cluster containing a zinc, arylsulfonamide 1, and three methylimidazole rings. The zinc atomic charge was found to be 0.914 (Mulliken), 0.643 (ESP), 1.258 (NPA).

Results and discussion

Docking of ligands with Autodock Vina

To address the first question posed in the introduction, we have employed the AutoDock4Zn force field and Autodock Vina program to predict the binding affinities of the test set of 32 ligands docked in their deprotonated form. In a recent assessment study, this docking program was found to perform very well for zinc metalloenzymes [16]. In this study, we see similarly good performance for the test set of 32 ligands where the correlation between the predicted and experimental binding affinity is shown in Fig. 3.

Fig. 3
figure 3

Predicted binding affinity of the full dataset of ligands by docking with the AutoDock4Zn force field in AutoDock Vina. Trendline shown with all data points (dashed) and with outliers removed (solid). Outliers denoted with a hollow symbol

When the entire dataset is considered, Autodock Vina yielded a relatively weak correlation with experimental binding affinities (R2 = 0.41). However, if two notable outliers, ligands 22 and 29 were removed, the correlation is significantly improved (R2 = 0.64). Ligand 22 has an experimental pKa of 6.3 [37] which is about 2–4 pKa units lower compared to most of the other ligands. Including the effect of deprotonation is expected to increase the binding energy of 22 by approximately 2 to 5 kcal mol−1 (more negative) relative to the other ligands which would bring it closer to the trend line. Indeed, Figure S1 in the Supporting Information shows that for ligands where experimental pKa values are available, the inclusion of the pKa correction in the docking scores improves the correlation with experimental values. Interestingly, 28 is structurally very similar to 26, 27 and 29, however it is the only one of the four analogues that is an outlier.

Despite the reasonably good correlation between the docking scores and experimental binding energies, there was a consistent difference between the predicted bound pose of the ligands compared to crystal structures. For the 24 ligands with crystal structures, the average RMSD between the crystal structure pose and lowest energy docked pose is > 6 Å. No ligands had a lowest energy binding mode with an RMSD of less than 2.5 Å from the crystal structure—in one case, 25 is predicted to bind outside of the active site. Two example ligands, arylsulfonamides 4 and 11 are shown in Fig. 4. In the crystal structures of their complexes with hCA II, the sulfonamide oxygen is also in close proximity to the Zn(II) ion (ca. 3 Å) which resembles a bidentate binding mode. When 4 is docked into the protein, the predicted binding pose is unable to correctly model the bidentate zinc—ligand interaction where the Zn–O distance is about 1 Å larger than in the crystal structure. As a result, the orientation of the phenyl ring in the docked pose and crystal structure are significantly different. On the other hand, 11 is predicted to have the correct bidentate zinc coordination, however the position of the long tail is incorrect. When the docked poses with the smallest deviation from the crystal structure is considered regardless of the corresponding binding energy, the mean RMSD reduces to 4.30 Å. Ligand 3 has the lowest RMSD of 1.9 Å, for a binding mode 0.23 kcal mol−1 higher in energy than the lowest energy pose. However, for 4 ligands—11, 12, 23 and 24, the AutoDock Vina program is unable to predict a docked pose within 6 Å of the crystal structure. When the energy of the binding pose with the smallest deviation from the crystal structure is used, no correlation with experiment is observed (R2 = 0.09).

Fig. 4
figure 4

Comparison of crystal and docked binding poses of arylsulfonamide ligands 4 (left) and 11 (right). The crystal structure pose is shown in translucent stick and ball, and the docked pose is shown as full colour liquorice. Protein residues shown coloured by their residue type (non-polar residues in brown, polar residues in green, acidic residues in blue). Zn-sulfonamide oxygen (Zn-O) distances in the docked poses and crystal structures are also displayed

Validation of bonded parameters to maintain the zinc coordination geometry

In MD simulations of ligand binding to hCAs, it is important to preserve the tetrahedral coordination environment around Zn(II). Previously, both “bonded” [1, 40, 43] and “non-bonded” [67] approaches have been used to retain the tetrahedral zinc coordination. In the former, force field parameters are developed to model the Zn-ligand and Zn-Histidine bonds while in the latter, no bonded parameters are defined for Zn.

Lin and Wang systematically derived bonded parameters for zinc containing systems in the AMBER force field [40]. Similarly, Lu and Voth also developed bonded parameters defining the Zn(His)3 complex for carbonic anhydrase [43]. These bonded parameters were used in a partially bonded model in work within the Meuwly group [65]. In this approach, bonds are defined linking each of the three histidine nitrogen atoms to the zinc atom, whereas the bond between the zinc and sulfonamide nitrogen is retained using collective variables and there was a moderately strong correlation between their MM/GBSA and experimental binding energies [21]. Inspired by these efforts, a similar partially bonded model was applied here. As the choice of zinc Lennard–Jones parameters can affect the coordination in the active site [48], two different vdW parameters were tested: the default CHARMM Zn2+ Lennard-Jones parameters, and one published Lu and Voth. In the latter, the εmin (well depth at the minimum) is smaller than the parameter in the CHARMM forcefield, and so the attractive force due to the interaction is smaller. The use of the CHARMM Lennard–Jones parameters led to ligands adopting an incorrect coordination geometry, where the sulfonamide nitrogen is displaced by a sulfonamide oxygen in the coordination shell, with the trajectory-averaged distance between the zinc and sulfonamide nitrogen of 4.17 Å, which is 2.2 Å greater than found in the crystal structure. (Fig. 5). The use of the Zn(II) Lennard–Jones parameters optimised by Lu and Voth prevents this change in binding geometry and hence was used for all subsequent simulations [43]. It is clear that regardless of the choice of zinc vdW parameter, the ligand loses its bidentate binding mode, and instead retains a monodentate interaction with the zinc. The trajectory-averaged zinc–sulfonamide oxygen distance is 1.1 Å larger than the crystal structure for simulations with the Voth Zn (II) parameter. The zinc–histidine bond distance was restrained to a distance of 2.0 Å to better reproduce crystal structure geometries. A complete list of parameters can be found in Table 1 and Fig. 6.

Fig. 5
figure 5

Representative binding pose of Ligand 4 using different Lennard–Jones parameters. Crystal structure (PDB ID 6RH4, left), Snapshots from a simulation of the solvated protein–ligand system with the zinc(II) vdW radii optimised by Lu and Voth (centre), and snapshot from a simulation of the solvated protein–ligand system with the CHARMM zinc(II) vdW radii (right). Only ligand and Zn(His)3 binding site are depicted. Average interatomic distances over a 4 ns production run are shown

Table 1 Bonded and non-bonded parameters used to maintain the zinc tetrahedral geometry in conjunction with the CHARMM force field.
Fig. 6
figure 6

Atom typing for Zn(II) and coordinating histidine residues

The distribution of Zn-N(sulfonamide) distance along a MD trajectory for CAII and ligand 4 is plotted in Fig. 7 which coincides reasonably well with the crystal structure distance.

Fig. 7
figure 7

Normalized histogram of the distance between the zinc and sulfonamide nitrogen over an 8 ns trajectory for Ligand 4. Red line represents the crystal structure distance of 1.97 Å (PDB ID 6RH4)

Effect of QM ligand charges on MM/GBSA binding energies

Having established a suitable set of force field parameters, this section focuses on conducting MD simulations using different atomic charges to represent the ligand and the effect this has on the resulting MM/GBSA binding energies. Atomic charges were calculated using the Mulliken, ESP, and NPA schemes at the HF/6-31G(d,p), B3LYP-D3(BJ)/6-31G(d,p), and M06-2X/6-31G(d,p) levels of theory. These charges were compared for their ability to predict trends in binding affinities for a set of 15 ligands previously studied in the Meuwly group (ligands 115 from Fig. 2) [65]. In that work, a correlation coefficient of 0.55 was achieved for these 15 ligands which is somewhat lower compared to the values obtained in this work (vide infra). The effect of atomic charges on the binding affinity is compared in Figs. 8, 9 and 10.

Fig. 8
figure 8

Predicted binding affinities of ligands 115 from MM/GBSA simulations with ESP, Mulliken, and NPA charges from a B3LYP-D3(BJ)/6-31G(d,p) calculation on the ligands. MM/GBSA energies averaged from 5 independent 4 ns production runs

Fig. 9
figure 9

Predicted binding affinities of ligands 115 from MM/GBSA simulations with ESP, Mulliken, and NPA charges from an M06-2X/6-31G(d,p) calculation on the ligands. MM/GBSA energies averaged from 5 independent 4 ns production runs

Fig. 10
figure 10

Predicted binding affinities of ligands 115 from MM/GBSA simulations with ESP, Mulliken, and NPA charges from a HF/6-31G(d,p) calculation on the ligands. MM/GBSA energies averaged from 5 independent 4 ns production runs

For this set of ligands, the use of ESP atomic charges consistently gave the best correlation between the experimental and MM/GBSA binding affinities. Notably, ESP charges calculated at the B3LYP-D3(BJ)/6-31G(d,p) level of theory yielded very good correlation with the experimentally derived binding energies, with an R2 value of 0.84.

Mulliken charges also yielded good correlation with experimental data when calculated with the two DFT methods. In particular, the protocol used here in conjunction with charges obtained at the B3LYP-D3(BJ) level of theory showed an improved performance compared to that of Meuwly and co-workers (R2 = 0.55 c.f. 0.66 in this work). Compared to ESP charges, Mulliken charges are not suitable for the prediction of trends in binding affinities when computed at the HF level of theory. This is unsurprising, as Mulliken charges are known to be sensitive to the level of theory at which they are obtained [39].

Despite previous work demonstrating NPA charges can accurately model solute–solvent interactions [14], here the use of NPA charges gave no correlation with the experimental data irrespective of the choice of QM method. Scheme 1 compares the atomic charges obtained using different schemes for a representative ligand 4. The automatically assigned CGENFF atomic charges are also shown for comparison but were not considered in the MD simulations due to the high penalty scores associated with these assigned charges. As shown, the use of NPA charges significantly polarises the charges around the sulfonamide head. In particular, the charge on the sulfur atom is around 2.5 times greater when assigned using NPA charges compared to ESP charges. Figure 11 shows the correlation between Mulliken and ESP compared to the correlation between Mulliken and NPA charges. It is clear that the NPA charge scheme yields charges that are disproportionately larger in magnitude compared to Mulliken charges, most notably for the sulfonamide S atom, which sit in the top right of the graph, and the sulfonamide nitrogen at the bottom left.

Scheme 1
scheme 1

Distribution of partial charges of ligand 4 derived from CGENFF and B3LYP-D3(BJ)/6-31G(d,p) geometry optimisations and calculations. Partial charges obtained from Mulliken, ESP or NPA fittings. Partial charges rounded to three decimal places

Fig. 11
figure 11

Mulliken and NPA charges plotted against ESP charges for all atoms in all ligands. NPA charges are significantly more polarised than Mulliken charges

To test the broader performance of the best performing charge schemes, we expanded on the test set to include a total of 32 chemically diverse ligands (Fig. 2). Binding affinities were predicted from simulations using Mulliken or ESP charges from either B3LYP-D3(BJ) or M06-2X, and compared to experimental data. The binding affinities and uncertainty are presented in Table 2, and plotted in Fig. 12. The RMSD in predicted absolute binding energies and pairwise RMSD in relative binding energies are also presented in Table 2, where the latter is about 2–3 times smaller presumably due to systematic cancellation of errors.

Table 2 Predicted MM/GBSA Binding Affinities of ligands 1 to 32 to hCA II with different charge schemes. Binding data is the average of 5 independent 4 ns trajectories. Uncertainty presented as 2 times the standard error in the mean
Fig. 12
figure 12

MM/GBSA binding affinities of ligands 132 calculated with ESP or Mulliken atomic charges from DFT methods. Hollow symbols denote outliers, and dashed lines the correlation including the outliers. MM/GBSA energies averaged from 5 independent 4 ns production runs

Notably, for this larger and more diverse dataset the correlation between experimental and predicted binding affinities reduced significantly compared to the original set containing ligands 115. This effect was independent of the charge scheme used. ESP charges showed better correlation with the experimental data than Mulliken charges regardless of the QM level of theory. Both the B3LYP-D3(BJ) and M06-2X charges gave very similar correlation coefficients (R2 = 0.49 and 0.48 respectively). Nevertheless, when four outliers (22, 27, 28, and 30) were removed, this restored the good correlation with experimental binding affinities where the R2 increased to 0.77 in the best case for B3LYP ESP charges.

Further inspection of the four outliers reveals several interesting observations. As with the docking protocol, 22 (trifluomethylsulfonamide) has a significantly lower pKa (6.3) than the other ligands in the dataset, and hence the energetic cost of deprotonation is lower [37]. The pKa values of the other ligands range from 7.4 to 10.8, with the majority above 8.5. At a pH of 7, the difference in deprotonation energy between a ligand with a pKa of 6.3 and 8.3 translates to an approximately 2 kcal mol−1 change in binding free energy. Consequently, whilst this effect likely cancels for the other ligands with similar pKa values, the lower energetic requirement for deprotonation resulted in an underestimation of the binding affinity of 22 relative to other ligands and should bring it closer to the trend lines when the effect is included (Fig. 12).

Ligand 28, which contains a carbohydrate group has consistently underestimated binding affinities. This appears to be consistent with literature where the CHARMM carbohydrate forcefield is known to significantly and consistently underestimate protein—carbohydrate binding affinities [59].

Whilst ligands 2629 all had overestimated binding affinities, ureidobenzenesulfonamides 27 and 28 consistently showed the largest deviation from the trendline. These four ligands include SLC-0111 (26) and its derivatives which are more conformationally flexible than the majority of the ligands in the test set. Specifically, these compounds can adopt 3 distinct conformations, namely syn-syn, syn-anti, and anti-anti, corresponding to different orientations of the -NH groups pointing either towards (syn) or away from (anti) the chalcogen. For the ureidobenzensulfonamides 26, 28 and 29, different binding poses are observed in independent trajectories, corresponding to the anti-anti and syn-anti conformers. Whilst 26 and 29 have similar binding affinities for each pose, the ortho-substituted isopropyl phenyl group in 28 is significantly affected by the orientation of the urea as shown in Fig. 13. Binding in the anti-anti conformer results in a 7 kcal mol−1 increase in binding affinity relative to the syn-anti. In the syn-anti pose the isopropyl group points towards the hydrophobic pocket, whereas in the anti-anti conformer the phenyl ring is more easily able to form a strong hydrophobic interaction with a phenylalanine residue. This large variance results in the large uncertainties as shown in Table 2, with an uncertainty of 3.06 kcal mol−1 in MM/GBSA binding energies obtained from B3LYP-D3(BJ) ESP charges. To examine if this was due to insufficient sampling, 100 simulations were run with B3LYP-D3(BJ) ESP charges (50 with an anti-anti starting pose and 50 with a syn-syn starting pose). The binding affinity was predicted to be -25.6 kcal mol−1 (σ = 2.44) compared with -27.43 kcal mol−1 (σ = 3.46) when only five trajectories were run. This 2 kcal mol−1 shift brings the data point closer to the trend line indicating that insufficient sampling is likely an issue here. Notably, regardless of starting pose, trajectories containing both the syn-syn and syn-anti conformers were observed. The crystal structure shows that ligand 28 binds in the anti-anti conformation [53]. For the thiourea 27, the ligand adopts the syn-syn conformer in all trajectories, whilst the crystal structure shows the ligand binds in a syn-anti conformer [2]. However, fixing the ligand in the crystal structure conformer may not necessarily improve the correlation with experiment, as evidenced by the crystal structure anti-anti conformer of ligand 28 having a greater divergence from the predicted trendline of binding affinities.

Fig. 13
figure 13

Representative binding modes of 28 in two different simulations. The ureido group can adopt different conformations, which affects the position of the isopropyl group. The anti-anti (left) and syn-anti (right) conformations are observed. Protein residues shown coloured by their residue type. Non-polar residues shown in brown, polar residues shown in green. Hydrophobic carbon–carbon interaction distances indicated by red dashed lines

When these ligands are excluded, the use of B3LYP-D3(BJ) optimised ESP charges has a strong correlation with the experimental binding affinities (R2 = 0.77), which only slightly lower than the correlation observed with only the original 15 ligands. Similarly, ESP charges obtained with M06-2X also show reasonable correlation, with an R2 of 0.66. Mulliken charges show limited correlation between the predicted and experimental binding data even with these difficult to model inhibitors excluded.

Conclusions

This work evaluates methods for predicting relative binding affinities for a set of structurally diverse ligands. The AutoDock4Zn docking program showed a moderately strong ability to rank ligands (R2 = 0.64). However, this method was unreliable in reproducing the crystal structure binding pose of a set of 24 ligands, with a mean RMSD of 6.05 Å. The more theoretically robust MM/GBSA method was shown to improve the correlation between predicted and experimental binding affinities when ESP ligand charges were used. Simulations with charges obtained from DFT optimisations at the B3LYP-D3(BJ)/6-31G(d,p) level of theory showed the strongest correlation with the experimental values (R2 = 0.77). When applied to a subset of arylsulfonamides used in a previous study [65], this approach shows an excellent correlation of R2 = 0.84. Similarly, M06-2X/6-31G(d,p) ESP charges had a moderately strong correlation (R2 = 0.66) over the whole dataset. However, the use of other charge schemes showed limited to no correlation. Mulliken charges showed poorer agreement with the experimental binding data regardless of level of theory, with correlations of R2 = 0.36 and R2 = 0.21 for B3LYP-D3(BJ)/6-31G(d,p) and M06-2X/6-31G(d,p) charges respectively. Finally, NPA charges showed no ability to rank ligands by binding affinity. Overall, the results presented here demonstrate that MM/GBSA can be used to evaluate ligand binding affinities provided that a validated set of parameters is used.