Introduction

The treatment options for central nervous system (CNS) disorders are hindered by the blood–brain barrier (BBB), which is a highly selective barrier that regulates and restricts the transport of molecules from circulation into the CNS [1, 2]. The BBB is a complex physiological structure that separates the CNS from the bloodstream and is composed of a specialized layer of endothelial cells that line the blood vessels in the brain and spinal cord. The tight junctions between human brain microvascular endothelial cells (hBMECs; see Fig. 1) effectively prevent the paracellular transport of molecules and restricts transcellular transport into the brain to molecules with compatible chemical properties [3]. Therefore the primary mode for drugs to enter the brain is by passive diffusion across the luminal and abluminal hBMEC membranes. Furthermore, successful drugs must also not be substrates of BBB efflux pumps or need to diffuse sufficiently fast to overcome the action of efflux pumps. otherwise the effective concentration is significantly reduced. This means that the majority of drugs do not cross the BBB, making it difficult to deliver drugs to the CNS to treat neurodegenerative and psychiatric disorders.

Fig. 1
figure 1

Construct of our in silico model of the blood–brain barrier for MD simulation of molecular transport across the BBB. A A representation of the human brain microvascular endothelial cell (hBMEC) membrane that forms a key element of the blood–brain barrier (BBB). B Composition of the in silico BBB apical hBMEC membrane model (N = 96 lipid molecules). C An in silico representation of the apical hBMEC membrane, showing dimensions and a drug crossing event. The system used has a system with a larger z-dimension (12 nm at equilibration), which is different (60% larger in z dimension than x–y dimensions) to our previous systems [26]. (D) Library of CNS compounds spanning a representative set of permeabilities (10–7 cm/s to 10–3 cm/s)

The BBB plays a key physiological role in the development and progression of several neurodegenerative [4], including Alzheimer's disease [5], Parkinson's disease [6], multiple sclerosis and brain tumors [7]. In these and other CNS conditions, the BBB can become "leaky" which can lead to the accumulation of harmful molecules in the brain and contribute to the progression of these disorders. Additionally, molecular transport across BBB may also play a role in the development of psychiatric disorders [8] such as anxiety and depression [9], as well as schizophrenia [9] as the BBB also regulates the distribution of neurotransmitters and neurotrophic factors, which are essential for the proper functioning of the brain.

Therefore, the determination of the brain exposure of drugs is a challenging task that represents a major obstacle to the development of drugs for the treatment of CNS disorders. Despite the fact that there are approximately 1700 FDA-approved drugs, the brain exposure of only around 200 of these drugs is known [10, 11]. This lack of information about brain exposure makes it difficult to repurpose existing drugs for use in the treatment of CNS disorders. This highlights the need for new technologies that can identify therapeutics with the potential to cross the BBB early on in the drug development process [12, 13].

One of the more common screening models used for this purpose is Lipinski's rule of five [3], which is based on the idea that drugs with certain chemical properties, such as a low molecular weight and a neutral charge, are more likely to cross the BBB. Another methodology that is used is the QikProp program of Schrödinger, which uses the method of Duffy and Jorgensen [14]. This program uses a variety of cheminformatic methods to predict the properties of a drug, including its potential to cross the BBB. Generally, in silico models can be useful for identifying promising lead compounds early on in the drug development process, but existing solutions are limited in their ability to deliver unbiased atomic-detail insights into the physical process of BBB penetration. Atomistic insights are important to elucidate the underlying mechanism behind black-box or machine learning (ML) model predictions, which can only be obtained with MD simulations. As an example, two molecules can have their permeability predicted coarsely with ML models, but we can only rationalize the difference by observing the simulation trajectories.

Experimentally, the customary method for determining brain exposure of pharmaceutical compounds is the transwell assay, which is a widely used in vitro method [15]. In these assays, a BBB-mimetic confluent monolayer is used to mimic the tight junctions of the blood–brain barrier and the apparent permeability (Papp) of the drug is determined by measuring the amount of the molecule that crosses the monolayer [16, 17]. Ideally, the monolayer used in this assay is composed of hBMECs (i.e., primary cells which would form the BBB in vivo), but a variety of primary cell types have been utilized for transwell assays, including confluent monolayers of Madin–Darby Canine Kidney (MDCK) or Caco-2 cells [16, 17], or more recently BMECs derived from induced pluripotent stem cells [18]. These experiments provide a means to evaluate the rate of transport of drugs across the BBB from estimating the content from the sacrificed rat brain. The data obtained from these experiments may then be compared to in silico simulations of transcellular transport to gain insights into the mechanisms of transport of drugs across the BBB [19,20,21,22,23].

Molecular dynamics (MD) simulations are a powerful computational tool that can be used to investigate the free energy of drug permeation across the transcellular pathway of the blood–brain barrier (BBB). MD simulations have been used to study a wide range of membrane types [19,20,21,22,23], including plasma and mammalian membrane models. There have been few studies that have attempted to simulate the endothelial membranes using accurate compositions [24,25,26], and some of these studies have revealed the computational challenges in converging permeability estimates.

To overcome sampling limitations in MD simulations for cellular translocation, various enhanced sampling techniques have been developed [27, 28]. However, these methods do not directly provide transport rates without a reweighting of the resulting ensemble or inferences through an inhomogeneous-solubility diffusion (ISD) framework [19]. Unbiased all-atom MD simulations provide detailed insights into the molecular mechanisms of transport across the BBB without the need for a priori knowledge of the permeation pathway or constrained coordinate systems. Kinetics can be calculated from the transition-based counting (TBC) approach [24, 29], but this still requires simulation times in the tens to hundreds of microseconds per drug partition to achieve converged estimates at 37 °C. This is currently beyond the limit of realistic sampling for routine MD simulations. Other methodologies for permeability have only been parametrized for the fast regime (permeability, P > 10–5 cm/s) [23].

In previous research, we have shown that high-temperature simulations can be used to estimate permeabilities at body temperature from direct Arrhenius fits [26] or by fitting the rates based on a modified form of Kramer's theory of reaction rate [24]. Although these simulations are run at much higher than physiological temperature, we have demonstrated that membranes do not disintegrate, nor do they significantly differ, even at extremely elevated simulated temperatures of 500 K (~ 226.85 °C). Furthermore, it has well-established in the MD simulation literature that the thermostat does not permit a vapour transition, and the membrane stability at high temperatures has also been experimentally validated [30, 31]. Both of these procedures have limitations, as the first procedure has errors in comparison to experimental measurements, and the second procedure requires estimation of multiple per-compound physical fitting parameters (i.e., the Arrhenius pre-exponential constant A, the lateral diffusivity of the lipids, DL(T), and the apparent transmembrane free energy apparent barrier height, G0). There are currently no known rapid ways to obtain a relative ranking of compounds crossing the BBB that are not based on estimates from structure–activity relation approximations [1, 13, 32,33,34].

In this study, we propose a method for determining quantitative transmembrane transport based on TBC rates obtained from a single equilibrium MD simulation, with a temperature-based enhancement of sampling that can perform relative ranking of permeability. This method leverages the recent advances in atomic scale modeling and computing power to overcome the limitations of current methods. The proposed method utilizes a single MD simulation at high temperature for a central nervous system (CNS) compound, which can be converged using a single graphics processing unit (GPU) node within a time frame of 24 to 72 h. Our approach is based on a least-squares fitting procedure that utilizes a library of CNS compounds of clinical importance with permeabilities spanning a broad range of interest, from the slowest range (10–7 cm/s) to fast CNS penetration (10–4 cm/s). We use a sample size of N = 18 CNS compounds which are chosen based on their clinical relevance and that their permeabilities span a broad range of interest. This approach provides a rapid and efficient way to determine the transmembrane transport of CNS compounds, and it could be useful for identifying promising lead compounds early in the drug development process.

Materials and methods

Single MD simulations for a library of compounds

Key MD simulation parameters

Unbiased atomistic MD simulations were performed using GROMACS (www.gromacs.org) [35] with the CHARMM36 all-atom force field for lipids [36], the CHARMM general force field (CGenFF) for molecular solutes and the TIP3P water model as solvent [37]. Electrostatic interactions were computed using particle-mesh-Ewald (PME) [38], and a cut-off of 10 Å was used for non-bonded interactions. Bonds involving hydrogen atoms were restrained using LINCS [39] to permit a 2 fs time-step. Neighbor lists were updated every five steps.

All simulations were performed in the NPT (constant N, pressure and temperature) ensemble, with water, lipids, and drug molecules coupled separately to a heat bath with temperatures at, respectively, 127 °C (simulation set 1), and 167 °C (simulation set 2). A time constant τT = 0.1 ps was utilized in combination with the velocity rescaling algorithm [40]. In the equilibration stage, an atmospheric pressure of 1 bar was maintained in the simulation box using the Berendsen semi-isotropic pressure coupling [41] with compressibility κz = κxy = 4.6 × 10–5 bar−1. In the production runs we utilized the Parrinello–Rahman semi-isotropic pressure coupling [42] with compressibility κz = κxy = 4.6 × 10–5 bar−1 and time constant τP = 20 ps. In order to capture diffusion events at sufficiently high resolution, trajectories were recorded every 1 ps, such that each 1 ms of trajectory comprises a dataset of 107 observations.

A lipid bilayer model of human brain microvascular endothelial cells (BMECs)

An atomic detail molecular model of the apical BMEC lipid bilayer was utilized as described in previous work [24], consisting of a patch of 96 lipids (48 per leaflet) in an area of about 25 nm2. Information about the spatial lipid distribution within the individual leaflets was not available and was not incorporated in the model.

A compound library of the CNS drug space

A library of molecular solutes (N = 18; Table S1 and molecular structures in Figure S1) was chosen to be sufficiently broad and representative of CNS drugs, spanning a range of permeabilities (from 10–7 to 10–3 cm/s), charge (neutral, cationic, anionic, and zwitterionic), and lipophilicity (LogPw/oct values from − 1.8 (polar) to 5.10 (non-polar)). Solute transport is dependent, in part, on the choice of solute force field [22, 43, 44]. Force field parameters were obtained in a standardized fashion using the CGenFF program [45, 46] (version 2.0) to obtain bonded and nonbonded parameters via the automated parameter assignment tool (Paramchem) [47] of CGenFF. The quality of similarity assignment for bonded and non-bonded parameters was monitored by penalty scores (Table S.4). All molecules are simulated in their neutral form, for comparison. The list of compounds is given in Fig. 1.

To converge the permeability estimates, simulations of solute transbilayer BBB crossing are carried out at temperatures greater than 25 °C (167 °C). This high-T MD methodology dramatically increases the rate of solute transport across lipid bilayers, and is applicable to lipid membranes in conjunction with small-molecule solutes that do not denature at T > 25 °C.

Simulations of solute translocation

The standard simulation system is different from that reported in our previous work by having an increased amount of water content. The system contains 7031 water molecules, 96 lipids, and 20 solute molecules distributed randomly with PACKMOL [48]. Simulations at elevated temperatures were necessary to enable accurate determination of the translocation frequency for solutes with low permeability. This high-T MD methodology is applicable to lipid membranes in conjunction with small-molecule solutes that do not denature at 167 °C ≥ T > 36.8 °C[24, 26]. The use of non-polarizable water models, such as TIP3P, at high temperatures carries limitations. While high-temperature simulations are well established in the field of MD simulation, the water models do not accurately describe the phase transitions associated with heating (or cooling). However, since we are not aiming to describe transport properties in water, we hold that the methodology is applicable to solute translocation across low dielectric media, such as a lipid bilayer.

Quantifying solute translocation across the lipid bilayer

There are two general approaches for calculating permeability in simulations: Fick’s law counting-based methods (used in this work), and methods based on the inhomogeneous solubility-diffusion (ISD) equation [19]. The ISD model is more complex since it requires accurate determination of the diffusion coefficient and free energy surface at each location (i.e. position in the bilayer), both of which are challenging [49,50,51,52]. Here we use direct observation from the simulations to identify individual translocation events.

The solute translocation frequency (k) across the bilayer is the number of translocation events per unit time. A translocation event is defined when a solute molecule moves from bulk solution on one side of the lipid bilayer, across the bilayer, and crosses a plane 1.0 nm beyond the bilayer on the opposite side. In the simulations, we define the steady state translocation k frequency as the average value when the derivative is less than a threshold value: i.e., dk/dt = [(k(i + 1) − k(i))/∆t] < 0.004.

Free-energy surfaces (FES) and grouping of solutes

The FES profiles were calculated by first binning the z-positions of a solute to generate a 1-dimensional probability distribution P(z). The free-energy, F(z), was calculated by Boltzmann reweighting of the probability distribution (see Supplementary Information for details).

Experimental values of permeability

Experimental values of permeability derived from the 2D transwell assay (Papp) are widely used to predict brain penetration of small molecules, and the transwell assay is often considered the gold standard for validating simulations and in assessing barrier function of other in vitro models [53]. The most common cell lines are Madin–Darby Canine Kidney (MDCK) cells [16, 17], as well as Caco-2 cells with the PAMPA assay [54, 55]. We have chosen solute values of the cell line available for the compound of interest in published literature, with compounds missing a reference value measured as in-house measurements (Department of Materials Science & Engineering, Johns Hopkins). We have opted to use only Papp values as a benchmark, and not rat brain perfusion values P3D, due to the scarcity of data and the closer mimetics of our simulation system with the transwell assay. The final reference values are produced in Table S.1.

Results

The passive BBB transport of small molecules involves crossing the luminal and abluminal membranes of BMECs in the cerebrovasculature (Fig. 1A). Previous work demonstrated high similarity in transport across the apical and basolateral chambers, so we chose to simulate the apical chamber only. As long as transport within the cell (between the two membranes) is fast in comparison to transport across the membranes, then the experimentally determined unidirectional diffusional permeability from the transwell assay is equal to half of the simulated permeability from the bidirectional flux, Papp = Psim/2, where Psim is the permeability across a single bilayer (Fig. 1A). Here we do not include efflux pumps or other membrane proteins in the bilayer in order to focus on the effects of passive BBB translocation.

Solute transport

MD simulations were performed at 167 °C for a library of solutes (N = 18; Fig. 1D) in a box (~ 5 × 5 × 12 nm) containing a BBB lipid bilayer model (Fig. 1B, C). Simulations were performed at elevated temperatures to enable a sufficient number (i.e. at least 1 and on average 10 events in 100 ns; Figure S1) of translocation events for accurate assessment of the permeability. The number of translocation events per unit time, k, is related to the permeability by: Psim = r/2AC where r is the rate (k/NA), A is the lateral cross-sectional area of the membrane, and C is the concentration of the solute. The temperature required for efficient unbiased sampling is constrained, in large part, by the solutes with the lowest permeability, which require relatively long times to obtain accurate values of the rate constant. For the library of solutes reported here, each rate constant converged to a steady state value within 0.2–2.0 μs (Figures S1, S2), scaling at 0.1–0.2 μs per 24 h on a computing node with 4–8 computing processing units (CPUs) and GeForce GT × 1080 Ti GPU card.

As an example, we simulated the transport of ethanol (Fig. 2). Following simulation, we observe individual molecules crossing the bilayer over periods lasting ~ 1 ns (Fig. 2A). From the running time average of the number of crossings, a steady state in the rate constant, k, is reached after about 100 ns (using the plateau value from dk/dt), and from the plateau region, we determine a permeability of 8.9 × 100 cm/s (Fig. 2B) using the procedure outlined in Methods (Fig. 2C).

Fig. 2
figure 2

Obtaining membrane permeability (Psim) estimates from high-temperature MD simulations. For permeability estimates: A perform unbiased MD simulation of explicit atom transbilayer crossing at high temperature to obtain an estimate of the rate of crossing k based on the # of translocation events, B establish the steady state behaviour of the rate k (s), molar rate r (mol/s) to ensure the system has reached equilibrium, C calculate the permeability (cm/s)

The permeability values for all solutes at 167 °C are summarized in Table 1 (additional parameters for calculation are provided in Table S.5). The permeabilities Psim span more than two orders of magnitude from 7 × 10–2 to 1.69 × 101 cm/s. As expected, these values are much higher than the experimentally determined values at room temperature obtained from transwell measurements, which are typically in the range from 10–7 to 10–3 cm/s [56]. The temperature dependence on permeability can be inferred from the values of Psim at 127 °C (Table 1), which are 5 to tenfold smaller than at 167 °C.

Table 1 Kinetic parameters extracted from simulations of the library of solutes (127, 167 °C) ordered alphabetically with a unique index (#)

On comparing the values of permeability from the simulations (Psim) at 167 °C with experimental values (Papp) recorded at ranges of at 25–37 °C (Fig. 3), with the predominant temperature being 25 °C and the following assay sources including Caco-2 cell lines as well as MDCK cell line measurements at 37 °C and red blood cell estimates using the PAMPA assay [57]. There is a reasonable correlation (R2 = 0.59) with an offset of about 5 orders of magnitude.

Fig. 3
figure 3

Permeabilities correlation (Papp vs Psim) for N = 18 compounds. A At 167 °C, depicting the roughly linear distribution of Psim/Papp along a line that is parallel to the equivalency line by a constant. The value and origin of this shift are confirmed in (B) for simulations at 127 °C and 167 °C

To verify the statistical significance of the sample size (N = 18), we perform independent regressions in sample sizes of incremental N (steps of N = 3 compounds; N = 3, 6, 9, 12, 15, 18 compounds). To remove bias, we perform combinatorial sampling to all possible combinations of points (816, 18,564, 48,620, 18,564, 816) in the dataset. The mean and standard deviation of the slopes of these regression lines were recorded and were used to assess the variability of the slopes. In Figure S3, we report the results of the search, which demonstrate that the average slope reaches a plateau beyond N = 12 compounds, suggesting that this method can be utilized for datasets above 12 compounds.

In previous work [26], we outline computationally intensive ways to map the relationship between experimental and simulated permeabilities. We used the same library of compounds, with N = 18 converged at 167 °C and N = 13 converged at lower temperature. Here, we describe a relationship between the high-temperature simulated permeability and measured experimental permeability (N = 18 compounds) using a least-squares fit from the stats.linregress function of Scipy (Fig. 4) [58].

Fig. 4
figure 4

Least-squares fit of simulated permeability data and experimental permeability. A Least-squares fit regression of LogPsim vs LogPapp with N = 13 compounds (167 °C, 127 °C) indicating the resulting regression y = mx + c. For 167 °C, m = 1.726, c = − 5.20, R2 = 0.68, p-value = 0.00054. For 127 °C, m = 1.17, c = − 3.73, R2 = 0.54, p-value = 0.0042. B Least-squares fit regression of LogPsim vs LogPapp with N = 18 compounds (167 °C) and N = 13 (127 °C), indicating the resulting regression y = mx + c. For 167 °C, m = 1.124, c =  − 4.819, R2 = 0.59, p-value = 0.00021. For 127 °C, m = 1.17, c =  − 3.73, R2 = 0.54, p-value = 0.0042. Log denotes Log in base 10

This allows for a relative ranking of novel compounds of interest to one order of magnitude of precision (Table 2; 167 °C; R2 of 0.59, p = 0.00021; Eq. 1; 127 °C; R2 of 0.54, p = 0.0042; Eq. 2). Specifically, using the fit we propose (Eq. 1; fit using 167 °C), one can rank a novel compound, ‘X’, using a single simulation at 167 °C (Psim,x,167°C) to obtain a relative apparent permeability ranking, Papp,X,37°C at 37 °C.

$${\text{Log}}P_{{{\text{app}},{\text{X}},{37}^\circ {\text{C}}}} = {1}.{12}\left[ {{\text{Log}}P_{{{\text{sim}},{\text{X}},{167}^\circ {\text{C}}}} } \right]{-}{4}.{82}$$
(1)
$${\text{Log}}P_{{{\text{app}},{\text{X}},{37}^\circ {\text{C}}}} = {1}.{17}\left[ {{\text{Log}}P_{{{\text{sim}},{\text{X}},{127}^\circ {\text{C}}}} } \right]{-}{3}.{73}$$
(2)
Table 2 Regression prediction

The boundaries of the regression have been tested for a range of experimental permeabilities from 10–7 cm/s (doxorubicin) to 10–3 cm/s (ethanol) at two temperatures.

In Table 2, we demonstrate the application of the regressions to perform a relative ranking of a novel compound X. For a compound with a 167 °C simulated permeability Psim,X,167°C of 7 × 10–2 cm/s, this yields a predicted value of experimental permeability, Papp,X,37°C, of 7.64 × 10–7 cm/s, while for a 167 °C permeability Psim,X,167°C of 1.2 × 10–1 cm/s, this yields a value of Papp,X,37°C of 1.4 × 10–6 cm/s. Finally, for Psim,X,167°C of 9.85 × 100 cm/s, this yields a prediction for the value of Papp,X,37°C of 1.98 × 10–4 cm/s. The regression therefore spans the 4 orders of magnitude of Papp with significant predictive power. The error between the predicted experimental permeability (Papp,X,37°C) and the known in-vitro permeability Papp is less than one order of magnitude (0.44–0.55; Table 2). To verify the model with an external dataset, we generated a test set outside the range of regression compounds (N = 4) using the same setup as depicted in the Methods section. In Fig. 5 we depict the results of applying regression (Eq. 1) to an external dataset (details in Table S7) of 167°C permeability. We find that the external test set lies within the range described by the regression, and thus the regression can be used to predict the experimental permeability for an unknown compound X from a single high-temperature simulation.

Fig. 5
figure 5

Least-squares regression using external dataset (N = 4). Red: Least-squares fit regression of LogPsim vs LogPapp with N = 18 compounds (167 °C) and N = 13 (127 °C), indicating the resulting regression y = mx + c. For 167 °C, m = 1.124, c =  − 4.819, R2 = 0.59, p-value = 0.00021. For 127 °C, m = 1.17, c =  − 3.73, R2 = 0.54, p-value = 0.0042. Green: External verification set (N = 4) not included in the regression set

Discussion

Understanding the mechanisms of drug transport across the BBB is critical for the development of therapeutics for CNS disorders. The BBB effectively restricts the entry of most molecules from the bloodstream into the CNS, making it difficult to deliver drugs to the brain [68]. A better understanding of the mechanisms of drug transport across the BBB will allow for the development of more effective therapeutics for the treatment of CNS disorders. As an example, several anti-amyloid antibodies (AAA) therapeutics for Alzheimer’s have failed to cross the BBB, thereby halting advanced stage clinical trials [69].

In this work, we have introduced a computational approach for ranking the permeability of a library of CNS compounds across the BBB. By simulating a library of interest at high temperature (T = 167 °C) and constructing a regression (Eq. 1) from the dataset of Psim,167°C versus Papp it is possible to achieve relative prediction of an arbitrary compound’s permeability Papp,X,37°C from a single simulation (Psim,x,167°C). The method relies on efficient simulations at 167 °C that converge on a standard GPU typically in t ~ 24–72 h.

In the Results, we showed our simulated permeabilities for the compound library of solutes distribute with moderate linearity (Fig. 4), with a constant offset shift with respect to the experimental permeability (Papp; 37 °C). When the temperature is lowered, the distribution shifts closer to the experimental permeability. Our methodology allows for the qualitative ranking of a compound permeability (Papp,X,37°C) at physiological temperature. The error between the predicted permeability (Papp,X,37°C) and the known in-vitro permeability Papp is less than one order of magnitude (0.44–0.55; Table 2), demonstrating an acceptable accuracy of our ranking approach.

In Table 3 we present the final comparison between experimental permeabilities reference values, to those predicted by the regression (Eq. 1) from a single high-T simulation (in [26]), to those from Arrhenius extrapolation from multiple high-T simulations. As can be seen, the regression prediction shows significantly closer agreement with experimental reference value compared to Arrhenius prediction. We had previously described how Arrhenius predictions suffered from non-linear effects, which can be corrected for (as in the work of Wang et al. [24]). In proposing this regression approach, we are arguing that scarce access to computing time should not preclude users from performing a qualitative ranking that is often less than one order of magnitude from the experimental reference. In our previous work, we noted how there can be marked disagreements between experimental sources (Papp from cell lines or P3D from rat brain perfusion measurements), which are notably concentrated for certain groups of molecules (one group displaying statistically significant deviations of over 1.5 orders of magnitude. This should serve as a caution to examine the source of experimental data carefully before using as benchmark.

Table 3 Comparison of room-temperature permeabilities from (i) experiment, (ii) regression prediction and (iii) Arrhenius extrapolation from multiple high-T simulations

Although the R2 values obtained from regression are statistically significant (p = 0.0002 for T = 167 °C; p = 0.0042 for T = 127 °C), sources of error can be ascribed to, in particular, the use of CGenFF force field parameters in a consistent way without reoptimization. Previous attempts at ranking permeability with a fast methodology, such as implicit solvent assumption [43], have found smaller permeability R2 value in agreement when ranked (below 0.2), and described errors of up to 10 orders of magnitude between in silico prediction and experimental permeabilities. Our methodology has a statistically significant (p value = 0.0002) R2 value and lower error and is fast to calculate on consumer-grade GPU cards. This work opens up the possibility for routine ranking of CNS candidates on a large scale.

We now proceed to discuss the inherent methodological limitations from the simulations and experiments. Experimentally, the transwell assay is currently a very popular in vitro experimental method to source for permeability [53, 70]. The main limitation in its ability to accurately determine BBB permeability is to reproducibly obtain sources of hBMEC and other supporting cell types [71], and one solution has been to source the cells from human induced pluripotent stem cells (iPSCs). Our methodology has used the to-date best available experimental data [56, 67], but there are still notable differences in the epithelial cell lines utilized in the transwell assay (MDCK, Caco-2), which are different from brain endothelial cells (hBMECs). The passive permeability across the cell membrane is affected by the native size of cells. In general, epithelial cells are much thicker that brain endothelial cells. Thus, the permeability coefficients measured in epithelial cell-derived cell lines and hBMECs may be inherently different.

Similarly for computational methods such as MD simulations with high performance computing, there are limitations [19]. These reside primarily with their ability to obtain sufficient sampling statistics in order to converge metrics such as permeability estimates. Using either the ISD framework or our flux-based method, convergence at 310 K remains highly challenging, and for flux-based methods, computationally intractable for permeabilities < 10–6 cm/s [26]. Our MD model does not model the paracellular pathway, only the transcellular pathway. Among the other challenges of using MD simulations of CNS penetration is reconciling the choice of model with the experimental setup. To date, no simulation attempt is known to model the paracellular pathway, which is the transport of drugs across the tight junctions of the BBB. This is partly due to the complexities in building a cell–cell system to represent the paracellular pathway. The paracellular pathway is crucial for the overall transport of drugs across the BBB, and further research is needed to accurately model this pathway.

A deep understanding of the molecular mechanisms underlying BBB transport is crucial for the development of effective therapeutics for CNS disorders. Further research is needed to understand how these mechanisms are affected by disease states, such as BBB dysfunction in neurodegenerative and psychiatric disorders [4]. This will provide insights into how the BBB can be targeted to enhance drug delivery to the brain in these disorders, and how future simulation topologies can be developed to better capture these affects.

In addition to the scientific value of the proposed method, it is also important to consider the practicality, scalability and cost-effectiveness for routine use in drug development and discovery [72]. This method is easy to implement and is fast and cost-effective and it should be able to handle large sets of compounds, as is typically required in a drug discovery pipeline. The use of high-T methodology has a cost-saving component that is well documented [73], and the use of GPU technology is a further cost-saving measure that scales well. As a next step, the approach outlined here could be combined with a combinatorial chemistry array of compounds as input for unbiased presynthetic screening for a library of drugs for optimizing BBB penetration via chemical modification.

Conclusion

We describe an advance in the development of the in silico design and optimization of CNS drugs. By using a single high-temperature simulation that requires between 24 and 72 h of user time on a conventional GPU computing node, our method predicts a relative ranking in CNS penetration for novel compounds using a dataset of simulated permeabilities. In the future, this technology can be applied to lower temperature simulations, pending suitable hardware advances, and can easily be integrated into a larger and more efficient platform accelerating drug discovery (e.g., to screen a library of compounds and find candidates that enhance brain exposure). In particular, the current framework is applicable to any atomically parameterizable chemistry, including peptides and other biologicals, which are not currently covered by many in silico techniques.

By simulating a library of interest at high temperature and constructing a regression from the dataset of Psim,167C versus Papp it is possible to do relative permeability prediction rankings for arbitrary compounds compared to a dataset of simulated permeabilities. The user needs to input a single simulation at high-T to perform a ranking of the compound at 37 °C to get a fit with a mean error to experiment below one (0.44–0.55) order of magnitude.

The wealth of detailed information provided by atomic-resolution simulation trajectories can be used to identify lead compounds through high-throughput screening and pre-synthetically guide lead compound design for enhancing CNS exposure. The presented approach provides a rapid and efficient way to determine the transmembrane transport of CNS compounds and could be useful for identifying promising lead compounds early in the drug development process.