1 Introduction

Surrogate modelling is a computationally inexpensive way to model complex, nonlinear biostructures in neuro-musculoskeletal (NMS) simulations of the spine used to non-invasively study generic and subject-specific spinal load sharing amongst active and different passive tissue compartments in predictive simulations (Meszaros-Beller et al. 2023; Mörl et al. 2020; Rupp et al. 2015; Guo et al. 2021). Particularly, the intervertebral disc (IVD) is a critical component, providing support and allowing for movement in all six degrees of freedom (DOFs). However, modelling the multiphasic soft matter physics of an IVD as a sub-model in subject-specific NMS spine models is challenging, especially under the demand of reduced model complexity to map the six-dimensional (6d) force-displacement behaviour. Apart from few attempts to directly model the IVD by mechanical components like springs in a multibody (MB) environment (Gao et al. 2015), the vast majority of spine models rely on data-based IVD models (see Fig. 1 for an overview of current modelling approaches), which are typically implemented as a force-torque element acting on adjacent vertebrae, while depending on the relative translational and rotational displacements of each vertebra. Such models should generally meet the following requirements: first, physical validity, second, exhibit the nonlinear force-displacement relations that are revealed experimentally, e.g., in Berkson et al. (1979), Schultz et al. (1979) and Panjabi et al. (1994), third, account for the coupling of the various DOFs, which has been documented in a few studies such as Patwardhan et al. (2003) and McGlashen et al. (1987), and, fourth, have a high accuracy on the given data set.

Fig. 1
figure 1

Overview of current surrogate modelling approaches for the elastic IVD responses. Linear stiffness matrices (top left), as for example Christophy et al. (2013) and Huynh et al. (2010), conserve energy and principally allow for coupling of all six DOFs. However, they often consider only two coupled DOFs or, when implemented as diagonal matrix, no coupling (coupled DOFs = 1). Similarly, uncoupled nonlinear models (top right) were developed from reported data (Damm et al. 2020; Schmid et al. 2020). Previous nonlinear models that including coupling (middle) were either polynomial approximations (Zhang et al. 2020; Karajan et al. 2013) with two coupled DOFs or kernel models (Wirtz et al. 2015; Haasdonk et al. 2021) with three coupled DOFs but did not conserve mechanical energy during motion. The first nonlinear surrogates that include coupling of all six DOFs, and conservation of energy (bottom right) are presented in this work and base on kernel or polynomial approximations

Disregarding the nonlinear characteristics of this soft cartilaginous tissue, the elastic IVD responses are commonly estimated by (bi-)linear spring-damping elements (Stokes and Gardner-Morse 1995; Huynh et al. 2010; Christophy et al. 2012; Meng et al. 2015; Meszaros-Beller et al. 2023; Senteler et al. 2016; Stokes et al. 2002; Monteiro et al. 2011). Even though these so-called bushing elements do only allow for linear coupling of DOFs, they are currently the most sophisticated approach for modelling elastic IVD responses in MB simulation frameworks in terms of considering fundamental physical principles, specifically the energy-conserving nature inherent to the term ‘elastic’. This fact is surprising, since it has already been elucidated more than one decade ago that the conservation of mechanical energy can be incorporated in MB IVD surrogates by deriving all vectorial forces and torques from a single scalar force-torque potential (see Senan and O’Reilly (2009) and Metzger et al. (2010)). However, existing nonlinear models of IVDs do either neglect mutual influences of DOFs (Damm et al. 2020; Schmid et al. 2020), or include coupling of only two or three DOFs by fitting force-displacement data for single force or torque components separately, e.g., with polynomials (Karajan et al. 2013; Zhang et al. 2020) or kernel approximations (Wirtz et al. 2015). Due to the lack of a common force potential and, by that, a loss of physical information, the latter published models violate energy balance.

Obtaining multidirectional force-displacement relations experimentally is not trivial and data of coupled DOF stiffnesses are still rare. Consequently, data-based models that include nonlinear coupling can hardly be trained on pure experimental data at present. As an alternative, detailed finite element (FE) models, validated against multiple experimental findings, can be used to provide estimates for the nonlinear and coupled elastic responses (Schmidt et al. 2007).

Several detailed, multiphasic FE models exist (Weisse et al. 2012; Ehlers et al. 2008; Karajan 2012; Little et al. 2008), which are often scaled to a subject-specific anatomy and generally capable of computing realistic IVD forces even under large and multidirectional deformations. However, their steep computational costs stand in contrast to the required reduced force law in a MB spine model, and can therefore not directly be implemented in the NMS model. To link detailed FE and NMS MB models, one can make use of hybrid models (Azari et al. 2018; Khoddam-Khorasani et al. 2018), with only slightly reduced computational cost compared to the pure FE models. A popular alternative approach is to use a detailed model to generate a multidimensional database and find a mathematical description, i.e., a surrogate, to map the resulting 6d displacements in an intervertebral joint to the six elastic responses component-wise (Karajan et al. 2013; Wirtz et al. 2015; Haasdonk et al. 2021). However, this procedure is associated with the aforementioned problem of the lack of energy conservation when kinetic coupling is included.

Consequently, the use of an FE IVD model developed to simulate a loading scenario similar to the utility of the NMS MB model, is suitable for generating a training data set. So, in this work, we employ a validated FE model from Little et al. (2008) for the subject-specific geometry of the Visible Male (Spitzer et al. 1996) and record its force-displacement relations. As surrogate models learn to map the model inputs to the desired outputs, and have no knowledge of the underlying physical system, there is a need to incorporate these physical laws directly into the model structure. Here, we modelled elastic response forces and torques, so we were aiming at surrogates that conserve the total energy. In mathematical terms, this is equivalent to requiring that the surrogate is the force field of an unknown potential. To enforce this condition, we define an IVD surrogate of a generic potential energy, and impose constraints on the value of its derivatives based on the measured force-displacement data of the detailed FE model. We apply this idea to define different types of surrogate models: polynomial models of low (second, third or fourth) order, and kernel models based on greedy kernel algorithms (Wendland 2005; Fasshauer and McCourt 2015; Wenzel et al. 2021). The novelty of this work lies in the design of these two IVD surrogate architectures that both combine the property of conservation of mechanical energy with existing fitting algorithms that consider the nonlinearity and kinetic coupling of the six DOFs of an intervertebral joint. Additionally, we quantified the accuracy of all surrogates to demonstrate the huge benefit of using nonlinear IVD models over common stiffness matrices in the context of predictive NMS MB simulations.

2 Energy-conserving surrogate models

A surrogate model itself is no biophysical model in the sense of providing a full understanding of the mechanical responses exerted by microscopic structures. It is rather an approximation of the macroscopic output as a function of input variables. Therefore, desired features need to be anchored in the intrinsic model structure. As we aim for a surrogate model for the elastic response forces and torques, the model must not violate basic physical principles. In particular, it should not produce or annihilate energy during motion (no gain nor loss) because this would contradict the assumption of a purely elastic behaviour. To allow for kinetic coupling, the elastic responses that are exerted onto the adjacent vertebra endplates need to be treated as force-torque pairs rather than as uncoupled vector components.

In Sect. 2.1, we will recall the prerequisites of biomechanical surrogate models that map the 6d inputs of joint displacements to the 6d outputs comprising force and torque and at the same time ensure energy conservation from Senan and O’Reilly (2009) and Metzger et al. (2010) as the latter is regularly not taken into account in current IVD models. Afterwards, we give concrete classes of models, namely polynomial models (Sect. 2.2) and kernel models (Sect. 2.3) to realise the theoretical approach, and represent the force-torque pair by nonlinear, composite functions of coupled inputs. These models will be used and compared in the numerical simulations of Appendix D.

For readers who like to skip the math, the Sects. 2.12.3 can be summarised as follows: When a surrogate for the potential energy stored in an IVD is identified, instead of independently mapping each force and torque component, the model’s mechanical responses are automatically energy conserving. Thus, for an accurate modelling process, it is crucial to derive force and torque components from the first derivatives of the potential energy.

In the following, we use bold letters to denote vector valued quantities, i.e., \(\varvec{x}_i\) for \(i=1, \dots , N\) correspond to vector valued (input) data points. In contrast, non-bold letters correspond to scalar valued quantities, i.e., \(x_\mu \) for \(\mu =1, \dots , d\) correspond to the component of a vector \(\varvec{x}\).

2.1 Conservative force-torque field

The total mechanical energy conservation is a crucial feature of every elastic surrogate model, i.e., kinetic energy and potential energy may vary but their sum will stay constant over time independent of the loads applied. This is equivalent to the demand of zero net work on closed paths, i.e., the independence of mechanical work of the chosen (displacement) path. In other words, we can state that the surrogate model should represent a conservative force field (Senan and O’Reilly 2009; Metzger et al. 2010).

To elaborate this in more detail, we consider the potential energy \(U(\varvec{x})\) as a function of the translational \(\varvec{r}=(x_1,x_2,x_3)\) and rotational displacements \(\varvec{\Phi }=(x_4,x_5,x_6)\), which together form the 6d state vector \(\varvec{x}=(\varvec{r},\varvec{\Phi })\) of the intervertebral joint. Analogously, we define a vector of all six elastic IVD responses \(\varvec{{\mathcal {F}}} =(\varvec{F},\varvec{M})\), containing three force components \(\varvec{F}=({\mathcal {F}}_1,{\mathcal {F}}_2,{\mathcal {F}}_3)\) and three torque components \(\varvec{M}=({\mathcal {F}}_4,{\mathcal {F}}_5,{\mathcal {F}}_6)\).

The existence of such a path- and time-independent potential U is mandatory for a conservative force field. According to the definition of a conservative force, the response forces and torques conserve mechanical energy if they can be derived from U as negative gradients

$$\begin{aligned} \nabla U = -{\mathcal {F}}^\prime \quad , \end{aligned}$$
(1)

with \(\nabla =(\partial _{x_1}, \partial _{x_2}, \partial _{x_3}, \partial _{x_4}, \partial _{x_5}, \partial _{x_6})\) and \(\partial _{x_\mu } \equiv \frac{\partial }{\partial x_\mu }\) for \(\mu =1,2,...,6\), and where \({\mathcal {F}}^\prime\) represents the torque components in a different coordinate system, which can be easily converted in Cartesian coordinatesFootnote 1 to obtain \(\mathcal F\).

To be called conservative, a force field \({\mathcal {F}}\) needs to be curl-free, i.e.,

$$\begin{aligned} \partial _{x_\mu } {\mathcal {F}}^\prime _\nu = \partial _{x_\nu } {\mathcal {F}}^\prime _\mu \quad \forall \ 1\le \nu ,\mu \le 6. \end{aligned}$$
(2)

We remark that curl-free is only a notation motivated by the case of \({\mathbb {R}}^3\), for which conservative vector fields are curl-free as the curl is zero. As the notation curl-free has been established in the literature (see e.g., Drake et al. (2021) and Drake et al. (2022)) also beyond the case of \({\mathbb {R}}^3\), we stick to it here.

Before proceeding with our formulation, we observe that this requirement is obviously fulfilled if the model neglects coupling between DOFs, i.e., if \({\mathcal {F}}^\prime _\mu ={\mathcal {F}}^\prime _\mu (x_\mu )\), as holds for, e.g., Schmid et al. (2020) and Mörl et al. (2020), or for symmetric stiffness matrices (Kövecses and Angeles 2007). On the other hand, the condition of Eq. (2) is automatically satisfied with the existence of a pot ntial U satisfying Eq. (1), as we can then rewrite Eq. (2) as \(-\partial _{x_\nu }\partial _{x_\mu } U = -\partial _{x_\mu }\partial _{x_\nu } U\). The key to developing mechanical energy-conserving surrogates is, thus, to create a model for the force potential U and, afterwards, derive single force and torque components from it. In the following subsections, we present two different realisations of these elastic surrogate models including a coupling of different DOFs: polynomial models (Sect. 2.2) and greedy kernel models (Sect. 2.3).

We remark that from the practical point of view, for the computation of both types of models, we applied common preprocessing steps to the data. These include especially data cleaning (removing doubled points, removing outliers) and data transformation (scaling of the input to \([-1, +1]\)). As scaling of the inputs also affects the outputs due to their relation from Eq. (1), no independent scaling of the outputs was applied. Further, no dimensionality reduction was applied as all features are inherently important for the conservation of energy.

2.2 Energy-conserving polynomial models

In this subsection, we briefly present well-known polynomial models and elaborate how to use them for energy-conserving modelling. For our numerical experiments, we will consider polynomial models of degree two, three and four. For these low degrees, we can write generic polynomials as

$$\begin{aligned} q_2(\varvec{x})&= \sum _{i=1}^{6} p_i x_i + \sum _{i,k=1}^{6} p_{ik} x_i x_k\quad ,\nonumber \\ q_3(\varvec{x})&= q_2(\varvec{x}) + \sum _{i,k,j=1}^{6} p_{ikj} x_i x_k x_j\quad ,\nonumber \\ q_4(\varvec{x})&= q_3(\varvec{x}) + \sum _{i,k,j,l=1}^{6} p_{ikjl} x_i x_k x_j x_l\quad , \end{aligned}$$
(3)

where \(p_i, p_{ik}, p_{ikj}, p_{ikjl}\), \(1\le i,j,k,l\le 6\), are the free parameters of the model. Here, we omit the 0-degree monomial as explained later.

In order to predict the vector valued force-torque pair, simply using vector valued coefficients to map the single vector components independently does not result in energy-conserving surrogates as elaborated in Sect. 2.1. Instead, we predict the underlying (unknown) potential \(U(\varvec{x})\) from Eq. (1) by fitting its derivative values, i.e., determining the coefficients of the polynomial model from Eq. (3) such that \(\nabla U(\varvec{x})~\approx ~\nabla q_\ell (\varvec{x})\) for some \(\ell =2,3,4\). As the values of the potential are unknown, we use the derivative values—which are provided as response forces and torques in force-displacement data sets—for the approximation. As an example, for \(\mu =1, \ldots , 6\) we use in the case of a fourth-order polynomial the equations

$$\begin{aligned} {\mathcal {F}}^\prime_\mu= & {} \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;-\frac{\partial q_4(\varvec{x})}{\partial x_\mu } \nonumber \\\!\!\!\!\!\!\!\!\!\!\!\!= & {} -p_\mu -2 \sum _{i=1}^{6} p_{\mu i} x_i -3 \sum _{i,k=1}^{6} p_{\mu ik} x_i x_k\nonumber \\{} & {} \;\;\;\;\;\;\;\;+ 4 \sum _{i,k,j=1}^{6} p_{\mu ikj} x_i x_k x_j \quad . \end{aligned}$$
(4)

The prefactors 2, 3 and 4 were explicitly included as we exploit the symmetry of the tensors \(p_{\mu i}, p_{\mu i k}\), such that we can reduce the number of independent parameters. Note that the coefficient \(p_{\mu }\) in Eq. (4) can in general be nonzero when the IVD is pre-strained and therefore exerts forces in the reference position where all displacements are zero, i.e., \(\varvec{x}=0\).

The surrogate is obtained from imposing Eq. (4) for \(\mu =1, \ldots , 6\) and for all \(\varvec{x}\) in the data set. This results in a linear equation system which is, however, not necessarily square in general, as the number of equations (\(d \cdot N\)) usually does not match the number of (independent) parameters for the polynomial model from Eq. (3). Thus, the linear equation system is solved in a least-square sense. The corresponding calculations were done in Python using PyTorch (Paszke et al. 2017).

We remark that the 0-degree monomial is omitted in Eq. (3) as they are cancelled by differentiation and do not appear in Eq. (4). This corresponds to the fact that a potential is only uniquely defined up to the addition of constant terms.

2.3 Energy-conserving kernel models

In the following, we use kernel models within this theory of conservative force-torque field (Wendland 2005; Fasshauer and McCourt 2015). The reader is referred to Appendix A for background information and an introduction into kernel methods with a focus on sparse kernel models obtained by greedy strategies. In fact, kernel methods generalise the polynomial models from the previous subsection, as those can be obtained using the so-called polynomial kernels. Thus, this broader class of methods allows for more sophisticated and accurate mathematical models.

We use the Gaussian kernel \(k(\varvec{x},\varvec{z}) \in {\mathbb {R}}\) (see Appendix A, Eq. (7)), which has been used to train surrogates of single components of the IVD responses before (Wirtz et al. 2015; Haasdonk et al. 2021). However, here we exactly ensure the energy conservation property of Eq. (2) by using a curl-free variant of the Gaussian kernel (Drake et al. 2022), that is a matrix-valued kernel given by

$$\begin{aligned} \varvec{k}_{\textrm{cf}}(\varvec{x}, \varvec{z}) :=&-\nabla _{\varvec{x}} \nabla _{\varvec{z}}^\top k(\varvec{x},\varvec{z}) \in {\mathbb {R}}^{6 \times 6}, \nonumber \\ \Rightarrow k_{\textrm{cf}}(\varvec{x}, \varvec{z})_{\mu \nu } =&-\partial _{x_\mu } \partial _{z_\nu } k(\varvec{x}, \varvec{z})\quad , \end{aligned}$$
(5)

for \(\mu , \nu = 1,..., 6\). By choosing a dimensionality of \(d=6\), we account for the six DOF in an intervertebral joint. Then, the force-torque vector can be represented by the sparse kernel interpolant (see Eq. (8) for comparison), \({\mathcal {F}}^\prime \approx \varvec{s}_n(\varvec{x})\), with vector-valued coefficients \(\varvec{\alpha }_i \in {\mathbb {R}}^6\) for \(i=1,..., n\) (where the number n of used training points is a small subset of the training data inputs \(\{ \varvec{x}_i\}_{i=1}^N, n\ll N\))

$$\begin{aligned} \varvec{s}_n(\varvec{x}) = \sum _{i=1}^n \varvec{k}_{\textrm{cf}}(\varvec{x}, \varvec{x}_i) \varvec{\alpha }_i \end{aligned}$$
(6)

with a common potential energy of all interpolant components,

$$\begin{aligned} U_{\varvec{s}_n}(\varvec{x}) = \sum _{i=1}^n\sum _{j=1}^d \left( \nabla _{\varvec{z}} k(\varvec{x}, \varvec{z}=\varvec{x}_i) \right) _j (\varvec{\alpha }_i)_j\quad . \end{aligned}$$

The vectorial coefficients \(\varvec{\alpha }_i\) can be determined from the interpolation conditions \(\varvec{s}_n(\varvec{x}_i) = \varvec{y}_i\) (with n of the training data outputs \(\{\varvec{y}_i\}_{i=1}^n\) assigned to the kernels at \(\{\varvec{x}_i\}_{i=1}^n\)).

Note that several selection criteria give rise to meaningful subsets \(X_n \subset X_N\). In this paper, we use a \(\gamma\)-stabilised greedy algorithm, which is a modification of standard greedy algorithms to yield more stable approximants \(\varvec{s}_n\), see Wenzel et al. (2021). An implementation of the stabilised greedy algorithms for various kernels with a recent extension to the curl-free Gaussian kernel mentioned above in arbitrary dimensions \(d \in {\mathbb {N}}\) can be found in the energy-conserving Vectorial Kernel Orthogonal Greedy Algorithm (VKOGA) implementationFootnote 2 (Santin and Haasdonk 2021; Wenzel et al. 2021).

For the actual calculation of the models for all IVDs, we apply this VKOGA algorithm to the preprocessed data with a stabilisation parameter of \(\gamma =0.05\). In particular, we run a 5-fold cross-validation to search for the best hyperparameters, precisely the best kernel width parameter \(\epsilon\). As a search grid we used \(\{\epsilon \}~\in ~\{ 0.5, 0.7, 0.9, 1.2 \}\). The maximal expansion size \(n_\textrm{max}\) of the model was set to \(n_\textrm{max} = 500\).

3 Results

Both approaches from Sect. 2.2 and Sect. 2.3 were applied in the context of a digital human spine model to join the advantages of FE simulations of single tissue models and muscle-driven MB simulations with forward-dynamic capabilities. At the example of a subject-specific anatomy, we built surrogate models of detailed FE IVD representations, and implemented these in a MB thoracolumbar spine model (for generation of artificial training and test data, see Appendix D).

To evaluate the performance and accuracy of the kernel and polynomial modeling approach, we juxtaposed the different surrogate predictions with respect to the reference values gained using the underlying detailed FE IVD model. We started with an analysis of the models’ accuracy on the training data, followed by random six-DOF test displacements and ending with typical whole spine motions. This validation process is conducted exemplary for the L4\(\vert\)5 IVD geometry but could analogously be transferred to any other intervertebral level.

Additionally, we provide a verification test for elastic surrogate models (exemplary on the new kernel approach) in Appendix F.

3.1 Accuracy on training and six-DOF test data

All surrogate models were created using single, two- and random six-DOF force-displacement training data sets. Their performance, in terms of predicted force precision, will here be compared to the training data itself and an additional six-DOF test data set. The reader is referred to Appendix D.1 for details about the data generation. Model predictions are compared to the reference data from the detailed FE model for all polynomial and kernel surrogates of the L4\(\vert\)5 IVD with respect to the average relative errors of force and torque values in Table 1.

For all surrogates, we found comparable relative errors throughout the different parts of the training data set. As expected, the second-order polynomial model showed the largest errors of on average 23.3% for forces and 16.9% for torques, followed by the third-order polynomial with 16.3% for forces and 15.9% for torques. The errors decreased with higher levels of polynomial order down to around 12.2% for forces and 9.6% for torques predicted by the fourth-order polynomial model. The lowest relative errors of, on average, 5.8% for forces and 6.5% for torques occurred using the kernel model.

A closer look at the single-DOF displacement trajectories, see Fig. 2, revealed that all models complied with the data and could be taken into consideration for use in biomechanical models. Only the deviations from the reference data, see Fig. 3, exhibited a clear difference between the IVD surrogate types. All models showed a similar pattern for lateral and anterior-posterior shear as well as for the axial rotation, whereas for lateral and frontal flexion and for compression, the fourth-order polynomial and kernel model were closer to the reference data, see Fig. 3.

Considering isolated force and torque components in the single-DOF simulations, we observed large relative errors in zero-crossing regions, i.e., when absolute values are small, across all surrogates and DOF. These errors decreased rapidly (but not monotonically) with larger excursions.

Polynomial models showed relative errors of 21.5%/ 20.1%, 15.7%/ 15.2% and 11.8%/ 10.5% for the forces/ torques predicted by the second-, third- and fourth-order polynomial on six-DOF test data. This was in the same range as the errors for the training data presented above. In contrast, for the kernel model, we saw a marked jump to 7.5% and 8.1% in the averaged force and torque errors, respectively. Thus, we presume that kernel models tend to be very precise in regions close to the training data but loose accuracy further away. This seems not to be the case for the polynomial models at the example of mapping 6d elastic IVD responses. However, judging from this set of training and random test data, the kernel model outperforms all polynomial models, especially on the full training data set but also markedly for the six-DOF test data where it still exhibits about 30% lower errors as compared to the most accurate polynomial model in this study.

Fig. 2
figure 2

Comparison of single-DOF results for the detailed FE model of the L4\(\vert\)5 disc (black line) and the surrogates derived from it: kernel (blue line) and polynomial approximation (in red: second-order dotted line, third-order dashed line, fourth-order solid line). The six reaction force and torque components are displayed as a function associated with their corresponding displacement DOF and act in the opposite direction, i.e., anterior displacement results in a posteriorly (AP) directed shear force, lateral displacement or rotation to the right (negative lateral displacement values or positive lateral rotation angles) produces left lateral shear forces or bending moments, respectively, an axial compression (negative axial displacement) leads to tensile forces in the IVD, and a rotation about the caudo-cranial axis to a restoring force towards the relaxed position. Note that the pre-load of the IVD in the zero-displacement position causes an asymmetry in the compressive force values for positive and negative axial displacements

Fig. 3
figure 3

Deviations of the force and torque predictions of surrogate models from the reference data for single-DOF trajectories (compare Fig. 2)

Table 1 Average relative errors of predicted forces and torques (in \(\%\)) of the different IVD surrogate models (second-, third- and fourth-order polynomial and kernel approximation) with respect to the FE measurements for single-DOF (1D), two-DOF (2D) and random six-DOF (6D) force-displacement training data as well as unseen random six-DOF data which was not used for model training but as test data (6D test). The errors were computed as the norm of the respective absolute error vector normalised by the reference force or torque amplitude. In simulations with an asterisk, force and torque errors were calculated only for movement trajectories with nonzero translational or rotational displacements, respectively, in order to avoid domination of the zero-displacement state in the relative error estimation

3.2 Accuracy for spinal bending movements

In the second part of the model analysis, we investigated the IVD model precision on specifically relevant movement trajectories for biomechanics exemplary for three bending motions: forward bending, lateral bending to the right, and lateral bending to the left. For this, we created artificial kinematic data for the entire spine stemming from MB simulations of the subject-specific spine with kernel surrogates included, see Appendix D.2. The detected force-displacement data were not only applied to the detailed model to gain reference values but also to the three polynomial models to facilitate a comparison of the surrogate models’ accuracy on the same kinematic data.

For the forward and lateral bending motions, see Fig. 6, we observed a different model behaviour as seen before in Sect. 3.1. The performance of the surrogate models is depicted in Figs. 4, 9 and 10. First of all, the absolute values of forces and bending torques are lower as compared to in vivo measurements since head, neck and arm masses are neglected in the underlying NMS thoracolumbar spine model. However, this does not affect the surrogate model itself. Second, the relative errors were—on average—lower as compared to the six-DOF test data set, which can be attributed to the fact that there were no zero-crossings in the force and torque amplitude.

The mean relative errors varied a lot for the lower-order polynomials ranging from 6.6% to 20.7% for the second and 3.7% to 17.8% for the third-order polynomial with the largest errors occurring for the torque predictions. In contrast, the fourth-order polynomial and kernel model exhibited the most consistent relative errors of on average 6%/ 6.9% and 5.5%/ 7.4% for force/ torque error, respectively, see Table 2. An additional analysis of the errors of single force components and the measured elastic responses for the lateral bending can be found in Appendix G. Nevertheless, fourth-order polynomial and kernel approximation seem to be comparably precise in the spine bending test cases for the L4\(\vert\)5 IVD. Both outperform the commonly used linear stiffness matrix significantly.

Fig. 4
figure 4

Kernel (blue line) and polynomial approximation predictions (in red: second-order dotted line, third-order dashed line, fourth-order solid line) of the three force and torque components for a forward bending movement trajectory in comparison with the reference data from the detailed FE model (black lines) for the same kinematics

Table 2 Average relative errors (in \(\%\)) of predicted forces and torques of all four surrogate models (second-, third- and fourth-order polynomial and kernel approximation) for typical spine movements: forward flexion (FF), left (LLB) and right lateral bending (RLB). Kinematic data were generated using an individualised MB spine model in combination with level-specific IVD kernel models

4 Discussion

To the best of our knowledge, this work is the first that combines the approach of creating a surrogate for a force potential with current sophisticated algorithms for nonlinear and high-dimensional force mapping in the context of biomechanics. By that, we present two novel surrogate architectures that intrinsically include the elastic nature of the IVD tissue: one based on kernel algorithms and one based on polynomial approximations. At the same time, we consider the large mutual impact of the different DOFs of an intervertebral joint onto their corresponding force and torque contributions. The resulting kernel and fourth-order polynomial models showed average relative errors of below 10% for all test and training data sets. Hence, when implemented in a MB simulation model, they can mimic the forces and torques of multiphasic, detailed IVD models onto the adjacent vertebral bodies with high precision compared to the second-order polynomial for the force potential. The latter corresponds, in fact, to the stiffness matrix of commonly used bushing elements. In this way, we are capable of simulating movements of the entire thoracolumbar spine at low computational costs. However, the usefulness of a surrogate depends largely on the appropriateness of its database. Alongside the quality of the different training and test data sets, we will discuss the biomechanical and methodological benefits and challenges of such surrogate models.

4.1 Biomechanical quality of the training data and elastic modelling approach

While it would be valuable to develop the surrogate models using in vitro results for a cadaveric joint, whereby multi DOF loading is applied to a single IVD to provide multi-axis loading data for mathematical modelling, this is not feasible nor realistic due to specimen fatigue, creep and sub-failure micro-damage. As a basis for data acquisition, we therefore decided to use a detailed FE IVD model that has already been used in the clinical environment in the context of scoliosis surgery (Little et al. 2008; Little and Adam 2011). Validation of subject-specific osseoligamentous FE spine models including this IVD model has shown good agreement between experimentally measured stiffness and predicted stiffness for both joint level and overall spinal responses. Similarly, an analysis of the single-DOF force and torque predictions reveals a basic consistency with literature data (see Appendix E).

The generated surrogates shall serve as ideally elastic representations of the IVD responses. Since physiological IVD tissue has poroelastic as well as viscoelastic properties (Costi et al. 2008), the presented models can only approximate the IVD mechanics in a specific use case, i.e., under physiological loading conditions and strain rates. Consequently, the underlying FE model uses material parameters which are stiffer as compared to quasi-static experimental setups (e.g., from Berkson et al. (1979) and Schultz et al. (1979)), where creep-effects reduce the measured forces and torques. Additionally, a physiological pre-load increases linearity of force-displacement relations (Gardner-Morse and Stokes 2003; Zhang et al. 2020) which are nonlinear when starting in a relaxed state. Future representations of the IVD may include velocity dependent effects on stiffness and damping behaviour.

In this work, IVD surrogates were developed for the intervertebral level-, and subject-specific IVD mechanics for the anatomy of the VM (Spitzer et al. 1996)). Although it is known that the elastic responses of an IVD vary considerably interpersonally and with degeneration, these observations have rarely been quantified experimentally. Thus, subject-specific geometric differences, such as IVD conicity, are neglected in most MB spine simulation models, as there exists no scaling law for IVD forces on the adjacent vertebra endplates. Moreover, there is currently no non-invasive approach to prescribe mechanical soft tissue properties of subject-specific intervertebral discs. These facts restrict the IVD representations in NMS MB spine models to population-based and overly simplified models. Integrating surrogates of detailed, individualised IVD models into NMS models can partly overcome this challenge. We expect the level- and subject-specific IVD characteristics to substantially affect the load distribution determined in spine simulations as compared to population-based stiffness matrices averaged over several spinal levels as in Meszaros-Beller et al. (2023).

In our approach, mechanical responses of the detailed IVD model only scale with the subject-specific geometry while keeping the tissue stiffness parameters constant—disregarding possible gender, ageing, or traumatic effects. Ideally, multidimensional force-displacement data comprising material parameters derived from healthy and degenerated tissue would be used to set up a parametrised model that can be adapted to the subject’s conditions. This is, however, not a limitation of the general methodology presented in this work.

The kinematics used for the comparison between different IVD modelling approaches was gained from a predictive simulation of a NMS MB spine model. Even though the generic baseline model, from which the individualised model was derived, forecasted internal loads that were in good agreement with existing data on IVD, muscle and ligament forces during forward bending, the resulting movement trajectory might differ from the subject’s actual kinematics for the same bending task. A different joint angle trajectory could possibly lead to areas where models perform differently, i.e., with higher or lower average relative errors. However, we do not anticipate any fundamental changes in our conclusions when applying other displacements to the models.

4.2 Challenges and benefits of the new methodology

For some detailed IVD models, it was not possible to collect data in the desired range of motion due to convergence issues of the FE numerics in the ABAQUS 6.14 solver. This was especially the case for very thin or conically shaped geometries and is expected to generally occur for degenerated thoracic and pathological IVDs. The different surrogate models for these IVDs were then trained with the same amount of data points but on a smaller displacement space. We initially intended to also model the L5\(\vert\)S1 IVD but since the FE simulations were restricted to a range below the compression levels reached during gravitational settling process, the MB simulations failed when including a L5\(\vert\)S1 kernel surrogate model into the individualised spine. Instead, we kept the bushing element for L5\(\vert\)S1 from Meszaros-Beller et al. (2023).

As described in Appendix D.1, we recorded data along trajectories starting from and returning to the starting point of zero displacements—therewith following a similar protocol as possible experimental data sets on coupled DOFs in the future. Thus, we find a data-rich area around this point. In contrast, only 200 displacement trajectories were used to detect forces in the entire 6d space resulting in data-sparse areas for the highly coupled DOF motions and specifically at the boundaries. The bending simulations, however, started from an equilibrated position under gravitational loads which is not equivalent to the zero-displacement position referred to as the initial state. This explains the slightly larger deviations from the detailed model at the beginning of the movement simulations (see Figs. 4, 9 and 10) as compared to the purely one-dimensional displacement predictions (Fig. 3). A more systematic scan of the 6d space enclosed by the physiological boundaries defined in Appendix D.1, or further data points would presumably improve model performances for unknown trajectories such as the spine bending motions. In particular, the kernel approximation could profit as it shows the only marked jump in relative errors from training to unseen test data probably due to the inhomogeneously distributed locations of the kernel centres.

We observed a systematic decrease in relative errors for an increasing polynomial order of the surrogates. However, the underlying algorithm uses tensor multiplications with tensors of the same order as the polynomial. This complicates the model development when adding terms of higher order to capture the high-dimensional coupling of DOFs. In contrast, kernel approximations can easily handle functions with high-dimensional inputs with calculations being based on common matrix multiplications.

The higher accuracy and improved practicality for 6d fitting comes with an increasing number of model parameters. While the second-order polynomial has only 21 independent values in the stiffness matrix and the 6 forces and torques in the zero-displacement state (in total 27), the third-order polynomial requires additional 56 parameters for the symmetric third-order tensor (83 in total), and a fourth-order polynomial needs additional 126 parameters to describe the symmetric fourth-order tensor (summing up to 209 independent parameters). The actual number of kernel surrogate parameters varies depending on the number of kernels used by the algorithm, which was limited to a maximum 500. For the L4\(\vert\)5 kernel model, this was 399 kernels with 6 parameters for the corresponding vectorial weighting factor plus one for the kernel width (in total 2,394 independent model parameters). In the current work, the focus laid on including a given data set—here, the finite element model mechanics—into a multibody environment with the highest possible precision. Thus, number of model parameters played a minor role. However, a larger model complexity might cause higher computational cost, or be relevant for certain applications. This effort is—for both the polynomial and kernel model—orders of magnitude lower than the underlying finite element model, and reasonably small in our tests.

While kernel models perform excellently on the range of motion defined by the boundaries of the data set (see Appendix D.1) and particularly on the training data, a limitation of the kernel surrogate models is its limited generalisation performance to unseen physical regimes of the data: the used Gaussian kernel is a localised function, and as all the recorded data lies in a bounded 6d box, the kernel surrogate model can only be expected to perform well within this box. Outside the boundaries, its performance will likely deteriorate and approach zero for larger excursions. This can be seen as a benefit of the polynomial models, as they actually can suitably predict growing forces and torques for increasing displacements, even outside the box of collected data points. A future extension of this project would be to combine kernel and polynomial approaches in order to benefit from the extrapolation capabilities of polynomial models and, at the same time, from the precision of kernel models especially in data-rich regions.

Also, it seems worth investigating further strategies to improve the accuracy of the surrogates. Higher-order polynomials or different kernel types, e.g., polynomial kernels, might be an alternative choice for the mostly monotonous force-displacement curve.

In Appendix F, we present a test to verify the elastic soft-tissue models with respect to energy conservation. The results demonstrate that surrogates developed here do, indeed, conserve energy—in contrast to previously presented approaches such as Haasdonk et al. (2021). This analysis also shows that although the training data were gained with an ideally elastic FE IVD model, the surrogates with independently fitted output forces and torques are not fully elastic. Even if computed net work was low for the non-conserving force in the chosen example, errors can add up over time particularly when several of such energy sinks or sources are implemented. This highlights the necessity to incorporate fundamental physical principles in the surrogate development process.

5 Conclusion

With the presented work, we demonstrate a way to design and develop surrogate models for multidimensional data sets of the elastic responses of soft tissues such as IVDs. By deriving all force and torque components from one single force potential, we ensure the conservation of mechanical energy in the modelled material. In combination with greedy kernel and polynomial approximations of detailed FE IVD models, we gained precise models capable of predicting the nonlinear and coupled force-displacement behaviour of IVDs in a MB spine simulation. Using this approach, it is also possible to increase the level of individualisation in subject-specific MB spine models with respect to level- and subject-specific IVD characteristics.