1 Introduction

Tumors are complex diseases characterised by high diversity and incidence, Sung et al. (2021). One of the most crucial and lethal processes is represented by the increased ability of cancer cells to migrate and invade other organs during the so-called metastatic spread, Dillekås et al. (2019). In the primary tumor, cancer cells could acquire useful mutations that allow them to penetrate different barriers and to disseminate themselves into secondary organs. Then, we observe the transition from an in situ stage to an invasive one. It is now well known that metastatic cancer cells typically move in clusters which have greater predisposition of forming metastasis than single cells, Aceto et al. (2014); Bubba et al. (2019); Hong et al. (2016).

Invasion of cells through layers of extracellular matrix is a key step in tumor metastasis, inflammation, and development. The process of invasion involves several stages, including adhesion to the matrix, degradation of proximal matrix molecules, extension and traction of the cell on the newly revealed matrix, and movement of the cell body through the resulting gap in the matrix, Friedl and Wolf (2010). Each of these stages of invasion is executed by a suite of proteins, including proteases, integrins, GTPases, kinases, and cytoskeleton-interacting proteins. One of the most difficult barriers for cells to cross is the basement membrane, also composed by extracellular matrix (ECM). This kind of membrane separates the epithelial tissue from the connective one and, among its functions, we can distinguish a supportive role and an isolating one. Tumor cell invasion is associated with an enhanced capability of tumor cells to degrade ECM. Tumor cells produce several hydrolytic enzymes including matrix metalloproteinases (MMPs). This process is, for instance, observed for breast tumors, as considered in our study.

Many questions concerning invasion details remain unanswered. Over the last decade, the research interest on this process is increasing in order to highlight the main cues with the aim of controlling and treating the phenomenon ( Chaplain et al. (2019); Ciavolella et al. (2023); Franssen et al. (2019); Gallinato et al. (2017); Giverso et al. (2022)).

In particular, we are interested in better describing one of the main biological phenomenon responsible for the invasion one, which is membrane degradation. In fact, in vitro invasion investigations (using the XCELLigence technology, Connolly and Maxwell (2002); Ke et al. (2011); Martinez-Serra et al. (2014); Obr et al. (2013); Türker Şener et al. (2017); Zaoui et al. (2019)) are not able to give details on this process. Consequently, we build here a mathematical model which describes degradation of an ECM-like biological membrane through the production of MMPs by cancer cells. At the same time, we provide numerical simulations with a sensitivity analysis followed by a parameters estimation study. The work is completed by experimental results consisting of cells seeded in wells containing at their bottom fluorescent gelatin. We focus the attention on breast cancer, the most common malignancy among women. The Triple Negative Breast Cancer (TNBC) patients are mainly treated with combinations of chemotherapy with severe side effects and afrequent recurrence of the metastasis. If non-metastatic, it has high chances of healing, but, on the contrary, advanced breast cancer with metastasis are considered incurable with the present therapies, Harbeck et al. (2019); Waks and Winer (2019).

The outline of the paper is as follows. In Sects. 2 and 2.1, we introduce the mathematical model and its dimensionless form. Section 3 is divided in two subsections describing the sensitivity analysis and the parameter estimation problem. In Sect. 3.1, we briefly present the concept of sensitivity analysis and then the results on our model. The same is done in Sect. 3.2, in the case of both synthetic data and experimental ones. The subsection ends with numerical simulations which are compared with the biological experiments. Finally, in Sect. 4, we conclude the paper presenting also some perspectives that both improve our work and develop it. At the end of the paper, the reader can find the Supplementary Materials S5 in which we give more details concerning the numerical method behind the simulations presented.

2 Mathematical Model

We build a mathematical model describing the invasive behaviour of tumor cells on a gelatin coated plate. Their movement is influenced by the gelatin, since their primary role is to degrade it, due to MMPs enzymes. The major hypothesis is that cells degrade gelatin below them and, after, they move to degrade around. Experiments capturing this behaviour are realised using the QCM\(^{\textrm{TM}}\) Gelatin Invadopodia Assay. Tumor cells are plated on a culture surface coated with a thin layer of green fluorescent gelatin and they are filmed over several days. Videos can show both cells movement and the consequent creation of holes in the gelatin.

In accordance with this kind of experimental setting, we consider a domain \(\Omega \) representing a top view of a single plate. We do not include the third dimension because the thickness of the gelatin layer is not significant compared to cells size. Consequently, the movement is only realised on a surface. For simplicity, we take the rectangular two-dimensional domain \(\Omega \). For \(x\in \Omega \), \(t>0\), we consider a three species system for cells density u(xt), MMPs enzymes concentration m(tx), and the damage function \(d(t,x)\in [0,1]\), related to the amount of gelatin \(q(t,x)=1-d(t,x)\in [0,1]\). Equations write as

$$\begin{aligned} \left\{ \begin{array}{ll} \partial _{t}u=div(D(d)\nabla u), &{} \text{ in } \Omega ,\\[1ex] \partial _{t}m=D_m \Delta m + \beta (1-d)u-\alpha m, &{} \text{ in } \Omega ,\\[1ex] \partial _{t}d=\gamma m (1-d), &{} \text{ in } \Omega , \end{array} \right. \end{aligned}$$
(1)

where

$$\begin{aligned} D(d)=D_L d+D_G(1-d)= D_G+(D_L-D_G)d. \end{aligned}$$

We impose no-flux boundary conditions on \(\partial \Omega \) for u and m, i.e.

$$\begin{aligned} \left\{ \begin{array}{ll} {D(d)\nabla u\cdot {\textbf {n}}} =0,\\[1ex] D_m \nabla m \cdot {\textbf {n}}=0, \end{array} \right. \end{aligned}$$

where n is the outward unit normal at the boundary. We complete the system with the following initial conditions

$$\begin{aligned} \left\{ \begin{array}{ll} u(0,x)=u_0(x),\\[1ex] m(0,x)=m_0(x)=0,\\[1ex] d(0,x)=d_0(x)=0, \end{array} \right. \end{aligned}$$

with \(u_0\) a random function on \(\Omega \).

Equation for u. The first Eq. in (1) describes the evolution of the cells density. The diffusion coefficient depends on the diffusion \(D_L>0\) into the liquid and \(D_G>0\) on the gelatin, with \({D_L >D_G}\). We do not consider the case \({D_L = D_G}\), otherwise we would have a standard diffusion equation. We observe that when the gelatin is intact (\(d=0\), then \(q=1\)), cells move randomly on the gelatin, whereas when it is completely destroyed (\(d=1\), then \(q=0\)), cells diffuse into the liquid. The expression of D(d) is not new in applications. In the literature we can find it, for example, in models describing the chemical transformation of calcium carbonate stones under the attach of sulphur dioxide, Aregba-Driollet et al. (2004); Guarguaglini and Natalini (2007).

Equation for m. The second equation is a reaction-diffusion like equation for the MMPs with diffusion coefficient \(D_m>0\), production rate \(\beta >0\) and death rate \(\alpha >0\). In particular, production of MMPs is due to cells and to the fact that they sense the gelatin below them. Moreover, we assume that MMPs diffuse locally and, consequently, we add the condition

$$\begin{aligned} \sqrt{\frac{D_m}{\alpha }} \ll 1, \end{aligned}$$

where \(\sqrt{\frac{D_m}{\alpha }}\) is homogeneous to \(\sqrt{x^2}\) and corresponds to the diffusion length.

Equation for d. Finally, the equation for the damage derives from the damage mechanics, Kachanov (1986). It can be written in terms of the gelatin q as

$$\begin{aligned}\partial _{t}q=-\gamma m q,\end{aligned}$$

which has an exponential decreasing solution. The damage is produced at rate \(\gamma \) by the MMPs m.

Remark 2.1

The choice of a macroscopic model is dictated by the high number of tumor cells in experiments. Moreover, it is more interesting to look at the general behaviour of cells instead of the single individual. Indeed, tracking them, we could observe that over 72 hours they make local movements around the initial position and at very low velocities. No particular interactions between cells could then be taken into account. Finally, a hybrid approach could be introduced, but this would largely increase the number of parameters, adding complexity in their estimation. We deduce the multiple advantages of our simple PDE model that is able to describe degradation phenomena.

2.1 Dimensionless Model

Before analysing the model, we propose a nondimensional form. It has several advantages, as the reduction of the number of parameters and the fact that their units are unimportant, see Murray (2002); Segel (1972). Upon changes of time and space variables

$$\begin{aligned} {\tilde{t}}=\alpha t, \quad {\tilde{x}}= \sqrt{\frac{\alpha }{D_m} }x, \end{aligned}$$

and appropriate scaling for um,  and d, namely

$$\begin{aligned} \begin{array}{l} u(t,x)={\tilde{u}} \left( \alpha t, \sqrt{\frac{\alpha }{D_m}} x\right) ,\\ m(t,x)= \frac{\alpha }{\gamma }{\tilde{m}} \left( \alpha t, \sqrt{\frac{\alpha }{D_m}} x\right) ,\\ d(t,x)= \frac{D_m}{D_L-D_G} {\tilde{d}} \left( \alpha t, \sqrt{\frac{\alpha }{D_m}} x\right) , \end{array} \end{aligned}$$

we find a parametrised version, again with homogeneous Neumann boundary conditions. For simplicity in notation, eliminating the tilde, we find

$$\begin{aligned} \left\{ \begin{array}{ll} \partial _{t}u=div((\theta + d)\nabla u), &{} \text{ in } \Omega ,\\ \partial _{t}m= \Delta m + k_1 (1-p d) u- m, &{} \text{ in } \Omega ,\\ \partial _{t}d=\frac{1}{p} m (1-p d), &{} \text{ in } \Omega , \end{array} \right. \end{aligned}$$
(2)

where \(\theta =\frac{D_G}{D_m}, p=\frac{D_m}{D_L-D_G}, k_1= \frac{\gamma \beta }{\alpha ^2}\). Here, \(D( d)= \theta + d\). We assume all parameters positive, except p that satisfies the restriction \(p\le 1\), since we require \(1-pd\ge 0\).

3 Sensitivity Analysis and Parameter Estimation

We provide a parameters estimation study for the dimensionless Model (2). Indeed, parameters are not accessible by direct measurements from experiments. To test the estimation technique, it is interesting to work at first with synthetic data. The idea is that we create a numerical (synthetic) solution, through our Model (2), using parameters from the literature. Then, we find the best set of parameters that gives reasonable solutions and we can evaluate the error made in the choice of parameters, see Sect. 3.2.

Another useful study that can be done in parallel is the sensitivity analysis, see Sect. 3.1. It provides us with an instrument to examine how the choice of the parameters affects the model dynamics. This is also a key indicator in the case in which the error between one of the optimal parameters and its corresponding initial estimate is too big. In fact, the sensitivity analysis could show us that some parameters do not have important effects on solutions. Thus, a huge error on their estimation is not so important.

At the end of Sect. 3.2, we provide an example of possible experimental data and we reiterate the consolidated procedure performed with synthetic data. We evaluate the optimal parameters describing the biological experiment and we provide numerical simulations of the model solutions. Except for this experiments paragraph, in the following we integrate the theory with our model analysis considering parameters taken from Di Costanzo et al. (2016); Braun (2021),

$$\begin{aligned} \begin{array}{c} {D_L=7\times 10^{-7}\, \textrm{cm}^2 \textrm{s}^{-1}, \quad D_G= 10^{-7}\, \textrm{cm}^2 \textrm{s}^{-1}, \quad D_m= 5\times 10^{-7}\, \textrm{cm}^2 \textrm{s}^{-1},} \end{array} \end{aligned}$$

and from Franssen et al. (2019),

$$\begin{aligned} {\alpha =2.5\times 10^{-6}\,\textrm{s}^{-1}, \quad \beta = 4.9\times 10^{-6}\,M\textrm{s}^{-1}.} \end{aligned}$$

Consequently, the nondimensional parameters are

$$\begin{aligned} \theta = 0.2, \quad p=0.83, \quad k_1= 0.78. \end{aligned}$$
(3)

3.1 Sensitivity Analysis

The lack of parameters availability from experiments imposes an uncertainty in the output of the model. To obtain as reliable results as possible, we have to study the influence of the parameters on the model dynamics through a local sensitivity analysis. Local methods are the simplest and the most common. They are based on a one-factor-at-a-time (OAT) method. It consists in perturbing one parameter at a time to see the effect on the output for each parameter of the model. Sensitivity analysis (SA) is then performed monitoring changes in the output through, for example, a derivative-based approach.

SA determines dependencies of input parameters \(\mathcal {Q}=(\mathcal {Q}_1,...,\mathcal {Q}_n)\) and output of the model \(\mathcal {Y}\). In the following we illustrate the procedure. The measure of the influence of the parameter \(\mathcal {Q}_i\) on the output \(\mathcal {Y}(\mathcal {Q})\) is calculated by the partial derivative with respect to this parameter. Since the model parameters differ by several orders of magnitude, we introduce a normalisation with respect to the mean value. Using the finite difference approach, the sensitivity of the output with respect to the parameter \(\mathcal {Q}_i\) is obtained as

$$\begin{aligned} S= \frac{\mathcal {Y}(\mathcal {Q}_1,...,\mathcal {Q}_i\pm \delta ,...,\mathcal {Q}_n)-\mathcal {Y}(\mathcal {Q}_1,...,\mathcal {Q}_n)}{\mathcal {Y}(\mathcal {Q}_1,...,\mathcal {Q}_n)} \;\frac{\mathcal {Q}_i}{\delta }, \end{aligned}$$

where \(\delta =0.05\, \mathcal {Q}_i\) to consider a \(5\%\) perturbation of the parameters.

For Our Model.

We report here the results obtained for our Model (2), using the dimesionless parameters from the literature in (3). We have evaluated the sensitivity of parameters \(\mathcal {Q}=[\theta , p, k_1]\) taking as output \(\mathcal {Y}\) the maximum value at the final time for cells density and the total mass of enzymes and gelatin at the final time. It is not interesting to consider the cells total mass since this is a conserved quantity. The OAT method does not examine the whole range of values of the input parameters as the global SA. However, this method is easier to define and computationally more efficient than the global sensitivity, for which different techniques are possible, see Saltelli et al. (2008).

In Table 1, we show for each perturbed parameter \(\mathcal {Q}_i\pm \delta \), the sensitivity \(S_\mathcal {Y}\) related to the output \(\mathcal {Y}\). Sensitivity should be less than one or around it but not much bigger in order to have low sensitivity of the model with respect to parameters. Since in our case it is at maximum around 1, we deduce that solutions have not a great influence on big changes in parameters. Of course, the bigger is the sensitivity, the bigger is the influence on solutions. Moreover, from Table 1, we deduce that the diffusion parameter does not play a remarkable role in the behaviour of enzymes m and degradation d, whereas it is the most ’critical’ parameter for cells evolution. In this case, in fact, p and \(k_1\) are of order \(10^{-2}\). More critical is the influence of p on degradation and \(k_1\) on both degradation and enzymes concentration, but still sensitivity is good enough. From SA, we infer that changing the literature parameters do not greatly affect solutions behaviour. Not knowing real solutions of the model, this is a good estimate to know about.

Table 1 We collect sensitivity values \(S_\mathcal {Y}\) per each parameter \(\mathcal {Q}_i\pm \delta \), with \(\delta =0.05\, \mathcal {Q}_i\). \(S_{\text{ max } u}\) is the sensitivity with output the maximum of cell density at the final time (72 hours), \(S_{\text{ mass } m}\) is the sensitivity with output the amount of enzymes again after 72 hours and the same for \(S_{\text{ mass } d}\) with respect to gelatin degradation

3.2 Inverse Problem: Parameter Estimation with Synthetic and Biological Data

With the knowledge that changing our parameters from the literature we do not modify too much solutions (Sect. 3.1), we can now estimate, through an inverse problem, the good parameters useful to recover reasonable solutions compared to synthetic and experimental ones.

A convenient reformulation of the inverse problem is to write it as a minimization problem.

Definition 3.1

Let \(F(\mathcal {Q})=\mathcal {Y}\) be the forward problem with \(\mathcal {Q}=(\mathcal {Q}_1,...,\mathcal {Q}_n)\) the parameters and \(\mathcal {Y}\) the numerical solution found with the finite difference scheme. The inverse problem is defined as a minimisation problem of the form

$$\begin{aligned} \mathcal {Q}_{opt}:= \mathop {\textrm{argmin}}\limits _{\mathcal {Q}\in \mathbb {R}^n} \Vert F(\mathcal {Q})-\mathcal {Y}_{exp}\Vert _{L^2}^2, \end{aligned}$$
(4)

where \(Q_{opt}\) are the estimated parameters and \(\mathcal {Y}_{exp}\) the experimental (or synthetic) data.

Unfortunately, inverse problems are usually ill-posed. Then, initial small perturbations can lead to large ones in the results. In other words, small errors between the solution of the forward problem and the experimental data can lead to arbitrary large errors between the given parameters and the estimated ones. Hence the necessity of a regularisation method to compute a stable approximation of the minimiser, Braun (2021); Braun et al. (2022). Thus, we extend the minimisation in (4) with a Tikhonov regularisation as

$$\begin{aligned} \mathcal {Q}_{opt}:= \mathop {\textrm{argmin}}\limits _{\mathcal {Q}\in \mathbb {R}^n} ( \; \Vert F(\mathcal {Q})-\mathcal {Y}_{exp}\Vert _{L^2}^2 + \lambda \Vert \mathcal {Q}-\mathcal {Q}_0\Vert _{L^2}^2 \; ), \end{aligned}$$

where \(\lambda >0\) is the regularisation parameter and the a priori estimate \(\mathcal {Q}_0\in \mathbb {R}^n\) represents an a priori knowledge about the solution.

From gelatin invadopodia assay, it is possible to extract information mainly concerning tumor cells, since it could be difficult to quantify the gelatin and the holes created by degradation. One of the reasons is that cells have their own fluorescence and this could alter the quantification of gelatin. Up to our knowledge, it is very difficult to have data on enzymes. Thus, we take \(\mathcal {Y}=u\) and \(\mathcal {Y}_{exp}=u_{exp}\). Ignoring \(\mathcal {Q}_0\) due to lack of a priori information on parameters, we infer that

$$\begin{aligned} \mathcal {Q}_{opt}:= \mathop {\textrm{argmin}}\limits _{\mathcal {Q}\in \mathbb {R}^n} ( \; \Vert F(\mathcal {Q})-u_{exp}\Vert _{L^2}^2 + \lambda \Vert \mathcal {Q}\Vert _{L^2}^2 \; ). \end{aligned}$$
(5)

Thus, if an unknown noise is included in the data, there might be different solutions that minimise \(\Vert F(\mathcal {Q})-u_{exp}\Vert ^2_2\), but among these solutions only the one which minimises \(\Vert \mathcal {Q}\Vert ^2_2\) is selected. In our case, the main interest will be in minimising the error among solutions. Thus, in the following we will choose very small values of \(\lambda \) such that the norm of \(\mathcal {Q}\) will not play a significant role.

Parameter Estimation with Synthetic Data for Our Model.

Before working on biological data, it is commonly used to test the estimation on the synthetic data. Therefore, we develop the procedure used to estimate parameters for Model (2) taking the parameters from the literature. Choosing the latter as in (3)

$$\begin{aligned}\mathcal {Q}_{lit} \quad =\quad [\theta =0.2, \qquad p=0.83, \qquad k_1=0.78],\end{aligned}$$

we derive numerical solutions um and \(q=1-d\). Synthetic solutions are created from the numerical solutions of the model with the literature parameters. We choose two different synthetic data: at first, we use as \(u_{exp}\) the exact numerical solution, and in a second step we consider a perturbation of it. Indeed, in both cases, the minimisation of the functional in (5) starts with \(F(\mathcal {Q})\), the numerical solution obtained with the perturbed literature parameters, and thus we aim to recover the best parameters such that \(F(\mathcal {Q})\) is close to \(u_{exp}\). So at first with \(u_{exp}\) the numerical solution of our Model (2), it should be easier to recover them, while adding a small perturbation, the idea is to reproduce an error that we also encounter with experimental data.

Thus, we perturb parameters with \(n =0.1\), 0.2 or 0.4 Gaussian noise, namely the perturbed parameters \(\mathcal {Q}_{pert}\) are composed by

$$\begin{aligned} {\theta _{pert}=\theta +\omega _\theta \, n \; \theta ,\qquad p_{pert}=p+\omega _p \, n \; p,\qquad k_{1_{pert}}=k_1+\omega _{k_1} \, n \; k_1,} \end{aligned}$$

with \(\omega _\theta \) and \(\omega _{k_1}\) random numbers in \([-1,1]\), whereas \(\omega _p\) is again a random number in \([-1,1]\) in the case \(n=0.1, 0.2\) and \([-1,0]\) if \(n=0.4\). Indeed, we recall that \(p\le 1\), so that \(1-pd\ge 0\). Then, we look for the best parameters that solve the inverse Problem (5) in an interval \(I_{lit} = [\mathcal {Q}_{lit}-0.5 \mathcal {Q}_{lit}, \mathcal {Q}_{lit}+0.5 \mathcal {Q}_{lit}]\) and starting with the perturbed parameters above. So, we evaluate the parameters that minimise the error \(E(\mathcal {Q})\) between our synthetic data and the new solution \(\widehat{u}\) obtained with one of the parameters \(\mathcal {Q}\) in the research interval \(I_{lit}\) plus the Tikhonov regularisation with \(\lambda =10^{-6}\), namely

$$\begin{aligned} E(\mathcal {Q})= \Vert \widehat{u}(t)-u_{exp}(t)\Vert _2^2 + \lambda \Vert \mathcal {Q}\Vert _2^2, \quad \text{ for } t=48 \text{ hours }. \end{aligned}$$

The choice of the time \(t=48\) hours is dictated by the dissipative behaviour of diffusion problems. Indeed, we can either loose information on parameters looking at the final time or not having them at all considering the initial time (initial data do not depend on parameters). Finally, we evaluate the relative error between the estimated parameters \(\mathcal {Q}_{opt}\) and the initial ones, i.e.

$$\begin{aligned} E_\theta =\frac{\Vert \theta -\theta _{opt}\Vert _2}{\Vert \theta \Vert _2}, \qquad E_p=\frac{\Vert p-p_{opt}\Vert _2}{\Vert p\Vert _2}, \qquad E_{k_1}=\frac{\Vert k_1-k_{1_{opt}}\Vert _2}{\Vert k_1\Vert _2}. \end{aligned}$$

We also estimate the relative error between the synthetic data \(u_{exp}\) and the solution with the estimated parameters, that is defined as

$$\begin{aligned} E_{\mathcal {Q}_{opt}}(t)=\frac{\Vert u_{exp}(t)-u_{\mathcal {Q}_{opt}}(t)\Vert _2}{\Vert u_{exp}(t)\Vert _2}, \quad \text{ for } t=0,24,48,72 \text{ hours }. \end{aligned}$$
(6)

Synthetic data: numerical solution

At first, we choose as synthetic data the numerical solution u. This means that the numerical solution of our model, found with the parameters from the literature, simulates the experimental data available. Thus, to find the optimal parameters with (5), we substitute \(u_{exp}\) with the numerical solution obtained with parameters from the literature and \(F(\mathcal {Q})\) with the numerical solution obtained with the set of optimal parameters found at each iteration, starting with the perturbed parameters.

Example 3.1

We consider a \(10\%\) perturbation on parameters \(\mathcal {Q}_{lit}=[\theta , p, k_1]=[0.2, 0.83, 0.78]\), randomly extracting \(\omega _\theta =-0.48\), \(\omega _p= -0.49\) and \(\omega _{k_1}=0.51\). We deduce

$$\begin{aligned}\mathcal {Q}_{pert}=[0.19,\quad 0.871,\quad 0.82].\end{aligned}$$

Solving the inverse problem in (5), we infer that

$$\begin{aligned}\mathcal {Q}_{opt}= [0.199,\quad 0.415,\quad 0.39].\end{aligned}$$

The errors on parameters are \({E_\theta =10^{-6}, \, E_p=0.5,\, E_{k_1}=0.5}\), whereas

$$\begin{aligned}E_{\mathcal {Q}_{opt}} < 8 \times 10^{-6}.\end{aligned}$$

Unfortunately, errors on p and \(k_1\) are high. However, as expected from the sensitivity analysis in Sect. 3.1, this does not impact solutions. In Fig. 1 left, we show the error \(E_{\mathcal {Q}_{opt}}\) between the data u and the solution with estimated parameters. The same is provided also for the gelatin q. Both are below \( 8\times 10^{-6}\) indicating a good approximation of solutions. This confirms that a property of our model is stability.

Remark 3.1

The same results are obtained considering higher perturbations on parameters.

We conclude the example with a final observation. The relative error between solutions is of the same order as \(\lambda =10^{-6}\), and also lower. This implies that the minimisation does not consider anymore the norm of solutions, but only the norm of parameters \(\mathcal {Q}\), see Eq. (5). This is why we end up with a big error on parameters: the algorithm is trying to find very small parameters. Then, considering a lower \(\lambda =10^{-12}\), we could force the minimiser to decrease only the norms of the solutions. The results are in the following. We take the same data as before and, solving the inverse problem, we obtain

$$\begin{aligned}\mathcal {Q}_{opt}=[0.2, \quad 0.809, \quad 0.761].\end{aligned}$$

The errors on parameters are significantly smaller, i.e.

$$\begin{aligned}{E_\theta = 10^{-8}, \quad E_p=0.02, \quad E_{k_1}=0.02}.\end{aligned}$$

Again, we have a good estimate of solutions as before, see Fig. 1 right.

Fig. 1
figure 1

On the x-axis, the time in days until 3 days, which represents the duration of the experiments. On the y-axis, the error \(E_{\mathcal {Q}_{opt}}\) in (6), both for cells u and gelatin q (dash line). The graphs correspond to the case in which the synthetic data are the numerical solutions, but the regularisation parameter \(\lambda \) changes. Indeed, on the left, \(\lambda = 10^{-6}\), whereas on the right \(\lambda = 10^{-12}\) (Colour figure online)

Synthetic data: perturbed numerical solution

Experimental (non synthetic) data have always some unknown noise due for instance to instrumentation. This is why in the following we consider a more realistic setting in which the synthetic data are a perturbation of the numerical solution of Model (2). We perturb with \(5\%\) Gaussian noise the numerical solution, i.e. \(u_{pert}=u+\omega _u \, 0.05 \; u,\) where \(\omega _u\) is a random matrix with entries in \([-1,1]\). This Gaussian noise is with zero mean, thus perturbed solutions are on average the exact numerical ones.

Example 3.2

We consider a \(10\%\) perturbation on parameters \(\mathcal {Q}_{lit}=[\theta , p, k_1]=[0.2, 0.83, 0.78]\). Randomly extracting \(\omega _\theta = -0.11\), \(\omega _p= -0.15\) and \(\omega _{k_1}= -0.83\), we deduce

$$\begin{aligned}\mathcal {Q}_{pert}=[0.198, \quad 0.817, \quad 0.715].\end{aligned}$$

Solving the inverse problem in (5), we get

$$\begin{aligned} \mathcal {Q}_{opt}=[0.199, \quad 0.415, \quad 0.539]. \end{aligned}$$

This infers an error on parameters such that \({E_{\theta }= 0.0007,\, E_{p}=0.5,\, E_{k_1}= 0.3}\), whereas

$$\begin{aligned}E_{\mathcal {Q}_{opt}}< 3\times 10^{-3}.\end{aligned}$$

As before, errors on p and \(k_1\) are high. In Fig. 2 left, we show the errors \(E_{\mathcal {Q}_{opt}}\), which are for both u and q below the initial \(5\%\) error between the synthetic data and the numerical solution.

Remark 3.2

We do not get very satisfactory results in terms of parameter estimation. However, the most important characteristic is that solutions obtained with the estimated parameters \(\mathcal {Q}_{opt}\) do not greatly differ from the experimental data (the perturbed solutions in this synthetic case). In contrast, we remark that the functional to be minimised in (5) does not include either q or m. Indeed, having more information also on gelatin and enzymes concentration could allow to include them in the functional, thus bringing us good estimations on the other parameters. In particular, the parameter p is present both in the equation for enzymes concentration m and gelatin degradation d, then we should need both of them to obtain a good estimate on p. The parameter \(k_1\) is only in the equation for the enzymes and, indeed, we can approximate it having experimental data on enzymes.

Remark 3.3

A different situation promises to be the case with higher perturbation on parameters. Indeed, we notice a better convergence rate for \(k_1\) with respect to Example 3.2, probably due to a better formulation of the minimisation problem which well represents the final minima. Otherwise, we always have good final errors on solutions.

So if we look for the optimal parameters describing the synthetic experimental data (chosen as the perturbed numerical solution \(u_{pert}\)), we deduce:

  • a good approximation of \(\theta \) with an error around 0.001;

  • a maximal error on p;

  • a varying error on \(k_1\), which is around 0.04 or 0.3 in the examples presented.

Moreover, we have good error values \(E_{\mathcal {Q}_{opt}}\) between the experimental synthetic solution and the one found with the optimal parameters. Indeed, for both cells density and gelatin concentration we obtain \(E_{\mathcal {Q}_{opt}} \sim 3\%\).

In order to obtain accurate optimal parameters, we should need either more experimental data, to be included in the functional definition, or eventually a model reduction. Concerning \(E(\mathcal {Q})\), we could modify the value of \(\lambda \), maybe a lower one, as in Example 3.1 where the error between solutions is of the same order as \(\lambda \). Regarding a model reduction, even if relatively simple, System (2) could still be too complex to describe experiments. A detailed analysis of it could highlight particular behaviours that we currently ignore. We could also add an a priori knowledge on parameters such as \(\mathcal {Q}_0=\mathcal {Q}_{lit}\). This last addition would be less realistic but, at least, we force the minimisation to consider parameters not too far away from the a priori ones. An example is shown just below.

Example 3.3

Let \(\lambda =10^{-3}, \mathcal {Q}_0=\mathcal {Q}_{lit}\). We consider a \(10\%\) perturbation on parameters \(\mathcal {Q}_{lit}=[\theta , p, k_1]=[0.2, 0.83, 0.78]\), randomly extracting \(\omega _\theta = 0.27\), \(\omega _p= -0.12\) and \(\omega _{k_1}= -0.81\). Solving the inverse problem in (5), we get

$$\begin{aligned} \mathcal {Q}_{opt}=[0.2, \quad 0.806, \quad 0.803]. \end{aligned}$$

This infers an error on parameters \({E_{\theta }= 0.003, \, E_{p}=0.03, \, E_{k_1}= 0.03,}\) whereas

$$\begin{aligned}E_{\mathcal {Q}_{opt}} < 3\times 10^{-3}.\end{aligned}$$

In Fig. 2 right, we show the errors \(E_{\mathcal {Q}_{opt}}\), which are for both u and q below the initial error of \(5\%\).

Fig. 2
figure 2

On the x-axis, the time in days until 3 days, which represents the duration of the experiments. On the y-axis, the error \(E_{\mathcal {Q}_{opt}}\) in (6), both for cells u and gelatin q (dash line). The graphs correspond to the case in which the synthetic data are the perturbed numerical solution, but the a priori information of parameters changes. On the left, we lack in information, whereas on the right \(\mathcal {Q}_0=\mathcal {Q}_{lit}\). There are no important variations in the solutions error. Instead a difference can be seen in the estimated parameters (Colour figure online)

Parameter Estimation with Biological Data for Our Model. We briefly present the experimental setting. The study is on the breast carcinoma cell line called MDA-MB-231 with a triple negative breast cancer (TNBC) phenotype. As introduced before, cells are placed on a thin layer of green fluorescent gelatin. Gelatin means to mimic basal membrane which has a thickness of 10 to \(I{300}\,\textrm{nm}\) which is smaller that the size of a cell, i.e. \({10}\,{\mu }\textrm{m} \). Fluorescence is useful to distinguish areas in which gelatin has been consumed. Such assays have also revealed that invasive cells extend small localised protrusions, called invadopodia, where MMP enzymes are localised and from which membrane degradation starts. Pioneered by Wen-Thien Chen in the 1980’s (Chen et al. 1985, 1984; Chen and Singer 1980), visualisation of invadopodia ECM degradation by fluorescent gelatin has emerged as the most prevalent technique for evaluating cellular invasive potential, Artym et al. (2006); Martin et al. (2012).

Experimental data consist of videos over 72 hours on a portion of the plate, measuring 2530\(\times \)2530 \(\mu \textrm{m}^{2}\). They focus either on cells movement or on gelatin degradation. An example of the results is given in Fig. 4. At the beginning, gelatin is intact (top image right) and every day the consumption increases and this corresponds to the darker regions. On the left column, the configuration of tumor cells starting from the initial one on the top till the third day (on the bottom).

In possession of the biological data, we can apply the parameter estimation tested before considering \(u_{exp}\) as the biological experimental data, instead of the synthetic ones. We have decomposed the experimental video in 18 images (one every four hours). Using the Matlab tool impixel, it is possible to manually identify cells position and to have a discrete representation of them. Thus, we need to transform particles into densities in order to obtain a macroscopic view of cells, that we name \(u_{exp}\). This procedure consists of centring a Gaussian kernel on each cell, see Braun (2021); Braun et al. (2022). Then, discretising \(u_{exp}\), we can compare it with our numerical solution through the inverse problem written in (5), that having reliable data only on cells, can be rewritten as

$$\begin{aligned} \theta _{opt}:= \mathop {\textrm{argmin}}\limits _{\theta } ( \; \Vert F(\theta )(t)-u_{exp}(t)\Vert _{L^2}^2 + \lambda \Vert \theta \Vert _{L^2}^2 \; ), \end{aligned}$$

with t corresponding to the values at which we have evaluated \(u_{exp}\), i.e. every 4 hours in the interval [0, 72] hours. We highlight that we are not looking for optimal parameters p and \(k_1\), since they do not significantly influence the behaviour of cells. Indeed, sensitivity analysis with experimental data shows very similar results to the ones in Table 1, thus p and \(k_1\) play very little role on cells dynamics . Consequently, they are going to be fixed. Furthermore, we realised that parameters from the literature were not appropriate to describe experimental data. The first problem is in the description of cells movement that is clear only with a higher \(\theta \). Moreover, \(k_1\) is too low to allow enzymes production and the consequent gelatin degradation. Thus, we modify them as in the following. We take p and \(k_1\) fixed to \(p=3\times 10^{-2}\) and \(k_1=780\), whereas we set as [10, 150] the minimisation interval for \(\theta \). We test several initial random values for \(\theta \in [30,55]\). Setting the regularisation parameter \(\lambda =10^{-9}\), in Fig. 3 we show the minimisation results.

Fig. 3
figure 3

The optimal parameter \(\theta \) found varying the initial perturbed one in the interval [30, 55]. We have selected 6 representative values and we can observe that the optimal \(\theta \) is well confined in a small interval. Taking the mean value, we have that \(\theta _{opt}=52.23\) (Colour figure online)

The relative error in (6) between the experimental solution and the numerical one with estimated parameter \(\theta \) for fixed p and \(k_1\) is always around 0.4, which is quite good when dealing with biological data. In this case, as for the functional, we consider Expression (6) for the experimental times considered, i.e. every 4 hours in the interval [0, 72] hours.

Fig. 4
figure 4

Example of experimental data. On the left, cells position from the initial time to 72 hours, pictured every day, on the right the corresponding gelatin consumption. Gelatin is in green, whereas darker regions indicate its degradation (Colour figure online)

Fig. 5
figure 5

Comparison between our numerical cell density with optimal parameter \(\theta \) (on the left) and the experimental data after transformation into densities (on the right). The domain dimension is in the corresponding dimensionless coordinates of the experimental one. In the first row, the initial data that is the same for both. The second row represents the density after one day, the third one after two days, whereas the last one after three days. We can observe some similarities between the two columns even if the match is not complete, but the error is below 0.4(Colour figure online)

Fig. 6
figure 6

Gelatin degradation after 72 hours. On the left the results obtained through our simulations, and on the right the experimental data (Colour figure online)

In Fig. 5, we show an example of the numerical solution for the cells density (on the left) compared with the experimental one represented through the particle to density transformation (on the right). We have considered the numerical solution u for Model (2) with the mean optimal parameter \(\theta =52.23\), and with \(p=3\times 10^{-2}\) and \(k_1=780\). In Fig. 6, we also propose a comparison between the numerical solution of the gelatin degradation (on the left) and the experimental data (on the right) at the final time. In this case, the fitting is much different, since apparently not all cells have produced the enzymes that degrade the gelatin and it was not possible to quantify gelatin consumption from experiments.

4 Conclusions and Perspectives

In this work, we set up a preliminary mathematical model to describe gelatin degradation as a result of the action of MMPs enzymes produced by tumor cells. This process is strictly related to cancer cells invasion through a thin membrane. The model presented describes the evolution in time of tumor cells, MMPs enzymes and gelatin degradation through a macroscopic view.

The model is combined with a sensitivity analysis and a parameter estimation. Sensitivity analysis highlights how perturbation on parameters does not affect solutions of the PDE system. Indeed, it confirms the stability of our model. Then, we have performed parameter estimation both on synthetic data and on biological ones. Despite the lack of a complete biological knowledge, we were able to build an appropriate description of degradation experiments. As it can be observed in Fig. 5, a very attractive perspective is to test our model looking at cells in the entire experimental domain and on different set of data. Indeed, we could both better capture the diffusive behaviour given by our model and have more information on enzymes activity and degradation, the two main poorer information that influenced the research of the complete optimal set of parameters. A deeper study on how enzymes are produced by tumor cells could be helpful, since in the experimental data proposed only a small amount of holes has appeared, see Fig. 6.

More complex models considering the action of other cells involved in the invasion process, such as fibroblasts (Malaquin et al. 2013) could be very interesting to analyse. Moreover, as stressed during the paper, the gelatin is a very thin layer. A two-dimensional approach is then required. However, this work could be a first step in studying not only basal membrane degradation, but also ECM one. ECM is in fact a thicker layer in which cells move thanks to degradation. Furthermore, the understanding of the degradation process is crucial in modelling invasion. Indeed, the estimated parameters helps to quantify membrane permeability inside the Kedem-Katchalsky membrane conditions. Finally, a more mathematical analysis on Model (2) can reveal interesting behaviour of solutions.