1 Introduction

We consider the numerical solution of non-linear illposed and inverse problems where the underlying non-linearity F maps from a possibly multi-component version of \(L^\infty \) into a normed space Y. This scenario appears quite naturally in many parameter identification tasks for partial differential equations.

For instance, in electrical impedance tomography (EIT) one wants to identify the conductivity distribution in a body. To this end, electric currents are applied through electrodes and the resulting voltages are measured. In the quasi-static regime, an adapted version of the Laplace equation connects the currents with the voltages, see, e.g., [1]. Another application is full waveform inversion (FWI), the most advanced inversion technique in seismic imaging, see, e.g., [2, 3]. Depending on the used mathematical model for wave propagation (acoustic, elastic, or visco-elastic regime) the searched-for parameters include bulk density, pressure and shear wave velocities, and corresponding relaxation times. Here, a wave field is initiated by a source (explosion or earthquake) and the parts of the wave field that are reflected from the Earth’s internal structure are then picked up by receivers on the Earth’s surface or by hydrophones in the ocean.

From an abstract point of view, in both examples we have to solve an equation

$$\begin{aligned} \text {find } u\in \mathcal {D}(F)\text { such that } F(u)\approx y^\delta \end{aligned}$$
(1)

where the non-linear operator

$$\begin{aligned} F:\mathcal {D}(F)\subset L^\infty (D)^\ell \rightarrow Y \end{aligned}$$
(2)

maps \(\ell \) parameter functions u located on some domain of interest D to the respective measurements. Further, \(y^\delta \) are the (noisy) measurements satisfying \(\Vert y^\delta -F(u^+)\Vert _Y\le \delta \) for one \(u^+\in \mathcal {D}(F)\).

Newton-like regularization schemes are well-established iterations for getting a meaningful approximate solution of non-linear inverse problems. In this work we explore the Newton-like solver REGINN \(^\infty \) which extends REGINN of [4, 5] to a non-linear inverse problem with generic operators F as in (2). Here, F is required to fulfill a few specific properties which are satisfied by EIT and, except for one, also by FWI in all regimes. This not yet verified property of FWI is a structural assumption known as tangential cone condition (TCC), see (3) below and consult [6] for a first promising result. Apart from establishing the non-linearity constraint, the main challenge about regularization in \(L^{\infty }\) is its non-reflexive nature. As this space is further non-separable, convergence of discretization schemes in the strong topology, which is desirable for practical implementations, cannot be achieved, see [7].

In this work we will exploit the weak-\(\star \) topology and semi-discrete approximations to F based on a family \(\{X^n\}_n\) of finite-dimensional nested subspaces of \(L^\infty (D)^\ell \). Such an ansatz is very close to an implementation and addresses discretization errors directly in the theory. Apart from minor mathematical restrictions which will be specified below, the subspaces will also be held quite generic and may reflect features of the reconstructions one wants to focus on, e.g. using piecewise constant functions to model sharp interfaces. By the well-known limit

$$\begin{aligned} \Vert u\Vert _{L^\infty (D)^\ell }=\lim _{q\rightarrow \infty }\Vert u\Vert _{L^q(D)^\ell } \quad \text { for all } u\in L^\infty (D)^\ell , \end{aligned}$$

for bounded D, \(X^n\) can be equivalently furnished with the \(L^{q_n}(D)^\ell \)-topology for properly chosen \(q_n<\infty \) such that \(q_n\rightarrow \infty \) as \(n\rightarrow \infty \). This approach then implies that the Newton update for the n-th iterate can be obtained as an approximate minimizer of a smooth (provided Y is smooth) and convex Tikhonov functional over \(X^n\). As a result, the underlying minimizing procedure can be implemented for general \(\{X^n\}_n\) with classical techniques from smooth and unconstrained optimization.

There is a wealth of literature on the analysis of Newton-like regularization schemes, mainly in a Hilbert space but meanwhile also in a Banach space setting; we refer only to the monographs [8, 9] for a first reading. However, most of the Banach space methods are formulated in abstract spaces requiring smoothness and reflexivity at least as they rely heavily on duality mappings to mimic the Riesz isomorphism. To the best of our knowledge, regularization schemes applicable to non-reflexive spaces are only considered in [7, 10,11,12,13,14,15]. The first six of these publications consider iterative schemes on the basis of proximal point methods, Morozov, Ivanov, or Tikhonov regularization, respectively. We will compare our scheme with the somewhat similar IRGNM-Tikhonov method of [12] in Remark 2.6 below.

Our material is organized as follows: In Sect. 2 we introduce and analyze two versions of REGINN \(^\infty \) which differ in what information about the smoothness of the ground truth is available a priori. Under reasonable assumptions both algorithms are well defined and terminate with a regularized solution \(u_{M_\delta }\in \mathcal {D}(F)\) of (1). Further, we prove the regularization property, that is, weak-\(\star \) convergence of \(\{u_{M_\delta }\}_{\delta >0}\) to an exact solution of \(F(\cdot )=F(u^+)\) as the noise level \(\delta \) tends to zero. Our hypotheses are reasonable in fact as they are met by EIT as well as FWI with exception of the TCC (Sect. 3) as mentioned above. For FWI in the acoustic regime we are even able to validate the stronger assumption about the ground truth which enters the second version of REGINN \(^\infty \). Finally, Sect. 4 contains some experiments to identify two parameters (density and pressure wave velocity) in the acoustic wave equation. Here, all our assumptions including TCC in a semi-discrete setting actually hold. Some technical details that would otherwise interrupt the flow of reading have been moved to three appendices.

2 REGINN \(^{\infty }\)

For some bounded domain \(D\subset \mathbb {R}^d\) let \(F:\mathcal {D}(F)\subset L^{\infty }(D)^\ell \rightarrow Y\) be Fréchet-differentiable and satisfy the tangential cone condition (TCC) at \(u^+\), i.e., there are a positive constant \(\omega <1\) and a ball \(B_{r}(u^+)\subset \mathcal {D}(F)\) with radius \(r>0\) such that

$$\begin{aligned}{} & {} \Vert F(u)-F(\widetilde{u})-F'(\widetilde{u})(u-\widetilde{u})\Vert _{Y}\le \omega \Vert F(u)-F(\widetilde{u})\Vert _{Y}\nonumber \\{} & {} \quad \text { for all } u,\widetilde{u}\in B_{r}(u^+). \end{aligned}$$
(3)

Here, \(F':\mathcal {D}(F)\subset L^{\infty }(D)^\ell \rightarrow \mathcal {L}\big (L^\infty (D)^\ell ,Y\big )\) denotes the Fréchet-derivative of F and \(L^\infty (D)^\ell \) is endowed with \(\Vert \cdot \Vert ^2_{L^{\infty }(D)^\ell }\) where

$$\begin{aligned} \Vert u\Vert ^2_{L^{q}(D)^\ell }=\Vert (u_1,\ldots ,u_{\ell })\Vert ^2_{L^{q}(D)^\ell }:=\sum _{j=1}^{\ell }\Vert u_i\Vert ^2_{L^{q}(D)}, \quad q\in [1,\infty ]. \end{aligned}$$

The TCC, introduced in [16], is a well-established and widely used assumption in the convergence analysis of a variety of iterative regularization schemes for non-linear illposed problems; we only refer to the monographs [8, 9]. Nevertheless, only a few academic examples are reported in the infinite-dimensional setting where it holds, see, e.g., [8, 17, 18].

For \(\omega <1/2\) we can equivalently restate the TCC as

$$\begin{aligned} \Vert F(u)-F(\widetilde{u})-F'(\widetilde{u})(u-\widetilde{u})\Vert _{Y}\le L \Vert F'(\widetilde{u})(u-\widetilde{u})\Vert _{Y}, \end{aligned}$$
(4)

with \(L=\omega /(1-\omega )<1\). Since our method will be explicitly based on discretizing \(L^\infty (D)^\ell \), we impose the following assumptions on corresponding spaces \(X^n\):

  1. (S1)

    \(\{X^n\}_{n\in \mathbb {N}}\) is a sequence of nested subspaces of \(L^\infty (D)^\ell \), i.e., \(X^n\subset X^{n+1}\subset L^\infty (D)^\ell \) for all \(n\in \mathbb {N}\).

  2. (S2)

    For each \(X^n\) there exists a linear projection operator \(\mathcal {P}^n:L^\infty (D)^\ell \rightarrow X^n\), that is, \(\mathcal {P}^nu=u\) for all \(u\in X^n\), satisfying \(\Vert \mathcal {P}^nu\Vert _{L^\infty (D)^\ell }\le C_{\mathcal {P}}\Vert u\Vert _{L^\infty (D)^\ell }\) where the constant \(C_{\mathcal {P}}\ge 1\) is independent of n.

  3. (S3)

    For any fixed \(C_{\infty }>1\) we can find a positive increasing sequence \(\{q_n\}_{n\in \mathbb {N}}\) such that

    $$\begin{aligned}\Vert u\Vert _{L^\infty (D)^\ell }\le C_{\infty }\Vert u\Vert _{L^{q_n}(D)^\ell }\quad \text {for all }u\in X^n. \end{aligned}$$

Note that the \(L^{\infty }(D)^\ell \)-norm is always stronger than the \(L^{q_n}(D)^\ell \)-norm, hence the magnitude of \(C_{\infty }>1\) in (S3) determines how tight the norm equivalence of \(\Vert \cdot \Vert _{L^\infty (D)^\ell }\) and \(\Vert \cdot \Vert _{L^{q_n}(D)^\ell }\) is, restricted to \(X^n\). Thus, having set \(\{q_n\}_{n\in \mathbb {N}}\) once for a fixed feasible subspace sequence \(\{X^n\}_{n\in \mathbb {N}}\) and some \(C_\infty >1\), we can easily switch between non-smooth and smooth norms. We emphasize that \(C_\infty \) is independent of n. A family \(\{X^n\}_{n\in \mathbb {N}}\) enjoying (S1)–(S3) is constructed in Appendix A.1 on the basis of tensor product B-splines (see Appendix A.2 for a possible other construction, which is, however, not considered further in this work).

To ensure that our discretizations do indeed approximate the actual inverse problem in \(L^\infty (D)^\ell \) sufficiently well, we require a compatibility condition for F of the form

$$\begin{aligned} \liminf _{n\rightarrow \infty }\Vert F'(u)\big (\widehat{u}-\mathcal {P}^n\widehat{u}\big )\Vert _{Y}=0 \end{aligned}$$
(5)

for all \(u\in \mathcal {D}(F)\) and all \(\widehat{u}\in L^\infty (D)^\ell \). This extra condition is necessary since, in general, one cannot expect \(\lim _{n\rightarrow \infty }\mathcal {P}^n\widehat{u}= \widehat{u}\) in \(L^\infty (D)^\ell \) strongly (which would yield (5) by continuity of \(F'\)) because the union of \(X^n\) is countable while \(L^\infty (D)^\ell \) is not separable. However, as a consequence, the closure of the range of \(F'(u)\) must be separable, in contrast to Y itself, for all \(u\in \mathcal {D}(F)\).

Additionally, to prove the regularization property of our scheme (Algorithm 1), we will require either the weak-\(\star \) closedness of F or the range inclusion

$$\begin{aligned} \mathcal {R}(F'(u^+)^{*})\subset L^1(D)^\ell \end{aligned}$$
(6)

where \(F'(u^+)^{*}:Y^*\rightarrow \big (L^\infty (D)^*\big )^\ell \) denotes the adjoint operator. In Sects. 3.1 and 3.2 below we will verify (5) as well as (6) for the forward maps belonging to EIT and FWI, respectively.

As motivated in the introduction, the guideline for designing our regularization algorithm is to generate uniformly bounded iterates \(u_m\) in \(L^\infty (D)^\ell \) which give sufficiently small residuals

$$\begin{aligned} b^{\delta }_m:=y^\delta -F(u_m), \end{aligned}$$
(7)

where \(y^\delta \in Y\) is the perturbed datum at hand. We build on an inexact-Newton framework and find the updates from the linearization approximately via Tikhonov regularization linked to \(X^n\) as follows: Given an initial guess \(u_0\in \mathcal {D}(F)\) and an initial space \(X^{n_0}\), we iterate

$$\begin{aligned} u_{m+1}=u_m+ s_m,\quad m=0,1,2,\ldots , \end{aligned}$$
(8)

where \(n_m\in \mathbb {N}\) with \(n_m\ge n_{m-1}\) and \(s_m\in X^{n_m}\) are chosen such that

$$\begin{aligned} J_{n_m,m}(s_m)\le \mu _m^2 \Vert b^{\delta }_m\Vert ^2_{Y}. \end{aligned}$$
(9)

Here, the Tikhonov functional \(J_{n,m}:X^n\rightarrow [0,\infty )\) is

$$\begin{aligned} J_{n,m}(s):=\Vert F'(u_m)s -b^{\delta }_m\Vert ^2_{Y}+\alpha _m\Vert s+(u_m-u_0)\Vert _{L^{q_n}(D)^\ell }^2 \end{aligned}$$
(10)

with penalty parameters set a posteriori by

$$\begin{aligned} \alpha _m=\frac{\Vert b^\delta _m\Vert ^2_Y}{\gamma ^2}. \end{aligned}$$
(11)

We recall from assumption (S3) that an approximate minimizer of (10) is also one of the corresponding Tikhonov functional with \(\Vert \cdot \Vert _{L^\infty (D)^\ell }\)-penalty term and vice versa. However, given the sequence \(\{q_n\}_{n\in \mathbb {N}}\) for some \(C_\infty >1\) and a subspace sequence \(\{X^n\}_{n\in \mathbb {N}}\), we have more flexibility in choosing a numerical method to find \(s_m\) in (9) via the smoothed version.

So far, the parameters \(\gamma ,\{\mu _m\}_{m\in \mathbb {N}}\) are restricted to fulfill \(0<\mu _m<1\) and \(\gamma \ne 0\). While \(\mu _m\) serves as a stopping criterion in the spirit of an inexact Newton condition to set the m-th update \(s_m\), \(\gamma \) will be responsible for keeping the resulting iterate \(u_{m+1}\) sufficiently close to \(u^+\): the larger \(\gamma \) is chosen, the better the initial guess has to be. The effect of the initial discretization level \(n_0\) on the iterates is demonstrated in Section 4 by numerical experiments. We stop the Newton iteration (8) by a discrepancy principle with constant \(\tau >1\). The resulting inversion scheme REGINN \(^{\infty }\) is summarized in Algorithm 1. It is well defined under reasonable assumptions according to the following theorem. Its regularization property is then specified in Theorem 2.3.

figure a

Theorem 2.1

(Well-definedness and termination of REGINN \(^{\infty }\)) Let \(F:\mathcal {D}(F)\subset L^\infty (D)^\ell \rightarrow Y\) be as above satisfying (3) with \(\omega <1/3\) in \(B_{r}(u^+)\subset {\text {int}}(\mathcal {D}(F))\) and (5), where \(\{X^n\}_{n\in \mathbb {N}}\) and \(\{\mathcal {P}^n\}_{n\in \mathbb {N}}\) fulfill assumptions (S1)–(S3). Let \(y^\delta \) be given such that \(\Vert F(u^+)-y^\delta \Vert _{Y}\le \delta \) for one \(\delta >0\). Let \(\Lambda \in (\frac{2\omega }{1-\omega },1)\) and set

$$\begin{aligned} \mu _{\max }:=(1-\omega )\Lambda -\omega . \end{aligned}$$

For

$$\begin{aligned} \gamma \in \left( 0,\frac{r}{\sqrt{C_{\infty }}\mu _{\max }}\right) \end{aligned}$$

and

$$\begin{aligned} r_0\in \left( 0,\min \left\{ r-\sqrt{C_{\infty }}\mu _{\max }\gamma ,\frac{\gamma }{C_{\mathcal {P}}}\sqrt{\mu ^2_{\max }-\omega ^2}\right\} \right) \end{aligned}$$

choose

$$\begin{aligned} \tau >\frac{1+\omega }{\sqrt{\mu ^2_{\max }-C_{\mathcal {P}}^2r_0^2/\gamma ^2}-\omega }. \end{aligned}$$

Further, define

$$\begin{aligned} \mu _{\min }:=\sqrt{\left( \omega +\frac{1+\omega }{\tau }\right) ^2+C_{\mathcal {P}}^2\frac{r_0^2}{\gamma ^2}}. \end{aligned}$$

Restrict all tolerances \(\{\mu _m\}\) to \((\mu _{\min },\mu _{\max })\) and start with some arbitrary \(n_0\in \mathbb {N}\) and \(u_0\in B_{r_0}(u^+)\). Then, there exists an \(M_{\delta }\in \mathbb {N}\) such that all iterates \(\{u_1,\ldots ,u_{M_{\delta }}\}\) of REGINN \(^{\infty }\) are well-defined and stay in \(B_r(u^+)\). Moreover, \(\Vert b_{m+1}^{\delta }\Vert _{Y}\le \Lambda \Vert b_m^{\delta }\Vert _{Y}\) for \(m=0,\ldots ,M_{\delta }-1\), \(\Vert b^\delta _{M_{\delta }}\Vert _{Y}\le \tau \delta \), and \(M_{\delta }=\mathcal {O} (\vert \log \delta \vert )\) as \(\delta \searrow 0\).

Proof

Before we begin with the proof we discuss our assumptions on the parameters. First, observe that the open interval for choosing \(\Lambda \) is non-empty by \(\omega <1/3\). The lower bound for \(\Lambda \) guarantees that \(\mu _{\max }>\omega \). Together with the upper bound on \(\gamma \) this yields a positive upper bound for \(r_0\). Further, the radicand and the denominator of the lower bound for \(\tau \) are positive. Finally, \(\mu _{\min } <\mu _{\max }\).

We use an inductive argument and assume therefore that \(\Vert b_m^{\delta }\Vert _{Y}\le \Lambda ^m \Vert b_0^{\delta }\Vert _{Y}\) as well as \(\Vert u_i-u^+\Vert _{L^\infty (D)^\ell }< r\) for \(i\le m\), which holds in particular for \(m=0\) because of \(\Vert u_0-u^+\Vert _{L^\infty (D)^\ell }<r_0<r\). If \(\Vert b^{\delta }_m\Vert _{Y}\le \tau \delta \), REGINN \(^{\infty }\) stops with \(u_{M_{\delta }}:=u_m\) and nothing else needs to be shown. Otherwise, \(\Vert b^{\delta }_m\Vert _{Y}> \tau \delta \) and we next show that a Newton update is well defined by (9). Let \(s_{n,m}:=\arg \min _{s\in X^n} J_{n,m}(s)\) which exists as the unique minimizer of a strictly convex functional over a finite dimensional space. Then,

$$\begin{aligned} J_{n,m}(s_{n,m})&\le \ J_{n,m}\big (\mathcal {P}^{n}(u^+-u_m)\big ) \\&=\Vert F'(u_m)\mathcal {P}^{n}(u^+-u_m)-b_m^{\delta }\Vert ^2_{Y}\\&\qquad +\alpha _m\Vert \mathcal {P}^{n}(u^+-u_m)+(u_m-u_0)\Vert ^2_{L^{q_n}(D)^\ell }. \end{aligned}$$

Recursively, we get \(u_m-u_0=s_0+s_1+\cdots + s_{m-1}\) from which we deduce that \(u_m-u_0\) is in \(X^{n_{m-1}}\) as the spaces are nested by (S1). Hence, by (S2),

$$\begin{aligned} \mathcal {P}^{n}(u_m-u_0)=u_m-u_0 \end{aligned}$$
(12)

for \(n\ge n_{m-1}\) and by linearity of \(\mathcal {P}^{n}\) we may simplify

$$\begin{aligned} J_{n,m}(s_{n,m})&\le \Vert F'(u_m)\mathcal {P}^{n}(u^+-u_m)-b_m^{\delta }\Vert ^2_{Y}+\alpha _m\Vert \mathcal {P}^{n}(u^+-u_0)\Vert ^2_{L^{q_n}(D)^\ell } \\&\le \Vert F'(u_m)\mathcal {P}^{n}(u^+-u_m)-b_m^{\delta }\Vert ^2_{Y}+ \text {vol}_\text {d}(D)^{2/q_n}C_{\mathcal {P}}^2\frac{r_0^2}{\gamma ^2}\Vert b^\delta _m\Vert ^2_Y. \end{aligned}$$

In the last step we additionally used (11), Hölder’s inequality and \(\Vert \mathcal {P}^n(u^+-u_0)\Vert _{L^\infty (D)^\ell }\le C_{\mathcal {P}}\Vert u^+-u_0\Vert _{L^\infty (D)^\ell }<C_{\mathcal {P}}r_0\), see (S2). We continue by splitting the residual term according to

$$\begin{aligned} \Vert F'(u_m)\mathcal {P}^{n}(u^+-u_m)-b_m^{\delta }\Vert _{Y}&\le \Vert F'(u_m)(u^+-u_m)-F(u^+)-F(u_m)\Vert _{Y} \\&\qquad +\Vert F(u^+)-y^\delta \Vert _{Y} \\&\qquad \qquad +\Vert F'(u_m)\big (\mathcal {P}^{n}(u^+-u_m)-(u^+-u_m)\big )\Vert _{Y} \\&\le \Vert F'(u_m)(u^+-u_m)-F(u^+)-F(u_m)\Vert _{Y} +\delta \\&\qquad \qquad +\Vert F'(u_m)\big (\mathcal {P}^{n}(u^+-u_0)-(u^+-u_0)\big )\Vert _{Y}, \end{aligned}$$

employing again (12) to get the bottom line. Since \(\Vert u_{m}-u^+\Vert _{L^\infty (D)^\ell }< r\) by induction, TCC (3) yields

$$\begin{aligned}{} & {} \Vert F'(u_m)\mathcal {P}^{n}(u^+-u_m)-\big (y^\delta -F(u_m)\big )\Vert _{Y} \\{} & {} \quad \le \omega \Vert F(u^+)-F(u_m)\Vert _{Y} +\delta +\Vert F'(u_m)\big (\mathcal {P}^{n}(u^+-u_0)-(u^+-u_0)\big )\Vert _{Y} \end{aligned}$$

and with \(\Vert F(u^+)-F(u_m)\Vert _{Y}\le \Vert F(u^+)-y^\delta \Vert _{Y}+\Vert b_m^{\delta }\Vert _{Y}\le \delta + \Vert b_m^{\delta }\Vert _{Y}\) we deduce further

$$\begin{aligned}{} & {} \Vert F'(u_m)\mathcal {P}^{n}(u^+-u_m)-\big (y^\delta -F(u_m)\big )\Vert _{Y} \\{} & {} \quad \le \omega \big (\delta +\Vert b_m^{\delta }\Vert _{Y}\big ) +\delta +\Vert F'(u_m)\big (\mathcal {P}^{n}(u^+-u_0)-(u^+-u_0)\big )\Vert _{Y}. \end{aligned}$$

Taking into account that \(\Vert b^{\delta }_m\Vert _{Y}> \tau \delta \), we get

$$\begin{aligned}{} & {} \Vert F'(u_m)\mathcal {P}^{n}(u^+-u_m)-\big (y^\delta -F(u_m)\big )\Vert _{Y} \\{} & {} \quad \le \Vert b_m^{\delta }\Vert _{Y}\Big (\omega +\frac{1+\omega }{\tau }\Big )+\Vert F'(u_m)\big (\mathcal {P}^{n}(u^+-u_0)-(u^+-u_0)\big )\Vert _{Y} \end{aligned}$$

and finally

$$\begin{aligned}{} & {} J_{n,m}(s_{n,m}) \le \bigg (\Vert b_m^{\delta }\Vert _{Y}\Big (\omega +\frac{1+\omega }{\tau }\Big )+\Vert F'(u_m)\big (\mathcal {P}^{n}(u^+-u_0)-(u^+-u_0)\big )\Vert _{Y}\bigg )^2\nonumber \\{} & {} \qquad \qquad \qquad \quad +{\text {vol}}_d(D)^{2/q_n}C_{\mathcal {P}}^2\frac{r_0^2}{\gamma ^2}\Vert b^\delta _m\Vert ^2_{L^2(D)}. \end{aligned}$$
(13)

Since \(\text {vol}_\text {d}(D)^{2/q_n}\rightarrow 1\) as \(n\rightarrow \infty \) and in view of (5), we find that

$$\begin{aligned} \liminf _{n\rightarrow \infty }J_{n,m}(s_{n,m})\le \Vert b_m^{\delta }\Vert ^2_{Y}\bigg (\Big (\omega +\frac{1+\omega }{\tau }\Big )^2+C_{\mathcal {P}}^2\frac{r_0^2}{\gamma ^2}\bigg )= \mu _{\min }^2 \Vert b_m^{\delta }\Vert ^2_{Y}. \end{aligned}$$
(14)

Consequently, condition (9) with \(\mu _m>\mu _{\min }\) is feasible for \(n_m\) large and appropriate \(s_m\in X^{n_m}\backslash \{0\}\), where \(s_m\ne 0\) follows by \(J_{n,m}(0)\ge \Vert b_m^{\delta }\Vert ^2_{Y}\). Hence, \(u_{m+1}=u_m+s_m\) is well defined and, relying on (S3) as well as (11), we proceed with

$$\begin{aligned} \Vert u_{m+1}-u_0\Vert _{L^\infty (D)^\ell }^2 =\Vert s_{m}+(u_m-u_0)\Vert ^2_{L^\infty (D)^\ell }&\le C_{\infty } \frac{J_{n_m,m}(s_m)}{\alpha _m}\\&\le C_{\infty }\frac{\mu _m^2 \Vert b_m^{\delta }\Vert ^2_{Y}}{\alpha _m} < C_{\infty } \mu _m^2 \gamma ^2\ . \end{aligned}$$

Hence,

$$\begin{aligned} \Vert u_{m+1}-u^+\Vert _{L^\infty (D)^\ell }&\le \Vert u_{m+1}-u_0\Vert _{L^\infty (D)^\ell }+\Vert u^+-u_0\Vert _{L^\infty (D)^\ell } \\&< \sqrt{C_{\infty }}\mu _{\max } \gamma +r_0<r \end{aligned}$$

by the upper bound of \(r_0\), yielding \(u_{m+1}\in B_r(u^+)\subset \text {int}(\mathcal {D}(F))\). Finally, we estimate on the basis of (4) and (9)

$$\begin{aligned} \Vert b^{\delta }_{m+1}\Vert _{Y}&=\Vert \big (b^{\delta }_m-F'(u_m)s_m\big )-\big (F(u_{m+1})-F(u_m)-F'(u_m)s_m\big )\Vert _{Y} \nonumber \\&\le \Vert b^{\delta }_m-F'(u_m)s_m\Vert _{Y}+\frac{\omega }{1-\omega }\Vert F'(u_m)s_m\Vert _{Y}\nonumber \\&\le \sqrt{J_{n,m}(s_{m})} + \frac{\omega }{1-\omega }\big (\Vert b^{\delta }_{m}\Vert _{Y}+\Vert F'(u_m)s_m-b^{\delta }_{m}\Vert _{Y}\big )\nonumber \\&\le \mu _m \Vert b^{\delta }_m\Vert _{Y} + \frac{\omega }{1-\omega }(1+\mu _m)\Vert b^{\delta }_m\Vert _{Y}\\&=\left( \mu _m+\frac{\omega }{1-\omega }(1+\mu _m)\right) \Vert b^{\delta }_m\Vert _{Y} \nonumber \\&<\left( \mu _{\max }+\frac{\omega }{1-\omega }(1+\mu _{\max })\right) \Vert b^{\delta }_m\Vert _{Y} < \Lambda \Vert b^{\delta }_m\Vert _{Y}.\nonumber \end{aligned}$$
(15)

Having thus proven the induction part, REGINN \(^{\infty }\) is forced to terminate for any \(\delta >0\) due to \(\Vert b_m^{\delta }\Vert _{Y}\le \Lambda ^m \Vert b_0^{\delta }\Vert _{Y} \le \tau \delta \) for m sufficiently large. From this estimate, we may even deduce \(M_{\delta }=\mathcal {O}(\vert \log \delta \vert )\) as \(\delta \searrow 0\). \(\square \)

Remark 2.2

(a) The name REGINN \(^{\infty }\) for Algorithm 1 is justified by the stopping condition (9) for determining the Newton update which is, in view of (10) and (11), equivalent to

$$\begin{aligned} \frac{\Vert F'(u_m)s_m -b^{\delta }_m\Vert ^2_{Y}}{\Vert b_m^{\delta }\Vert ^2_{Y}}+\frac{\Vert s_m+(u_m-u_0)\Vert _{L^{q_n}(D)^\ell }^2}{\gamma ^2} \le \mu _m^2. \end{aligned}$$

In particular, \(s_m\) satisfies the stopping condition of REGINN [4], i.e., the above condition without penalty term.

(b) Recall that REGINN fulfills in the Hilbert space setting (and likewise for smooth reflexive Banach spaces) the error reducing property for the iterates of many inner linear solvers, keeping thus \(u_m\in B_{r}(u^+)\) if the initial guess was chosen so. However, this does not hold any longer for our \(L^{\infty }\)-tailored REGINN \(^{\infty }\) in general. Therefore our parameters need to be controlled in terms of both \(\omega \) and r, whereas standard REGINN only requires the knowledge of \(\omega \) for defining admissible tolerances \(\mu \) and stopping constants \(\tau \), see Theorem 3.1 in [19].

In case that F is linear, i.e., \(F(u)=Au\) for some \(A\in \mathcal {L}(L^\infty (D)^\ell ,Y)\), the TCC holds with \(\omega =0\) and \(r=\infty \). Some observations are in order:

  • Although \(r_0\) can be arbitrarily large now, to still ensure a finite and uniform \(L^{\infty }\)-bound on the iterates, \(r_0<\gamma <\infty \) needs to be chosen compatibly.

  • Because of

    $$\begin{aligned} \Vert F'(u_m)s_m -b^{\delta }_m\Vert ^2_{Y} =\Vert As_m -(y^\delta -Au_m)\Vert ^2_{Y} =\Vert Au_{m+1} -y^\delta \Vert ^2_{Y} \end{aligned}$$

    the iterate \(u_{m+1}\) satisfies

    $$\begin{aligned} \Vert Au_{m+1} -y^\delta \Vert ^2_{Y}+\alpha _m\Vert u_{m+1}-u_0\Vert _{L^{q_{n_m}}(D)}^2\le \mu _m^2 \Vert b_m^\delta \Vert _Y^2. \end{aligned}$$

    Hence, \(u_{m+1}\) can be considered an approximate minimizer of the Tikhonov functional \( u\mapsto \Vert Au-y^\delta \Vert ^2_{Y}+\alpha _m\Vert u-u_0\Vert _{L^{q_{n_m}}(D)}^2\) in the set \(u_0+X^{n_{m}}\). Put differently: in the linear case, REGINN \(^{\infty }\) can be viewed as a cascading Tikhonov regularization iterating over nested finite-dimensional spaces where the penalty term is determined a posteriori by the previous iterate.

Theorem 2.3

(Regularization property of REGINN \(^{\infty }\)) Adopt all assumptions and notations from Theorem 2.1 with \(\overline{B_r(u^+)}\subset \mathcal {D}(F)\) and set \(F(u^+)=y\). Additionally, assume that \(F'(u^+)\) fulfills (6) or that F is weakly-\(\star \) sequentially closed, that is, \(w_n{\mathop {\rightharpoonup }\limits ^{\star }}w\) in \(L^\infty (D)^\ell \) and \(F(w_n)\rightharpoonup z\) imply that \(F(w)=z\). Let \(\{\delta _i\}_{i\in \mathbb {N}}\) be a positive zero sequence and let \(u_{M_{\delta _i}}\) be the output of Algorithm 1 with respect to perturbed data \(y^{\delta _i}\). Then the set of weak-\(\star \) accumulation points of the sequence \(\{u_{M_{\delta _i}}\}_{i\in \mathbb {N}}\) is non-empty and consists of solutions to \(F(\cdot )=y\). If \(u^+\) is the only solution in \(\overline{B_r(u^+)}\), then even the whole sequence \(\{u_{M_{\delta _i}}\}_{i\in \mathbb {N}}\) converges weakly-\(\star \) to \(u^+\) in \(L^\infty (D)^\ell \).

Proof

By construction in Theorem 2.1 we know that \(\{u_{M_{\delta _i}}\}_{i\in \mathbb {N}}\) yields

$$\begin{aligned} \Vert y-F(u_{M_{\delta _i}})\Vert _{Y}\le \Vert y-y^{\delta _i}\Vert _{Y}+\Vert b_{M_{\delta _i}}^{\delta _i}\Vert _{Y}\le (1+\tau )\delta _i\rightarrow 0 \end{aligned}$$
(16)

and is uniformly bounded in \(L^\infty (D)^\ell \), so there exists weak-\(\star \) accumulation points in \(\overline{B_r(u^+)}\) by sequential weak-\(\star \)-compactness. Take representatively \(u_{M_{\delta _{i_k}}}{\mathop {\rightharpoonup }\limits ^{\star }}\widetilde{u}\). In case that F is weakly-\(\star \) sequentially closed, we can directly deduce \(F(\widetilde{u})=y\), hence any weak-\(\star \) accumulation point solves the equation. In case that (6) holds, we first note that the TCC (3) implies by the reverse triangle inequality for any \(u\in B_r(u^+)\) that

$$\begin{aligned} (1-\omega )\Vert F(u^+)-F(u)\Vert _{Y}{} & {} \le \Vert F'(u^+)(u^+-u)\Vert _{Y}\nonumber \\{} & {} \le (1+\omega )\Vert F(u^+)-F(u)\Vert _{Y}. \end{aligned}$$
(17)

With (6) we then obtain for any \(g\in Y^{*}\)

$$\begin{aligned} \langle F'(u^+)\widetilde{u},g\rangle _{Y,Y^{*}}&=\langle \widetilde{u},F'(u^+)^{*}g\rangle _{L^\infty (D)^\ell ,L^1(D)^\ell }\\&=\lim _{k\rightarrow \infty }\langle u_{M_{\delta _{i_k}}},F'(u^+)^{*}g\rangle _{L^\infty (D)^\ell ,L^1(D)^\ell }\\&=\lim _{k\rightarrow \infty }\langle F'(u^+)u_{M_{\delta _{i_k}}},g\rangle _{Y,Y^{*}} \\&=\langle F'(u^+)u^+,g\rangle _{Y,Y^{*}}, \end{aligned}$$

where the last equality above follows by the second inequality in (17) with \(u=u_{M_{\delta _{i_k}}}\) and \(F(u_{M_{\delta _{i_k}}})\rightarrow y\) in \(Y\). We deduce that \(F'(u^+)(u^+-\widetilde{u})=0\) and combining this relation now with the first inequality in (17) using \(\widetilde{u}=u\), we may again conclude \(F(\widetilde{u})=y\). Finally, if \(u^+\) is the only solution to \(F(\cdot )=y\) in \(\overline{B_r(u^+)}\), the weak-\(\star \) convergence of the whole sequence \(\{u_{M_{\delta _i}}\}_{i\in \mathbb {N}}\) follows by a standard subsequence argument, see [20, Prop. 10.13(2)]. \(\square \)

Remark 2.4

If \(F'(u^+)\) is injective, \(\Vert u\Vert _\star :=\Vert F'(u^+)u\Vert \) constitutes a norm on \(L^\infty (D)^\ell \) with respect to which \(\{u_{M_{\delta }}\}_{\delta >0}\) then converges strongly to \(u^+\) according to (17) and (16) at the rate \(\Vert u^+ -u_{M_{\delta }}\Vert _\star =\mathcal {O}(\delta )\) as \(\delta \rightarrow 0\). However, this norm is generally weaker than \(\Vert \cdot \Vert _{L^\infty (D)^\ell }\) with equivalence if and only if \(F'(u^+)\) is boundedly invertible. However, for locally illposed problems \(F(\cdot )=y\) we expect its linearization to be illposed as well.

Remark 2.5

We discuss how the statements of the theorems from above carry over to a semi-discrete situation as it appears under an implementation of Algorithm 1. Typically, one \(X^{n_{\max }}\) represents the finest possible or finest chosen resolution for the sought-for quantity \(u^+\in X^{n_{\max }}\) and models the implementation from a mathematical point of view.Footnote 1 Here, \(X^{n_{\max }}\) is equipped with the \(L^\infty \)-topology. Now, Theorems 2.1 and 2.3 apply to

$$\begin{aligned} F_{n_{\max }}:\mathcal {D}(F)\cap X^{n_{\max }}\subset L^\infty (D)^\ell \rightarrow Y,\quad u\mapsto F(u), \end{aligned}$$

where (5) can be omitted due to

$$\begin{aligned} F_{n_{\max }}'(u)\big ((u^+-u_0)-\mathcal {P}^{n_{\max }}(u^+-u_0)\big )=0 \end{aligned}$$

since both \(u_0\) and \(u^+\) are assumed to be in \(X^{n_{\max }}\). Further, (14) then reads

$$\begin{aligned} J_{n_{\max },m}(s_{n_{\max },m})\le \Vert b_m^{\delta }\Vert ^2_{Y}\bigg (\Big (\omega +\frac{1+\omega }{\tau } \Big )^2+\text {vol}_\text {d}(D)C_{\mathcal {P}}^2\frac{r_0^2}{\gamma ^2}\bigg ) \end{aligned}$$

and as the only consequence the constant \(C_{\mathcal {P}}\) needs to be replaced by \(\text {vol}_d(D)^{1/2}C_{\mathcal {P}}\) in the definition of corresponding REGINN \(^\infty \) parameters.

We emphasize that the underlying semi-discrete inverse problem is: given \(y^\delta \in Y\) find \(u^\delta \in X^{n_{\max }}\) such that \(F_{n_{\max }}(u^\delta )\approx y^\delta \) where \(y^\delta \) now incorporates measurement noise and discretization error.

Remark 2.6

At first glance the IRGNM-Tikhonov method of [12] and REGINN \(^\infty \) seem to be quite similar, but they are separated by significant structural differences: In IRGNM-Tikhonov the penalty parameter is determined a priori and is assigend to a fixed regularization term, whereas for REGINN \(^\infty \) the regularization term is explicitly n-dependent and relies on an the posteriori parameter choice (11). Further, the Newton update for the IRGNM-Tikhonov method has to be an exact minimizer of the Tikhonov functional. In contrast, the Newton update for REGINN \(^\infty \) only requires an approximate minimizer, see (9), which is especially convenient when it comes to practical realizations. Note also that the discrete spaces \(X^n\) are an integral part of REGINN \(^\infty \) and its theory, so numerical effects are included in a natural way. As part of our results, the discrete regularizers \(u_{M_{\delta _i}} \in X^{n_{M_{\delta _i}}}\) of Theorem 2.3 converge to the continuous limit \(u^+\in L^\infty (D)^\ell \). A similar connection of the discrete with the continuous world is (to our knowledge) still pending for the method in  [12].

The previous version of REGINN \(^{\infty }\) requires the determination of successive discretization levels \(n_m\ge n_{m-1}\) for possibly strongly increasing \(n_m\) so that many intermediate (almost) minimizers \(s\in X^n\) of (10) for \( n_{m-1}\le n\le n_m\) need to be computed before meeting the given \(\mu _m\)-criterion in (9). As this adaptive discretization strategy can become numerically expensive, we want to present an alternative version which directly links n to m. A closer look at the proof of Theorem 2.1 reveals that \(n_m\) actually depends on the decay of \(\Vert F'(u_m)(\mathcal {P}^{n}(u^+-u_0)-(u^+-u_0))\Vert _{Y}\). Hence, if we have a concrete upper bound for this discretization residual in terms of n, feasible choices of \(n_m\) can be found by simple algebraic manipulation. Such upper bounds can be deduced on the basis of better initial guesses which are governed by some stronger norm. For this purpose we state the following refined version of assumption (5):

If \(X\subset L^\infty (D)^\ell \) is an embedded normed space such that

$$\begin{aligned} \Vert \widehat{u}\Vert _{L^\infty (D)^\ell }\le C_{X} \Vert \widehat{u}\Vert _X\ \text { for all } \widehat{u}\in X, \end{aligned}$$
(18)

then for any \(u^+\) such that \(B_{\widetilde{r}}(u^+)\subset \text {int}(\mathcal {D}(F))\), \(u\in B_{\widetilde{r}}(u^+)\) and \(\widehat{u}\in X\) we assume that

$$\begin{aligned} \Vert F'(u)\big (\widehat{u}-\mathcal {P}^{n}\widehat{u}\big )\Vert _{Y}\le C^+ \Vert \widehat{u}\Vert _{X}\beta (n), \end{aligned}$$
(19)

where \(C^+>0\) and \(\beta \) fulfills \(\beta (n)\searrow 0\) with \(\beta (0)=1\). We think of \(\beta \) as being rather independent of \(u\in \mathcal {D}(F)\) once \(X\) and \(\{X^n\}_{n\in \mathbb {N}}\) are set while the magnitude of \(C^+\) is strongly \(B_{\widetilde{r}}(u^+)\)-dependent. We will verify in Sect. 3.3 below that (19) is satisfied, for instance, by the modeling assumptions of FWI in the acoustic regime. The next theorem shows that we can indeed determine \(n_m\) conveniently for successive Newton steps of REGINN \(^{\infty }\) when replacing condition (5) by (19).

Theorem 2.7

Adopt all assumptions, notations and parameters from Theorems 2.1 and 2.3, and assume that (19) holds—without loss of generality with \(\widetilde{r}=r\) by shrinking one of the radii otherwise. Start with \(u_0\in L^\infty (D)^\ell \) such that \(\Vert u^+-u_0\Vert _X<\min \{r_0/C_{X},1/C^+\}\) and restrict \(\{\mu _m\}\) to \((\mu ^{\varepsilon }_{\min },\mu _{\max })\), where

$$\begin{aligned} \mu ^{\varepsilon }_{\min }:=\sqrt{\left( \omega +\frac{1+\omega }{\tau }+\varepsilon \right) ^2+\max \big \{\text {vol}_d(D)^{2/q_{n_0}},1\big \}C_{\mathcal {P}}^2\frac{r_0^2}{\gamma ^2}}<\mu _{\max } \end{aligned}$$
(20)

for some \(\varepsilon >0\) sufficiently small and \(q_{n_0}\) large with \(n_0\in \mathbb {N}\). Further, let \(n_m\) be successively defined by

$$\begin{aligned} \begin{aligned} n_m:=\min \bigg \{n\ge n_0:\ \beta (n)\le \varepsilon \Vert b^{\delta }_m\Vert _{Y}\bigg \} . \end{aligned} \end{aligned}$$
(21)

Then we can find \(s_m\in X^{n_m}\) satisfying (9). In particular, REGINN \(^{\infty }\) also terminates in this case and the regularization property still holds.

Proof

First, \(n_m\) according to (21) is well defined since \(\lim _{n\rightarrow \infty }\beta (n)=0\). Besides, since \(\Vert b^{\delta }_m\Vert _{Y}\) is monotonously decreasing in m, we get that \(n_m\) is non-decreasing, too. Using \(s_m:=\arg \min _{s\in X^{n_m}} J_{n_m,m}(s)\), we may compute with (13) and by (19)

$$\begin{aligned} J_{n_m,m}(s_m)&\le \bigg (\Vert b_m^{\delta }\Vert _{Y}\Big (\omega +\frac{1+\omega }{\tau }\Big )+\Vert F'(u_m)\big (\mathcal {P}^{n}(u^+-u_0)-(u^+-u_0)\big )\Vert _{Y}\bigg )^2\\&\qquad +\text {vol}_d(D)^{2/q_{n_m}}\frac{r_0^2}{\gamma ^2}C_{\mathcal {P}}^2\Vert b^\delta _m\Vert ^2_{L^2(D)} \\&\le \bigg (\Vert b_m^{\delta }\Vert _{Y}\Big (\omega +\frac{1+\omega }{\tau }\Big )+\underbrace{C^+ \Vert u^+-u_0\Vert _{X}}_{<1}\underbrace{\beta (n_m)}_{\le \varepsilon \Vert b^{\delta }_m\Vert _{Y}}\bigg )^2\\&\qquad +\max \{\text {vol}_d(D)^{2/q_{n_0}},1\}C_{\mathcal {P}}^2\frac{r_0^2}{\gamma ^2}\Vert b^\delta _m\Vert ^2_{L^2(D)} \\&\le (\mu ^{\varepsilon }_{\min })^2\Vert b^\delta _m\Vert ^2_{L^2(D)}. \end{aligned}$$

The fact that REGINN \(^{\infty }\) still terminates and also admits the regularization property follows by Theorems 2.1, 2.3 and \(\Vert u_0-u^+\Vert _{L^\infty (D)^\ell }\le C_{X}\Vert u_0-u^+\Vert _{X}< r_0\). \(\square \)

For convenience, we restate REGINN \(^{\infty }\) in Algorithm 2 subject to \(u^+-u_0\in X\) for which we need to provide \(\varepsilon \) and \(\beta \) as additional input. This version is especially of interest if the regularity \(u^+\in X\) is known a priori so that \(u_0\in X\) ensures \(u^+-u_0\in X\), as desired.

figure b

3 Applications: parameter identification tasks in PDEs

In this section we will verify our abstract assumptions (5) and (6) in the concrete settings of electrical impedance tomography and time domain full waveform inversion (FWI) in the visco-elastic regime which is the state-of-the-art imaging modality in exploration geophysics, see, e.g., [2, 3]. Both inverse problems are locally illposed. We will even show that (19) is satisfied in the acoustic regime. Moreover, our results for FWI carry over to parameter identification tasks for other hyperbolic evolution equations, such as an inverse problem in electromagnetic scattering.

For all applications we rely on the discrete B-spline spaces constructed in Appendix A.1.

3.1 Electrical impedance tomography under the complete electrode model

Electrical impedance tomography (EIT) entails the determination of the electric conductivity distribution of an object by applying electric currents at the boundary through electrodes and measuring the resulting voltages at the boundary as well. Potential applications are, for instance, medical imaging and non-destructive testing.

Let \(\sigma :D\rightarrow [\sigma _{\min },\infty )\), \(\sigma _{\min }>0\), be the searched-for conductivity distribution in the simply connected Lipschitz-domain \(D \subset \mathbb {R}^2\). Further, the p electrodes are denoted by \(E_1,\ldots ,E_p\) and are assumed to be open subsets of \(\partial D\), the boundary of D, having positive surface measure: \(\vert E_j\vert >0\), \(j=1,\ldots ,p\). Moreover, let the electrodes be connected and separated: \(\overline{E}_i\cap \overline{E}_j=\emptyset \), \(i\ne j\). To this electrode configuration we associate the electrode space

$$\begin{aligned} \mathscr {E}_p:=\textrm{span}\{\chi _{E_1},\ldots ,\chi _{E_p}\}\cap L^2_{\scriptscriptstyle \diamondsuit }( \partial D)\subset L^2_{\scriptscriptstyle \diamondsuit }( \partial D) \end{aligned}$$

where \(\chi _{E_i}\) is the indicator function of the ith electrode and \(L^2_{\scriptscriptstyle \diamondsuit }(\partial D)\) is the space of \(L^2\)-functions on the boundary of D having vanishing mean.

The forward problem of impedance tomography under the complete electrode model (CEM) in the weak formulation is based on the bilinear form \(a:Y_p\times Y_p\rightarrow \mathbb {R}\), \(Y_p:= H^1(D)\oplus \mathscr {E}_p\),

$$\begin{aligned} a\big ((v,V),(w,W)\big ):=\int _D\sigma \nabla v\cdot \nabla w\,\textrm{d} x+ \frac{1}{z}\int _{\partial D} (v-V) (w-W)\,\textrm{d}s \end{aligned}$$

Given an electrode (mean) current \(I\in \mathscr {E}_p\) and a contact impedance \(z>0\), find a voltage potential \(u \in H^1(D)\) and an electrode voltage \(U \in \mathscr {E}_p\) such that

$$\begin{aligned} a\big ((u,U),(w,W)\big )=\int _{\partial D}I\,W\textrm{d}s \quad \text { for all }(w,W)\in Y_p. \end{aligned}$$
(22)

The vanishing mean of I and U can be interpreted as conservation of charge and grounding the potential, respectively. CEM (22) is the most accurate mathematical model for EIT currently in use and has a unique solution for \(\sigma \in L^\infty _+(D)\), where

$$\begin{aligned} L^\infty _+(D):=\{\sigma \in L^\infty (D):\sigma \ge \sigma _{\min } >0 \ \text { a.e. in }D\}. \end{aligned}$$

The latter follows from the Lax-Milgram theorem, see [21].

The nonlinear forward operator F describing CEM is

$$\begin{aligned} F:L^\infty _+(D)\subset L^\infty (D)\rightarrow \mathcal {L}(\mathscr {E}_p), \quad \sigma \mapsto \{I\mapsto U\}. \end{aligned}$$

In other words: \(F(\sigma )\) maps the current I to the voltage U of the solution of (22). Its Frechét derivative \(F'(\sigma )\in \mathcal {L}\big (L^\infty (D),\mathcal {L}(\mathscr {E}_p)\big )\) is given by

$$\begin{aligned} F'(\sigma )[h]I=U' \end{aligned}$$

where \((u',U')\in Y_p\) uniquely solves

$$\begin{aligned} a\big ((u',U'),(w,W)\big )=-\int _{D} h \nabla u(I)\cdot \nabla w\textrm{d}x \quad \text {for all }(w,W)\in Y_p \end{aligned}$$
(23)

with \(u=u(I)\) being the first component of the solution of (22) with respect to the current I, see, e.g., [22]. We will similarly use U(I) for the second component of the solution of (22) and \(u'(I)\) for the first component of (23).

We equip \(\mathcal {L}(\mathscr {E}_p)\) with the Hilbert-Schmidt inner product: \(\langle \Lambda , \Gamma \rangle _\text { HS}=\sum _{j=1}^{p-1} \langle \Lambda I_j,\Gamma I_j\rangle _{L^2(\partial D)}\) where \(\{I_1,\ldots ,I_{p-1}\}\) is an orthonormal basis of \(\mathscr {E}_p\). This inner product is independent of the chosen orthonormal basis.

Lemma 3.1

The adjoint operator \(F'(\sigma )^*:\mathcal {L}(\mathscr {E}_p)^*\rightarrow L^\infty (D)^*\) is given by

$$\begin{aligned} F'(\sigma )^*\Gamma =-\sum _{j=1}^{p-1} \nabla u( \Gamma I_j)\cdot \nabla u( I_j) \end{aligned}$$

where \(u(\Gamma I_j)\) and \(u(I_j)\) denote the solutions of (22) with respect to \(I=\Gamma I_j\) and \(I=I_j\), respectively. As a sum of products of two \(L^2\)-functions, \(F'(\sigma )^*\Gamma \) is even in \(L^1(D)\subset L^\infty (D)^*\), so that (6) holds.

Proof

Set \(\Lambda = F'(\sigma )[h]\), \(\Lambda _j=\Lambda I_j\), and \(\Gamma _j=\Gamma I_j\) for one \(\Gamma \in \mathcal {L}(\mathscr {E}_p)\). Then,

$$\begin{aligned}{} & {} \langle \Lambda , \Gamma \rangle _\text { HS}= \sum _{j=1}^{p-1} \langle \Lambda _j,\Gamma _j\rangle _{L^2(\partial D)} {\mathop {=}\limits ^{(22)}} \sum _{j=1}^{p-1} a\big ((u(\Gamma _j),U(\Gamma _j)),(u'(I_j), \Lambda _j)\big )\\{} & {} \quad {\mathop {=}\limits ^{(23)}} - \int _{\partial D} h \sum _{j=1}^{p-1} \nabla u( \Gamma _j)\cdot \nabla u(I_j)\,\textrm{d}s =\langle h, F'(\sigma )^*\Gamma \rangle _{L^\infty (D)\times L^\infty (D)^*} \end{aligned}$$

which settles the argument. \(\square \)

Lemma 3.2

Let \(X^n=X_N^n({D})\), \(n\in \mathbb {N}\), be the tensor product spline space of Appendix A.1 with projector \(\mathcal {P}^n\) defined in (A5). Then, for any \(\sigma \in L_+^\infty (D)\),

$$\begin{aligned} \lim _{n\rightarrow \infty }\big \Vert F'(\sigma )[h-\mathcal {P}^nh]\big \Vert _{\mathcal {L}(\mathscr {E}_p)}=0\quad \text { for any }h\in L^\infty (D). \end{aligned}$$

Proof

We start with

$$\begin{aligned} \big \Vert F'(\sigma )[h-\mathcal {P}^nh]\big \Vert _{\mathcal {L}(\mathscr {E}_p)}^2 =\sum _{j=1}^{p-1} \big \Vert F'(\sigma )[h-\mathcal {P}^nh]I_j\big \Vert _{L^2(\partial D)}^2. \end{aligned}$$

The bilinear form a is bounded and elliptic on \(Y_p\). Hence, it follows from (23) that

$$\begin{aligned} \big \Vert F'(\sigma )[h-\mathcal {P}^nh]I_j\big \Vert _{L^2(\partial D)}\lesssim \big \Vert (h-\mathcal {P}^nh) \vert \nabla u(I_j)\vert \big \Vert _{L^2(D)}. \end{aligned}$$

Since \(\mathcal {P}^nh\xrightarrow {n\rightarrow \infty } h\) pointwise a.e. (Appendix B), the norm on the right tends to zero as \(n\rightarrow \infty \) by the dominated convergence theorem. \(\square \)

So we have validated (5) and (6) for the forward operator of EIT. Moreover, TCC (3) holds for the semi-discrete version \(F_{n_{\max }}\) of F (Remark 2.5) provided the number of electrodes is sufficiently large [22, Theorem 4.5].

3.2 Full waveform inversion

Wave propagation in realistic media can be modeled by a visco-elastic wave equation which accounts for dispersion and attenuation [2, Chapter 5]. Here, we consider the formulation introduced in [23]: Let \(D\subset \mathbb {R}^3\) be a Lipschitz domain. Using \(L\in \mathbb {N}\) damping tensors \({\varvec{\sigma }}_l:[0,T]\times D\rightarrow \mathbb {R}^{3\times 3}_\text {sym}\), \(l=1,\ldots ,L\), the evolution of the particle velocity field \(\textbf{v}:[0,T]\times D\rightarrow \mathbb {R}^3\) and stress tensor \({\varvec{\sigma }}_0:[0,T]\times D\rightarrow \mathbb {R}^{3\times 3}_\text {sym}\) reads

$$\begin{aligned} \rho \, \partial _t \textbf{v}&= \mathrm{\,div\,}\Big (\sum _{l=0}^{L}\varvec{\sigma }_l\Big )+{\varvec{f}}{} & {} \text {in }(0,T)\times D, \end{aligned}$$
(24a)
$$\begin{aligned} \partial _t {\varvec{\sigma }}_0&= C\big (\mu ,\pi \big )\,\varepsilon (\textbf{v}){} & {} \text {in }(0,T)\times D , \end{aligned}$$
(24b)
$$\begin{aligned} \tau _{{\varvec{\sigma }},l}\, \partial _t {\varvec{\sigma }}_l&= C\big (\tau _\text {\tiny S}\mu , \tau _\text {\tiny P}\pi \big )\,\varepsilon (\textbf{v}) -{\varvec{\sigma }}_l,\qquad l=1,\dots ,L,{} & {} \text {in }(0,T)\times D, \end{aligned}$$
(24c)

with zero initial conditionsFootnote 2

$$\begin{aligned} \textbf{v}(0)=\textbf{0}\ \text { and }\ {\varvec{\sigma }}_l(0)=\textbf{0},\ l=0,\ldots ,L. \end{aligned}$$
(24d)

Above, \({\varvec{f}}:(0,T)\times D\rightarrow \mathbb {R}^3\) is the volume force, which initiates the wave, and the functions \(\tau _{\text {\tiny P}},\tau _{\text {\tiny S}}:D\rightarrow \mathbb {R}\) are scaling factors for the unrelaxed bulk modulus \(\pi :D\rightarrow \mathbb {R}\) and shear modulus \(\mu :D\rightarrow \mathbb {R}\), respectively. Further, \(\rho :D\rightarrow \mathbb {R}\) is the mass density. The linear map C(mp), \(m,p\in \mathbb {R}\), is Hooke’s tensor

$$\begin{aligned} C(m,p):\mathbb {R}^{3\times 3}\rightarrow \mathbb {R}^{3\times 3},\qquad C(m,p){\textbf{M}} =2m\textbf{M} + (p-2m) {\text {tr}}(\textbf{M})\textbf{I}, \end{aligned}$$

where \(\textbf{I}\in \mathbb {R}^{3\times 3}\) is the identity matrix and \({\text {tr}}(\textbf{M})\) denotes the trace of \(\textbf{M}\in \mathbb {R}^{3\times 3}\). Finally,

$$\begin{aligned} \varvec{\varepsilon }( \textbf{v}) = \frac{1}{2}\big [(\nabla _x \textbf{v})^\top +\nabla _x \textbf{v}\big ] \end{aligned}$$

is the linearized strain rate.

Wave propagation is frequency-dependent and the numbers \(\tau _{{\varvec{\sigma }},l}>0\), \(l=1,\ldots ,L\), model this dependency about the center frequency \(\omega _0\), see [24, 25]. Introducing the frequency-dependent phase velocities of P- and S-waves,

$$\begin{aligned} v^{2}_{\text {\tiny P}} =\frac{\pi }{\rho }(1+ \tau _{\text {\tiny P}} \alpha ) \ \text { and }\ v^{2}_{\text {\tiny S}} =\frac{\mu }{\rho }(1+ \tau _{\text {\tiny S}} \alpha )\ \text { with } \alpha =\sum _{l=1}^{L} \frac{\omega _0^{2}\tau _{{\varvec{\sigma }},l}^{2}}{1+\omega _0^{2}\tau _{{\varvec{\sigma }},l}^{2}}, \end{aligned}$$

full waveform inversion entails the identification of the five spatially dependent parameters

$$\begin{aligned} u=(\rho ,v_{\text {\tiny S}}, \tau _{\text {\tiny S}}, v_{\text {\tiny P}},\tau _{\text {\tiny P}}) \end{aligned}$$

from partial wave field measurements. For a physically meaningful open subsetFootnote 3\(\mathcal {D}(F)\subset L^\infty (D)^5\) the full waveform forward operator

$$\begin{aligned} F:\mathcal {D}(F)\subset L^\infty (D)^5\rightarrow L^2((0,T),H),\quad u \mapsto y:=(\textbf{v},{\varvec{\sigma }}_0,\dots ,{\varvec{\sigma }}_L), \end{aligned}$$

is well defined with

$$\begin{aligned}H=L^2(D,\mathbb {R}^3)\times L^2(D,\mathbb {R}^{3\times 3}_{\textrm{sym}})^{1+L} \end{aligned}$$

where y \(\in C^{1}([0,T],H)\) is the unique classical solution of (24) with source \({\varvec{f}}\in W^{1,1}((0,T),L^2(D,\mathbb {R}^3))\), see [26] (of course, y satisfies boundary conditions, whose concrete formulations we have omitted for simplicity).

Remark 3.3

Actually, the operator modeling seismic imaging is \(\Phi =M\circ F\) where \(M:L^2((0,T),H) \rightarrow S \) denotes the measurement operator (S is the space of seismograms). We can reasonably assume that M is linear and bounded. If F satisfies (5) and (6), so does \(\Phi \).

For suitable \(w=(\textbf{w},\varvec{\psi }_0,\ldots ,\varvec{\psi }_L)\in H\) we define operators A, B, and Q mapping into H by

$$\begin{aligned} Aw=-\begin{pmatrix} \mathrm{\,div\,}\big (\sum _{l=0}^{L}\varvec{\psi }_l\big ) \\ \varepsilon (\textbf{w})\\ \vdots \\ \varepsilon (\textbf{w})\end{pmatrix}\!,\ B^{-1}w=\begin{pmatrix} \frac{1}{\rho }\,\textbf{w}\\ C\big (\mu ,\pi \big )\varvec{\psi }_0\\ C\big (\tau _\text {\tiny S}\mu , \tau _\text {\tiny P}\pi \big )\varvec{\psi }_1\\ \vdots \\ C\big (\tau _\text {\tiny S}\mu , \tau _\text {\tiny P}\pi \big )\varvec{\psi }_L \end{pmatrix}\!,\ Qw=\begin{pmatrix} \textbf{0}\\ \textbf{0}\\ \frac{1}{\tau _{{\varvec{\sigma }},1}}\,\varvec{\psi }_1\\ \vdots \\ \frac{1}{\tau _{{\varvec{\sigma }},L}}\,\varvec{\psi }_L \end{pmatrix}\!, \end{aligned}$$

so that (24) collapses to

$$\begin{aligned} \begin{aligned} B\partial _ty(t)+Ay(t)+BQy(t)&=f(t)&\text {in }(0,T),\\ y(0)&=0. \end{aligned}\ \end{aligned}$$
(25)

Note that only \(B:\mathcal {D}(F)\subset L^\infty (D)^5\rightarrow \mathcal {L}(H)\) depends on the parameters to be identified: \(B=B(u)\). It is Fréchet-differentiable with

$$\begin{aligned} B'(u)[\widehat{u}]\! \begin{pmatrix}\! \textbf{w}\\ \varvec{\psi }_0\\ \vdots \\ \varvec{\psi }_L \!\end{pmatrix} = \begin{pmatrix} \widehat{\rho } \,\textbf{w}\\ -\frac{\widehat{\rho }}{\rho }\widetilde{C}(\mu ,\pi )\varvec{\psi }_0 +{\rho }\, \widetilde{C}'(\mu ,\pi )\! \begin{bmatrix} \widetilde{\mu }\\ \widetilde{\pi } \end{bmatrix}\! \varvec{\psi }_0\\ -\frac{\widehat{\rho }}{\rho }\widetilde{C}(\tau _\text {\tiny S}\mu ,\tau _\text {\tiny P}\pi )\varvec{\psi }_1 +{\rho }\, \widetilde{C}'(\tau _\text {\tiny S}\mu ,\tau _\text {\tiny P}\pi )\! \begin{bmatrix} \widehat{\mu }\\ \widehat{\pi } \end{bmatrix}\! \varvec{\psi }_1\\ \vdots \\ -\frac{\widehat{\rho }}{\rho }\widetilde{C}(\tau _\text {\tiny S}\mu ,\tau _\text {\tiny P}\pi )\varvec{\psi }_L +{\rho }\, \widetilde{C}'(\tau _\text {\tiny S}\mu ,\tau _\text {\tiny P}\pi )\! \begin{bmatrix} \widehat{\mu }\\ \widehat{\pi } \end{bmatrix}\! \varvec{\psi }_L \end{pmatrix} \end{aligned}$$
(26)

where \(\widehat{u}=(\widehat{\rho },\widehat{v}_{\text {\tiny S}}, \widehat{\tau }_{\text {\tiny S}}, \widehat{v}_{\text {\tiny P}},\widehat{\tau }_{\text {\tiny P}})\) and

$$\begin{aligned} \widetilde{\mu }&= \frac{2v_{\text {\tiny S}}}{1+\tau _{\text {\tiny S}}\alpha }\, \widehat{v}_{\text {\tiny S}} - \frac{\alpha \,v_{\text {\tiny S}}^2}{(1+\tau _{\text {\tiny S}}\alpha )^2}\, \widehat{\tau }_{\text {\tiny S}},&\qquad \widetilde{\pi }&=\frac{2v_{\text {\tiny P}}}{1+\tau _{\text {\tiny P}}\alpha }\, \widehat{v}_{\text {\tiny P}} - \frac{\alpha \,v_{\text {\tiny P}}^2}{(1+\tau _{\text {\tiny P}}\alpha )^2}\, \widehat{\tau }_{\text {\tiny P}},\\ \widehat{\mu }&= \frac{2\tau _\text {\tiny S}\,v_{\text {\tiny S}}}{1+\tau _{\text {\tiny S}}\alpha }\, \widehat{v}_{\text {\tiny S}} + \frac{v_{\text {\tiny S}}^2}{(1+\tau _{\text {\tiny S}}\alpha )^2}\, \widehat{\tau }_{\text {\tiny S}},&\qquad \widehat{\pi }&=\frac{2\tau _\text {\tiny P}\,v_{\text {\tiny P}}}{1+\tau _{\text {\tiny P}}\alpha }\, \widehat{v}_{\text {\tiny P}} +\frac{v_{\text {\tiny P}}^2}{(1+\tau _{\text {\tiny P}}\alpha )^2}\, \widehat{\tau }_{\text {\tiny P}}. \end{aligned}$$

Further,

$$\begin{aligned} \widetilde{C}(m,p):=C(m,p)^{-1}=C\left( \frac{1}{4m},\frac{p-m}{m(3p-4m)}\right) . \end{aligned}$$

and

$$\begin{aligned} \widetilde{C}'(m,p)\begin{bmatrix} \widehat{m}\\ \widehat{p} \end{bmatrix} =-\widetilde{C}(m,p)\circ C(\widehat{m},\widehat{p})\circ \widetilde{C}(m,p). \end{aligned}$$
(27)

Both, \(\widetilde{C}(\mu ,\pi )\) and \(\widetilde{C}(\tau _\text {\tiny S}\mu ,\tau _\text {\tiny P}\pi )\) are well defined for \(u\in \mathcal {D}(F)\), see [26, Section 4.1] for the involved definition of \(\mathcal {D}(F)\).

Lemma 3.4

Let \(\{\widehat{u}_n\}_{n\in \mathbb {N}}\subset L^\infty (D)^5\) be bounded and be convergent to \(\widehat{u}\in L^\infty (D)^5\) pointwise a.e. Then, for any \(u\in \mathcal {D}(F)\),

$$\begin{aligned} \lim _{n\rightarrow \infty } \big \Vert B'(u)[\widehat{u}_n-\widehat{u}]h \big \Vert _H=0 \quad \text {for any } h\in H. \end{aligned}$$
(28)

Proof

The \(L^2\)-space H has inner product

$$\begin{aligned} \big \langle (\textbf{v}, {\varvec{\sigma }}_0,\ldots , {\varvec{\sigma }}_L), (\textbf{w}, \varvec{\psi }_0,\ldots , \varvec{\psi }_l)\big \rangle _H= \int _D\Big (\textbf{v}\cdot \textbf{w}+\sum _{l=0}^L\varvec{\sigma }_l:\varvec{\psi }_l\Big )\,\textrm{d}x \end{aligned}$$

where the colon indicates the Frobenius inner product on \(\mathbb {R}^{3\times 3}\). Now, in view of (26) and (27) we see that the integrand of \(\Vert B'(u)[\widehat{u}_n-u]h \Vert _H^2\) converges to zero pointwise a.e. Furthermore, as \(\{\widehat{u}_n\}_{n\in \mathbb {N}}\) is bounded in \(L^\infty (D)^5\), the absolute value of the integrand is bounded by an integrable function (as a matter of fact, the integrand is the sum of products of two \(L^2\)-functions with an \(L^\infty \)-function). The assertions follows from the dominated convergence theorem. \(\square \)

For completion, we quote a result from [26, Theorem 4.4] which we will need below.

Lemma 3.5

Under the assumptions from above, F is Fréchet-differentiable at any \(u\in \mathcal {D}(F)\). For \(\widehat{u}\in L^{\infty }(D)^{5}\) we have \(\overline{y}:=F'(u)\widehat{u}\in C([0,T],H)\) given as the unique weak solution of

$$\begin{aligned} B\partial _t\overline{y}(t)+A\overline{y}(t)+BQ\overline{y}(t)&=-B'(u)[\widehat{u}]\left( \partial _ty(t)+Qy(t)\right) , \\ \overline{y}(0)&=0, \end{aligned}$$

with \(y=F(u)\). Further, for any \(t\in [0,T]\),

$$\begin{aligned} \Vert \overline{y}\Vert _{C([0,t],H)}\le C\Vert B'(u)[\widehat{u}](\partial _ty+Qy)\Vert _{L^1((0,t),H)} , \end{aligned}$$
(29)

where C depends continuously on the operator norms of B, \(B^{-1}\), Q, and on T.

We define

$$\begin{aligned} X^n =X_N^n \times \dots \times X_N^n\quad \text {(5 factors)} \end{aligned}$$
(30)

as well as

$$\begin{aligned} \mathcal {P}^n:L^{\infty }(D)^{5}\rightarrow X^n,\quad \mathcal {P}^nu=(\mathcal {P}_N^nu_1,\ldots ,\mathcal {P}_N^nu_{5}), \end{aligned}$$

where \(X_N^n\) and \(\mathcal {P}_N^n\) are given by the cardinal B-spline spaces in (A3) and (A5), respectively. We point out that also heterogeneous choices for the factors in \(X^n\) are possible as long as \(X^n\) keeps nested with respect to n. The convergence properties of \(\{\mathcal {P}^n\}_n\), see Appendix B, then guarantee (5) according to the next lemma.

Lemma 3.6

Let \(X^n\) and \(\mathcal {P}^n\) be as above. For the full waveform forward operator F with \({\varvec{f}}\in W^{1,1}((0,T),L^2(D,\mathbb {R}^3))\) we have that

$$\begin{aligned} \lim _{n\rightarrow \infty }\Vert F'(u)\big (\widehat{u}-\mathcal {P}^n\widehat{u}\big )\Vert _{L^2((0,T),H)}=0 \end{aligned}$$

for any \(u\in \mathcal {D}(F)\) and all \(\widehat{u}\in L^{\infty }(D)^{5}\).

Proof

The function \(\overline{y}_n:=F'(u)\big (\widehat{u}-\mathcal {P}^n\widehat{u}\big )\) is the unique weak solution of

$$\begin{aligned} \begin{aligned} B\partial _t\overline{y}_n(t)+A\overline{y}_n(t)+BQ\overline{y}_n(t)&=-B'(u)\big [\widehat{u}-\mathcal {P}^n\widehat{u}\big ]\left( \partial _ty(t)+Qy(t)\right) , \\ \overline{y}_n(0)&=0, \end{aligned} \end{aligned}$$

which according to (29) satisfies

$$\begin{aligned} \Vert F'(u)\big (\widehat{u}-\mathcal {P}^n\widehat{u}\big )\Vert _{C([0,T],H)}\le C\Vert [B'(u)\big [\widehat{u}-\mathcal {P}^n\widehat{u}\big ](\partial _ty+Qy)\Vert _{L^1((0,T),H)} . \end{aligned}$$

Since \(C([0,T],H)\hookrightarrow L^2((0,T),H)\) is continuous, the assertion of the theorem follows if we can show that the right-hand side of the above stability estimate goes to zero as \(n\rightarrow \infty \). Applying Proposition B.1 componentwise, we deduce that \(\mathcal {P}^{n_k}\widehat{u}\rightarrow \widehat{u}\) pointwise a.e. for a subsequence \(\{n_k\}_{k\in \mathbb {N}}\). Using \(\Vert \mathcal {P}^{n_k}\widehat{u}\Vert _{L^{\infty }(D)^{5}}\le C_\mathcal {P}\Vert \widehat{u}\Vert _{L^{\infty }(D)^{5}}\) and \(\partial _ty+Qy\in C([0,T],H)\), we see that (26) corresponds to linear combinations of uniformly bounded functions in t for all k. In particular, we can find \(g\in L^1(0,T)\) such that

$$\begin{aligned} \sup _{k}\Vert [B'(u)\big [\widehat{u}-\mathcal {P}^{n_k}\widehat{u}\big ](\partial _ty+Qy)(t)\Vert _H\le g(t). \end{aligned}$$

By (28) we can then apply the dominated convergence theorem for integration in time which yields \(\Vert B'(u)\big [\widehat{u}-\mathcal {P}^{n_k}\widehat{u}\big ](\partial _ty+Qy)\Vert _{L^1((0,T),H)}\rightarrow 0\) as \(k\rightarrow \infty \). By uniqueness of the pointwise limit \(\widehat{u}\) the latter convergence even holds for the whole sequence, see [20, Prop. 10.13(2)]. \(\square \)

Condition (6), which guarantees the regularization property (Theorem 2.3), was proven for the visco-elastic case (24) in [26, Theorem 4.5 and Remark 4.6]. Unfortunately, the TCC, which is the remaining condition for the rigorous applicability of REGINN \(^{\infty }\), is subject of current research in the context of FWI and only special cases are known to hold. For example, in [6] the TCC has recently been shown for a semi-discrete setting in the acoustic regime, cf. (31) below.

Remark 3.7

The findings for the full waveform forward operator in this subsection, that is, (5) and (6) hold, carry over to parameter identification tasks for other hyperbolic evolution equations of the form (25). Examples are the parameter-to-state maps with respect to the acoustic and elastic wave equations (which are in fact simplifications of the visco-elastic model). A further example is the Maxwell system where the components of \(y=(\textbf{E}, \textbf{H})\) are the electric and magnetic fields, respectively. The involved operators are

$$\begin{aligned} A=\begin{pmatrix} \sigma \text {Id} &{} -\text {curl}\\ \text {curl} &{} 0 \end{pmatrix},\ \ B=\begin{pmatrix} \varepsilon \text {Id}&{}0 \\ 0 &{} \mu \text {Id} \end{pmatrix},\quad Q=0,\ \text { and }\ f= \begin{pmatrix}-\textbf{J}_{\text {e}} \\ \textbf{J}_{\text {m}} \end{pmatrix} \end{aligned}$$

where \(\varepsilon ,\mu ,\sigma :D\rightarrow \mathbb {R}\) are the (electric) permittivity, the (magnetic) permeability, and the conductivity. Further, \(\textbf{J}_{\text { e}}, \textbf{J}_{\text { m}}:[0,\infty )\times D\rightarrow \mathbb {R}^3\) are the current and magnetic densities. See [27, Section 5] for the details to verify (5) and (6) for the map \(F:(\varepsilon , \mu )\mapsto y\).

3.3 A stronger result for FWI in the acoustic regime

Setting \(\mu =0\), \(\tau _{{\varvec{\sigma }},l}=0\), \(\tau _{\text {\tiny P}}=\tau _{\text {\tiny S}}=0\) in (24) and introducing the hydrostatic pressure \(p={\text {tr}}(\varvec{\sigma }_0)/3\), we can formally derive the acoustic wave equation as first order system,

$$\begin{aligned} \rho \,\partial _t \textbf{v}- \nabla p&= {\varvec{f}}_1 \qquad \text {in }(0,T)\times D, \end{aligned}$$
(31a)
$$\begin{aligned} \frac{1}{\rho \nu ^2} \,\partial _t p - \mathrm{\,div\,}\textbf{v}&= f_2 \qquad \text {in } (0,T)\times D, \end{aligned}$$
(31b)

where we allow two source terms and where \(\nu =v_\text {\tiny P}\) is the compression wave speed. In our abstract formulation (25), it is represented by \(y=(\textbf{v},p)\in L^2((0,T),H)\), \(H=L^2(D,\mathbb {R}^d)\times L^2(D)\), and the operators

$$\begin{aligned} A=- \begin{pmatrix} 0&{} \nabla \\ \mathrm{\,div\,}&{} 0 \end{pmatrix}, \quad B= \begin{pmatrix} \rho \text {Id} &{} 0\\ 0 &{}\frac{1}{\rho \,\nu ^2} \text {Id} \end{pmatrix}, \quad Q=0.\end{aligned}$$
(32)

Further, A is defined on

$$\begin{aligned} \mathcal {D}(A)=\big \{(\textbf{v},p)\in L^2(D,\mathbb {R}^d) \times H_0^1(D):\ \mathrm{\,div\,}\textbf{v}\in L^2(D)\big \}. \end{aligned}$$
(33)

If \({\varvec{f}}=({\varvec{f}}_1,f_2)\in W^{1,1}((0,T),H)\) then (31) has a unique classical solution in \(C^1([0,T],H) \cap C([0,T],\mathcal {D}(A))\), see [27] (recall that we have zero initial conditions (24d)).

The parameter-to-state map here is

$$\begin{aligned} F:\mathcal {D}(F) \subset L^\infty (D)^2\rightarrow L^2((0,T),H),\quad u=(\rho ,\nu )\mapsto y=(\textbf{v},p), \end{aligned}$$
(34)

with domain of definition

$$\begin{aligned} \mathcal {D}(F)=\big \{(\rho ,\nu )\in L^{\infty }(D)^{2}:\&0<\rho _{\min }<\rho<\rho _{\max }<\infty \ \text { and } \nonumber \\&0<\nu _{\min }< \nu< \nu _{\max }<\infty \ \text { a.e. in }D\big \}. \end{aligned}$$
(35)

As explained in Remark 3.7, F satisfies (5) and (6). In this part we even verify the stronger compatibility condition (19), that is, for a special choice of \(X^n\) and X, see (18), we will specify the decay function \(\beta \) in Theorem 3.8 below. For this purpose, we restrict ourselves to the discretization space \(X^n=X^n_1\times X^n_1\) with \(X^n_1=X^n_1(D)\) from (A3) with \(N=1\). Thus, \(X^n\) consists of piecewise constant functions in the sequel. The associated projector onto \(X^n\) then reads

$$\begin{aligned} \mathcal {P}^n:L^{\infty }(D)^{2}\rightarrow X^n,\quad \mathcal {P}^nu=(\mathcal {P}_1^nu_1,\mathcal {P}_1^nu_2), \end{aligned}$$
(36)

whose components are given by

$$\begin{aligned} \mathcal {P}_1^nu_j=\sum _{k\in \mathcal {I}_n}2^{nd}\left( \int _{\square ^n_{k}}u_j(x)\;\textrm{d}x\right) \mathbb {1}_{\square ^n_{k}} \end{aligned}$$

for \(j\in \{1,2\}\) according to (A5). Here,

$$\begin{aligned} \square ^n_{k}: =2^{-n}([0,1]^d+k) \end{aligned}$$
(37)

is the translated and dilated unit cube. Further, \(\mathcal {I}_n = \{k\in \mathbb {Z}^d: \square ^n_{k}\subset \overline{D}\} \), see (A4).

We are left to determine \(X\subset L^{\infty }(D)^2\) where the subspace X is governed by a stronger topology measuring some kind of smoothness. Intuitively, X should be large enough to still contain a wide class of discontinuous profiles, on the other hand we need some minimal a priori regularity such that its \(X^n\)-projections facilitate a common decay rate in (19). For \(s>0\) fixed we set

$$\begin{aligned}X:=L^{\infty }_s(D)^2 \end{aligned}$$

whose component spaces are characterized by

$$\begin{aligned} L^{\infty }_s(D):=\left\{ w\in L^{\infty }(D):\ \sup _{h\ne 0}\frac{\Vert w(\cdot - h)-w\Vert _{L^2(D^{h})}}{\vert h\vert ^s}<\infty \right\} \end{aligned}$$
(38)

with \(D^{h}:=\{x\in D:\ x-h\in D\}\) for any \(h\in \mathbb {R}^d\). We assign the norm

$$\begin{aligned} \Vert \cdot \Vert _{L^{\infty }_s(D)}:=\Vert \cdot \Vert _{L^{\infty }(D)}+[\cdot ]_{B^{s}_{2,\infty }(D)}, \end{aligned}$$

where \([\cdot ]_{B^{s}_{2,\infty }(D)}\) is a semi-norm given by the numerical value of the supremum in (38). Originally, \([\cdot ]_{B^{s}_{2,\infty }(D)}\) emerges from the definition of Hilbertian Besov-Nikolskii spaces

$$\begin{aligned} B^{s}_{2,\infty }(D)=\left\{ w\in L^2(D):\ \sup _{h\ne 0}\frac{\Vert w(\cdot - h)-w\Vert _{L^2(D^{h})}}{\vert h\vert ^s}<\infty \right\} \end{aligned}$$

with \(\Vert \cdot \Vert _{B^{s}_{2,\infty }(D)}:=\Vert \cdot \Vert _{L^2(D)}+[\cdot ]_{B^{s}_{2,\infty }(D)}\), see [28]. We set

$$\begin{aligned} \Vert u\Vert _{X}^2:=\Vert \rho \Vert _{L^{\infty }_s(D)}^2+\Vert \nu \Vert _{L^{\infty }_s(D)}^2 \end{aligned}$$
(39)

and obviously have that \(\Vert u\Vert _{L^{\infty }(D)^2}\le \Vert u\Vert _{X}\) for all \(u\in X\).

Theorem 3.8

Let \(D\subset \mathbb {R}^d\), \(d\in \{1,2,3\}\), be a domain with \(C^1\) boundary and assume that \({\varvec{f}}=({\varvec{f}}_1,f_2)\in W^{2,1}((0,T),H)\) with \({\varvec{f}}(0)=\partial _t {\varvec{f}}(0)=\textbf{0}\) and \({\varvec{f}}_1\in C([0,T],L^{q}(D,\mathbb {R}^d))\) for some \(q\in (2,q_{\max })\), where \(q_{\max }\) only depends on D and the ratio \(\rho _{\max }/\rho _{\min }<\infty \) of the parameters from the definition of \(\mathcal {D}(F)\). Let \(\mathcal {P}^n\) be as in (36) and let \(X=L^\infty _s(D)^2\) be as above for some \(0<s\le 1/2\). Then, there is a neighborhood U of \(u^+\in \mathcal {D}(F)\) such that

$$\begin{aligned} \Vert F'(u)\big (\widehat{u}-\mathcal {P}^n\widehat{u}\big )\Vert _{L^2((0,T),H)}\le C^+ \Vert \widehat{u}\Vert _{X}\Big (2^{-s(q-2)/(4d(q-1))}\Big )^n \end{aligned}$$
(40)

for all \(u\in U\) and any \(\widehat{u}\in X\).

To prove the theorem, some preparation is required. We start with the observation that \(L^{\infty }_s(D)\subset B^{s}_{2,\infty }(D)\) and \(X_1^n\subset L^{\infty }_s(D)\) if \(s\le 1/2\). The latter inclusion can be seen as follows: Let \(h=(h_1,\dots ,h_d)\) with \(\vert h\vert <2^{-n}\). Without loss of generality we may assume that \(h_i\ge 0\). Now, thanks to the symmetry of the cubes, we estimate

$$\begin{aligned}&\Vert \mathbb {1}_{-h+\square ^n_{k}}-\mathbb {1}_{\square ^n_{k}}\Vert _{L^2(D^{h})}^2\\&\qquad \qquad \le 2\text {vol}_d\Big (\big [0,2^{-n}\big ]^d\backslash \big [0,2^{-n}-h_1\big ]\times \dots \times \big [0,2^{-n}-h_d\big ]\Big )\\&\qquad \qquad = 2\Big (2^{-nd}-\prod _{i=1}^d\big (2^{-n}-h_i\big )\Big )\\&\qquad \qquad \le 2\Big (2^{-nd}-\big (2^{-n}-\vert h\vert \big )^d\Big ) \le d\,2^{-n(d-1)+1}\vert h\vert \end{aligned}$$

for all \(k\in \mathcal {I}_n\), where we used the mean value theorem in the last step.

Lemma 3.9

Let D be a bounded Lipschitz domain and let \(\mathcal {P}^n\) be as in (36). Then, for any \(u\in X\) and \(0<s\le 1/2\),

$$\begin{aligned} \Vert u-\mathcal {P}^nu\Vert _{L^2(D)^2}\le C\Vert u\Vert _{X}2^{-ns/d}, \end{aligned}$$

where C only depends on D and the dimension d.

Proof

In view of (39) it suffices to prove the assertion for each component \(\mathcal {P}^n_1\) of \(\mathcal {P}^n\) and for any \(w\in L^{\infty }_s(D)\). Let \(\square \subset \mathbb {R}^d\) be a sufficiently large rectangle containing D. According to [29] there exists \(\widetilde{w}\in B^s_{2,\infty }(\square )\) such that \(\widetilde{w}\vert _D=w\) and \(\Vert \widetilde{w}\Vert _{B^s_{2,\infty }(\square )}\le \widetilde{C}\Vert w\Vert _{B^s_{2,\infty }(D)}\), where \(\Vert w\Vert _{B^s_{2,\infty }(D)}\) can be replaced by the stronger norm \(\Vert w\Vert _{L^{\infty }_s(D)}\). Using a dyadic partition of \(\square \) at level n based on the cubes \(\{\square _{k}^n\}_{k}\) (for which we restrict \(\square \) to have integer side length), we have

$$\begin{aligned} \Vert \widetilde{w}-\mathcal {\widetilde{P}}^n_1\widetilde{w}\Vert _{L^2(\square )}\le [\widetilde{w}]_{B^s_{2,\infty }(\square )}2^{-ns/d}, \end{aligned}$$

see [30], where \(\mathcal {\widetilde{P}}^n_1\) is defined as in (36) but with respect to the larger index set \(\mathcal {I}_n(\square )= \{k\in \mathbb {Z}^d:\square ^n_{k}\subset \overline{\square }\}\). Setting

$$\begin{aligned} D_n:=\bigcup _{k\in \mathcal {I}_n(D)}\square _{k}^n, \end{aligned}$$

we conclude that \(\mathcal {P}^n_1w\vert _{D_n}=\mathcal {\widetilde{P}}^n_1\widetilde{w}\vert _{D_n}\) and \(\mathcal {P}^n_1w\vert _{D\backslash D_n}=0\). The latter implies

$$\begin{aligned} \Vert w-\mathcal {P}^n_1w\Vert _{L^2(D\backslash D_n)}=\Vert w\Vert _{L^2(D\backslash D_n)}&\le \sqrt{\text {vol}_d(D\backslash D_n)}\,\Vert w\Vert _{L^\infty (D)}\\&\le \sqrt{\text {vol}_d(D\backslash D_n)}\,\Vert w\Vert _{L^{\infty }_s(D)} \end{aligned}$$

for which we can estimate for some \(C_D<\infty \)

$$\begin{aligned} \text {vol}(D\backslash D_n) \le C_D\mathcal {H}^{d-1}(\partial D)\, 2^{-n} \end{aligned}$$

according to the inclusion \(D\backslash D_n\subset \cup _{x\in \partial D} \big (x+2^{-n}[-1,1]^d]\big )\), where \(\mathcal {H}^{d-1}(\partial D)\) denotes the \(d-1\)-dimensional Hausdorff measure of \(\partial D\). Altogether, we obtain for \(s\le 1/2\) that

$$\begin{aligned} \Vert w-\mathcal {P}^n_1w\Vert _{L^2(D)}&\le \Vert w-\mathcal {P}^n_1w\Vert _{L^2(D_n)} + \Vert w-\mathcal {P}^n_1w\Vert _{L^2(D\backslash D_n)} \\&\le \big (\widetilde{C}2^{-ns/d} + \sqrt{C_D\mathcal {H}^{d-1}(\partial D)}2^{-n/2}\big ) \Vert w\Vert _{L^{\infty }_s(D)} \\&\le C \Vert w\Vert _{L^{\infty }_s(D)} 2^{-ns/d} \end{aligned}$$

which proves the lemma. \(\square \)

Lemma 3.10

Under the assumptions of Theorem 3.8, there is a \(q_{\max }>2\) such that for any \(u^+=(\rho ^+,\nu ^+)\in \mathcal {D}(F)\) there exists a neighborhood U of \(u^+\) such that

$$\begin{aligned} \sup \left\{ \Vert \partial _ty\Vert _{L^1((0,T),H^{q})}:y=(p,\textbf{v})=F(u),\ u= (\rho ,\nu )\in U \right\} <\infty \end{aligned}$$

for any \(q\in (2,q_{\max })\) where

$$\begin{aligned} H^q:=L^q(D)\times L^q(D,\mathbb {R}^d). \end{aligned}$$
(41)

Here, \(q_{\max }\) only depends on D and the ratio \(\rho _{\max }/\rho _{\min }<\infty \) of the parameters from the definition of \(\mathcal {D}(F)\).

Proof

The proof makes use of converting higher time regularity to higher spatial integrability. By Theorem 2.6 of [27], we know that for \(u^+\in \mathcal {D}(F)\) we have \(\partial _ty=(\partial _t p,\partial _t\textbf{v})\in C([0,T],\mathcal {D}(A))\cap C^1([0,T],H)\), in particular \(\partial _t p\in L^1((0,T),H^1_0(D))\). By Sobolev embedding, see [31], we obtain at least \(\partial _t p\in L^1((0,T),L^6(D))\) for \(d\in \{1,2,3\}\). Again by Theorem 2.6 of [27], since the constant of the stability estimate depends continuously on B and thus on \(u\), we can actually conclude \(\Vert \partial _t p\Vert _{L^1((0,T),L^6(D))}<\infty \) uniformly in a neighborhood of \(u^+\). Concerning \(\partial _t\textbf{v}\), we only have integrability information about its divergence. Therefore, we first note that by (31) we have

$$\begin{aligned} -\mathrm{\,div\,}\left( \frac{1}{\rho }\nabla p\right) =\mathrm{\,div\,}\left( \frac{{\varvec{f}}_1}{\rho }\right) -\mathrm{\,div\,}\partial _t\textbf{v}=\mathrm{\,div\,}\left( \frac{{\varvec{f}}_1}{\rho }\right) + \left( \partial _t f_2-\frac{\partial _t^2 p}{\rho \nu ^2}\right) \end{aligned}$$

in the sense of distributions. As \((\rho , \nu )\in \mathcal {D}(F)\) is bounded away from zero uniformly, the parameters do not affect the integrability of the right-hand side terms. Now Meyers’ estimate, see [32, Thm. 1], implies for fixed \(t\in [0,T]\) and any \(q\in (2,q_{\max })\) that

$$\begin{aligned} \Vert \nabla p\Vert _{L^q(D,\mathbb {R}^d)}\le C \left( \frac{\Vert {\varvec{f}}_1\Vert _{L^q(D,\mathbb {R}^d)}}{\rho _{\min }}+\Vert \partial _t f_2\Vert _{L^2(D)}+\frac{\Vert \partial _t^2 p\Vert _{L^2(D)}}{\rho _{\min } \nu _{\min }^2}\right) , \end{aligned}$$

where \(q_{\max }\) depends on D and \(\rho _{\max }/\rho _{\min }<\infty \), while C additionally depends on q. Since \(\Vert \partial _t^2 p\Vert _{L^2(D)}\) can be bounded in terms of \(\Vert \partial _t^2 {\varvec{f}}\Vert _{L^1((0,T),H)}\) and a constant which depends continuously on B, that is, on u, see Theorem 2.6 of [27], we concude that \(\Vert \nabla p\Vert _{L^1((0,T),L^q(D,\mathbb {R}^d))}<\infty \) uniformly in a neighborhood of \(u^+\). Comparing with (31), we finally get that also \(\Vert \partial _t\textbf{v}\Vert _{L^1((0,T),L^q(D,\mathbb {R}^d))}<\infty \) locally uniform in \(u\). This completes the proof. \(\square \)

Finally, we can prove the main result.

Proof of Theorem 3.8

The proof uses a more elaborate analysis of the stability estimate

$$\begin{aligned} \Vert F'(u)\big (\widehat{u}-\mathcal {P}^n\widehat{u}\big )(t)\Vert _{L^2((0,T),H)}\le C\Vert B'(u)\big (\widehat{u}-\mathcal {P}^n\widehat{u}\big )\partial _ty\Vert _{L^1((0,T),H)} , \end{aligned}$$

compared to Lemma 3.6. Recall that the constant C depends continuously on the operator norms of \(B(u)\), \(B^{-1}(u)\) and on T according to (29). Since also \(B\mapsto B^{-1}\) is (locally) continuous, we can assume without loss of generality that the above inequality holds for some fixed \(C=C_{r',u^+}>0\) for all \(\Vert u-u^+\Vert _{L^{\infty }(D)^{2}}<r'\) by shrinking the original \(r'>0\) otherwise. Then, for any \(\delta >0\),

$$\begin{aligned}&\Vert B'(u)\big [\widehat{u}-\mathcal {P}^n\widehat{u}\big ]\partial _ty\Vert _{L^1((0,T),H)} \\&=\left\| B'(u)\left[ \frac{\widehat{u}-\mathcal {P}^n\widehat{u}}{\Vert \widehat{u}\Vert _{X}}\right] \big (\partial _ty(\mathbb {1}_{\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert \ge \delta \Vert \widehat{u}\Vert _{X}\}}+\mathbb {1}_{\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert< \delta \Vert \widehat{u}\Vert _{X}\}})\big )\right\| _{L^1((0,T),H)} \\&\qquad \times \Vert \widehat{u}\Vert _{X} \\&\le \Vert B'(u)\Vert _{\mathcal {L}(L^{\infty }(D)^{2},\mathcal {L}(H))}\frac{\Vert \widehat{u}-\mathcal {P}^n\widehat{u}\Vert _{L^{\infty }(D)^{2}}}{\Vert \widehat{u}\Vert _{X}}\, \left\| \partial _ty\mathbb {1}_{\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert \ge \delta \Vert \widehat{u}\Vert _{X}\}}\right\| _{L^1((0,T),H)} \\&\qquad \times \Vert \widehat{u}\Vert _{X} \\&\qquad +\Vert B'(u)\Vert _{\mathcal {L}(L^{\infty }(D)^{2},\mathcal {L}(H))}\delta \left\| \partial _ty\mathbb {1}_{\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert < \delta \Vert \widehat{u}\Vert _{X}\}}\right\| _{L^1((0,T),H)}\Vert \widehat{u}\Vert _{X}. \end{aligned}$$

Due to \(\Vert \widehat{u}-\mathcal {P}^n\widehat{u}\Vert _{L^{\infty }(D)^{2}}\le (1+C_{\mathcal {P}})\Vert \widehat{u}\Vert _{L^{\infty }(D)^{2}}\le (1+C_{\mathcal {P}})\Vert \widehat{u}\Vert _{X}\) by (S2) and \(C_X=1\) in (18), we obtain, by dropping the complementary indicator function in the bottom line above, that

$$\begin{aligned}{} & {} \Vert B'(u)\big [\widehat{u}-\mathcal {P}^n\widehat{u}\big ]\partial _ty\Vert _{L^1((0,T),H)} \\{} & {} \quad \le (1+C_{\mathcal {P}})\Vert B'(u)\Vert _{\mathcal {L}(L^{\infty }(D)^{2},\mathcal {L}(H))}\left\| \partial _ty\mathbb {1}_{\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert \ge \delta \Vert \widehat{u}\Vert _{X}\}}\right\| _{L^1((0,T),H)}\Vert \widehat{u}\Vert _{X} \\{} & {} \qquad +\Vert B'(u)\Vert _{\mathcal {L}(L^{\infty }(D)^{2},\mathcal {L}(H))}\delta \left\| \partial _ty\right\| _{L^1((0,T),H)}\Vert \widehat{u}\Vert _{X}. \end{aligned}$$

The middle line here can also be directly expressed in terms of \(\delta \). To this end, we apply Hölder’s inequality with exponent \(q/2>1\) in view of (41) to get

$$\begin{aligned}&\left\| \partial _ty(\mathbb {1}_{\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert \ge \delta \Vert \widehat{u}\Vert _{X}\}}\right\| _{L^1((0,T),H)}\\&\qquad \qquad = \int _0^T \left( \int _D \big \vert \partial _ty(t)\big \vert ^2(x)\mathbb {1}_{\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert \ge \delta \Vert \widehat{u}\Vert _{X}\}}(x)\;\textrm{d}x\right) ^{1/2}\textrm{d}t \\&\qquad \qquad \le \int _0^T \left( \big \Vert \partial _ty(t)\big \Vert _{H^q}^2 \text {vol}\big (\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert \ge \delta \Vert \widehat{u}\Vert _{X}\}\big )^{(q-2)/q} \right) ^{1/2}\textrm{d}t \\&\qquad \qquad =\Vert \partial _ty\Vert _{L^1((0,T),H^q)}\text {vol}\big (\{\vert \widehat{u}-\mathcal {P}^n\widehat{u}\vert \ge \delta \Vert \widehat{u}\Vert _{X}\}\big )^{(q-2)/(2q)} \\&\qquad \qquad \le \Vert \partial _ty\Vert _{L^1((0,T),H^q)}\left( \frac{\Vert \widehat{u}-\mathcal {P}^n\widehat{u}\Vert _{L^2(D)^{2}}^2}{\delta ^2\Vert \widehat{u}\Vert _{X}^2}\right) ^{(q-2)/(2q)}, \end{aligned}$$

where we additionally employed Tschebyscheff’s inequality in the bottom line, see e.g., [33]. With Lemma 3.9 we can further estimate

$$\begin{aligned} \frac{\Vert \widehat{u}-\mathcal {P}^n\widehat{u}\Vert _{L^2(D)^{2}}^2}{\delta ^2\Vert \widehat{u}\Vert _{X}^2}\le \frac{C\Vert \widehat{u}\Vert _{X}^2\big (2^{-(s/d)}\big )^n}{\delta ^2\Vert \widehat{u}\Vert _{X}^2}= \frac{C\big (2^{-(s/d)}\big )^n}{\delta ^2}. \end{aligned}$$

Altogether, we get with a similar Hölder-inequality argument for

$$\begin{aligned} \Vert \partial _ty\Vert _{L^1((0,T),H)}\le \Vert \partial _ty\Vert _{L^1((0,T),H^q)}\text {vol}_d(D)^{(q-2)/(2q)} \end{aligned}$$

that

$$\begin{aligned}{} & {} \Vert B'(u)\big [\widehat{u}-\mathcal {P}^n\widehat{u}\big ]\partial _ty\Vert _{L^1((0,T),H)} \le \Vert \widehat{u}\Vert _{X}\Vert B'(u)\Vert _{\mathcal {L}(L^{\infty }(D)^{2},\mathcal {L}(H))}\Vert \partial _ty\Vert _{L^1((0,T),H^q)}\\{} & {} \quad \times \Bigg ((1+C_{\mathcal {P}})\left( \frac{C\big (2^{-(s/d)}\big )^n}{\delta ^2}\right) ^{(q-2)/(2q)}+\delta \text {vol}_d(D)^{(q-2)/(2q)}\Bigg ) \end{aligned}$$

which holds for all \(\delta >0\). Optimization in \(\delta \) then yields \(\delta \propto \big (2^{-s(q-2)/(4d(q-1))}\big )^n\). By Lemma 3.10 and the continuity of \(B'\), we can indeed find \(C^+<\infty \) such that (40) holds. \(\square \)

4 Numerical Results

We present numerical experimentsFootnote 4 on a two-parameter reconstruction to demonstrate the operation of REGINN \(^\infty \) in a test scenario where all assumptions required for our analysis in the previous sections are satisfied.

Recall from Theorem 2.3 that, in general, the regularization property holds only in the weak-\(\star \) topology permitting a kind of strange convergence behavior. Therefore, we test Algorithm 1 as the noise level approaches zero and also how it behaves under different initial spaces \(X^{n_0}\). We will start with a rather low dimensional \(X^{n_0}\) such that \(n_m\) increases successively in the course of the Newton iteration (while-loop) and in contrast also with some large dimensional \(X^{n_0}\) which corresponds to a more static use of Tikhonov regularization throughout all iterations.

Our experiments rely on the acoustic wave equation in one spatial dimension, \(d=1\), where \(D=(0,1)\) and \(T=1\):

$$\begin{aligned} \begin{aligned} \frac{1}{\rho \nu ^2} \,\partial _t p - \partial _{x} v&= f_1{} & {} \text {in }(0,1)\times (0,1),\\ \rho \, \partial _t v - \partial _{x} p&= f_2{} & {} \text {in }(0,1)\times (0,1),\\ v(0,\cdot )=p(0,\cdot )&= 0{} & {} \text {on } (0,1),\\ p(\cdot ,0)=p(\cdot ,1)&=0{} & {} \text {on } (0,1). \end{aligned} \end{aligned}$$
(42)

The source components \(f_1,f_2:[0,1]\times [0,1]\rightarrow \mathbb {R}\) are

$$\begin{aligned} \begin{aligned} f_1(t,x)&=f_1(x)=100\Big (x(x-1)\frac{1}{\rho (x)\nu (x)^2}-\frac{\pi }{2}\cos \left( \frac{\pi }{2}x\right) \Big ),\\ f_2(t,x)&=100\Big (-t(2x-1)+\sin \left( \frac{\pi }{2}x\right) \rho (x)\Big ), \end{aligned} \end{aligned}$$
(43)

where

$$\begin{aligned} \rho (x)=1+\frac{1}{5}\,\mathbb {1}_{[7/30,17/30]}(x) \quad \text {and} \quad \nu (x)=1-\frac{1}{10}\,\mathbb {1}_{[13/30,23/30]}(x).\end{aligned}$$
(44)

The corresponding exact data, that is, the solution of (42) and (43), are given by

$$\begin{aligned} p(t,x)=100tx(x-1)\quad \text {and}\quad v(t,x)=100t\sin \left( \frac{\pi }{2}x\right) . \end{aligned}$$
(45)

We solve the appearing wave equations during inversion for the parameters by the FEM-based MATLAB (R2021a) command pdepe with 300 spatial and 100 temporal grid points. Both sets of points are distributed equidistantly in [0, 1].

Our discrete parameter spaces \(X^n=X^n_1\times X^n_1\) are generated by the piecewise constant cardinal B-spline as explained in Appendix A.1. So, conditions (S1)–(S3) are fulfilled. Note that the dimension of \(X^n_1\) is \(2^n\). In view of Remark 2.5 we set \(n_{\max }=8\) yielding the semi-discrete parameter-to-state map

$$\begin{aligned} F_{n_{\max }}:\mathcal {D}(F_{n_{\max }}) \subset L^\infty (D)^2 \rightarrow L^2([0,1],H),\quad (\rho ,\nu )\mapsto (p,v), \end{aligned}$$
(46)

where \((p,v)\in L^2([0,1],H)\), \(H= L^2(0,1)^2\), solves (42) with (43) and \((\rho ,\nu )\in \mathcal {D}(F_{n_{\max }})= X^{n_{\max }} \cap \mathcal {D}(F)\), see (35) for \(\mathcal {D}(F)\). Within our computations, \(\Vert \cdot \Vert _{ L^2([0,1],H)}\) is discretized by the corresponding space-time Euclidean norm and denoted by \(\Vert \cdot \Vert \). Since \(F_{n_{\max }}\) satisfies the TCC (3), see Appendix C, Theorem 2.1 guarantees termination of REGINN \(^\infty \) applied to the inverse problem

$$\begin{aligned} \text {find } (\rho ,\nu )\in X^{n_{\max }}:\quad F_{n_{\max }}(\rho ,\nu )\approx (p^\delta ,v^\delta ). \end{aligned}$$
(47)

To simplify notation we use the same symbols for the continuous and the discrete versions of functions such as p, v, \(\rho \), \(\nu \), etc.

We apply REGINN \(^\infty \) (Algorithm 1) to (47) where we choose \(q_n=n/\log _2 C_\infty \) for \(J_{n,m}\) from (10) in accordance with the lower bound in (A6) below. For each m the computation of Newton update candidates \(s_m\in X^n\) is realized—benefiting greatly from the smoothness of the Tikhonov functionals—by a steepest descent routine with Armijo stepsize rule in a loop over n until (9) is met. We adapt \(\mu _m\) during iteration according to the rule proposed in [4]: we start with \(\mu _1=\mu _0\) and set

$$\begin{aligned} \mu _m={\left\{ \begin{array}{ll} \min \{1-\frac{j_{m-2}}{j_{m-1}}(1-\mu _{m-1}), 0.999\}, &{} j_{m-1}\ge j_{m-2},\\ 0.9\mu _{m-1}, &{} \text {otherwise}, \end{array}\right. } \qquad m\ge 2, \end{aligned}$$

where \(\mu _0\in (0,1)\) is user-supplied and \(j_m\) denotes the number of gradient decent steps needed to compute the update \(s_m\). Complementary, the underlying discretization level n will be increased if the gradient descent loop stagnates on \(X^n\), which we consider to occur if the ratio of two successive gradient step evaluations does not exceed a fixed threshold close to 1, say 0.99999. We stop the algorithm either by the discrepancy principle or if \(n\ge n_{\max }\) happens, that is, if the discretization of \(X^n\) would become finer than the computational grid used in the pdepe-routine for solving the wave equation. In the latter case, we still perform \(u_{m+1}=u_m+\widetilde{s}_m\) with the last update candidate \(\widetilde{s}_m\in X^{n_{\max }}\) before abortion. We emphasize that \(\widetilde{s}_m\) is not a Newton update in the sense of (9), but the corresponding \(u_{m+1}\) might still fulfill the discrepancy principle unlike \(u_m\).

In our experiments we especially want to detect the jump regions [7/30, 17/30] and [13/30, 23/30], where the parameters differ from the homogeneous background material \((\rho _0,\nu _0)=(1,1)\in X^{n_{\max }}\), respectively, that we take as initial guess. Note that no grid point of \(X^n\) coincides with either of the jump discontinuity points for all n so that the error of any reconstruction of \(\rho \) and \(\nu \) will always be at least \((\max {\rho }-\min {\rho })/2=1/10\) and \((\max {\nu }-\min {\nu })/2=1/20\) with respect to the \(L^{\infty }\)-norm, respectively.

Fig. 1
figure 1

Approximate solutions \(\rho _M\) (blue, left column) and \(\nu _M\) (blue, right column) by Algorithm 1 with initial spaces \(X^2\), \(X^5,\) and \(X^8\) (top to bottom) (color figure online)

Fig. 2
figure 2

Left: Convergence history for exact data case \(n_0=2\) from Fig. 1. Peaks for \(j_m\) arise whenever the discretization level \(n_m\) is increased as cumulative contribution. Right: Graphical presentation of the values \(j_m\) (blue) and \(\mu _{m-1}\) (black dashed) as functions of \(m\in \{11,\ldots ,17\}\) where \(n_m=4\). Moreover, we have included the quotient \(\Vert b_m^0\Vert /\Vert b_{m-1}^0\Vert \) (red) which is always below \(\mu _{m-1}\). This illustrates that (15) holds for a tiny \(\omega \) (color figure online)

Fig. 3
figure 3

Regularized solutions \(\rho _M\) (blue, left column) and \(\rho _M\) (blue, right column) by Algorithm 1 for relative noise levels of \(5\%\), \(2\%\) and \(1\%\) (top to bottom) (color figure online)

First, we investigate the case of ‘exact’ data, that is, \((p^\delta ,v^\delta )= (p,v)\) with (pv) from (45). Despite of \(\delta =0\) our data might still be contaminated by some discretization error with respect to \(F_{n_{\max }}\) since the corresponding analytical solution \((\rho ,\nu )\) given by (44) is not contained in \(X^{n_{\max }}\). Choosing \(\mu _0=0.7\), \(\gamma =0.8\), \(C_\infty =1.1\), we run Algorithm 1 for different \(n_0\) to observe how its choice affects the outcome. Note that setting \(\tau \) is redundant here because termination is solely forced by n exceeding \(n_{\max }\). Figure 1 displays the exact parameter functions \(\rho \) and \(\nu \) (red) and the corresponding outputs \(\rho _M\) and \(\nu _M\) (blue) of Algorithm 1 when starting with \(n_0\in \{2,5,8\}\), respectively. We see that the larger \(n_0\) is, the smoother the output becomes, while the points of discontinuity are more sharply located for smaller \(n_0\). Hence, for the reconstruction of jump discontinuities, \(n_0\) shall be chosen large enough to locate discontinuity points sufficiently precise while at the same time it should not be too large to prevent oversmoothing. Figure 2 shows a more detailed convergence history in the case \(n_0=2\) and confirms that the majority of Newton steps is indeed undertaken with \(n_m\le 4\).

Next, we study the case of noisy data. For this purpose we generate noise vectors  \(\zeta \) as random samples from a centered Gaussian distribution and scale it such that \(\Vert \zeta \Vert =\delta \Vert (p,v)\Vert \). Since \(\delta \) is a relative perturbation here, the discrepancy principle must be adjusted accordingly. As before, we employ Algorithm 1 with \(\mu _0=0.7\), \(\gamma =0.8\), \(C_\infty =1.1\), \(\tau =1.1\). Using the insights from our exact data case we set \(n_0=5\) as initial value to balance the aforementioned effects of globally smooth and locally oscillating reconstructions. The corresponding results for \(\rho _M\) and \(\nu _M\) are shown in Fig. 3 for \(\delta =5\%\), \(\delta =2\%\), and \(1\%\). We see that the reconstructions’ profiles approach the correct jump height of the exact solution as \(\delta \) becomes smaller. In all three cases, termination occurs by reaching the discretization limit, however, each last update fulfills the discrepancy principle afterwards. Altogether, the plots are in agreement with the weak-\(\star \) regularization property of REGINN \(^\infty \) (Theorem 2.3).

5 Conclusion

We have investigated a novel iterative regularization algorithm tailored for non-linear illposed problems between \(L^\infty (D)^\ell \) and a normed space Y. The main focus was on generating uniformly bounded iterates relying on a Tikhonov-like regularization term. Due to the non-smooth structure of \(L^\infty (D)^\ell \), a straightforward implementation would require non-smooth or box-constrained calculus which we could circumvent by using discretization in combination with equivalent \(L^p(D)^\ell \)-norms for \(p<\infty \). Under reasonable assumptions on the input parameters, our algorithm REGINN \(^\infty \) terminates after finitely many steps. Further, it fulfills the regularization property in the weak-\(\star \) topology as the noise level of the Y-data tends to zero. Depending on the non-linearity, this convergence can be reformulated as convergence with respect to a norm. Numerical experiments with a model problem illustrate the theoretical findings.

Future research may include a convergence rate analysis under higher regularity assumptions as in (19) or under more general variational source conditions with respect to a Bregman distance [13]. We could even incorporate a metric to overcome that \(L^\infty (D)^\ell \) is non-separable; an approach proposed in [7]. Concerning the data space, especially the task of finding proper measures for the misfit in seismograms, the Kantorovich-Rubinstein (KR) norm has recently proven advantageous in exploration geophysics, see, e.g., [34, 35]. This fact suggests an implementation of our method under the KR-norm on Y. In fact, our theory also allows more general distance functions on Y. For instance, any distance concept is admissible which is convex in one of its two arguments (e.g. Bregman distances). Such a concept can in particular be learned within a predefined set of admissible distance functions in practice and is subject of ongoing research, see [36].