Paper The following article is Open access

Consistency of the Bayes method for the inverse scattering problem

, and

Published 18 March 2024 © 2024 The Author(s). Published by IOP Publishing Ltd
, , Citation Takashi Furuya et al 2024 Inverse Problems 40 055001 DOI 10.1088/1361-6420/ad3089

0266-5611/40/5/055001

Abstract

In this work, we consider the inverse scattering problem of determining an unknown refractive index from the far-field measurements using the nonparametric Bayesian approach. We use a collection of large 'samples', which are noisy discrete measurements taking from the scattering amplitude. We will study the frequentist property of the posterior distribution as the sample size tends to infinity. Our aim is to establish the consistency of the posterior distribution with an explicit contraction rate in terms of the sample size. We will consider two different priors on the space of parameters. The proof relies on the stability estimates of the forward and inverse problems. Due to the ill-posedness of the inverse scattering problem, the contraction rate is of a logarithmic type. We also show that such contraction rate is optimal in the statistical minimax sense.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

In this work, we study the Bayes method for solving the inverse medium scattering problem. Our aim is to prove the consistency property of the posterior distribution. Let the refractive index $n\unicode{x2A7E} 0$ and $1-n$ be a compactly supported function in $\mathbb{R}^3$ with $\mathrm{supp}(1-n)\subset D$, where D is an open bounded smooth domain, and having suitable regularity, which will be specified later. Let $u = u^\textrm{inc}+u_n^\textrm{sca}$ satisfy

Equation (1.1a)

and

Equation (1.1b)

Assume that $u^\textrm{inc}$ is the plane incident field, i.e. $u^\textrm{inc} = e^{\mathbf{i} \kappa x\cdot\theta}$ with $\theta\in\mathcal{S}^{2}$, where $\mathcal{S}^{2} = \left\{\theta \in \mathbb{R}^{3} : \lvert \theta \rvert = 1 \right\}$. Then the scattered field $u_n^\textrm{sca}$ satisfies

Equation (1.1c)

where $\theta^{^{\prime}} = x/|x|$, see for example, [Ser17, p 232]. The inverse scattering problem is to determine the medium perturbation $1-n$ from the knowledge of the scattering amplitude $u_n^\infty(\theta^{^{\prime}},\theta)$ for all $\theta^{^{\prime}},\theta\in\mathcal{S}^{2}$ at one fixed energy κ2.

It was known that the scattering amplitude $u_n^\infty(\theta^{^{\prime}},\theta)$ uniquely determines the near-field data of (1.1a ) on $\partial D$, which in turn, determines the Dirichlet-to-Neumann map of (1.1a ) provided κ2 is not an Dirichlet eigenvalue of $-\Delta$ on D, for example, see [Nac88]. Combining this fact and the uniqueness results proved in [SU87], one can show that the scattering amplitude $u_n^\infty(\theta^{^{\prime}},\theta)$ for all $\theta^{^{\prime}},\theta\in\mathcal{S}^{2}$ uniquely determines n, at least when n is essentially bounded. For the stability, a log-type estimate was derived in [HH01].

In this paper, we would like to apply the Bayes approach to the inverse scattering problem. The study of this inverse method is motivated by the practical consideration. In practice, it is impossible to obtain the full knowledge of the scattering amplitude. Instead, the following experiment is more realistic. We first randomly choose an incident direction and send out the corresponding incident field towards the probe region. Then we measure the scattering amplitude at another randomly chosen direction. The experiment can be repeated as many times as we wish. The goal now is to make an inference of the refractive index based on such observation data. To describe the method more precisely, we first introduce the measurement model. Let µ be the uniform distribution on $\mathcal{S}^{2}\times\mathcal{S}^{2}$, i.e. $\mu = \mathrm{d}\omega/|\mathcal{S}^{2}|^2$, where $\mathrm{d}\omega$ is the product measure on $\mathcal{S}^{2}\times\mathcal{S}^{2}$, that is, $\int_{\mathcal{S}^{2}\times\mathcal{S}^{2}}\mathrm{d}\omega = |\mathcal{S}^{2}|^2$. We also write $\mu = \mathrm{d}\xi$ and hence $\int_{\mathcal{S}^2\times\mathcal{S}^2}\mathrm{d}\xi = 1$. Consider the iid random variables $X_i\sim\mu$, $i = 1,2,\ldots,N$ with $N\in\mathbb{N}$. In other words, $\{X_i\}_{i = 1}^N$ is a sequence of independent samples of µ on $\mathcal{S}^{2}\times\mathcal{S}^{2}$. Denote

Equation (1.2)

where $(\theta^{^{\prime}},\theta)$ is a realization of Xi . The observation of the scattering amplitude $G(n)(X_i)$ is polluted by the measurement noise which is assumed to be a Gaussian random variable. Since $G(n)(X_i)$ is a complex-valued function, we treat it as a $\mathbb{R}^2$-valued function. For convenience, we slightly abuse the notation by writing

Consequently, the statistical model of the scattering problem is given as

Equation (1.3)

where σ > 0 is the noise level, I2 is the $2\times 2$ unit matrix. We also assume that $W^{(N)}: = \{W_i\}_{i = 1}^N$ and $X^{(N)}: = \{X_i\}_{i = 1}^N$ are independent.

The theme of this paper is to consider the inference of n from the observational data $(Y^{(N)},X^{(N)})$ with $Y^{(N)} = \{Y_i\}_{i = 1}^N$ by the Bayes method. In particular, we are interested in the asymptotic behavior of the posterior distribution induced from a large class of Gaussian process priors on n as $N\to\infty$. The aim is to establish the statistical consistency theory of recovering n in (1.1c ) with an explicit convergence rate as the number of measurements N increases, i.e. the contraction rate of the posterior distribution to the 'ground truth' n0 when the observation data is indeed generated by n0. Gaussian process priors are often used in applications in which efficient numerical simulations can be carried out based on modern MCMC algorithms such as the pCN (preconditioned Crank–Nicholson) method [CRSW13].

The study of inverse problems in the Bayesian inversion framework has recently attracted much attention since Stuart's seminal article [Stu10] (also see [DS17]). The setting of the problem considered in this paper is closely related to the ones studied in [GN20, Kek22]. In [GN20], Gaussian process priors were used in the Bayesian approach to study the recovery of the diffusion coefficient in the elliptic equation by measuring the solution at randomly chosen interior points with uniform distribution. It was shown that the posterior distribution concentrates around the true parameter at a rate Nλ for some λ > 0 as $N\to\infty$, where N is the number of measurements (or sample size). Previously, (frequentist) consistency of Bayesian inversion in the elliptic PDE considered in [GN20] was derived in [Vol13]. However, the contraction rates obtained in [Vol13] were only implicitly given. Based on the method in [GN20], similar results were proved in [Kek22] for the parabolic equation where the aim is to recover the absorption coefficient by the interior measurements of the solution. For further results on the Bayesian inverse problems in the non-linear settings, we refer the reader to other interesting papers [Abr19, NS17, NS19, MNP21, Nic20, NP21, NW20]. On the other hand, for linear inverse problems, the statistical guarantees of nonparametric Bayesian methods with Gaussian priors have been extensively studied and well understood, see for example [ALS13, KLS16, KvdVvZ11, MNP19, Ray13] and references therein.

The ideas used in proving the consistency of the Bayesian inversion for the inverse scattering problem studied here are similar to those used in [GN20, Kek22] in which main ideas are from [MNP21]. Unlike the polynomial contraction rates derived in [GN20, Kek22], the posterior distribution $\Pi(\cdot|Y^{(N)},X^{(N)})$ of $n|(Y^{(N)},X^{(N)})$ contracts at the true refractive index n0 as $N\to\infty$ with a logarithmic rate. The logarithmic rate is due to the ill-posedness of the inverse scattering medium problem by the knowledge of the scattering amplitude at one fixed energy, see [HH01].

This paper is organized as follows. In section 2, we describe the statistical model arising from the scattering problem. We state main consistency theorems with contraction rates assuming re-scaled Gaussian processes priors and re-scaled Gaussian sieve priors. The proofs of theorems are given in section 3. In appendix A, we derive some estimates for the forward scattering problem, and in appendix B, we prove the optimality of the logarithmic stability estimate in the inverse medium scattering problem based on Mandache's idea. Similar instability estimate was also derived in [Isa13]. To make the paper self contained, we present our own proof and slightly refine the estimate obtained in [Isa13].

2. The statistical inverse scattering problem

2.1. Some function spaces and notations

Throughout this paper, we shall use the symbol $\lesssim$ and $\gtrsim$ for inequalities holding up to a universal constant. For two real sequences $(a_{N})$ and $(b_{N})$, we say that $\simeq$ if both $a_{N} \lesssim b_{N}$ and $b_{N}\lesssim a_{N}$ for all sufficiently large N. For a sequence of random variables ZN and a real sequence $(a_{N})$, we write $Z_N = O_{\mathbb{P}}(a_N)$ if for all ε > 0 there exists $1 \unicode{x2A7D} M_\varepsilon\lt\infty$ such that for all N large enough, $\mathbb{P}(|Z_N|\unicode{x2A7E} M_\varepsilon a_N)\lt\varepsilon$. Denote ${\mathcal L}(Z)$ the law of a random variable Z. Let $C_c^t({\mathcal O})$ with $t\unicode{x2A7E} 0$ denote the Hölder space of order t with compact supports in the bounded smooth domain ${\mathcal O}$.

Let D be a bounded smooth domain in $\mathbb{R}^{3}$, let $s \unicode{x2A7E} 0$ be an integer and we consider the Hilbert space

For non-integer s, $H^s(D)$ is defined in terms of the interpolation, see [LM72]. It is known that the restriction operator to D is a continuous linear map of $H^s(\mathbb{R}^3)$ to $H^s(D)$ [LM72, (8.6)]. The space $H_{0}^{s}(D)$ is the completion of $C_c^\infty(D)$ with respect to $H^s(D)$. Also, for $s\gt1/2$ with $s\ne \mathbb{Z} +1/2$, the zero extension of $f\in H_{0}^s(D)$ (extension of f by 0 outside of D) is a continuous map $H_{0}^s(D)\to H^s(\mathbb{R}^3)$ [LM72, theorem 11.4]. Let ${\mathcal K}$ be a compact subset, define $H^s_{\mathcal K} = \{f\in H^s(\mathbb{R}^3): \mathrm{supp}(f)\subseteq {\mathcal K}\}$. In fact, we have

see [McL00, theorems 3.29 and 3.33].

We now define the space of parameters. For ${M}_{0}\gt1$ and $s \gt \frac{3}{2}$, let

where the traces are defined in the sense of [LM72, theorem 8.3]. For each $n\in\mathcal{F}^s_{M_0}$ we extend $n\equiv 1$ in $\mathbb{R}^3\setminus D$, still denoted by n. Then it is clear that $\mathrm{supp}(1-n)\, \subset D$. Note that for $n\in\mathcal{F}^s_{M_0}$, we only put the restriction on the size of n, but not on the $H^s(D)$-norm of n. As in [NvdGW20, GN20, AN19, Kek22], we will consider a re-parametrization of $\mathcal{F}^s_{M_0}$. We consider the link function Φ satisfying

  • (i)  
    $\Phi:(-\infty,\infty)\to(0,M_0)$, $\Phi(0) = 1$, $\Phi^{^{\prime}}(z)\gt0$ for all z;
  • (ii)  
    for any $k\in{\mathbb{N}}$

One example to satisfy (i) and (ii) is the logistic function

As pointed out in [NvdGW20, section 3], by utilizing a characterization of the space $H_{0}^{s}(D)$ (see e.g. [LM72, theorem 11.5]), one can show that the parameter space can be realized as

Equation (2.1)

We end this subsection by emphasizing that our link function is different to those in [NvdGW20, GN20, AN19, Kek22], see (i).

2.2. An abstract statistical model

For each forward map $G : {\mathcal F}_{M_0}^s\rightarrow L^2({\mathcal S}^2\times{\mathcal S}^2)$, we define the reparametrized forward map by

Equation (2.2a)

and consider the following random design regression model

Equation (2.2b)

Assume that $\mathcal{G}$ satisfies

Equation (2.2c)

and for each $F_{1},F_{2} \in (H^{1}(D))^{*}$ one has

Equation (2.2d)

for some constant $S_2\gt0$ and $t\unicode{x2A7E} 1$.

Remark 2.1. We implicitly assume that taking pointwise value of ${\mathcal G}(F)$ on ${\mathcal S}^2\times{\mathcal S}^2$ is legitimate. For instance, in the inverse scattering problem here, ${\mathcal G}(F)$ (corresponding to the scattering amplitude $u^\infty_n$) is, in fact, analytic on ${\mathcal S}^2\times{\mathcal S}^2$.

The statistical model (2.2b ) with conditions (2.2c ) and (2.2d ) falls into the general framework described in [GN20]. We want to remark that the uniform boundedness of the forward map ${\mathcal G}$ (condition (2.2c )) in [GN20, NvdGW20] (elliptic boundary value problem) or in [Kek22] (parabolic initial-boundary value problem) is ensured by the positivity assumption of the coefficient and the bound S1 is determined by the fixed boundary value or the fixed initial and boundary values. Due to these facts, the ranges of the link functions used in [GN20, NvdGW20] and [Kek22] do not required to have finite upper bounds. In the scattering problem, the boundedness requirement of the forward map (2.2c ) cannot be guaranteed by the sign restriction of the potential. In this case, we choose a link with finite range (like Φ given above) to ensure (2.2c ).

Remark 2.2. In view of (1.2) and (2.1), one notes that the statistical model (1.3) fits into the framework of (2.2b ). For the inverse scattering problem studied here, if the forward map G(n) is defined by the far-field pattern (1.2) with refractive index $n \in \mathcal{F}^s_{M_{0}}$ with $s \gt \frac{3}{2}$, from (A.9) we see that (2.2c ) satisfies with $S_{1} = S_{1}(D,\kappa,M_{0})$. From (A.14) we have

where $C = C(D,k,M_0)$. By using (ii) and [NvdGW20, lemma 29], we have

and

therefore we see that (2.2d ) satisfies with $S_{2} = S_{2}(D,\kappa,M_{0})$ with t = 1.

Observe that the random vectors $(Y_i,X_i)$ are iid with laws $\mathbb{P}_{F}^{i}$. It turns out the Radon–Nikodym derivative of $\mathbb{P}_{F}^{i}$ is given by

Equation (2.3)

By slightly abusing the notation, now we define $\mathbb{P}_F^N = \otimes_{i = 1}^N \mathbb{P}_F^i$ the joint law of the random vectors $(Y_i,X_i)_{i = 1}^N$. Moreover, $\mathbb{E}_F^i$, $\mathbb{E}_F^N$ denote the expectation operators in terms of the laws $\mathbb{P}_F^i, \mathbb{P}_F^N$, respectively.

In the Bayesian approach, let Π be a Borel probability measure on the parameter space $H_{0}^{s}(D)$ supported in the Banach space C(D). From the continuity property of $(F,(y,\xi))\to p_F(y,\xi)$, the posterior distribution $\Pi(\cdot|Y^{(N)},X^{(N)})$ of $F|(Y^{(N)},X^{(N)})$ is given by

Equation (2.4a)

where the log-likelihood function is written as

Equation (2.4b)

Finally, we end this subsection by referring to the monograph [Nic23] for a nice introduction on the above preliminaries.

2.3. Statistical convergence rates

In this work we would like to show that the posterior distribution arising from certain priors concentrates near sufficiently regular ground truth $\Phi(F_{0})$, and derive a bound on the rate of contraction, assuming that the observation data $(Y^{(N)},X^{(N)})$ are generated through the model (2.2a )–(2.2d ) of law $\mathbb{P}_{F_{0}}^N$.

2.3.1. Rescaled Gaussian priors.

We now describe explicitly Gaussian priors introduced in [GN20] (also see [Kek22]).

Assumption 2.3. Let $s\gt t+3/2$, $t\unicode{x2A7E} 1$, and ${\mathcal H}$ be a Hilbert space continuously embedded into $H_{0}^s(D)$, and let $\Pi^{^{\prime}}$ be a centered Gaussian Borel probability measure on the Banach space $C^0_c(D)$ that is supported on a separable measurable linear subspace of $C_c^t(D)$. Assume that the reproducing kernel Hilbert space (RKHS) of $\Pi^{^{\prime}}$ equals to ${\mathcal H}$.

Here, we refer to [GN21, definition 2.6.4] for the definition of the RKHS, and we refer to [GN20, example 25] for an example satisfies assumption 2.3. For each s given in assumption 2.3, let $\Pi^{^{\prime}}$ be given in assumption 2.3 and $F^{^{\prime}}\sim\Pi^{^{\prime}}$, we consider the rescaled prior

Equation (2.5)

Again, $\Pi_N$ defines a centered Gaussian prior on C(D) and its RKHS ${\mathcal H}_N$ is still ${\mathcal H}$ but with the norm

Equation (2.6)

for all $F\in{\mathcal H}$. We now introduce the main device in our proof, concerning the posterior contraction with an explicit rate around F0, whose proof can be found in [GN20, theorem 14].

Theorem 2.4. Let $(\mathcal{H},\Pi^{^{\prime}})$ satisfies assumption 2.3 with integer s, let $\Pi_{N}\equiv\Pi_{N}[s]$ be the rescaled prior given in (2.5), let $\Pi_N(\cdot|Y^{(N)},X^{(N)})$ be the posterior distribution given in (2.4a ) with $\Pi = \Pi_{N}$. Assume that $F_0\in{\mathcal H}$ and the observation $(Y^{(N)},X^{(N)})$ to be generated through model (2.2b )–(2.2d ) of law $\mathbb{P}_{F_{0}}^{N}$. If we denote $\delta_N = N^{-(s+1)/(2s+5)}$, then for any K > 0, there exists a large L > 0, depending on $\sigma, F_0, K, s, t, D, S_1,S_2$, such that

Equation (2.7a)

In addition, there exists a large $L^{^{\prime}}\gt0$, depending on $\sigma, K, s, t$, such that

Equation (2.7b)

We now apply theorem 2.4 to the inverse scattering problem considered here. Relying on the contraction rate (2.7a ) and the regularization property (2.7b ) and taking account of the stability estimate of G−1 (see theorem A.1), we can show that the posterior distribution arising from the statistical inverse scattering model (1.3) contracts around n0 in the $L^\infty$-risk using ideas from [MNP21]. In light of the link function, we define the push-forward posterior on the refractive index n by

By the push-forward map, we can rewrite (2.7a ) and (2.7b ) in terms of n. That is, we have that for $N\to\infty$ that

Equation (2.8a)

and

Equation (2.8b)

where the second estimate can be derived from the estimates proved in [NvdGW20, lemma 29] (see also [GN20, (27)]). Here L depends on $\sigma, n_0, K, s, t, D, \kappa, M_0$ and Lʹ depends on $\sigma, K, s, t, \kappa, M_0$.

Theorem 2.5. Let $t \unicode{x2A7E} 2$ and $s\gt t+3/2$ be integers, and fix a real parameter $M_0\gt1$. We further assume that epsilon is any constant satisfying $0\lt\epsilon\lt\frac{2t-3}{2t+3}$. Let $\Pi_N(\cdot|Y^{(N)},X^{(N)})$, $F_{0} \in \mathcal{H}$ and $\delta_N = N^{-(s+1)/(2s+5)}$ be given in theorem 2.4. The ground truth refractive index is $n_0 = \Phi\circ F_0\in{\mathcal F}^s_{M_0}$. Then for any K > 0, there exists a constant $C = C(\sigma, n_0,K,s,t,D,\kappa,\epsilon,M_0)\gt0$ such that

Equation (2.9)

as $N\rightarrow \infty$.

It is clear that we can replace $\|n-n_0\|_{L^\infty(D)}$ by $\|n-n_0\|_{L^2(D)}$ in (2.9). Unlike the polynomial rate proved in [GN20, theorem 5], we obtain a logarithmic contraction rate in (2.9), which is due to a log-type stability estimate of G−1. To obtain an estimator of the unknown coefficient n, in view of the link function Φ, it is often convenient to derive an estimator of F. The posterior mean $\bar F_N : = \mathbb{E}^{\Pi}[F|Y^{(N)},X^{(N)}]$ of $\Pi_N(\cdot|Y^{(N)},X^{(N)})$, which can be approximated numerically by an MCMC algorithm, is the most natural choice of estimator. In light of theorem 2.5, we can also prove a contraction rate for the convergence $\bar F_N$ to F0.

Theorem 2.6. Assume that the hypotheses of theorem 2.5 hold. Then, there exists a $\left.\tilde{C} = \tilde{C}(\sigma, n_0,K,s,t,D,\kappa,\epsilon,M_0)\gt0\right.$ such that

Equation (2.10)

Corollary 2.7. Assume that the hypotheses of theorem 2.5 hold. Then there exists a sufficiently large $\tilde{C}^{^{\prime}} = \tilde{C}^{^{\prime}}(\sigma, n_0,K,s,t,D,\kappa,\epsilon,M_0)\gt0$ such that

Equation (2.11)

The logarithmic contraction rates obtained in theorems 2.5, 2.6, and corollary 2.7 inherit from the log-type estimate of the inverse scattering problem. Nonetheless, in the next theorem, we will show that this contraction rate for the estimator $\hat n: = \Phi\circ\bar F_N$ is optimal is the statistical minimax sense, at least up to the exponent of $\text{ln} N$. We first define a parameter space. Let $s\gt3/2$ be an integer, β > 0, and define

Theorem 2.8. For integer $s\gt3/2$, there exists $\beta = \beta(s)\gt0$ such that for any $\delta\gt\frac{5s}{3}$ and $\varepsilon\in (0,1)$, we have

Equation (2.12)

for all N large enough, where the infimum is taken over all measurable functions $\tilde n = \tilde n(Y,X)$ of the data $(Y,X)\sim{\mathbb{P}}_n^N$.

2.3.2. High-dimensional Gaussian sieve priors.

From computational perspective, it is useful to consider sieve priors that are finite-dimensional approximations of the function space supporting the prior. Here we will use a randomly truncated Karhunen–Loéve type expansion in terms of Daubechies wavelets considered in [GN20, appendix B] or [GN21, chapter 4]. Let $\left\{\Psi_{\ell r} : \ell\unicode{x2A7E} -1, r\in\mathbb{Z}^3 \right\}$ be the (3-dimensional) compactly supported Daubechies wavelets, which forms an orthonormal basis of $L^{2}(\mathbb{R}^{3})$. Let $\mathcal{K}$ be a compact subset in D and let $R_{\ell} = \left\{r \in \mathbb{Z}^{d} : \mathrm{supp}\,(\Psi_{\ell r}) \cap \mathcal{K} \neq \emptyset \right\}$. Let $\mathcal{K}^{^{\prime}}$ be another compact subset in D such that $\mathcal{K}^{^{\prime}} \subsetneq \mathcal{K}$, and let $\chi\in C_c^\infty({D})$ be a cut-off function with χ = 1 on ${\mathcal K}^{^{\prime}}$. Let $s\gt1+3/2$ and consider the prior

Equation (2.13)

where $J\in\mathbb{N}$ is the truncation level. In fact $\Pi^{^{\prime}}_J$ defines a centered Gaussian prior that is supported on the finite-dimensional space

with RKHS norm satisfying [GN20, (B2)]. As above, we consider the 're-scaled' prior $\Pi_N$ defined in (2.5) with $F^{^{\prime}}\sim\Pi^{^{\prime}}_J$.

In analogy to theorem 2.4, we will derive a contraction rate for the statistical model (2.2b ) with the prior $\Pi_N$ defined above. As in theorem 2.4, we obtain the same contraction rate in the L2-prediction risk of the regression function.

Theorem 2.9. Let $t \unicode{x2A7E} 1$, $s\gt t+3/2$ be integers, and $\Pi_N\equiv \Pi_{N}[s]$ be the 're-scaled' prior defined in (2.5) with priors $F^{^{\prime}}\sim\Pi^{^{\prime}}_{J_{N}} \equiv \Pi^{^{\prime}}_{J_{N}}[s]$ where $2^{J_N}\simeq N^{1/(2s+5)}$. Denote $\Pi_N(\cdot|Y^{(N)},X^{(N)})$ the posterior distribution arising from the noisy discrete measurements $(Y^{(N)},X^{(N)})$ of (2.2b ). Let $F_0\in H^{s}_{\mathcal K}(D)$ and $\delta_N = N^{-(s+1)/(2s+5)}$. Then for any K > 0, there exists a large L > 0, depending on $\sigma, F_0, K, s, D, S_1,S_2$, such that

Equation (2.14)

and for sufficiently large $L^{^{\prime}}\gt0$, depending on $\sigma, K, s, t$, we have

Equation (2.15)

The proof of theorem 2.9 only requires minor modification from the proof of theorem 2.4, and all necessary modifications are listed in [GN20, section 3.2]. Therefore we omit the details. Having established theorem 2.9, similar to [GN20, proposition 7], theorems 2.5 and 2.6 can be directly extended to the case of Gaussian sieve priors.

Remark 2.10. As in [GN20], it is likely to extend the results for the Gaussian sieve priors with deterministic truncation level to randomly truncated ones, where the truncation level J itself is an appropriate random number. However, due to the log-type stability estimate of the inverse scattering problem, such extension is highly nontrivial. We will discuss the Bayes method with randomly truncated sieve priors for the inverse scattering problem in another paper.

3. Proofs of theorems

The main theme of this section is to prove theorem 2.5, 2.6 and 2.8.

Proof of theorem 2.5. For each M > 0 satisfies $\lVert 1-n \rVert_{H^{t}(D)} \vee \lVert 1-n_{0} \rVert_{H^{t}(D)} \unicode{x2A7D} M$, one has the stability estimate of G−1 in theorem A.1:

Equation (3.1)

where $\alpha = \frac{2t-3}{2t+3}-\epsilon\gt0$ and $C = C(D,t,\kappa,M,\epsilon)$.

If $\lVert n \rVert_{C^{t}(D)} \unicode{x2A7D} M^{^{\prime}}$ for some $M^{^{\prime}}\unicode{x2A7E} M_0$, since t is an integer, then $\|1-n\|_{H^t(D)}\unicode{x2A7D} CM^{^{\prime}}$ with $C = C(t,D)$. We now set the constant $M = (CM^{^{\prime}})\vee\|1-n_0\|_{H^t(D)}$. In view of (2.8a ), (2.8b ), and (3.1), for any K > 0, there exists large constants $L, M^{^{\prime}}$ and Lʹ (Lʹ is determined by L and Mʹ) such that

which conclude our result. □

Having proved theorem 2.5, we then establish theorem 2.6 using the contraction rate in theorem 2.5 and the link function Φ.

Proof of theorem 2.6. We proof the theorem by modifying some ideas in [GN20, theorem 6]. By the Jensen's inequality, it suffices to prove that there exists a large $\tilde{C}\gt0$ such that

For a large M > 0 to be chosen later, we write

Equation (3.2)

Part I: Estimating $\mathbb{E}^\Pi[\|F-F_0\|_{L^\infty}{\unicode{x1D7D9}}_{\|F\|_{C^t}\gt M}|Y^{(N)},X^{(N)}]$. By using Cauchy-Schwartz inequality, it is easy to see that

Equation (3.3)

Let B > 0 be a constant to be determined later. By using (2.7b ), one can choose a sufficiently large $M = M(\sigma,B,s,t)\gt0$ such that

We now define ${\mathcal B}_N$ by

Equation (3.4)

By using [GN20, lemmas 16 and 23], we have

Let $\nu(\cdot) = \Pi_N\left(\cdot\cap{\mathcal B}_N\right)/\Pi_N\left({\mathcal B}_N\right)$ and set the event

Equation (3.5)

By [GN21, lemma 7.3.2], we can show that

Equation (3.6)

By using the properties (3.6) of $\mathcal{C}_{N}$, we obtain

which is bounded from above, using Markov's inequality and Fubini's theorem, by

Equation (3.7)

By using Fernique's theorem (see e.g. [GN21, exercises 2.1.1, 2.1.2 and 2.1.5]) one has $\mathbb{E}^{\Pi_{N}}\lVert F \rVert_{L^{\infty}}^{2} \lt \infty$. Taking $B \gt A+2$, from (3.7) we conclude

Equation (3.8)

Part II: Estimating $\mathbb{E}^\Pi[\|F-F_0\|_{L^\infty}{\unicode{x1D7D9}}_{\|F\|_{C^t} \unicode{x2A7D} M}|Y^{(N)},X^{(N)}]$. The above discussions still valid if we replace M by a (possibly larger) $M = M(\sigma,B,s,t,F_{0})\gt0$ with $\lVert F_{0} \rVert_{L^{\infty}} \unicode{x2A7D} M$. Since $n = \Phi\circ F$ and $n_0 = \Phi\circ F_0$, by (i), mean value theorem and inverse function theorems, there exists η lying between $n_{0}(x)$ and n(x) such that

for all $x\in D$. Since $F,F_{0} \in \left[\Phi(-M),\Phi(M)\right]$, by (i) we reach

Therefore we see that

and

Equation (3.9)

Let C > 0 be a constant to be determined later. From (3.9), we see that

We can modify the arguments as in Part I above by replacing the event $\left\{F : \lVert F \rVert_{C^{t}}\gt M \right\}$ (resp. (2.7b )) by the event $\left\{n : \lVert n-n_{0} \rVert_{L^{\infty}}\gt C(\text{ln} N)^{-\frac{2t-3}{2t+3}+\epsilon} \right\}$ (resp. (2.9)) with $C = C(\sigma, n_0,K,s,t,D,\kappa,\epsilon,M_0)\gt0$ given in theorem 2.5 to show that

Equation (3.10)

Finally, putting together (3.2), (3.8), and (3.10) yields theorem 2.6 with $\tilde{C} = \text{max}\{C,1\}$. □

Next, we prove the optimality of contraction rate in corollary 2.7 in the minimax sense.

Proof of theorem 2.8. We apply the method in the proof of the lower bound [AN19, theorem 2] to our case here. The idea is to find $n_0, n_1\in\hat{\mathcal{F}}^s_{\beta}$ (both are allowed to depend on N) such that, for some small ζ sufficiently small,

  • (a)  
    $\|n_0-n_1\|_{L^{\infty}} \unicode{x2A7E} \theta_{N,\delta}: = (\text{ln} N)^{-\delta}$.
  • (b)  
    $\textrm{KL}(p_{n_0}^{\otimes N},p_{n_1}^{\otimes N})\unicode{x2A7D}\zeta$,

where $p_{n}^{\otimes N}$ is the Radon Nikodym derivative of the joint law $\mathbb{P}_{n}^{N}$ and the Kullback–Leibler divergence $\textrm{KL}(p_{n_0}^{\otimes N},p_{n_1}^{\otimes N})$ is defined by

Equation (3.11)

By independence, we note that

and [GN20, lemma 23] implies

Equation (3.12)

then using the standard arguments as in [GN21, section 6.3.1] (see also [Tsy09, chapter 2]), we conclude the theorem.

For the sake of completeness, here we present the details. From condition (a), we see that $\psi = {\unicode{x1D7D9}}_{\{\|\tilde n-n_1\|_{L^{\infty}}\lt\|\tilde n-n_0\|_{L^{\infty}}\}}$ yields a test of

Equation (3.13)

It follows from a general reduction principle that

where the second infimum is over all tests ψ of (3.13). Similar to the proof of [GN21, theorem 6.3.2], we introduce the event

Note that

Let $p_1 = \mathbb{P}_{n_1}^N(\psi = 1)$, then

It is clear that the infimum above is attained when $\frac 12(p-\mathbb{P}_{n_1}^N(\Omega^c)) = 1-p$ and has the value $\frac 13\mathbb{P}_{n_1}^N(\Omega)$. Hence,

Equation (3.14)

Next, let us estimate

Using the second Pinsker inequality [GN21, proposition 6.1.7b] and condition (b), we have

We now choose ζ sufficiently small such that the last term above is bounded below by $1-\varepsilon$. Then the estimate (2.12) follows in view of (3.14).

The remaining task is to find $n_0, n_1$ satisfying conditions (a) and (b). For $\delta\gt\frac{5s}{3}$, setting $\theta = \theta_{N,\delta} = (\text{ln} N)^{-\delta}$ in theorem B.1, there exist $n_0, n_1 \in\hat{\mathcal F}^s_{\beta}$ satisfying $\|n_0-n_1\|_\infty\gt\theta_{N,\delta}$ and

To verify condition (b), we use (3.12) to conclude that

since $\frac{3\delta}{5s}\gt1$. Therefore, we can make ζ as small as we wish by taking N sufficiently large. □

Data availability statement

No new data were created or analysed in this study.

Appendix A: Inverse scattering problem

In what follows, assume that

Equation (A.1)

We first discuss the existence and uniqueness of the scattered field $u_n^\textrm{sca}$. It is known that $w: = u_n^\textrm{sca}$ satisfies the following boundary value problem [CCH23, (1.51)–(1.52)]:

Equation (A.2)

where BR is a ball of radius R such that $\bar D\subset B_R$. Here $S_R: H^{1/2}(\partial B_R)\to H^{-1/2}(\partial B_R)$ is the Dirichlet-to-Neumann map, defined for $g\in H^{1/2}(\partial B_R)$ by $S_Rg = (\partial u_g/\partial r)|_{\partial B_R}$ , where ug is the solution of the Helmholtz equation satisfying the Sommerfeld radiation condition in $R^3\setminus B_R$ and the Dirichlet condition $u_g = g$ on $\partial B_R$. It has been shown that

Equation (A.3)

where $\langle\cdot,\cdot\rangle$ denotes the duality pairing between $H^{-1/2}(\partial B_R)$ and $H^{1/2}(\partial B_R)$, see, for example, [CCH23, definition 1.36 and (1.50)].

To proceed further, let us replace the right-hand side of the first equation in (A.2) by a general source term f with $\mathrm{supp}(f)\,\subset B_R$, i.e.

Equation (A.4)

In view of the integration by parts, (A.4) is equivalent to the following variational formulation: find $w\in H^1(B_R)$ such that for all $v\in H^1(B_R)$,

where

and

We can see that

In other words, $a_1(\cdot,\cdot)$ is strictly coercive. Combining the Riesz representation theorem and the Lax–Milgram theorem, there exists an invertible operator ${\mathcal A}:H^1(B_R)\to (H^{1}(B_R))^\ast$, where $(H^1(B_R))^\ast$ is the dual space of $H^1(B_R)$, such that

Similarly, define the bounded linear operator ${\mathcal B}:H^1(B_R)\to (H^1(B_R))^\ast$ by $a_2(w,v) = ({\mathcal B}w,v)_{H^1(B_R)}$. It is not difficult to see that ${\mathcal B}$ is compact. Consequently, ${\mathcal A}+{\mathcal B}$ is a Fredholm operator. By the Fredholm alternative, ${\mathcal A}+{\mathcal B}:H^1(B_R)\to (H^1(B_R))^\ast$ is bounded invertible provided the kernel of ${\mathcal A}+{\mathcal B}$ is trivial, which follows the uniqueness of the scattered solution (by the combination of the Rellich lemma and the unique continuation property). Furthermore, we have the following estimate:

Equation (A.5)

where $C = C(D,\kappa,M_0)$. Let $f = \kappa^2(n-1)u^\textrm{inc}$ and $w = u_n^\textrm{sca}$, then (A.5) implies

Equation (A.6)

uniformly in $\theta\in\mathcal{S}^2$. We now choose $R^{^{\prime}}\lt R$ such that $\overline{D} \subset B_{R^{^{\prime}}}$. By the interior estimate [GT01, theorem 8.8], we further have

which, by the Sobolev imbedding theorem, implies

Equation (A.7)

The scattering amplitude $u_n^\infty(\theta^{^{\prime}},\theta)$ can be expressed explicitly by

Equation (A.8)

where $u(y,\theta) = u^\textrm{inc}(y,\theta)+u_n^\textrm{sca}(y,\theta)$ is the total field with $u^\textrm{inc}(y,\theta) = e^{\mathbf{i}\kappa y\cdot\theta}$, see [CCH23, (1.22)] or [CK19, (8.28)] or [Ser17, p 232]. From (A.7) and (A.8), we have

Equation (A.9)

with $S = S(D,k,M_0) = C(1+M_0)$. Since

applying the interior estimate and the Sobolev imbedding theorem again, we have that

Equation (A.10)

with $C = C(D,k,M_0)$.

Next, assume that $n_1, n_2$ satisfy (A.1). For each open set Ω in Euclidean space, we observe that

Equation (A.11)

Let $w = u^\textrm{sca}_{n_2}-u^\textrm{sca}_{n_1}$ and $f = \kappa^2(n_2-n_1)u^\textrm{sca}_{n_1}+\kappa^2(n_2-n_1)u^\textrm{inc}$, then combining (A.5), (A.7), (A.10) and (A.11), yields

Equation (A.12)

uniformly in θ and $C = C(D,k,M_0)$. Then it yields from (A.8) that

Equation (A.13)

where the inequality above is due to the integral form of Minkowski's inequality. Plugging (A.6), (A.10) and (A.12) into (A.13) gives

Equation (A.14)

where $C = C(D,k,M_0)$.

Next, we recall the following stability estimate for the determination of the potential from the scattering amplitude.

Theorem A.1  (HH01, theorem 1.2).  Let $t\gt3/2$, M > 0, and $0\lt\epsilon\lt\frac{2t-3}{2t+3}$ be given constants. Assume that $1-n_j\in H^t(\mathbb{R}^3)$ satisfying $\|1-n_j\|_{H^t(\mathbb{R}^3)}\unicode{x2A7D} M$ and $\mathrm{supp}(1-n_j)\subset D$, $j = 1,2$. Then

Equation (A.15)

where $C = C(D,t,k,M,\epsilon)$ and

Remark A.2 Here the constant M may be different from M0 given above.

Appendix B: Optimality of the stability estimate

The purpose of this section is to show that the logarithmic estimate obtained in theorem A.1 is optimal by deriving an instability estimate. Similar instability estimate was already proved in [Isa13]. To make the paper self contained, we present our own proof here and also slightly refine the estimate obtained in [Isa13]. Throughout this section, we denote $q(x) = n(x)-1$.

Theorem B.1. Consider the inverse scattering problem (1.1a )–(1.1c ) with frequency κ > 0. Let integer s > 0 be a given regularizing parameter. Then there exists constants $\beta = \beta(s,\kappa) \gt 0$ and $\vartheta_{0} = \vartheta_{0}(s,\kappa)\gt0$ such that: for each $0 \lt \vartheta \lt \vartheta_{0}$ there exists non-negative $q_{1},q_{2} \in C^{\infty}(\mathbb{R}^{3})$ with $\mathrm{supp}\,(q_{j}) \subset B_{\frac{1}{2}}$ satisfying the apriori bound $\lVert q_{j} \rVert_{C^{s}(\mathbb{R}^{3})} \unicode{x2A7D} \beta$ and

Remark B.2. From the properties of the Hilbert-Schmidt norm [Con90, exercise IX.2.19(h)], one has

Equation (B.1)

Recall $G(1+q)$ is the far-field operator given in (1.2).

Our main strategy is to modify the ideas in [DCR03] (see also [KRS21] for more details about the mechanism). Given any ϑ > 0, $s \unicode{x2A7E} 0$ and β > 0, we consider the following set:

where Br denotes the ball or radius r centered at the origin. The following lemma verifies the assumption (a) of [DCR03, theorem 2.2], can be proved as in [Man01, lemma 2] (we omit the proof), see also [Isa13, KUW21, KW22, ZZ19]. We refer [KT61] for a version in a more abstract form.

Lemma B.3. Fix $d \in \mathbb{N}$ and $s \unicode{x2A7E} 0$. There exists a constant 6 $\mu = \mu(d,s) \gt 0$ such that the following statement holds for all β > 0 and for all $\vartheta \in (0,\mu\beta)$: there exists a ϑ-discrete subset $\mathcal{Z}_{\vartheta}$ of $\left( \mathcal{N}_{s \beta}^{\vartheta} , \lVert \cdot \rVert_{L^{\infty}(\mathbb{R}^{d})} \right)$ 7 such that

Equation (B.2)

In addition, all elements in $\mathcal{Z}_{\vartheta}$ are in $C^{\infty}(\mathbb{R}^{d})$.

Similar to [DCR03], the proof of theorem B.1 is quite delicate, which is not an obvious consequence of the abstract theorem in [DCR03, theorem 2.2]. From the asymptotic expansion (1.1c ), it is easy to see that

A crucial point is to bound $\lvert x \rvert \lvert u_{1+q}^\textrm{sca}(x,\theta) \rvert$ for all $\lvert x \rvert \unicode{x2A7E} 2$, by some constant which is independent of θ and q, like [DCR03, (4.21)]. From now on, for simplicity, we restrict ourselves for the case when d = 3. We now prove the following lemma.

Lemma B.4. Let κ > 0 and let $\lVert q \rVert_{L^{\infty}(\mathbb{R}^{3})} \unicode{x2A7D} 1$ with $\mathrm{supp}\,(q) \subset B_{1/2}$. Then there exists a constant $C = C(\kappa)\gt0$ such that the following uniform decay estimate holds:

Proof. This lemma is an easy consequence of (A.6) and [Ron03, lemma 3.2] (with R = 1). □

As in [HH01], we introduce the following index set

Let $\{Y\,{_{m}^{j}}: m \in \mathbb{Z}_{\unicode{x2A7E} 0},\;\lvert j \rvert \unicode{x2A7D} m\}$ be the set of spherical harmonics. For each $F \in \textrm{HS}(L^{2}(\mathcal{S}^{2}),L^{2}(\mathcal{S}^{2}))$, we denote $F_{mn}^{jk} : = \left( F Y\,{_{m}^{j}} , Y_{n}^{k} \right)_{L^{2}(\mathcal{S}^{2}),L^{2}(\mathcal{S}^{2})}$. Accordingly, we write

Lemma B.5. Let κ > 0 and let $\lVert q \rVert_{L^{\infty}(\mathbb{R}^{3})} \unicode{x2A7D} 1$ with $\mathrm{supp}\,(q) \subset B_{1/2}$. Then there exist constants $c_\textrm{abs} = c_\textrm{abs}(\kappa)\gt0$ and $C_\textrm{abs} = C_\textrm{abs}(\kappa)\gt e$ such that

Proof. By [CK19, theorem 2.15], we can write

where $h_{m}^{(1)}$ is the spherical Hankel function of the first kind or order m. In fact,

see the proof of [CK19, theorem 2.16]. For each $\theta \in \mathcal{S}^{2}$, also from [CK19, theorem 2.16], we have the following expansion

which converges uniformly in $\theta^{^{\prime}} \in \mathcal{S}^{2}$, and hence

Equation (B.3)

for all r > 1. We now combine (B.3) with lemma B.4 to obtain

uniformly in $\theta^{^{\prime}}\in\mathcal{S}^2$. We now choose $\kappa r = 2$ and reach

Equation (B.4)

From [KW22, (48)], we see that

Equation (B.5)

In view of the quantitative Stirling's formula [Rob55], we obtain

Equation (B.6)

From (B.5), it is clearly that

Combining (B.4) and (B.6) implies

Finally, by the first reciprocity relation [CK19, theorem 8.8], we have

and the lemma is proved. □

We also need a technical lemma.

Lemma B.6. Consider the normed space

where

Then

Proof. Since

Equation (B.7)

we can estimate

which implies the lemma. □

We now construct a δ-net by following the procedures in [DCR03, lemma 2.3].

Lemma B.7. Let $\mathcal{N}^{1} : = \left\{q\unicode{x2A7E} 0 : \mathrm{supp}\,(q) \subset B_{1/2}, \quad \lVert q \rVert_{L^{\infty}(\mathbb{R}^{3})} \unicode{x2A7D} 1 \right\}$. Then there exists a constant $C_\textrm{abs}^{^{\prime}} = C_\textrm{abs}^{^{\prime}}(\kappa) \gt e$ such that: for each $0 \lt \delta \lt 1/e$ we can find a δ-net 8 $\mathcal{Y}_{\delta}$ for $\left( G(1+\mathcal{N}^{1}) , \lVert \cdot \rVert_{\textrm{HS}(L^{2}(\mathcal{S}^{2}),L^{2}(\mathcal{S}^{2}))} \right)$ such that

Equation (B.8)

Proof. Fix $\delta \in (0 , 1/e)$. Let $\tilde{\ell}$ be the smallest positive integer such that

Equation (B.9)

where $C_\textrm{abs} \gt e$ and $c_\textrm{abs} \gt 0$ are the absolute constants appeared in lemma B.5. One sees that there exists an absolute constant C > 0 such that

Equation (B.10)

We define a finite subset of the complex plane $\mathbb{C}$ by

It is easy to see that

Equation (B.11)

We now define the set

Let $\ell_{*}$ be the number of $(m,j,n,k)\in\mathcal{M}$ such that $\text{max}\{m,n\} \unicode{x2A7D} \tilde{\ell}$, then we have

Equation (B.12)

Plugging (B.10) and (B.11) into (B.12) yields

and, furthermore, from the trivial estimate $\text{log}(1+t)\unicode{x2A7D} t$ for all $t \unicode{x2A7E} 0$, we see that

which gives (B.8).

The remaining task now is to verify that the set $\mathcal{Y}_{\delta}$ constructed above is a δ-net for $\left( G(1+\mathcal{N}^{1}) , \lVert \cdot \rVert_{\textrm{HS}(L^{2}(\mathcal{S}^{2}),L^{2}(\mathcal{S}^{2}))} \right)$. Fix $q \in \mathcal{N}^{1}$ and consider the far-field pattern $G(1+q)$. For each $(m,j,n,k)\in\mathcal{M}$ with $\text{max}\{m,n\}\unicode{x2A7D}\tilde{\ell}$, we choose $F_{mn}^{jk} \in \Psi_{\delta}$ be the closest element to $G(1+q)_{mn}^{jk}$; otherwise for those $(m,j,n,k)\in\mathcal{M}$ with $\text{max}\{m,n\} \gt \tilde{\ell}$, we simply choose $F_{mn}^{jk} = 0$. We define the operator $F \in \mathcal{Y}_{\delta}$ by

By lemma B.5, we see that

Therefore, if $\text{max}\{m,n\} \unicode{x2A7D} \tilde{\ell}$, we have that

Otherwise, when $t : = \text{max}\{m,n\} \gt \tilde{\ell}$, from lemma B.5 and (B.9), we estimate

Consequently, by lemma B.6, we finally obtain

 □

We are now ready to prove the main instability estimate by combining lemmas B.3 and B.7.

Proof of theorem B.1. Let $s \gt \frac{3}{2}$ and $C_\textrm{abs}^{^{\prime}} \gt e$ be the constant obtained in lemma B.7. We choose $\beta = \beta(s) \gt 0$ such that

Equation (B.13)

where $\mu = \mu(s)$ is the constant given in lemma B.3 (with d = 3). Let us pick ϑ satisfying

Assume that the set $\mathcal{Z}_{\vartheta}$ is given in lemma B.3 (also take d = 3). Next we choose

and construct the $\mathcal{Y}_{\delta}$ described in lemma B.7. Since $\mathcal{Z}_{\vartheta} \subset \mathcal{N}_{s \beta}^{\vartheta} \subset \mathcal{N}_{s \beta}^{1}$, it is clear that $\mathcal{Y}_{\delta}$ is also a δ-net for $\left( G(1+\mathcal{Z}_{\vartheta}) , \lVert \cdot \rVert_{\textrm{HS}(L^{2}(\mathcal{S}^{2}),L^{2}(\mathcal{S}^{2}))} \right)$. Moreover, we can check that

Combining (B.2), (B.8) and (B.13) implies

This enables us to choose two different $q_{1},q_{2} \in \mathcal{Z}_{\vartheta}$ (by definition of $\mathcal{Z}_{\vartheta}$ it holds that $\lVert q_{1}-q_{2} \rVert_{L^{\infty}(\mathbb{R}^{3})} \unicode{x2A7E} \vartheta$) such that there exists $F \in \mathcal{Y}_{\delta}$ such that

which proves the theorem by using (B.1). □

Footnotes

  • In fact, one can choose $\mu = d^{-\frac{s}{2}}\lVert \psi \rVert_{C^{s}(\mathbb{R}^{d})}^{-1}$ for some $\psi\in C_{c}^{\infty}((-1/2,1/2)^{d})$ with $\lVert \psi \rVert_{L^{\infty}(\mathbb{R}^{d})} = 1$.

  • This means that $\lVert q_{1}-q_{2} \rVert_{L^{\infty}(\mathbb{R}^{d})} \unicode{x2A7E} \vartheta$ for all $q_{1},q_{2} \in \mathcal{Z}_{\vartheta} \subset \mathcal{N}_{s \beta}^{\vartheta}$.

  • This means that for each operator $F_{0} \in G(1+\mathcal{N}^{1})$ there exists $F \in \mathcal{Y}_{\delta}$ which is approximate F0 in the sense that $\lVert F_{0}-F \rVert_{\textrm{HS}(L^{2}(\mathcal{S}^{2}),L^{2}(\mathcal{S}^{2}))} \unicode{x2A7D} \delta$.

Please wait… references are loading.