Restoring the discontinuous heat equation source using sparse boundary data and dynamic sensors

Guang Lin; Na Ou; Zecheng Zhang; Zhidong Zhang

doi:10.1088/1361-6420/ad2904

1. Introduction

1.1. Mathematical model

We give the mathematical statement of our interested inverse problem. Firstly the heat equation is given as follows:

$\begin{align} \begin{cases} \begin{aligned} \left(\partial_t-\Delta\right)u\left(x,t\right)& = \chi_{{}_{D}}, &&\left(x,t\right)\in\Omega\times\left(0,T\right),\\ u\left(x,t\right)& = 0,&&\left(x,t\right)\in\partial\Omega\times\left(0,T\right)\cup \Omega\times\left\{0\right\}. \end{aligned} \end{cases} \end{align} \tag{ 1.1 }$

Here $\Omega\subset \mathbb{R}^2$ is the two-dimensional unit disc. In equation (1.1), the discontinuous source $\chi_{{}_{D}}$ is the characteristic function and the support $D\subset \Omega$ is unknown. More precisely, if x is included by D, the value of $\chi_{{}_{D}}$ is one; otherwise, it equals zero. We assume that $\partial D$ is sufficiently smooth. In this work, we will use the boundary flux data observed by the dynamic moving sensors to recover the unknown support D. The mathematical formulation of the measurements can be written as

$\begin{align*} \frac{\partial u}{\partial \overset{\,}{\overrightarrow{\textbf{n}}} }\left(z\left(t\right),t\right),\ z\left(t\right)\in \Gamma\left(t\right)\subset\partial\Omega,\ t\in\left(0,T\right). \end{align*}$

Here z(t) is the observation point and located in the boundary $\partial\Omega$ ; $\Gamma(t)$ is the observed area. We employ the notations z(t) and $\Gamma(t)$ to emphasize that the sensor locations are time-dependent. Our analysis centers on sparse boundary data, where Γ represents a very small subset of the boundary. The limited availability of observations on this sparse boundary presents substantial challenges in recovering χ_D. To address these challenges and verify the proposed algorithm 1, we opt to rely solely on flux information obtained from two sensors, further enhancing the complexity of the problem.

.

Algorithm 1. Pseudo-algorithm for dynamical locations determination.
1. Initialization: two observation points (sensors) location p₁ and p₂, observation time stamps $t^{\prime} = [T_1,{\ldots},T_{n}]^{\intercal}$ , current observation time $\tilde{t} = T_1$ , initial PDE simulation time t₀ = 0, and the total sampling iteration limits K, the number of samples N;
2. for k = 1 to K − 1 do
3. Observe the flux at p₁ and p₂;
4. Simulate the PDE solution until $\tilde{t}$ from t₀ for the computation of $\Phi(\cdot)$ ;
5. Obtain posterior samples $\{\boldsymbol \xi_i\}_{i = 1}^{N}$ (the target) and $\{U_i\}_{i = 1}^{N}$ (solution at $\tilde{t}$ ) according to the sampling schemes presented in algorithm 2;
6. Update two sensors p₁ and p₂ based on posterior samples (please refer to section 3.2.1 and 4.1.1 for details);
7. Reset initial time $t_0 = \tilde{t}$ ;
8. Reset the mean of $\{U_i\}_{i = 1}^{N}$ to be the initial value for the next simulation ;
9. Pick up and set $\tilde{t}$ to be the next observation and the end-of-simulation time $T_{k+1}$ from tʹ.

1.2. Background and literature

In practical applications of inverse problems, sparse boundary data holds great importance. Let us take an example in atmospheric science [11]. The equation (1.1) can be used to describe the diffusion of pollutants [11], where the solution u represents pollutant concentration. In such scenarios, the region of the pollutant source, denoted as D, should be heavily contaminated. To safeguard the health of engineers, the utilization of boundary measurements becomes imperative.

Furthermore, in the realm of practical inverse problems, the project's cost cannot overlooked. This includes expenses related to equipment, labor, computation, and more. This cost-conscious perspective has motivated numerous researchers to seek solutions to inverse problems using sparse data.

However, when opting for sparse boundary data, raises a pertinent question: how do we determine the sensors' locations? Although one can establish the well-posedness of the inverse partial different equation (PDE) problem, the sensors' locations significantly influence the quality of numerical simulation. Unfortunately, in the context of inverse problems, we often have to select observation areas randomly. In practice, it is entirely possible to initially choose suboptimal observation points, which can pose challenges.

To address this dilemma and ensure the quality of our approximations, we propose the concept of using moving observation points. Specifically, if the initial observation points are situated in less favorable locations, we can relocate the facilities to more suitable positions. Consequently, the data obtained from these improved locations will compensate for the adverse effects of the suboptimal points and enhance the accuracy of our approximations.

In line with this idea, the fundamental question we must address is how to determine whether a location is advantageous or not; that is, how to decide when and where to move the observation points.

The inverse source problem of the heat equation $(\partial_t-\Delta)u = F$ is a classical field in the literature of inverse problems, and abundant academic achievements are generated. See [5, 14, 16–19, 25, 31, 32] and the references therein. Nowadays, due to the significant application value of the sparse data, more and more researchers have paid attention to this field. Here we list several references on the inverse problems with sparse data. In [14], the authors consider the inverse source problem of the heat equation on the two-dimensional unit disc. They recover the space-dependent source f(x) by the flux data at finite points of the boundary. The conclusion in [14] is promoted by [32], in which the variable separable source $f_1(x)f_2(t)$ can be uniquely determined by the sparse boundary data. Note that in [32], the spatial term $f_1(x)$ and temporal term $f_2(t)$ are both unknown. In [22, 25], the authors further recover the semi-discrete unknown source, which is investigated in this work, and the considered equation is the parabolic equation on a general domain. Also, for the geometric inverse problems, the authors in [13, 20] recover the manifold by the image of a single specific function. We should note that for such geometric inverse problems, people usually use a whole operator as the measurements.

1.3. Main result and outline

We summarize the contributions as follows.

1.
We study inverse problems with sparse measurements and moving observation points.
2.
We introduce a sampling-based approach and a dynamic sensor migration algorithm designed for source tracing. Moreover, we present two migration strategies and validate their effectiveness through two challenging numerical examples.

The rest of this article is organized as follows. In section 2, we will provide an overview of the existing problems. We then introduce the Bayesian methodology and the Langevin-based algorithm in section 3, in which the uniqueness of this inverse problem is also discussed. In particular, we will discuss the proposed sampling methods and proposed dynamical sensors migration algorithm in section 3.1 and 3.2. Finally, we will verify the performance and introduce the sensors migration strategies in section 4.

2. Preliminaries

2.1. The inverse problem settings

This article is an extension of [31, 32], in which the authors prove the uniqueness theorem of the inverse source problem under sparse data. More precisely, under suitable conditions, the flux data generated from two points of the boundary can uniquely determine the unknown source. However, as we mention in the introduction, when we solve this inverse problem numerically, there is a natural question that how to determine the locations of the observation points. This question holds paramount importance in computational work because the precision of the numerical approximation is profoundly influenced by the placement of observation points. To substantiate this assertion, we test a numerical experiment by the algorithm proposed in [32] and obtain the comparison (please see figure 1).

In figure 1, the dotted blue line represents the support of the exact source, while the dashed red line corresponds to the support of the approximated source. The black points located along the boundary represent the observation points. It is evident that the support of the exact source in both figures is the same; the only difference lies in the placement of the observation points. Consequently, the two approximations exhibit significant disparities.

Evidently, in the left figure, the observation points are positioned unfavorably, earning them the label of 'bad' points, while those in the right figure are considered 'good'. However, it is important to recognize that in practical applications of inverse problems, obtaining the exact solution for the unknown source is often infeasible. Therefore, we are left with the task of selecting observation points largely based on chance.

If, by chance, we initially select 'bad' points and do not alter their positions, it results in the production of poor numerical outcomes, as demonstrated in the left part of figure 1. This underscores the reason why we endeavor to leverage data from moving observation points to offset the approximations adversely affected by these 'bad' observation points.

Following [14, 32], the unknown support D of the source can be uniquely determined when the observed area Γ is fixed and only consists of two chosen points on the boundary. Hence, in this work, we will suppose that $\Gamma(t)$ consists of two points of $\partial \Omega$ for each $t\in(0,T)$ . With the idea of moving observation points, the measurements we used in this work will be represented as

$\begin{align} \frac{\partial u}{\partial{\overset{\,}{\overrightarrow{\textbf{n}}}}}\left(z_j\left(t\right),t\right),\ z_j\left(t\right)\in \partial\Omega,\ j = 1,2,\ t\in \left(0,T\right). \end{align} \tag{ 2.2 }$

The notation $z_j(t)$ is the path of each observation point, and the essential difficulty of this work is how to determine the path z_j . We will provide some rules for determining $z_j(t)$ and give the corresponding numerical experiments. These will be discussed in the next sections.

2.2. Bayesian framework

A highly effective approach for tackling the inverse problem involves adopting the Bayesian inference framework, as referenced in prior works [6, 10, 23, 33, 35]. In this context, we define the measured flux data as $\textbf{d} \in {\mathbb{R}}^{n_d}$ and represent the prior distribution of ξ as $p_0(\boldsymbol \xi)$ , where ξ is the unknowns used to parameterize the shape of the source and the inverse quantities of interests (QoIs). Utilizing the Bayesian formula, we can express this relationship as follows:

$\begin{align*} p\left(\boldsymbol \xi|\textbf{d}\right) \propto p_0\left(\boldsymbol \xi\right) p\left(\textbf{d}|\boldsymbol \xi\right). \end{align*}$

where $p(\textbf{d}|\boldsymbol \xi)$ represents the likelihood function given the parameter ξ .

We denote the mapping from the space of parameter ξ to the observations as the forward model, labeled as g. Given that observation data inherently contain noise and that noise is also introduced during the forward evaluation, we make the assumption that given ξ , the data d follows a normal distribution with mean $g(\boldsymbol \xi)$ and variance σ². Specifically, the likelihood function takes the following form:

$\begin{align*} p\left(\textbf{d}|\boldsymbol \xi\right) = \left(2\pi\sigma^2\right)^{-\frac{n_d}{2}}\text{exp}\left(-\frac{\|\textbf{d}-g\left(\boldsymbol \xi\right)\|_2^2}{2\sigma^2}\right). \end{align*}$

One of the key advantages of employing the Bayesian inference method is our ability to obtain the distribution of the unknown parameters. This capability allows us to quantitatively assess the uncertainty associated with QoIs, such as the outputs of the forward model. The trajectory of the observation points is subsequently determined in a sequential manner using posterior samples of the QoI.

3. Langevin diffusion and Bayesian sampling

In this study, we will employ the Bayesian approach to ascertain the trajectory of observation points and iteratively address the inverse source problem. To attain the desired posterior distribution $p(\boldsymbol \xi|\textbf{d})$ , various Markov chain Monte Carlo (MCMC) methods can be employed for sampling. Among these methods, the Langevin diffusion (LD)-based MCMC [4, 8, 9, 28] stands out as a computationally efficient choice, particularly when dealing with high-dimensional and multimodal distributions [21, 28].

Under certain regularity conditions (see, e.g. [2, 30]), the following stochastic differential equation (SDE)

$\begin{align} \mathrm{d}\boldsymbol \xi_t = -L\nabla U \left(\boldsymbol \xi_t\right) \mathrm{d}t + \sqrt{2 L \tau} \mathrm{d}W_t, \end{align} \tag{ 3.3 }$

known as the preconditioned LD [26, 29], has a stationary distribution

$\begin{align*} p_{\tau}\left(\boldsymbol \xi\right) \propto \text{exp}\left(-\frac{U\left(\boldsymbol \xi\right)}{\tau}\right). \end{align*}$

In the equation (3.3), L is an arbitrary symmetric, positive-definite matrix, $U(\cdot)$ is the energy function, W_t is the p-dimensional standard Brownian motion, and τ is the temperature parameter. Specifically, let us define the energy function $U(\boldsymbol \xi)$ as,

$\begin{align*} U\left(\boldsymbol \xi\right) = \text{log} p\left(\boldsymbol \xi\right)+\text{log} p\left(\textbf{d}|\boldsymbol \xi\right), \end{align*}$

the target posterior distribution $p(\boldsymbol \xi|\textbf{d})$ can then be sampled by the LD-based MCMC algorithms, where the Markov chain is constructed by discretizing the equation (3.3) without adjustment or with the Metropolis–Hastings (MH) acceptance step [3, 34]. The choice of the temperature parameter τ has crucial influence on the sampling. Motivated by the replica-exchange methods [23, 24, 27] extends the preconditioned LD-based methods to multiple chains with different temperature parameters to accelerate distribution simulation.

In practical sampling scenarios, computing the gradient of the energy function can be computationally expensive, especially when dealing with PDE-based inverse problems that involve time-consuming forward model simulations. Moreover, in this work, the source term exhibits discontinuities concerning the unknown parameters. Therefore, we opt for the gradient-free MH MCMC method [12].

To address the inverse problem, we will employ the adaptive preconditioned Crank–Nicolson MCMC (pCN-MCMC) method within a Bayesian inference framework.

3.1. Adaptive pCN

As the sampling acceptance rate is independent of the dimension of parameters, the method of pCN-MCMC [7] stands as an efficient strategy for addressing large-scale inference challenges. Specifically, it is based on the SDE

$\begin{align} \mathrm{d}\boldsymbol \xi_t = -LB^{-1} \boldsymbol \xi_t \mathrm{d}t + \sqrt{2 L} \mathrm{d}W_t, \end{align} \tag{ 3.4 }$

which is a special case of equation (3.3), where the prior is the Gaussian distribution ${\cal N}(0, B)$ , the gradient of the log-likelihood vanishes in the drift term, and the temperature is τ = 1. Applying the Crank–Nicolson schemes [1, 7] to the SDE (3.4), it follows that,

$\begin{align} \boldsymbol \xi^* = \boldsymbol \xi-\frac{1}{2}\delta LB^{-1}\left(\boldsymbol \xi+\boldsymbol \xi^*\right)+\sqrt{2\delta L}w, \end{align} \tag{ 3.5 }$

where $w \sim \mathcal{N}(0, I_p)$ represents samples from a multivariate normal distribution, and δ denotes the step size. When L = B is chosen, the pCN proposal is resulted,

$\begin{align} \boldsymbol \xi^* = \sqrt{1-\beta_1^2}\boldsymbol \xi+\beta_1 {w_1}, \ w_1\sim \mathcal{N}\left(0, B\right), \end{align} \tag{ 3.6 }$

where

$\begin{align*} \beta_1 = \frac{2\sqrt{2 \delta}}{2+ \delta}.\end{align*}$

Moreover, the equation (3.5) can also be rewritten as [15]

$\begin{align} \boldsymbol \xi^* = \sqrt{1-\beta_2^2CB^{-1}}\boldsymbol \xi+\beta_2 { w_2}, \ w_2\sim \mathcal{N}\left(0, C\right), \end{align} \tag{ 3.7 }$

where

$\begin{align*} \beta_2\sqrt{C} = \frac{\sqrt{2 \delta}}{I+ \frac{1}{2}\delta L B^{-1}}, \end{align*}$

yielding the adaptive pCN scheme, it is used to accelerate the convergence of the pCN MCMC method. One of the optimal choices of C is the posterior covariance matrix, which is not directly available but can be determined by the historical samples empirically. For both the pCN and adaptive pCN scheme, the acceptance rate has the form [15]

$\begin{align} \alpha\left(\boldsymbol \xi^*, \boldsymbol \xi\right) = \text{min}\left(1, \text{exp}\left(\Phi\left(\boldsymbol \xi\right)-\Phi\left(\boldsymbol \xi^*\right)\right)\right) \end{align} \tag{ 3.8 }$

where

$\begin{align*} \Phi\left(\boldsymbol \xi\right) = \text{log} p\left(\textbf{d}|\boldsymbol \xi\right). \end{align*}$

Please refer to algorithm 2 for more details on the adaptive pCN sampling method.

Algorithm 2. The adaptive pCN MCMC.
Input: the covariance matrix B of the prior, parameters β₁, β₂, noise of the observed data σ, the initial guess $\boldsymbol \xi_0$ , current observation time $\tilde{t}$ . The number of samples N₁ and N, the update frequent number k₀.
Output: posterior samples $\{\boldsymbol \xi_i\}_{i = 1}^{N}$ and the PDE solution $\{U_i\}_{i = 1}^{N}$ at time $\bar t$ .
1. for $k = 1:N_1$ do
2. Generate the candidate $\boldsymbol \xi^*$ by the pCN scheme (3.6)
3. Accept $\boldsymbol \xi^$ as $\boldsymbol \xi_{k}$ with the rate $\alpha(\boldsymbol \xi^, \boldsymbol \xi_{k-1})$
4. Collect the posterior samples of the parameters $\{\boldsymbol \xi_i\}_{i = 1}^{N_1}$ and the PDE solution $\{U_i\}_{i = 1}^{N_1}$ at time $\bar t$ .
5. Compute the empirical covariance matrix of the $\{\boldsymbol \xi_i\}_{i = 1}^{N_1}$ as C.
6. for $k = N_1+1:N$ do
7. if $mod(k,k_0+1) = 0$ then
8. Update the empirical covariance matrix of $\{\boldsymbol \xi_i\}_{i = 1}^{k}$ as C
9. Propose the candidate $\boldsymbol \xi^*$ by the adaptive pCN scheme (3.7)
10. Accept $\boldsymbol \xi^$ as $\boldsymbol \xi_{k}$ with the probability $\alpha(\boldsymbol \xi^, \boldsymbol \xi_{k-1})$
11. Store the posterior samples of the parameters and PDE solutions at time $\bar t$ .

3.2. Methodology

As mentioned earlier, the uniqueness of this inverse problem has undergone comprehensive scrutiny in prior works [31, 32]. In this study, we align ourselves with the principles outlined in these [31, 32] to articulate the uniqueness theorem.

Theorem 1. Given two unknown supports D₁ and D₂ which have sufficiently smooth boundaries, we denote the solutions corresponding to D_j by u_j , $j = 1,2$ . Recalling the data (2.2), we assume that there exists $\epsilon_0\gt0$ such that

$\begin{align*} z_j\left(t\right)=\theta_j\text{ for } t\in\left(0,\epsilon_0\right),\ j=1,2, \text{ and } {\left(\theta_1-\theta_2\right)/\pi\notin \mathbb Q}, \end{align*}$

where $\unicode{x211A}$ is the set of rational numbers. If

$\begin{align*} \frac{\partial u_1}{\partial\overset{\,}{\overrightarrow{\mathbf{n}}}}\left(z_j\left(t\right),t\right) = \frac{\partial u_2}{\partial\overset{\,}{\overrightarrow{\mathbf{n}}}} \left(z_j\left(t\right),t\right),\ j = 1,2,\ t\in \left(0,T\right), \end{align*}$

then $D_1 = D_2$ in the sense of $L^2(\Omega)$ .

Proof. The proof of theorem 1 follows from [31, theorem 3.1] and the analyticity of the natural exponential function straightforwardly. □

Theorem 1 establishes that, under appropriate conditions, we can ascertain the precise support of the source using the data (2.2). However, extensive experimental findings, as presented in [25, 32], suggest that relying solely on fixed-sensor flux observations may not yield accurate source predictions. Furthermore, in both sampling methods [25, 33] and the classical Levenberg–Marquardt iteration, the accuracy of source capture is greatly contingent on sensor placement, particularly when only using a limited number of sensors.

Inspired by theorem 1, we introduce an algorithm that introduces variability in the locations of observation points (sensors). Firstly, two observed positions (sensors) along the boundary at time T₁ are selected randomly, the positions are denoted as $p_1 = z_1(T_1)$ and $p_2 = z_2(T_1)$ , respectively. We then estimate the shape of the source based on the measured flux data collected at p₁ and p₂ in the framework of Bayesian inference; particularly the adaptive pCN method displayed in algorithm 2 is used to draw the posterior samples. We then move the observation points location at T₂ according to the sensors migration rules detailed in sections 3.2.1 and 4.1.1. Moreover, to save the computational cost, we collect the posterior samples of the target inverse QoIs U at time T₁ as $\{U_i\}_{i = 1}^{N}$ , N is the number of samples, calculate the mean of $\{U_i\}_{i = 1}^{N}$ and use it as the initial guess of the PDE among the interval $[T_1, T_2]$ , i.e. once the observed locations at time T₂ determined, we only simulate the forward model during the time interval $[T_1, T_2]$ , which saves the cost of re-simulating the PDE solution from $[T_0, T_1]$ . The process is repeated until the predicted locations are closed to the current locations. For a comprehensive understanding of the procedure, we direct your attention to algorithms 1 and 2. It is worth noting that the option to select new sensor locations at random is a possibility. However, in section 4.1.1 and 3.2.1, we delve into two distinct approaches for altering the sensor locations, which we detail further.

3.2.1. A sensors migration rule.

Theorem 1 provides us with the possibility of using time-dependent sensors, and we also observe the improved numerical results in section 4. Notably, we even observe that the random sensors relocation strategy (moving the sensors at random at the next observation time) demonstrates superior performance compared to the fixed-sensor approach. However, it remains challenging for us to determine the optimal time to relocate the sensors. To address this issue, we introduce an effective approach that can help us identify when it is appropriate to cease relocating the sensors.

Our approach revolves around relocating the two sensors to locations characterized by the highest flux variance observed across all sampling iterations. To be precise, let the collection of flux at position ${\cal I}$ for all different sampling steps be denoted as $\mathcal{F}_{\cal I} = \{f_{\cal I} \}$ . We then proceed to shift one sensor to:

$\begin{align} \text{New sensor location} = \mathrm{argmax}_{\cal I} \text{var}\left(\mathcal{F}_{\cal I}\right). \end{align} \tag{ 3.9 }$

The second sensor is subsequently moved to the position exhibiting the second largest variance. It is important to emphasize that this strategy incurs no additional costs. Precisely, to compute the flux at the sensor locations, we must evaluate the equation's solution using the sampled source, guided by the acceptance rate (3.8). This enables us to compute the flux variance across all sampling iterations at any given point along the domain boundary. Notably, higher variance corresponds to a larger confidence interval and a less confident prediction. Consequently, our strategy involves relocating the sensors to positions characterized by the greatest variance.

4. Numerical experiments

In this section, we will conduct numerical experiments to test the performance of the algorithm. We shall delve into scenarios involving two distinct unknown sources, each characterized by disparate geometries and dimensions pertaining to the unknown parameters. Across both sets of problems, the task involves measuring boundary observation flux at two distinct points along the boundary—an endeavor that has proven to be particularly challenging for conventional implementation methodologies [31, 32]. Employing a comparative approach alongside the fixed observation points method, our analysis reveals a consistent trend: the methods we propose adeptly capture the source.

Consider the problem

$\begin{align*} \frac{\partial u}{\partial t}-\Delta u = b\chi_D\left(x\right), \ \ \ t\in \left[0, T\right] \end{align*}$

the physical domain is $\Omega: = \{(x, y)|x^2+y^2\lt1\}$ . b represents the predefined intensity of the source term. χ_D denotes the characteristic function related to the domain D, where D is the unspecified region in need of reconstruction. We will specify D in the experiments later. The observations are the flux $\frac{\partial u}{\partial \nu}$ collected at some time and locations along $\partial \Omega$ .

For the convenience of solving the forward model, we transform the Cartesian plane to the polar coordinates, i.e.

$\begin{align*} r = \sqrt{x^2+y^2},\ \ {\text{tan}{\theta} = \frac{y}{x}}, \end{align*}$

the domain is then transformed to be $\tilde \Omega: = \{(r, \theta)|0\lt r\lt1, 0 \unicode{x2A7D} \theta\unicode{x2A7D} 2\pi\}$ . The equation becomes

$\begin{align} \frac{\partial u}{\partial t}-\left(\frac{1}{r}\frac{\partial }{\partial r} \left(r \frac{\partial u}{\partial r}\right)+\frac{1}{r^2}\frac{\partial^2 u}{\partial \theta^2}\right) = b\chi_D\left(r, \theta\right), t\in \left[0, T\right]. \end{align} \tag{ 4.10 }$

The spatial and temporal discretization of the forward model (4.10) is conducted with the finite difference and backward Euler method, respectively. The synthetic measurement data is generated by solving the PDE with the reference source term with the spatial grid $33 \times 36$ and the discretized time step $\Delta t = 0.01$ . The discretized time step is set as $\Delta t = 0.02$ when simulating the forward model to avoid the 'inverse crime'. The measurement locations are sequentially determined during the inference following algorithm 1.

4.1. Circle-shape source

In this experiment, we consider the following unknown domain D,

$\begin{align*} \left(x-\eta_1\right)^2+\left(y-\eta_2\right)^2\lt0.2^2, \end{align*}$

where ${\boldsymbol \eta} = [\eta_1, \eta_2]^{\intercal}$ are the unknowns. The corresponding expression under the polar coordinates is

$\begin{align*} \left(r\text{cos}{\theta}-\rho\text{cos}{\omega}\right)^2+\left(r\text{sin}{\theta}-\rho\text{cos}{\omega}\right)^2\lt0.2^2, \end{align*}$

where $(\rho, \omega)$ are the unknown parameters. As the range of the samples is $(-\infty, +\infty)$ for this algorithm, while the range of our parameters is finite, we use the bijection map

$\begin{eqnarray*} \rho\left(\xi_1\right) & = & \frac{1}{\pi}\text{arctan}{\xi_1}+\frac{1}{2}, \\ \omega\left( \xi_2\right) & = & 2\text{arctan}{\xi_2}+\pi \end{eqnarray*}$

to map ${\boldsymbol \xi} = (\xi_1, \xi_2)^{\intercal}$ to the intervals (0, 1) and $(0,2\pi)$ , respectively. The ξ become the unknowns endowed with the prior $\mathcal{N}(0, I_p)$ to be inferred. The relationship between ξ and the location of the center is

$\begin{eqnarray*} \eta_1 & = & \rho\left(\xi_1\right)\text{cos}{\omega\left(\xi_2\right)}, \\ \eta_2 & = & \rho\left(\xi_1\right)\text{sin}{\omega\left(\xi_2\right)}. \end{eqnarray*}$

The statistical properties of η can be estimated through the posterior samples of ξ .

The predefined strength of the source term is b = 50, the ground truth parameter is ${\boldsymbol \eta} = [0, 0.5]^{\intercal}$ , implying that the true parameter vector ξ is given by $(\xi_1, \xi_2)^{\intercal} = (0, -1)^{\intercal}$ . The measurement error variance is $\sigma^2 = 0.05^2$ . We perform a total of N₁ = 0 initial iterations, followed by $N = 10^4$ iterations, with updates occurring every $k_0 = 2.5\times 10^3$ iterations. In both the pCN and adaptive pCN schemes, the tunable parameters β₁ and β₂ are set to be equal to ensure an acceptance rate of 30%–40%.

4.1.1. The second sensors migration rule.

In section 3.2.1, we introduce a general sensor relocation strategy that is based on an evaluation of the variance of the flux for all different sampling steps at different points along the boundary. In this section, we introduce the second rule which is specific to this problem but can be easily generalized to problems represented in spherical or polar coordinates.

The parameters ξ₁ and ξ₂ represent the radius and center coordinates of the target circle. As the observation locations approach the target area, the gathered data becomes more informative. Since the target area is circular, the proximity of the observation locations to the domain can be quantified by measuring the distance between the center and the observation locations. Hence, the moving observation locations are determined based on the samples of ξ₂, specifically,

$\begin{align} \text{New sensor location} = \left(\lfloor {\bar \omega}/h_\theta\rfloor, \lceil {\bar \omega}/h_\theta \rceil\right) \end{align} \tag{ 4.11 }$

where $\bar \omega$ is the posterior mean of $\omega(\xi_2)$ , h_θ is the spatial step of the θ direction, yielding $\lfloor {\bar \omega}/h_\theta\rfloor$ and $\lceil {\bar \omega}/h_\theta \rceil$ the index of locations along the boundary.

Firstly, we take two measured locations along the edge randomly at time $T_1 = 0.5$ , and draw samples from the posterior density. We then calculate the posterior mean of $\omega(\xi_2)$ as $\bar \omega$ , and determine the new sensors location index as $\lfloor {\bar \omega}/h_\theta\rfloor$ and $\lceil {\bar \omega}/h_\theta \rceil$ at the next time layer T₂. Obviously, the selected observation points p₁ and p₂ are adjacent. The process is repeated until the observation time is $\tilde t = T_3$ , and the forward model simulation and Bayesian inference are then stopped. The results of the dynamic procedure are displayed in figure 2.

**Figure 2.** Estimates of η using posterior sample mean. The center of the blue dashed circle is the true η , while the center of the red dashed circle is the average of all prediction samples. Flux measurements are obtained through two sensors (black dots) situated along the simulation boundary, with their positions dynamically adjusted over time according to algorithm 1. Specifically, the simulation is first performed to compute the flux at $T_1 = 0.5$ , with sensors located at $(\frac{22}{18}\pi, \ \frac{30}{18}\pi)$ . Subsequently, flux calculations are performed at T₂ = 1, with sensors positioned at $(\frac{10}{18}\pi, \ \frac{11}{18}\pi)$ , and at $T_3 = 1.5$ with sensor placement at $(\frac{10}{18}\pi, \ \frac{9}{18}\pi)$ . For further details on the sensor migration strategy, refer to section 3.2.1 .
Download figure:
Standard image High-resolution image

For comparison purposes, we maintain the sensor location fixed at $(\frac{22}{18}\pi, \frac{30}{18}\pi)$ , and flux measurements are taken at time instances $t^{\prime} = [0.5, 1, 1.5]^{\intercal}$ , respectively. The measurement error remains consistent for both the fixed location strategy and the strategy involving sequentially determined locations. In other words, for each observed time T_i , the noise introduced to the flux at the fixed locations $(\frac{22}{18}\pi, \frac{30}{18}\pi)$ is identical to the noise introduced to the flux at the sequentially determined locations. In figure 3, the posterior mean of the samples is represented by the red circle, offering an accurate estimate of the target region's center. However, figure 4 reveals a distinct pattern: the variance of the samples acquired through sequentially determined sensor placements is significantly reduced compared to those obtained using the fixed measurement strategy. This outcome is a result of the fact that sequentially determined measurements provide more informative data, leading to a substantial reduction in uncertainty.

**Figure 3.** Posterior mean estimations of η under static sensor positions. The sensors are initially positioned at $(\frac{22}{18}\pi, \ \frac{30}{18}\pi)$ . In contrast to the dynamic sensor adjustment recommended by algorithm 1, we maintain a fixed sensor location and capture flux measurements at time instances $t^{\prime} = [0.5, 1, 1.5]^{\intercal}$ .
Download figure:
Standard image High-resolution image

**Figure 4.** Trace plots illustrating posterior samples across distinct measurement strategies. The blue curve represents samples for ξ₁, while the green curve corresponds to ξ₂. (a) Utilizing sequentially determined locations as recommended by algorithm. (b) Employing fixed sensors. The visual analysis of the plots indicates that the strategy involving fixed sensor positions results in a broader confidence interval. Conversely, the proposed strategy yields predictions with significantly reduced variance.
Download figure:
Standard image High-resolution image

We additionally performed a comparative analysis between our proposed method and a random selection strategy for measurement locations. To gain a more comprehensive understanding of the stochastic nature, we conducted two experiments with distinct measurement locations. The trace plots of posterior samples for ξ are presented in figure 5. It is evident that the dynamical sensors placement method, utilizing both the proposed varying method as detailed in section 4.1.1, and the random varying strategy successfully capture the inverse QoIs. Nevertheless, it is important to highlight that the random varying strategy results in a broader confidence interval.

**Figure 5.** Trace plots illustrating posterior samples across distinct measurement strategies. The blue curve represents samples for ξ₁, while the green curve corresponds to ξ₂. Figure (a): utilizing sequentially determined locations as recommended by algorithm 1 with dynamical sensor placement strategy detailed in section 4.1.1. Figures (b) and (c): using algorithm 1 with random sensor placement strategy; please note that we repeat the experiments twice to comprehensively study the performance. It is evident that adopting a dynamic sensor placement strategy (left figure) leads to a narrower confidence band, signifying decreased variance among all the samples. Nevertheless, even randomly varying sensors, as illustrated in (b) and (c), still exhibit considerably lower variance compared to the fixed sensor approach, as demonstrated in figure 4(b).
Download figure:
Standard image High-resolution image

4.2. High dimensional peanut-shape source

In this example, we examine a problem of higher dimensions with asymmetrical characteristics [31]. Specifically, the unknown domain has the boundary $\partial D$ that can be parameterized as

$\begin{align*} \partial D = \left\{q\left(\theta; \boldsymbol \xi\right)\left(\text{cos}\left(\theta\right), \text{sin}\left(\theta\right)\right)^\mathrm{T}: \theta \in \left[0, 2\pi\right] \right\}, \end{align*}$

with a smooth, periodic function $q(\theta; \boldsymbol \xi)\in (0, 1)$ , which has the form

$\begin{align} q\left(\theta; \boldsymbol \xi\right) = \frac{1}{2}\xi_1+\sum_{i = 1}^M\left( \xi_i\text{cos}\left(i\theta\right)+\xi_{i+1}\text{sin}\left(i\theta\right)\right). \end{align} \tag{ 4.12 }$

To ensure the smoothness of the approximation, we follow the paper [31] and set the penalty term to be the H² norm of $q(\theta; \boldsymbol \xi)$ , which implies the Gaussian prior density ${\cal N}(0, B)$ for ξ , where the covariance matrix $B\in {\mathbb{R}}^{(2M+1)\times (2M+1)}$ is diagonal, with entries

$\begin{align*} B_{1,1} = 1, \quad B_{i+1,i+1} = B_{i+M+1,i+M+1} = \frac{1}{i^2}, \quad i = 1, \cdots, M.\end{align*}$

The true parameter is $\boldsymbol \xi = [1, 0, 0, 0, 0.3]^{\intercal}$ , the strength is preset as b = 10, the variance of the measurement error is $\sigma^2 = 0.01^2$ . The iteration numbers are $N_1 = 1\times 10^3$ , $N = 1.5\times10^4$ , the update frequent number is $k_0 = 2.5\times 10^3$ . The parameters in the pCN and adaptive pCN scheme are tuned separately, both lead to the acceptance rate 30%–40%.

We first assess the random sensor relocation method and showcase the corresponding outcomes in figure 6(b). We compare the proposed method with the fixed-sensor approach, and as shown in figure 6(a), the limited data available to the fixed-location strategy hinders its ability to accurately capture the shape of the target source D. Despite some overlap in measurement locations across different time points, the data collected by randomly relocating sensors proves to be sufficiently informative for capturing the intricate shape of the high-dimensional source. As depicted in figure 7, the samples derived from the randomly moving sensors strategy tend to cluster around or in proximity to the inverse QoIs denoted as ξ , whereas the fixed-sensor approach struggles to identify most of the QoIs, with the exception of ξ₁.

**Figure 6.** Posterior mean estimates of $q(\theta,\boldsymbol \xi)$ for sensors at time $t = T_5$ with(a) fixed sensor locations $(\frac{11}{18}\pi, \ \frac{5}{18}\pi)$ and (b) random moving sensors at time: $T_1: (\frac{11}{18}\pi, \ \frac{5}{18}\pi)$ , $T_2: (\frac{29}{18}\pi, \ \frac{5}{18}\pi)$ , $T_3: (\frac{16}{18}\pi, \ \frac{33}{18}\pi)$ , $T_4: (\frac{29}{18}\pi, \ \frac{34}{18}\pi)$ , $T_5: (\frac{24}{18}\pi, \ \frac{2}{18}\pi)$ . The shape of the high dimensional source is accurately captured by the moving sensors strategy, while the fixed locations strategy performs poorly.
Download figure:
Standard image High-resolution image

**Figure 7.** Trace plots illustrating posterior samples at all iteration steps. Different color represents different unknown parameters, with true parameters $\boldsymbol \xi = [1, 0, 0, 0, 0.3]^{\intercal}$ . Left (a): the sensors' location are fixed at $(\frac{11}{18}\pi, \ \frac{5}{18}\pi)$ . Right (b): randomly move the sensors along the boundary. The samples of ξ₁ and ξ₅ concentrates around or near the true value 1 and 0.3 for the moving sensors strategy, the components ξ₂, ξ₃, and ξ₄ are also around the true value 0.
Download figure:
Standard image High-resolution image

**Figure 7.** Trace plots illustrating posterior samples at all iteration steps. Different color represents different unknown parameters, with true parameters $\boldsymbol \xi = [1, 0, 0, 0, 0.3]^{\intercal}$ . Left (a): the sensors' location are fixed at $(\frac{11}{18}\pi, \ \frac{5}{18}\pi)$ . Right (b): randomly move the sensors along the boundary. The samples of ξ₁ and ξ₅ concentrates around or near the true value 1 and 0.3 for the moving sensors strategy, the components ξ₂, ξ₃, and ξ₄ are also around the true value 0.
Download figure:
Standard image High-resolution image

Following the aforementioned rule detailed in section 3.2.1, we sequentially ascertain the measurement locations and present the posterior mean of the prediction samples in figure 8. Notably, the locations at time T₅ coincide with those at time T₃. Specifically, the suggested locations at time T₅ are $(\frac{21}{18}\pi, \ \frac{20}{18}\pi)$ . This recurrence of the location $\frac{20}{18}\pi$ also appears at T₄, indicating a pattern of repeated cycling between the locations $(\frac{4}{18}\pi, \ \frac{5}{18}\pi)$ and $(\frac{19}{18}\pi, \ \frac{20}{18}\pi)$ . Consequently, we choose to conclude the sequential process at time T₅. This method offers a means to determine when to cease taking measurements, providing insights into the richness of the collected data to some extent. The subfigure includes the 1.5σ confidence interval of the posterior samples for $q(\theta; \boldsymbol \xi)$ , within which the actual $q(\theta; \boldsymbol \xi)$ value is predicted.

**Figure 8.** Posterior mean estimates for prediction samples by the proposed algorithm 1. The light blue curves represent the target source location defined in equation (4.12), while the red dashed line depicts the sample mean. Black dots situated along the boundary denote the sensor positions, and the detailed determination of sensor locations at different time layers is presented sequentially below. (a) $T_1 = 0.5$ , $(\frac{11}{18}\pi, \ \frac{5}{18}\pi)$ . (b) T₂ = 1, $(\frac{28}{18}\pi, \ \frac{27}{18}\pi)$ . (c) $T_3 = 1.5$ , $(\frac{4}{18}\pi, \ \frac{5}{18}\pi)$ . (d) T₄ = 2, $(\frac{19}{18}\pi, \ \frac{20}{18}\pi)$ . (e) $T_5 = 2.5$ , $(\frac{4}{18}\pi, \ \frac{5}{18}\pi)$ . The shaded band is plotted with $\mu\pm 1.5\sigma$ .
Download figure:
Standard image High-resolution image

5. Concluding remarks

In this work, we consider the inverse source problem of the heat equation. We use the boundary measurements where the observation sensors can be moved. The Bayesian method is used and we attempt some numerical examples. The numerical results illustrate that the Bayesian approach is feasible to solve such an inverse problem which uses the data from the moving observation points.

We outline several directions for future research. First, it is worth exploring the application of the PDE (1.1) in a more general domain. In this study, we confined our attention to the unit disc in $\mathbb{R}^2$ , which possesses advantageous geometric properties. Expanding our analysis to encompass general domains in $\mathbb{R}^2$ or $\mathbb{R}^3$ would undoubtedly introduce additional complexities to the uniqueness argument and reconstruction algorithms.

Second, a significant open question pertains to the optimal strategy for relocating observation sensors. While we proposed a migration rule (3.9) in this work, we acknowledge that it lacks rigorous mathematical proofs, rendering it somewhat empirical in nature. Consequently, an essential avenue for future research involves a thorough investigation of this migration rule to establish its validity and optimize its performance.

Acknowledgments

Na Ou acknowledges the support of Chinese NSF 11901060, Hunan Provincial NSF 2021JJ40557 and Scientific Research Foundation of Hunan Provincial Education Department 22B0333. Zhidong Zhang is supported by National Natural Science Foundation of China (Grant No. 12101627). Guang Lin acknowledges the support of the National Science Foundation (DMS-2053746, DMS-2134209, ECCS-2328241, and OAC-2311848), and US Department of Energy (DOE) Office of Science Advanced Scientific Computing Research program DE-SC0023161, and the Uncertainty Quantification for Multifidelity Operator Learning project (Project No. 81739), and DOE-Fusion Energy Science, under Grant Number: DE-SC0024583.

Data availability statement

The data cannot be made publicly available upon publication because the cost of preparing, depositing and hosting the data would be prohibitive within the terms of this research project. The data that support the findings of this study are available upon reasonable request from the authors.

Restoring the discontinuous heat equation source using sparse boundary data and dynamic sensors

Article metrics

Submit

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

1.1. Mathematical model

1.2. Background and literature

1.3. Main result and outline

2. Preliminaries

2.1. The inverse problem settings

2.2. Bayesian framework

3. Langevin diffusion and Bayesian sampling

3.1. Adaptive pCN

3.2. Methodology

3.2.1. A sensors migration rule.

4. Numerical experiments

4.1. Circle-shape source

4.1.1. The second sensors migration rule.

4.2. High dimensional peanut-shape source

5. Concluding remarks

Acknowledgments

Data availability statement

Restoring the discontinuous heat equation source using sparse boundary data and dynamic sensors

Article metrics

Submit

Share this article

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction

1.1. Mathematical model

1.2. Background and literature

1.3. Main result and outline

2. Preliminaries

2.1. The inverse problem settings

2.2. Bayesian framework

3. Langevin diffusion and Bayesian sampling

3.1. Adaptive pCN

3.2. Methodology

3.2.1. A sensors migration rule.

4. Numerical experiments

4.1. Circle-shape source

4.1.1. The second sensors migration rule.

4.2. High dimensional peanut-shape source

5. Concluding remarks

Acknowledgments

Data availability statement