Skip to main content
Log in

Extremal Points and Sparse Optimization for Generalized Kantorovich–Rubinstein Norms

  • Published:
Foundations of Computational Mathematics Aims and scope Submit manuscript

Abstract

A precise characterization of the extremal points of sublevel sets of nonsmooth penalties provides both detailed information about minimizers, and optimality conditions in general classes of minimization problems involving them. Moreover, it enables the application of fully corrective generalized conditional gradient methods for their efficient solution. In this manuscript, this program is adapted to the minimization of a smooth convex fidelity term which is augmented with an unbalanced transport regularization term given in the form of a generalized Kantorovich–Rubinstein norm for Radon measures. More precisely, we show that the extremal points associated to the latter are given by all Dirac delta functionals supported in the spatial domain as well as certain dipoles, i.e., pairs of Diracs with the same mass but with different signs. Subsequently, this characterization is used to derive precise first-order optimality conditions as well as an efficient solution algorithm for which linear convergence is proved under natural assumptions. This behavior is also reflected in numerical examples for a model problem.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Algorithm 1
Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. F. Angrisani, G. Ascione, L. D’Onofrio, and G. Manzo, Duality and distance formulas in Lipschitz-Hölder spaces, Atti Accad. Naz. Lincei Rend. Lincei Mat. Appl. 31 (2020), no. 2, 401–419.

    Article  MathSciNet  MATH  Google Scholar 

  2. F. Angrisani, G. Ascione, and G. Manzo, Atomic decomposition of finite signed measures on compacts of \(\mathbb{R}^n\), Ann. Fenn. Math. 46 (2021), no. 2, 643–654.

  3. N. Boyd, G. Schiebinger, and B. Recht, The alternating descent conditional gradient method for sparse inverse problems, SIAM J. Optim. 27 (2017), no. 2, 616–639.

    Article  MathSciNet  MATH  Google Scholar 

  4. C. Boyer, A. Chambolle, Y. De Castro, V. Duval, F. De Gournay, and P. Weiss, On representer theorems and convex regularization, SIAM J. Optim. 29 (2019), no. 2, 1260–1281.

    Article  MathSciNet  MATH  Google Scholar 

  5. K. Bredies and M. Carioni, Sparsity of solutions for variational inverse problems with finite-dimensional data, Calc. Var. Partial Differential Equations 59 (2020), no. 1, 1–26.

    Article  MathSciNet  MATH  Google Scholar 

  6. K. Bredies, M. Carioni, S. Fanzon, and F. Romero, On the extremal points of the ball of the Benamou–Brenier energy, Bull. Lond. Math. Soc. 53 (2021), no. 5, 1436–1452.

    Article  MathSciNet  MATH  Google Scholar 

  7. K. Bredies, M. Carioni, S. Fanzon, and F. Romero, A generalized conditional gradient method for dynamic inverse problems with optimal transport regularization, Found. Comput. Math. 23 (2023), no. 3, 833–898.

    Article  MathSciNet  MATH  Google Scholar 

  8. K. Bredies, M. Carioni, S. Fanzon, and D. Walter, Asymptotic linear convergence of fully-corrective generalized conditional gradient methods, Math. Program. (2023), https://doi.org/10.1007/s10107-023-01975-z.

  9. K. Bredies and H. K. Pikkarainen, Inverse problems in spaces of measures, ESAIM Control Optim. Calc. Var. 19 (2013), no. 1, 190–218.

    Article  MathSciNet  MATH  Google Scholar 

  10. H. Brezis, Functional analysis, Sobolev spaces and partial differential equations, Universitext, New York, NY: Springer, 2011.

    Book  Google Scholar 

  11. E. J. Candès and C. Fernandez-Granda, Towards a mathematical theory of super-resolution, Comm. Pure Appl. Math. 67 (2014), no. 6, 906–956.

    Article  MathSciNet  MATH  Google Scholar 

  12. E. Casas, C. Clason, and K. Kunisch, Parabolic control problems in measure spaces with sparse solutions, SIAM J. Control Optim. 51 (2013), no. 1, 28–63.

    Article  MathSciNet  MATH  Google Scholar 

  13. J. C. Dunn, Convergence rates for conditional gradient sequences generated by implicit step length rules, SIAM J. Control Optim. 18 (1980), no. 5, 473–487.

    Article  MathSciNet  MATH  Google Scholar 

  14. J. C. Dunn and S. Harshbarger, Conditional gradient algorithms with open loop step size rules, J. Math. Anal. Appl. 62 (1978), no. 2, 432–444.

    Article  MathSciNet  MATH  Google Scholar 

  15. V. Duval and G. Peyré, Exact support recovery for sparse spikes deconvolution, Found. Comput. Math. 15 (2015), no. 5, 1315–1355.

    Article  MathSciNet  MATH  Google Scholar 

  16. V. Duval and R. Tovey, Dynamical programming for off-the-grid dynamic inverse problems, Preprint arXiv:2112.11378 [math.OC], 2021.

  17. I. Ekeland and R. Témam, Convex analysis and variational problems., Classics in Applied Mathematics, vol. 28, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1999.

  18. M. Frank and P. Wolfe, An algorithm for quadratic programming, Naval Research Logistics Quarterly 3 (1956), no. 1-2, 95–110.

    Article  MathSciNet  Google Scholar 

  19. L. G. Hanin, Kantorovich-Rubinstein norm and its application in the theory of Lipschitz spaces, Proc. Amer. Math. Soc. 115 (1992), no. 2, 345–352.

    Article  MathSciNet  MATH  Google Scholar 

  20. J. A. Iglesias and D. Walter, Extremal points of total generalized variation balls in 1D: characterization and applications, J. Convex Anal. 29 (2022), no. 4, 1251–1290.

    MathSciNet  MATH  Google Scholar 

  21. L. V. Kantorovich and G. P. Akilov, Functional analysis, Second ed., Pergamon Press, Oxford-Elmsford, N.Y., 1982.

    MATH  Google Scholar 

  22. P.-J. Laurent, Approximation et optimisation, Collection Enseignement des Sciences, No. 13, Hermann, Paris, 1972.

  23. J. Lellmann, D. A. Lorenz, C. Schönlieb, and T. Valkonen, Imaging with Kantorovich-Rubinstein discrepancy, SIAM J. Imaging Sci. 7 (2014), no. 4, 2833–2859.

    Article  MathSciNet  MATH  Google Scholar 

  24. L. Métivier, R. Brossier, Q. Mérigot, E. Oudet, and J. Virieux, An optimal transport approach for seismic tomography: application to 3D full waveform inversion, Inverse Problems 32 (2016), no. 11, 115008, 36 pp.

  25. P. Pegon, F. Santambrogio, and D. Piazzoli, Full characterization of optimal transport plans for concave costs, Discrete Contin. Dyn. Syst. 35 (2015), no. 12, 6113–6132.

    Article  MathSciNet  MATH  Google Scholar 

  26. F. Santambrogio, Optimal transport for applied mathematicians, Progress in Nonlinear Differential Equations and their Applications, vol. 87, Birkhäuser/Springer, Cham, 2015.

    Google Scholar 

  27. T. Strömberg, The operation of infimal convolution, Dissertationes Math. (Rozprawy Mat.) 352 (1996), 58 pp.

  28. M. Unser, J. Fageot, and J. P. Ward, Splines are universal solutions of linear inverse problems with generalized TV regularization, SIAM Rev. 59 (2017), no. 4, 769–793.

    Article  MathSciNet  MATH  Google Scholar 

  29. D. J. Wales and J. P. K. Doye, Global optimization by basin-hopping and the lowest energy structures of lennard-jones clusters containing up to 110 atoms, J. Phys. Chem. A 101 (1997), no. 28, 5111–5116.

    Article  Google Scholar 

  30. N. Weaver, Lipschitz algebras, World Scientific Publishing Co. Pte. Ltd., Hackensack, NJ, 2018.

    Book  Google Scholar 

  31. Y. Yu, X. Zhang, and D. Schuurmans, Generalized conditional gradient for sparse estimation, J. Mach. Learn. Res. 18 (2017), Paper No. 144, 46 pp.

  32. C. Zălinescu, Convex analysis in general vector spaces, World Scientific Publishing Co., Inc., River Edge, NJ, 2002.

    Book  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José A. Iglesias.

Additional information

Communicated by Martin Burger.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Proofs for Sect. 3.3.2

Appendix A: Proofs for Sect. 3.3.2

In this section, we collect the necessary auxiliary results for the proof of Theorem 3.15 by applying the results of [8]. For this purpose, we keep using the notation \(B:=\{\mu \,|\, \Vert \mu \Vert _{{\text {KR}}_p^{\alpha ,\beta }}\le 1\}\) and further introduce \(\mathcal {B}:=\overline{{\text {Ext}}(B)}^*\). Since the predual space \(\mathcal {C}(\Omega )\) is separable, \(\mathcal {B}\) is weak* compact and there exists a metric \(d_\mathcal {B}\) which metrizes the weak* topology on \(\mathcal {B}\), see [10, Theorem 3.29].

Lemma A.1

We have

$$\begin{aligned} \mathcal {B}= \{\,(\sigma /\alpha )\delta _z\;|&\;\sigma \in \{-1,+1\},~z \in \Omega \,\} \\&\cup \{\,\mathcal {D}_\beta (x,y)\;|\;(x,y) \in \Omega \times \Omega ,~0\le |x-y|^p\le 2 \alpha -\beta \,\}. \end{aligned}$$

Proof

By the characterization of \({\text {Ext}}(B)\), we first observe that

$$\begin{aligned} \mathcal {B}= \overline{\{\,(\sigma /\alpha )\delta _z\;|\;}&\overline{\sigma \in \{-1,+1\},~z \in \Omega \,\}}^*\\&\cup \overline{\left\{ \,\mathcal {D}_\beta (x,y)\;|\;(x,y) \in \Omega \times \Omega ,~0<|x-y|^p< 2 \alpha -\beta \,\right\} }^*. \end{aligned}$$

Now, let \(\mu _k=(\sigma _k/\alpha ) \delta _{z_k}\)\(\sigma _k\in \{-1,1\}\), \(z_k\in \Omega \)\(k \in \mathbb {N}\), denote a weak* convergent sequence with limit \(\bar{\mu }\). Then, due to the compactness of \(\Omega \), there exists a subsequence, denoted by the same symbol, with

$$\begin{aligned} (\sigma _k, z_k) \rightarrow (\bar{\sigma }, \bar{z}) \quad \text {for some}~(\bar{\sigma }, \bar{z}) \in \{-1,1\} \times \Omega . \end{aligned}$$

Setting \(\tilde{\mu }=(\bar{\sigma }/\alpha )\delta _{\bar{z}}\), the associated sequence of measures satisfies

$$\begin{aligned} \langle q,\mu _k \rangle = (\sigma /\alpha ) q(z_k) \rightarrow ({\bar{\sigma }}/\alpha ) q({\bar{z}})=\langle q,{\tilde{\mu }} \rangle \quad \text {for all}~q \in \mathcal {C}(\Omega ). \end{aligned}$$

Since weak* limits are unique, \(\bar{\mu }={\tilde{\mu }}\) follows.

Similarly, we see that any weak* convergent sequence \(\mu _k=\mathcal {D}_\beta (x_k,y_k)\) with

$$\begin{aligned} (x_k,y_k)\in \Omega \times \Omega ,~0<|x_k-y_k|^p< 2\alpha -\beta \end{aligned}$$

necessarily satisfies  for some \((\bar{x},\bar{y})\in \Omega \times \Omega \) with \(0\le |\bar{x}-\bar{y}|^p\le 2\alpha -\beta \). This finishes the proof. \(\square \)

In order to apply the abstract convergence result of [8], we have to check some structural assumptions. First, we show that, due to Assumption \((\textbf{B2})\), the linear problem

$$\begin{aligned} \max _{\mu \in \mathcal {B}} \langle \bar{q}, \mu \rangle \end{aligned}$$

admits finitely many maximizers and all of them are extremal points.

Lemma A.2

Let Assumption \((\textbf{B2})\) hold. Then, we have

$$\begin{aligned} {{\,\mathrm{arg\,max}\,}}_{\mu \in \mathcal {B}} \langle \bar{q}, \mu \rangle = \left\{ ({\text {sign}}(\bar{q}(\bar{z}_i))/\alpha )\delta _{\bar{z}_i}\right\} ^{\bar{N}_1}_{i=1} \cup \left\{ \mathcal {D}_\beta (\bar{x}_j,\bar{y}_j)\right\} ^{\bar{N}_2}_{j=1}. \end{aligned}$$

Proof

Define

$$\begin{aligned} D:=\big \{({\text {sign}}(\bar{q}(\bar{z}_i))/\alpha )\delta _{\bar{z}_i}\big \}^{\bar{N}_1}_{i=1} \cup \left\{ \mathcal {D}_\beta (\bar{x}_j,\bar{y}_j)\right\} ^{\bar{N}_2}_{j=1}. \end{aligned}$$

By assumption, D is nonempty and there holds \(\langle \bar{q}, \mu \rangle =1\) for all \(\mu \in D\). Moreover, since \(\bar{q}\) is the unique dual variable for Problem (\(\mathcal {P}\)) and \( \Vert \cdot \Vert _{{\text {KR}}_p^{\alpha , \beta }}\) is positively one-homogeneous, we conclude

$$\begin{aligned} \max _{\mu \in \mathcal {B}}\langle \bar{q},\mu \rangle =1 \quad \text {and thus}~D \subset {{\,\mathrm{arg\,max}\,}}_{\mu \in \mathcal {B}} \langle \bar{q}, \mu \rangle \end{aligned}$$

The inverse inclusion follows immediately from Assumption \((\textbf{B2})\) which gives

$$\begin{aligned} \max _{z}|\bar{q}(z)|\le \alpha ,~\max _{(x,y)}\Psi _{\bar{q}}(x,y)\le 1, \end{aligned}$$

as well as noting that

$$\begin{aligned} \langle \bar{q},\ \sigma \delta _z \rangle =1\text { for }\sigma \in \{-1,1\}\text { and }z \in \Omega&\text { implies}~|p(z)|=\alpha ,\text { and}\\ \langle \bar{q},\ \mathcal {D}_\beta (x,y) \rangle =1 \text { with } 0\le |x-y|\le 2 \alpha -\beta&\text { is equivalent to }\Psi (x,y)=1. \end{aligned}$$

\(\square \)

For abbreviation, set

$$\begin{aligned} \bar{\mu }^1_i= ({{\,\textrm{sign}\,}}(\bar{q}(\bar{z}_i))/\alpha ) \delta _{\bar{z}_i},~\bar{\mu }^2_j= \mathcal {D}_\beta (\bar{x}_j,\bar{y}_j) \quad \text {for all}~i=1,\dots , \bar{N}_1,~j=1,\dots \bar{N}_2. \end{aligned}$$

Second, we have to show the existence of \(d_{\mathcal {B}}\)-neighborhoods \(U^1_i\) of \(\bar{\mu }^1_i\) and \(U^2_j\) of \(\bar{\mu }^2_j\) in \(\mathcal {B}\), respectively, as well as of a mapping \(g :{\text {Ext}}(B) \times {\text {Ext}}(B)\) and \(\theta ,~C_K >0\) with

$$\begin{aligned} \Vert K(\mu -\mu ^k_j)\Vert _Y \le C_K \, g(\mu , \mu ^k_j)\ \text { and }\ 1-\langle \bar{q},\mu \rangle \ge \theta \, g(\mu ,\mu ^k_j)^2 \end{aligned}$$
(40)

for all \(j=1, \dots , \bar{N}_k\)\(k=1,2\), and all \(\mu \in U^k_j \cap {\text {Ext}}(B)\). We claim that this satisfied for

$$\begin{aligned}&g(\mu _1,\mu _2) :=\\&\quad {\left\{ \begin{array}{ll} |z_1-z_2|+|\sigma _1-\sigma _2| &{} \mu _1= \sigma _1 \delta _{z_1},~\mu _1= \sigma _2 \delta _{z_2},~z_1,z_2 \in \Omega ,~\sigma _1,\\ &{}\sigma _2 \in \{-1,1\} \\ \left| \begin{pmatrix} x_1-x_2 \\ y_1-y_2\end{pmatrix} \right| &{} \mu _1=\mathcal {D}_\beta (x_1,y_1),~\mu _2=\mathcal {D}_\beta (x_2,y_2),~ \\ &{}(x_1,y_1), (x_2,y_2) \in \Omega \times \Omega \\ 0 &{} \text {else.} \ \end{array}\right. } \end{aligned}$$

The proof is split into two parts. First, we characterize open \(d_{\mathcal {B}}\)-neighborhoods around the associated extremal points.

Lemma A.3

For \(0< R\) define the sets

$$\begin{aligned} U^1_i(R) :=\left\{ \,({\text {sign}}(\bar{q}(\bar{z}_i))/\alpha )\delta _{z}\;|\;z \in B_R(\bar{z}_i)\,\right\} \quad \text {for all}~ i=1, \dots , \bar{N}_1, \end{aligned}$$

as well as

$$\begin{aligned} U^2_j(R) :=\left\{ \,\mathcal {D}_\beta (\bar{x}_j,\bar{y}_j)\;|\;(x,y) \in B_R(\bar{x}_j)\times B_R(\bar{y}_j)\,\right\} \quad \text {for all}~ j=1, \dots , \bar{N}_2. \end{aligned}$$

Then, \({U}^1_i(R)\) is a \(d_{\mathcal {B}}\)-neighborhood of \(({\text {sign}}(\bar{q}(\bar{z}_i)/\alpha )\delta _{\bar{z}_i}\)\(i=1,\dots ,\bar{N}_1\), and \(\bar{U}^2_j(R)\) is a \(d_{\mathcal {B}}\)-neighborhood of \(\mathcal {D}_\beta (\bar{x}_j,\bar{y}_j)\)\(j=1,\dots ,\bar{N}_2\). Moreover, for every \(R>0\) small enough, there holds \({U}^1_i(R),~{U}^2_i(R) \subset {\text {Ext}}(B)\).

Proof

Let indices \(i\in \{1, \dots , \bar{N}_1\}\) and \(j\in \{1, \dots , \bar{N}_2\}\) be arbitrary but fixed. We first show the claimed statement for \(\bar{U}^2_j\). Noting that \((\mathcal {B},d_{\mathcal {B}})\) is a metric space, it suffices to show that any sequence \(\{\mu _k\}_k \subset \mathcal {B}\) with  eventually lies in \(\bar{U}^2_j\) for all \(k \in \mathbb {N}\) large enough. For this purpose, assume that \(\{\mu _k\}_{k}\) admits a subsequence, denoted by the same symbol, of the form \(\mu _k=(\sigma _k/\alpha ) \delta _{z_k}\) for some \(\sigma _k \in \{-1,1\},~z_k \in \Omega \). Then, by possibly selecting another subsequence, we get  for some \(\bar{\sigma } \in \{-1,1\},~\bar{z} \in \Omega \). Noting that weak* limits are unique and \(\bar{\sigma }\delta _{\bar{z}} \ne \mathcal {D}_\beta (\bar{x}_j, \bar{y}_j)\) yields a contradiction. In the same way, we exclude the existence of a subsequence with \(\mu _k=0\) for all k. Hence, for all \(k\in \mathbb {N}\) large enough, we have \(\mu _k=\mathcal {D}_\beta (x_k,y_k)\) for some \((x_k,y_k ) \in \Omega \times \Omega \) with \(0< |x_k,y_k|\le 2\alpha -\beta \). By a similar contradiction argument, \((x_k,y_k) \rightarrow (\bar{x}_j,\bar{y}_j)\) has to hold. Thus, for every \(k \in \mathbb {N}\) large enough, we have \((x_k,y_k) \in B_{R_2}(\bar{x}_j,\bar{y}_j)\) and thus \(\mu _k \in \bar{U}^2_j\), finishing the proof. The openness of \(\bar{U}^1_j\) follows by similar argument. In fact, if \(\{\mu _k\}_k \subset \mathcal {B} \) satisfies

then \(\mu _k=(\sigma _k/\alpha ) \delta _{z_k}\)\(\sigma _k \in \{-1,1\},~z_k\in \Omega \) for all k large enough since \(\bar{\mu }^1_i \ne \mathcal {D}_\beta (x,y)\) for every \((x,y)\in \Omega \times \Omega \). Moreover, from [8, Lemma 3.16], we get \(\sigma _k=\) for all \(k \in \mathbb {N}\) large enough. Finally, if there is a subsequence of \(\{z_k\}_k\), denoted by the same symbol, with \(z_k \rightarrow \bar{z}\) with \(\bar{z}\ne \bar{z}_i\), then we can choose \(\varphi \in \mathcal {C}(\Omega )\) satisfying \(\varphi (\bar{z})=0\) and \(\varphi (\bar{z}_i)=1\). For the corresponding subsequence of measures \(\mu _k\), we then obtain

$$\begin{aligned} \langle \varphi , \mu _k \rangle = (\sigma _k/\alpha ) \varphi (z_k) \rightarrow ({{\,\textrm{sign}\,}}(\bar{q}(\bar{z}_i))/\alpha ) \varphi (\bar{z})=0 \ne \langle \varphi ,\bar{\mu }_i \rangle \end{aligned}$$

yielding a contradiction and thus \(\bar{z}=\bar{z}_i\). \(\square \)

Next we prove the Lipschitz and quadratic growth properties from (40).

Lemma A.4

There are \(R_1,C_K>0\) with

$$\begin{aligned} \Vert K(\mu -\bar{\mu }^\ell _j) \Vert _Y \le C_K \, g(\mu , \bar{\mu }^\ell _j) \end{aligned}$$

for all \(\mu \in U^\ell _j(R_1)\)\(j=1,\dots ,\bar{N}_\ell \)\(\ell =1,2\).

Proof

By assumption, \(K_* :Y \rightarrow {\text {Lip}}(\Omega )\) is continuous. As a consequence, we immediately get

$$\begin{aligned} \Vert K(\delta _z-\delta _{\bar{z}_i}) \Vert _Y&= \sup _{\Vert v\Vert _Y \le 1} \langle K_* v, \delta _z-\delta _{\bar{z}_i}\rangle = \sup _{\Vert v\Vert _Y \le 1} \left[[K_* v ](z)-[K_* v ](\bar{z}_i)\right]\\&\le \Vert K_*\Vert _{Y, {\text {Lip}}} |z-\bar{z}_i| \end{aligned}$$

for all \(z\in \Omega \). For \(\mathcal {D}_\beta (\bar{x}_j, \bar{y}_j)\) we can argue similarly. For this purpose, if \(R_1>0\) is small enough, we have

$$\begin{aligned} |\bar{x}_i-\bar{y}_i|^p-|x-y|^p \le c \left| \begin{pmatrix} x-\bar{x}_j \\ y-\bar{y}_j\end{pmatrix} \right| \end{aligned}$$

for all \((\bar{x}_i,\bar{y}_i) \in B_R(\bar{x}_j) \times B_R(\bar{y}_j)\) since \(|\bar{x}_i-\bar{y}_i|>0\). As a consequence, we get

$$\begin{aligned} \Vert K(\mathcal {D}_\beta (x,y)-\mathcal {D}_\beta (\bar{x}_j,\bar{y}_j)) \Vert _Y&= \sup _{v \in Y} \langle K_* v, \mathcal {D}_\beta (x,y)-\mathcal {D}_\beta (\bar{x}_j,\bar{y}_j)\rangle \\&= \sup _{y \in Y} \left[\frac{[K_*v ](x)-[K_*v ](y)}{\beta + |x-y|^p}-\frac{[K_*v ](\bar{x}_j)-[K_*v ](\bar{y}_j)}{\beta + |\bar{x}_j-\bar{y}_j|^p} \right]\\&\le D_1+ D_2 \end{aligned}$$

where we abbreviate

$$\begin{aligned} D_1&:=\frac{\Vert K_*\Vert _{Y,{\text {Lip}}}(|x-\bar{x}_j|+|y-\bar{y}_j|)}{\beta + |\bar{x}_j-\bar{y}_j|^p} \le c \left| \begin{pmatrix} x-\bar{x}_j \\ y-\bar{y}_j\end{pmatrix} \right| \end{aligned}$$

as well as

$$\begin{aligned} D_2&:=\left( \frac{1}{(\beta + |x-y|^p)}-\frac{1}{(\beta + |\bar{x}_j-\bar{y}_j|^p)} \right) \left( [K_*v ](x)-[K_*v ](y)\right) \\&\le 2 \Vert K_*\Vert _{Y, \mathcal {C}} \left( \frac{|\bar{x}_j-\bar{y}_j|^p-|x-y|^p}{(\beta + |x-y|^p)(\beta + |\bar{x}_j-\bar{y}_j|^p)} \right) \\&\le \frac{2c \Vert K_*\Vert _{Y, \mathcal {C}}}{\beta ^2} \left| \begin{pmatrix} x-\bar{x}_j \\ y-\bar{y}_j\end{pmatrix} \right| . \end{aligned}$$

The claimed statement then follows by definition of \(U^1_i(R_1)\) and \(U^2_j(R_1)\) from Lemma A.3 and noting that

$$\begin{aligned} g(\mu , \bar{\mu }^1_i)=|z-\bar{z}_i| \quad \text {for all}~\mu = {{\,\textrm{sign}\,}}(\bar{q}(\bar{z}_i))\delta _z \in U^1_i(R_1) \end{aligned}$$

as well as

$$\begin{aligned} g(\mu , \bar{\mu }^2_j)=\left| \begin{pmatrix} x-\bar{x}_j \\ y-\bar{y}_j\end{pmatrix} \right| \quad \text {for all}~\mu = \mathcal {D}_\beta (x,y) \in U^2_i(R_1). \end{aligned}$$

Since all involved constants are independent of i and j, respectively, we conclude. \(\square \)

Proposition A.5

Let Assumption \((\textbf{B3})\) hold. Then, there are \(\theta >0 \) and a radius \(0<R_2\) with

$$\begin{aligned} 1-\langle \bar{q}, \mu \rangle \ge \theta \, g(\mu ,\bar{\mu }^\ell _j)^2 \quad \text {for all}~\mu \in U^\ell _j(R_2), \end{aligned}$$

and \(j=1,\dots ,\bar{N}_\ell \)\(\ell =1,2\).

Proof

Since \(\bar{z}_i \in {\text {int}}\Omega \) is a global extremum of \(\bar{q}\) and \((\bar{x}_j,\bar{y}_j) \in {\text {int}}\Omega \times {\text {int}}\Omega \) is a global maximum of \(\Psi _{\bar{q}}\), we have \(\nabla \bar{q}(\bar{z}_i)=0\) and \(\nabla \Psi _{\bar{q}}(\bar{x}_j, \bar{y}_j)=0\), respectively. Using the non-degeneracy of the associated Hessians, see Assumption \((\textbf{B3})\), and the continuity of \(\bar{q}\), we conclude the existence of \(R_2>0\) as well as of \(\theta >0\) with

$$\begin{aligned} {{\,\textrm{sign}\,}}(\bar{q}(z))= {{\,\textrm{sign}\,}}(\bar{q}(\bar{z}_i)),~1-|\bar{q}(z)|/\alpha \ge \theta \, |z-\bar{z}_i|^2 \quad \text {for all}~z \in B_{R_2}(\bar{z}_i), \end{aligned}$$

as well as

$$\begin{aligned} 1-\Psi _{\bar{q}}(x,y) \ge \theta \, \left| \begin{pmatrix} x-\bar{x}_j \\ y-\bar{y}_j\end{pmatrix} \right| ^2 \quad \text {for all}~(x,y) \in B_{R_2}(\bar{x}_j,\bar{y}_j), \end{aligned}$$

by Taylor’s expansion. This implies

$$\begin{aligned} 1-\langle \bar{q}, \mu _1 \rangle =1-{{\,\textrm{sign}\,}}(\bar{q}(z)) \bar{q}(z)/\alpha =1-|\bar{q}(z)|/\alpha \ge \theta \, |z-\bar{z}_i|^2= \theta \,g(\mu _1, \bar{\mu }_i)^2, \end{aligned}$$

as well as

$$\begin{aligned} 1-\langle \bar{q}, \mu _2 \rangle =1-\Psi _{\bar{q}}(x,y) \ge 1-|\bar{q}(z)|/\alpha \ge \theta \, \left| \begin{pmatrix} x-\bar{x}_j \\ y-\bar{y}_j\end{pmatrix} \right| ^2 \end{aligned}$$

for all

$$\begin{aligned} \mu _1= ({\text {sign}}(\bar{q}(\bar{z}_i))/\alpha ) \delta _z \in U^1_i(R_2)~\quad \text {and} \quad \mu _2= \mathcal {D}_\beta (x,y) \in U^2_j(R_2). \end{aligned}$$
(41)

By Lemma A.3, all elements of \(U^1_i(R_2)\) and \(U^2_i(R_2)\), respectively, are of the form (41), thus finishing the proof. \(\square \)

Summarizing the previous observations, we conclude Theorem 3.15 using the results of from [8]:

Proof of Theorem 3.15

Summarizing our previous observations, we have that:

  • The function F is strongly convex around the optimal observation \(\bar{y}\), see Assumption \((\textbf{B2})\).

  • According to Lemma A.2, there exists \(\{\bar{\mu }_j\}^{\bar{N}}_{j=1} \subset {\text {Ext}}(B)\) with \(\max _{\mu \in \mathcal {B}}\langle \bar{q}, \mu \rangle =\{\bar{\mu }_j\}^{\bar{N}}_{j=1}\).

  • The set \(\{\bar{\mu }_j\}^{\bar{N}}_{j=1}\) is linearly independent, see Assumption \((\textbf{B4})\).

  • The unique solution \(\bar{u}=\sum ^{\bar{N}}_{j=1} \bar{\gamma }_j \bar{\mu }_j\) satisfies \(\bar{\gamma }_j>0\), see Assumption \((\textbf{B5})\).

  • There are \(d_{\mathcal {B}}\)-neighborhoods \(U_j\) of \(\bar{\mu }_j\) for \(j=1,\dots ,\bar{N}\), a function \(g :{\text {Ext}}(B) \times {\text {Ext}}(B) \rightarrow \mathbb {R}\) and \(C_K,\theta >0\) with

    $$\begin{aligned} \Vert K(\mu \!-\!\bar{\mu }_j)\Vert _Y \le C_K g(\mu , \bar{\mu }_j), 1\!-\!\langle \bar{q}, \mu \rangle \ge \theta \, g(\mu ,\bar{\mu }_j)^2\, \text {for all}~\mu \in U_j \cap {\text {Ext}}(B). \end{aligned}$$

Consequently, the assumptions of [8, Theorem 3.8] are satisfied, and applying it we conclude the linear convergence of Theorem 3.15. \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Carioni, M., Iglesias, J.A. & Walter, D. Extremal Points and Sparse Optimization for Generalized Kantorovich–Rubinstein Norms. Found Comput Math (2023). https://doi.org/10.1007/s10208-023-09634-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10208-023-09634-7

Keywords

Mathematics Subject Classification

Navigation