Fast Optimistic Gradient Descent Ascent (OGDA) Method in Continuous and Discrete Time

Boţ, Radu Ioan; Csetnek, Ernö Robert; Nguyen, Dang-Khoa

doi:10.1007/s10208-023-09636-5

Fast Optimistic Gradient Descent Ascent (OGDA) Method in Continuous and Discrete Time

Open access
Published: 29 November 2023

(2023)
Cite this article

Download PDF

You have full access to this open access article

Foundations of Computational Mathematics Aims and scope Submit manuscript

Fast Optimistic Gradient Descent Ascent (OGDA) Method in Continuous and Discrete Time

Download PDF

Radu Ioan Boţ¹,
Ernö Robert Csetnek¹ &
Dang-Khoa Nguyen^1,2

797 Accesses
1 Citation
Explore all metrics

Abstract

In the framework of real Hilbert spaces, we study continuous in time dynamics as well as numerical algorithms for the problem of approaching the set of zeros of a single-valued monotone and continuous operator V. The starting point of our investigations is a second-order dynamical system that combines a vanishing damping term with the time derivative of V along the trajectory, which can be seen as an analogous of the Hessian-driven damping in case the operator is originating from a potential. Our method exhibits fast convergence rates of order $o \left( \frac{1}{t\beta (t)} \right) $ for $\Vert V(z(t))\Vert $, where $z(\cdot )$ denotes the generated trajectory and $\beta (\cdot )$ is a positive nondecreasing function satisfying a growth condition, and also for the restricted gap function, which is a measure of optimality for variational inequalities. We also prove the weak convergence of the trajectory to a zero of V. Temporal discretizations of the dynamical system generate implicit and explicit numerical algorithms, which can be both seen as accelerated versions of the Optimistic Gradient Descent Ascent (OGDA) method for monotone operators, for which we prove that the generated sequence of iterates $(z_k)_{k \ge 0}$ shares the asymptotic features of the continuous dynamics. In particular we show for the implicit numerical algorithm convergence rates of order $o \left( \frac{1}{k\beta _k} \right) $ for $\Vert V(z^k)\Vert $ and the restricted gap function, where $(\beta _k)_{k \ge 0}$ is a positive nondecreasing sequence satisfying a growth condition. For the explicit numerical algorithm, we show by additionally assuming that the operator V is Lipschitz continuous convergence rates of order $o \left( \frac{1}{k} \right) $ for $\Vert V(z^k)\Vert $ and the restricted gap function. All convergence rate statements are last iterate convergence results; in addition to these, we prove for both algorithms the convergence of the iterates to a zero of V. To our knowledge, our study exhibits the best-known convergence rate results for monotone equations. Numerical experiments indicate the overwhelming superiority of our explicit numerical algorithm over other methods designed to solve monotone equations governed by monotone and Lipschitz continuous operators.

Relaxed Inertial Method for Solving Split Monotone Variational Inclusion Problem with Multiple Output Sets Without Co-coerciveness and Lipschitz Continuity

Article 15 April 2024

Tseng’s extragradient method with double projection for solving pseudomonotone variational inequality problems in Hilbert spaces

Article 10 April 2024

Random Gradient-Free Minimization of Convex Functions

Article 30 November 2015

1 Introduction

Let ${\mathcal {H}}$ be a real Hilbert space and $V :{\mathcal {H}}\rightarrow {\mathcal {H}}$ a monotone and continuous operator. We are interested in developing fast converging methods aimed to find a zero of V, or in other words, to solve the monotone equation

$$\begin{aligned} V \left( z \right) = 0, \end{aligned}$$

(1)

for which assume that it has a nonempty solution set ${\mathcal {Z}}$. The monotonicity and the continuity of V imply that $z_{*}$ is a solution of 1 if and only if it is a solution of the following variational inequality

$$\begin{aligned} \left\langle z - z_{*} , V \left( z \right) \right\rangle \ge 0 \quad \forall z \in {\mathcal {H}}. \end{aligned}$$

(2)

One of the main motivations to study 1 comes from minimax problems. More precisely, consider the problem

$$\begin{aligned} \min \limits _{x \in {\mathcal {X}}} \max \limits _{y \in {\mathcal {Y}}} \Phi \left( x , y \right) , \end{aligned}$$

(3)

where ${\mathcal {X}}$ and ${\mathcal {Y}}$ are real Hilbert spaces and $\Phi :{\mathcal {X}}\times {\mathcal {Y}}\rightarrow {\mathbb {R}}$ is a continuously differentiable and convex–concave function, i.e., $\Phi \left( \cdot , y \right) $ is convex for every $y \in {\mathcal {Y}}$ and $\Phi \left( x, \cdot \right) $ is convex for every $x \in {\mathcal {X}}$. A solution of 3 is a saddle point $\left( x_{*}, y_{*} \right) \in {\mathcal {X}}\times {\mathcal {Y}}$ of $\Phi $, which means that it fulfills

$$\begin{aligned} \Phi \left( x_{*} , y \right) \le \Phi \left( x_{*} , y_{*} \right) \le \Phi \left( x , y_{*} \right) \quad \forall \left( x , y \right) \in {\mathcal {X}}\times {\mathcal {Y}}\end{aligned}$$

or, equivalently,

$$\begin{aligned} {\left\{ \begin{array}{ll} \nabla _{x} \Phi \left( x_{*} , y_{*} \right) &{} = 0 \\ - \nabla _{y} \Phi \left( x_{*} , y_{*} \right) &{} = 0. \end{array}\right. } \end{aligned}$$

(4)

Taking into account that the mapping

$$\begin{aligned} \left( x , y \right) \mapsto \Bigl ( \nabla _{x} \Phi \left( x , y \right) , - \nabla _{y} \Phi \left( x , y \right) \Bigr ) \end{aligned}$$

(5)

is monotone [43], it means that the problem of finding a saddle point of $\Phi $ eventually brings us back to the problem 1.

Both 1 and 3 are fundamental models in various fields such as optimization, economics, game theory and partial differential equations. They have recently regained significant attention, in particular in the machine learning and data science community, due to the fundamental role they play, for instance, in multi-agent reinforcement learning [37], robust adversarial learning [32] and generative adversarial networks (GANs) [18, 24].

In this paper, we develop fast continuous in time dynamics as well as numerical algorithms for solving 1 and investigate their asymptotic/convergence properties. First we formulate a second-order dynamical system that combines a vanishing damping term with the time derivative of V along the trajectory, which can be seen as an analogous of the Hessian-driven damping in case the operator is originating from a potential. A continuously differentiable and nondecreasing function $\beta :\left[ t_{0}, + \infty \right) \rightarrow \left( 0, + \infty \right) $, which appears in the system, plays an important role in the analysis. If $\beta $ satisfies a specific growth condition, which is for instance satisfied by polynomials including constant functions, then the method exhibits convergence rates of order $o \left( \frac{1}{t\beta (t)} \right) $ for $\Vert V(z(t))\Vert $, where z(t) denotes the generated trajectory, and for the restricted gap function associated with 2. In addition, z(t) converges asymptotically weakly to a solution of 1.

By considering a temporal discretization of the dynamical system, we obtain an implicit numerical algorithm which exhibits convergence rates of order $o \left( \frac{1}{k \beta _{k}} \right) $ for $\Vert V(z^k)\Vert $ and the restricted gap function associated with 2, where $(\beta _k)_{k \ge 0}$ is a nondecreasing sequence and $(z_k)_{k \ge 0}$ is the generated sequence of iterates. For the latter, we also prove that it converges weakly to a solution of 1.

By a further more involved discretization of the dynamical system, we obtain an explicit numerical algorithm, which, under the additional assumption that V is Lipschitz continuous, exhibits convergence rates of order $o \left( \frac{1}{k} \right) $ for $\Vert V(z^k)\Vert $ and the restricted gap function associated with 2, where $(z_k)_{k \ge 0}$ is the generated sequence of iterates, which is also to converge weakly to a solution of 1.

The resulting numerical schemes can be seen as accelerated versions of the Optimistic Gradient Descent Ascent (OGDA) method [33, 42] formulated in terms of a general monotone operator V. It should be also emphasized that the convergence rate statements for both the implicit and the explicit numerical algorithm are last iterate convergence results and are, to our knowledge, the best-known convergence rate results for monotone equations.

1.1 Related Works

In the following, we discuss some discrete and continuous methods from the literature designed to solve equations governed by monotone and (Lipschitz) continuous, and not necessarily cocoercive operators. It has been recognized that the simplest scheme one can think of, namely the forward algorithm, which, for a starting point $z^0 \in {\mathcal {H}} $ and a given step size $s >0$, reads for $k \ge 0$

$$\begin{aligned} z^{k+1} := z^{k} - sV \left( z^{k} \right) , \end{aligned}$$

and mimics the classical gradient descent algorithm, does not converge. Unless for the trivial case, the operator in 5, which arises in connection with minimax problems, is only monotone and Lipschitz continuous but not cocoercive. Therefore, it was early recognized that explicit numerical methods for monotone equations require an operator corrector term.

In case V is monotone and L-Lipschitz continuous, for $L >0$, Korpelevich [30] and Antipin [2] proposed to solve 1 the nowadays very popular Extragradient (EG) method, which reads for $k \ge 0$

$$\begin{aligned} \begin{aligned} \bar{z}^{k}&:= z^{k} - sV \left( z^{k} \right) \\ z^{k+1}&:= z^{k} - sV \left( \bar{z}^{k} \right) , \end{aligned} \end{aligned}$$

(6)

and converges for a starting point $z^0 \in {{\mathcal {H}}} $ and $0< s < \frac{1}{L}$ to a zero of V. The last iterate convergence rate for the Extragradient method was only recently derived by Gorbuno-Loizou-Gidel in [25]. For $\bar{z}\in {\mathcal {H}}$ and $\delta > 0$, we denote ${\mathbb {B}}\left( \bar{z}; \delta \right) := \left\{ u \in {\mathcal {H}}:\left\Vert \bar{z}- u \right\Vert \le \delta \right\} $. For $z_{*} \in {\mathcal {Z}}\text { and } \delta \left( z^{0} \right) := \left\Vert z_{*} - z^{0} \right\Vert $, the restricted gap function associated with the variational inequality 2 is defined as (see [36])

$$\begin{aligned} \texttt{Gap}\left( z \right) := \sup \limits _{u \in {\mathbb {B}}\left( z_{*} ; \delta \left( z^{0} \right) \right) } \left\langle z - u , V \left( u \right) \right\rangle \ge 0. \end{aligned}$$

In [25], it was shown that

$$\begin{aligned} \left\Vert V \left( z^{k} \right) \right\Vert = {\mathcal {O}}\left( \dfrac{1}{\sqrt{k}} \right) \text{ and } \texttt{Gap}\left( z^{k} \right) = {\mathcal {O}}\left( \dfrac{1}{\sqrt{k}} \right) \quad \text {as} \ k \rightarrow + \infty . \end{aligned}$$

In the same setting, Popov introduced in [42] for minmax problems and the operator in 5 the following algorithm which, when formulated for 1, reads for $k \ge 1$

$$\begin{aligned} z^{k+1} := z^{k} - 2s V \left( z^{k} \right) + s V \left( z^{k-1} \right) , \end{aligned}$$

(7)

and converges for starting points $z^0, z^1 \in {{\mathcal {H}}} $ and step size $0< s < \frac{1}{2L}$ to a zero of V. This algorithm is usually known as the Optimistic Gradient Descent Ascent (OGDA) method, a name which we adopt also for the general formulation in 7. Recently, Chavdarova–Jordan–Zampetakis proved in [19] that for $0< s < \frac{1}{16L}$ the scheme exhibits the following best-iterate convergence rate

$$\begin{aligned} \min \limits _{1 \le i \le k} \left\Vert V \left( z^{i} \right) \right\Vert = {\mathcal {O}}\left( \dfrac{1}{\sqrt{k}} \right) \quad \text { as } k \rightarrow + \infty . \end{aligned}$$

We notice also that, according to Golowich–Pattathil–Daskalakis–Ozdaglar (see [22, 23]), the lower bound for the restricted gap function for the algorithms 6 and 7 is of ${\mathcal {O}}\left( 1 / \sqrt{k} \right) $ as $k \rightarrow +\infty $.

The solving of equation 1 can be also addressed in the general framework of continuous and discrete-time methods for finding the zeros of a maximally monotone operator. Attouch–Svaiter introduced in [14] (see also [20]) a first-order evolution equation linked to the Newton and the Levenberg–Marquardt methods, which when applied to 1 reads

$$\begin{aligned} {\dot{z}} \left( t \right) + \lambda (t) \dfrac{d}{dt}V(z(t)) + \lambda (t) V(z(t)) = 0, \end{aligned}$$

(8)

where $t \mapsto \lambda (t)$ is a continuous mapping, and for which they proved that its trajectories converge weakly to a zero of V. Attouch–Peypouquet studied in [12] the following second-order differential equation with vanishing damping

$$\begin{aligned} \ddot{z} \left( t \right) + \dfrac{\alpha }{t} {\dot{z}} \left( t \right) + A_{\gamma \left( t \right) } \left( z \left( t \right) \right) = 0, \end{aligned}$$

(9)

where $A :{\mathcal {H}}\rightrightarrows {\mathcal {H}}$ is a possibly set-valued maximally monotone operator,

$$\begin{aligned} A_{\gamma } := \dfrac{1}{\gamma } \left( \text {Id}- J_{\gamma A} \right) \end{aligned}$$

stands for the Yosida approximation of A of index $\gamma > 0$, and $J_{\gamma A} = (\text {Id}+ \gamma A)^{-1}: {{\mathcal {H}}} \rightarrow {{\mathcal {H}}}$ stands for the resolvent of $\gamma A$. The dynamical system 9 gives rise via implicit discretization to the following so-called Regularized Inertial Proximal Algorithm, which for every $k \ge 1$ reads

$$\begin{aligned} \begin{aligned} \bar{z}^{k}&:= z^{k} + \left( 1 - \dfrac{\alpha }{k} \right) \left( z^{k} - z^{k-1} \right) \\ z^{k+1}&:= \dfrac{\gamma _{k}}{\gamma _{k} + s} \bar{z}^{k} + \dfrac{s}{\gamma _{k} + s} J_{\left( \gamma _{k} + s \right) A} \left( \bar{z}^{k} \right) , \end{aligned} \end{aligned}$$

$z^0, z^1 \in {{\mathcal {H}}}$ are the starting points, $\alpha >2$, $s >0$ and $\gamma _k = (1+\varepsilon )\frac{s}{\alpha ^2}k^2$ for every $k\ge 1$, with $\varepsilon >0$ fixed. In [12], it was shown that the discrete velocity $z^{k+1} -z^k$ vanishes with a rate of convergence of ${\mathcal {O}}\left( 1 / k \right) $ as $k \rightarrow +\infty $ and that the sequence of iterates converges weakly to a zero of A. The continuous time approach in 9 has been extended by Attouch–László in [9] by adding a Newton-like correction term $\xi \frac{d}{dt} \left( A_{\gamma \left( t \right) } \left( z \left( t \right) \right) \right) $, with $\xi \ge 0$, whereas the discrete counterpart of this scheme was proposed and investigated in [10].

For an inertial evolution equation with asymptotically vanishing damping terms approaching the set of primal-dual solutions of a smooth convex optimization problem with linear equality constraints, that can also be seen as the solution set of a monotone operator equation, and exhibiting fast convergence rates expressed in terms of the value functions, the feasibility measure and the primal-dual gap, we refer to the recent works [6, 17].

We also want to mention the implicit method for finding the zeros of a maximally monotone operator proposed by Kim in [29], which relies on the performance estimation problem approach and makes use of computer-assisted tools.

In the case when V is monotone and L-Lipschitz continuous, for $L >0$, Yoon-Ryu recently proposed in [49] an accelerated algorithm for solving 1, called Extra Anchored Gradient (EAG) algorithm, designed by using anchor variables, a technique that can be traced back to Halpern’s algorithm (see [28]). The iterative scheme of the EAG algorithm reads for every $k \ge 0$

$$\begin{aligned} \begin{aligned} \bar{z}^{k}&:= z^{k} + \dfrac{1}{k+2} \left( z^{0} - z^{k} \right) - s_kV \left( z^{k} \right) \\ z^{k+1}&:= z^{k} + \dfrac{1}{k+2} \left( z^{0} - z^{k} \right) - s_kV \left( \bar{z}^{k} \right) , \end{aligned} \end{aligned}$$

(10)

where $z^0 \in {{\mathcal {H}}}$ is the starting point and the sequence of step sizes $(s_k)_{k \ge 0}$ is either chosen to be equal to a constant in the interval $\left( 0,\frac{1}{8L}\right] $ or such that

$$\begin{aligned} s_{k+1} := s_{k} \left( 1 - \dfrac{1}{\left( k+1 \right) \left( k+3 \right) } \dfrac{s_{k}^2 L^{2}}{1 - s_{k}^2 L^{2}} \right) \quad \forall k \ge 0, \end{aligned}$$

(11)

where $s_0 \in \left( 0,\frac{3}{4}{L}\right) $. This iterative scheme exhibits in both cases the convergence rate of

$$\begin{aligned} \left\Vert V \left( z^{k} \right) \right\Vert = {\mathcal {O}}\left( \dfrac{1}{k} \right) \quad \text { as } k \rightarrow + \infty . \end{aligned}$$

Later, Lee-Kim proposed in [31] an algorithm formulated in the same spirit for the problem of finding the saddle points of a smooth nonconvex-nonconcave function.

Further variants of the anchoring-based method have been proposed by Tran-Dinh in [47] and together with Luo in [48], which all exhibit the same convergence rate for $\Vert V(z^k)\Vert $ as EAG. Tran-Dinh in [47] and Park-Ryu in [40] pointed out the existence of some connections between the anchoring approach and Nesterov’s acceleration technique used for the minimization of smooth and convex functions [34, 35].

1.2 Our Contributions

The starting point of our investigations is a second-order evolution equation associated with problem 1 that combines a vanishing damping term with the time derivative of V along the trajectory, which will then lead via temporal discretizations to the implicit and the explicit algorithms. In [19], several dynamical systems of EG and OGDA type were proposed, mainly in the spirit of the heavy ball method, that is, with a constant damping term, exhibiting a convergence rate of $\Vert V(z(t))\Vert = {\mathcal {O}}\left( 1/\sqrt{t} \right) $ as $t \rightarrow + \infty $ and, in case V is bilinear, weak convergence of the trajectory z(t) to a zero of the operator.

One of the main discoveries of the last decade was that asymptotically vanishing damping terms (see [4, 13, 46]) lead to the acceleration of the convergence of the value functions along the trajectories of a inertial gradient systems. Moreover, when enhancing the evolution equations also with Hessian-driven damping terms, the rate of convergence of the gradient along the trajectories can be accelerated, too [13, 45]. It is natural to ask whether asymptomatically vanishing damping terms have the same accelerating impact on the values of the norm of the governing operator along the trajectories of inertial dynamical systems associated with monotone (not necessarily potential) operators.

For $z_{*}\in {\mathcal {Z}}$ and the dynamics generated by this dynamical system, we will prove that

$$\left\langle z \left( t \right) - z_{*}, V \left( z \left( t \right) \right) \right\rangle = {\mathcal {O}}\left( \dfrac{1}{t \beta \left( t \right) } \right) \ \text{ and } \ \left\Vert V \left( z \left( t \right) \right) \right\Vert = {\mathcal {O}}\left( \dfrac{1}{t \beta \left( t \right) } \right) .$$

Further, by assuming that

$$\begin{aligned} 0 \le \sup \limits _{t \ge t_{0}} \dfrac{t {\dot{\beta }} \left( t \right) }{\beta \left( t \right) } < \alpha - 2, \end{aligned}$$

we will prove that the trajectory $z \left( t \right) $ converges weakly to a solution of 1 as $t \rightarrow + \infty $ and it holds

$$\begin{aligned} \left\Vert {\dot{z}} \left( t \right) \right\Vert = o \left( \dfrac{1}{t} \right) \quad \text { as } t \rightarrow + \infty , \end{aligned}$$

and

$$\begin{aligned} \left\langle z \left( t \right) - z_{*} , V \left( z \left( t \right) \right) \right\rangle = o \left( \dfrac{1}{t \beta \left( t \right) } \right) \text{ and } \left\Vert V \left( z \left( t \right) \right) \right\Vert = o \left( \dfrac{1}{t \beta \left( t \right) } \right) \quad \text { as } t \rightarrow + \infty . \end{aligned}$$

Polynomial parameter functions $\beta (t) = \beta _0 t^\rho $, for $\beta _0>0$ and $\rho \ge 0$, satisfy the two growth conditions for $\alpha \ge \rho + 2$ and $\alpha > \rho + 2$, respectively.

To the main contributions of this work belongs not only the improvement of the convergence rates in [19] in both continuous and discrete time, but in particular the surprising discovery that this can be achieved by means of asymptotically vanishing damping, respectively, as we will see below, of Nesterov momentum. This shows that the accelerating effect of inertial methods with asymptotically vanishing damping/Nesterov momentum goes beyond convex optimization and opens the gate toward new unexpected research perspectives.

Remark 1

(restricted gap function) The convergence rates for $t \mapsto \left\langle z \left( t \right) - z_{*}, V \left( z \left( t \right) \right) \right\rangle $ and $t \mapsto \left\Vert V \left( z \left( t \right) \right) \right\Vert $ can be easily transferred to the restricted gap function associated with the variational inequality 2. Indeed, for $z_* \in {\mathcal {Z}}$, let $\delta (z^0):= \left\Vert z^0 - z_{*} \right\Vert $, $u \in {\mathbb {B}}\left( z_{*}; \delta (z^0) \right) $ and $t \ge t_{0}$. It holds

$$\begin{aligned}&0 \le \left\langle z \left( t \right) - u , V \left( u \right) \right\rangle \le \left\langle z \left( t \right) - u , V \left( z \left( t \right) \right) \right\rangle \\&\quad = \left\langle z \left( t \right) - z_{*} , V \left( z \left( t \right) \right) \right\rangle + \left\langle z_{*} - u , V \left( z \left( t \right) \right) \right\rangle \\&\quad \le \left\langle z \left( t \right) - z_{*} , V \left( z \left( t \right) \right) \right\rangle + \left\Vert u - z_{*} \right\Vert \left\Vert V \left( z \left( t \right) \right) \right\Vert , \end{aligned}$$

which implies that for every $t \ge t_{0}$

$$\begin{aligned} 0 \le \texttt{Gap}\left( z \left( t \right) \right)= & {} \sup _{u \in {\mathbb {B}}\left( z_{*}; \delta (z^0) \right) } \left\langle z \left( t \right) - u, V \left( u \right) \right\rangle \\\le & {} \left\langle z \left( t \right) - z_{*}, V \left( z \left( t \right) \right) \right\rangle + \delta (z^0)\left\Vert V \left( z \left( t \right) \right) \right\Vert , \end{aligned}$$

which proofs our claim. The same remark can be obviously made in the discrete case.

Further we provide two temporal discretizations of the dynamical system, one of implicit and one of explicit type.

We will prove that, for $z_{*} \in {\mathcal {Z}}$, it holds

$$\begin{aligned} \left\Vert z^{k} - z^{k-1} \right\Vert = o \left( \dfrac{1}{k} \right) \text { as } k \rightarrow + \infty , \end{aligned}$$

and

$$\begin{aligned} \left\langle z^{k} - z_{*} , V \left( z^{k} \right) \right\rangle = o \left( \dfrac{1}{k \beta _{k}} \right) \text{ and } \left\Vert V \left( z^{k} \right) \right\Vert = o \left( \dfrac{1}{k \beta _{k}} \right) \text { as } k \rightarrow + \infty , \end{aligned}$$

and that the sequence $\left( z^{k} \right) _{k \ge 0}$ converges weakly to a solution in ${\mathcal {Z}}$.

The constant sequence $\beta _{k} \equiv 1$ obviously satisfies the growth condition required in the implicit numerical scheme and for this choice the generated sequence $\left( z^{k} \right) _{k \ge 0}$ fulfills for every $k \ge 1$

$$\begin{aligned} z^{k+1}&= z^{k} + \left( 1 - \dfrac{\alpha }{k + \alpha } \right) \left( z^{k} - z^{k-1} \right) - \dfrac{s \alpha }{2 \left( k + \alpha \right) } V \left( z^{k+1} \right) \\&\quad - \dfrac{sk }{k + \alpha } \left( V \left( z^{k+1} \right) - V \left( z^{k} \right) \right) . \end{aligned}$$

From the general statement, we have that

$$\begin{aligned} \left\Vert z^{k} - z^{k-1} \right\Vert= & {} o \left( \dfrac{1}{k} \right) , \ \left\langle z^{k} - z_{*}, V \left( z^{k} \right) \right\rangle = o \left( \dfrac{1}{k} \right) \ \text{ and } \ \left\Vert V \left( z^{k} \right) \right\Vert = o \left( \dfrac{1}{k} \right) \\{} & {} \text { as } k \rightarrow + \infty , \end{aligned}$$

and $\left( z^{k} \right) _{k \ge 0}$ converges weakly to a solution in ${\mathcal {Z}}$.

A further contribution of this work is therefore this numerical algorithm with Nesterov momentum for solving 1, obtained by implicit temporal discretization of the inertial evolution equation and which reproduces all its convergence properties in discrete time.

Only for the explicit discrete scheme, we will additionally assume that the operator V is L-Lipschitz continuous, with $L>0$.

When taking a closer look at its equivalent formulation, which reads for every $k \ge 1$

$$\begin{aligned} \bar{z}^{k}&:= z^{k} + \left( 1 - \dfrac{\alpha }{k + \alpha } \right) \left( z^{k} - z^{k-1} \right) - \dfrac{\alpha s}{2 \left( k + \alpha \right) } V \left( \bar{z}^{k-1} \right) \\ z^{k+1}&:= z^{k} + \left( 1 - \dfrac{\alpha }{k + \alpha } \right) \left( z^{k} - z^{k-1} \right) - \dfrac{\alpha s}{2 \left( k + \alpha \right) } V \left( \bar{z}^{k} \right) \\&\quad - \dfrac{sk}{k + \alpha } \left( V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right) , \end{aligned}$$

one can notice that the iterative scheme can be seen as an accelerated version of the OGDA method. An important feature of the explicit Fast OGDA method is that it requires the evaluation of V only at the elements of the sequence $\left( \bar{z}^{k} \right) _{k \ge 0}$, while the Extragradient method 6 and the Extra Anchored Gradient method 10 require the evaluation of V at both sequences $\left( z^{k} \right) _{k \ge 0}$ and $\left( \bar{z}^{k} \right) _{k \ge 0}$.

We will show that, for $z_{*} \in {\mathcal {Z}}$, it holds

$$\begin{aligned}&\left\Vert z^{k} - z^{k-1} \right\Vert = o \left( \dfrac{1}{k} \right) , \quad \left\langle z^{k} - z_{*} , V \left( z^{k} \right) \right\rangle = o \left( \dfrac{1}{k} \right) ,\\&\left\Vert V \left( z^{k} \right) \right\Vert = o \left( \dfrac{1}{k} \right) \ \text{ and } \ \left\Vert V \left( \bar{z}^{k} \right) \right\Vert = o \left( \dfrac{1}{k} \right) \text { as } k \rightarrow + \infty , \end{aligned}$$

and that also for this algorithm the generated sequence $\left( z^{k} \right) _{k \ge 0}$ converges weakly to a solution in ${\mathcal {Z}}$.

Another main contribution of this work is the explicit Fast OGDA method with Nesterov momentum and operator correction terms, for which we show the best convergence rate results known in the literature of explicit algorithms for monotone inclusions and the convergence of the iterates to a zero of the operator. We illustrate the theoretical findings with numerical experiments, which show the overwhelming superiority of our method over other numerical algorithms designed to solve monotone equations governed by monotone and Lipschitz continuous operators. These include the algorithms designed by using “anchoring” techniques, for which the tracing of the iterates back to the starting value seems to have a slowing effect on the convergence performances.

Remark 2

(the role of the time scaling parameter function $\beta $) The function $\beta $ which appears in the formulation of the dynamical system can be seen as a time scaling parameter function in the spirit of recent investigations on this topic (see, for instance, [5, 7]) in the context of the minimization of a smooth convex function. It was shown that, when used in combination with vanishing damping (and also with Hessian-driven damping) terms, time scaling functions improve the convergence rates of the function values and of the gradient. The positive effect of the time scaling on the convergence rates can be transferred to the numerical schemes obtained via implicit discretization, as it was recently pointed out by Attouch–Chbani–Riahi in [8], and long time ago by Güler in [26, 27] for the proximal point algorithm, which may exhibit convergence rates for the objective function values of $o \left( 1/k^{\rho } \right) $ rate, for arbitrary $\rho > 0$. On the other hand, this does not hold for numerical schemes obtained via explicit discretization, as it is the gradient method for which it is known that the convergence rate of $o \left( 1/k^{2} \right) $ for the objective function values (see [11]) cannot be improved in general [34, 35].

This explains why the discretization of the parameter function $\beta $ appears only in the implicit numerical scheme and in the corresponding convergence rates, and not in the explicit numerical scheme.

2 The Continuous Time Approach

In this section, we will analyze the continuous time scheme proposed for 1, and which we recall for convenience in the following.

Let $z_{*} \in {\mathcal {Z}}$ and $0 \le \lambda \le \alpha - 1$. We consider the following energy function ${\mathcal {E}}_{\lambda } :\left[ t_{0}, + \infty \right) \rightarrow \left[ 0, + \infty \right) $,

$$\begin{aligned} {\mathcal {E}}_{\lambda } \left( t \right)&{:=} \ \dfrac{1}{2} \left\Vert 2 \lambda \left( z \left( t \right) - z_{*} \right) + t \Bigl ( 2 {\dot{z}} \left( t \right) + \beta \left( t \right) V \left( z \left( t \right) \right) \Bigr ) \right\Vert ^{2} \nonumber \\&\quad + 2 \lambda \left( \alpha - 1 - \lambda \right) \left\Vert z \left( t \right) - z_{*} \right\Vert ^{2} \nonumber \\&\quad + 2 \lambda t \beta \left( t \right) \left\langle z \left( t \right) - z_{*} , V \left( z \left( t \right) \right) \right\rangle + \dfrac{1}{2} t^{2} \beta ^{2} \left( t \right) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2}, \end{aligned}$$

(14)

which will play a fundamental role in our analysis. By taking into consideration 2, for every $0 \le \lambda \le \alpha - 1$ we have

$$\begin{aligned} {\mathcal {E}}_{\lambda } \left( t \right) \ge 0 \quad \forall t \ge t_{0} . \end{aligned}$$

Denote

$$\begin{aligned} w: \left[ t_{0} , + \infty \right) \rightarrow {\mathbb {R}}, \quad w \left( t \right) := \dfrac{1}{2} \left( \left( \alpha - 2 \right) \dfrac{\beta \left( t \right) }{t} - {\dot{\beta }} \left( t \right) \right) . \end{aligned}$$

(15)

The growth condition 13 guarantees that $w(t) \ge 0$ for every $t \ge t_0$.

First we will show that the energy dissipates with time.

Lemma 3

Let $z :\left[ t_{0}, + \infty \right) \rightarrow {\mathcal {H}}$ be a solution of 12, $z_* \in {\mathcal {Z}}$ and $0 \le \lambda \le \alpha - 1$. Then for every $t \ge t_{0}$, it holds

$$\begin{aligned} \dfrac{d}{dt} {\mathcal {E}}_{\lambda } \left( t \right)&\le - 2 \lambda t w \left( t \right) \left\langle z \left( t \right) - z_{*} , V \left( z \left( t \right) \right) \right\rangle \nonumber \\&\quad + t \beta \left( t \right) \Bigl ( \left( \alpha - 1 - \lambda \right) \beta \left( t \right) - 2t w \left( t \right) \Bigr ) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2} \nonumber \\&\quad - \left( \alpha - 1 - \lambda \right) t \left\Vert 2 {\dot{z}} \left( t \right) - \beta \left( t \right) V \left( z \left( t \right) \right) \right\Vert ^{2} . \end{aligned}$$

(16)

Proof

Let $t \ge t_{0}$ be fixed. From the definition of the dynamical system 12, we have

$$\begin{aligned} 2 t \ddot{z} \left( t \right) + t \beta \left( t \right) \dfrac{d}{dt} V \left( z \left( t \right) \right){} & {} = - 2 \alpha {\dot{z}} \left( t \right) - t \beta \left( t \right) \dfrac{d}{dt} V \left( z \left( t \right) \right) \\{} & {} \quad - \left( t {\dot{\beta }} \left( t \right) + \alpha \beta \left( t \right) \right) V \left( z \left( t \right) \right) . \end{aligned}$$

Therefore,

$$\begin{aligned}&\dfrac{d}{dt} \left( \dfrac{1}{2} \left\Vert 2 \lambda \left( z \left( t \right) - z_{*} \right) + t \Bigl ( 2 {\dot{z}} \left( t \right) + \beta \left( t \right) V \left( z \left( t \right) \right) \Bigr ) \right\Vert ^{2} \right) \nonumber \\&\quad = \ \left\langle 2 \lambda \left( z \left( t \right) - z_{*} \right) + t \Bigl ( 2 {\dot{z}} \left( t \right) + \beta \left( t \right) V \left( z \left( t \right) \right) \Bigr ) , \right. \nonumber \\&\left. 2 \left( \lambda + 1 \right) {\dot{z}} \left( t \right) + \left( t {\dot{\beta }} \left( t \right) + \beta \left( t \right) \right) V \left( z \left( t \right) \right) + 2t \ddot{z} \left( t \right) + t \beta \left( t \right) \dfrac{d}{dt} V \left( z \left( t \right) \right) \right\rangle \nonumber \\&\quad = \ \left\langle 2 \lambda \left( z \left( t \right) - z_{*} \right) + 2t {\dot{z}} \left( t \right) + t \beta \left( t \right) V \left( z \left( t \right) \right) , \right. \nonumber \\&\left. 2 \left( \lambda + 1 - \alpha \right) {\dot{z}} \left( t \right) + \left( 1 - \alpha \right) \beta \left( t \right) V \left( z \left( t \right) \right) - t \beta \left( t \right) \dfrac{d}{dt} V \left( z \left( t \right) \right) \right\rangle \nonumber \\&\quad = 4 \lambda \left( \lambda + 1 - \alpha \right) \left\langle z \left( t \right) - z_{*} , {\dot{z}} \left( t \right) \right\rangle + 2 \lambda \left( 1 - \alpha \right) \beta \left( t \right) \left\langle z \left( t \right) - z_{*} , V \left( z \left( t \right) \right) \right\rangle \nonumber \\&\quad \quad - 2 \lambda t \beta \left( t \right) \left\langle z \left( t \right) - z_{*} , \dfrac{d}{dt} V \left( z \left( t \right) \right) \right\rangle + 4 \left( \lambda + 1 - \alpha \right) t \left\Vert {\dot{z}} \left( t \right) \right\Vert ^{2} \nonumber \\&\quad \quad + 2 t \left( \lambda + 2 - 2 \alpha \right) \beta \left( t \right) \left\langle {\dot{z}} \left( t \right) , V \left( z \left( t \right) \right) \right\rangle - 2t^{2} \beta \left( t \right) \left\langle {\dot{z}} \left( t \right) , \dfrac{d}{dt} V \left( z \left( t \right) \right) \right\rangle \nonumber \\&\quad \quad + \left( 1 - \alpha \right) t \beta ^{2} \left( t \right) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2} - t^{2} \beta ^{2} \left( t \right) \left\langle V \left( z \left( t \right) \right) , \dfrac{d}{dt} V \left( z \left( t \right) \right) \right\rangle . \end{aligned}$$

(17)

By differentiating the other terms of the energy function, it yields

$$\begin{aligned}&\dfrac{d}{dt} \left( 2 \lambda \left( \alpha - 1 - \lambda \right) \left\Vert z \left( t \right) - z_{*} \right\Vert ^{2} + 2 \lambda t \beta \left( t \right) \left\langle z \left( t \right) - z_{*} , V \left( z \left( t \right) \right) \right\rangle \nonumber \right. \\&\left. \qquad + \dfrac{1}{2} t^{2} \beta ^{2} \left( t \right) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2} \right) \nonumber \\&\quad = \ 4 \lambda \left( \alpha - 1 - \lambda \right) \left\langle z \left( t \right) - z_{*} , {\dot{z}} \left( t \right) \right\rangle + 2 \lambda \left( \beta \left( t \right) + t {\dot{\beta }} \left( t \right) \right) \left\langle z \left( t \right) - z_{*} , V \left( z \left( t \right) \right) \right\rangle \nonumber \\&\qquad + 2 \lambda t \beta \left( t \right) \left\langle {\dot{z}} \left( t \right) , V \left( z \left( t \right) \right) \right\rangle + 2 \lambda t \beta \left( t \right) \left\langle z \left( t \right) - z_{*} , \dfrac{d}{dt} V \left( z \left( t \right) \right) \right\rangle \nonumber \\&\qquad + t \beta \left( t \right) \left( \beta \left( t \right) + t {\dot{\beta }} \left( t \right) \right) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2} + t^{2} \beta ^{2} \left( t \right) \left\langle V \left( z \left( t \right) \right) , \dfrac{d}{dt} V \left( z \left( t \right) \right) \right\rangle . \end{aligned}$$

(18)

By summing up 17 and 18, and then using the definition of w in 15, we conclude that

$$\begin{aligned} \dfrac{d}{dt} {\mathcal {E}}_{\lambda } \left( t \right)&= - 2 \lambda t w \left( t \right) \left\langle z \left( t \right) - z_{*} , V \left( z \left( t \right) \right) \right\rangle + 4 \left( \lambda + 1 - \alpha \right) t \left\Vert {\dot{z}} \left( t \right) \right\Vert ^{2} \\&\quad + 4 \left( \lambda + 1 - \alpha \right) t \beta \left( t \right) \left\langle {\dot{z}} \left( t \right) , V \left( z \left( t \right) \right) \right\rangle - 2t^{2} \beta \left( t \right) \left\langle {\dot{z}} \left( t \right) , \dfrac{d}{dt} V \left( z \left( t \right) \right) \right\rangle \\&\quad - 2t^{2} \beta \left( t \right) w \left( t \right) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2} . \end{aligned}$$

Finally, we observe that

$$\begin{aligned}&4 \left( \lambda + 1 - \alpha \right) t \left\Vert {\dot{z}} \left( t \right) \right\Vert ^{2} + 4 \left( \lambda + 1 - \alpha \right) t \beta \left( t \right) \left\langle {\dot{z}} \left( t \right) , V \left( z \left( t \right) \right) \right\rangle \\&\quad = \ t \beta \left( t \right) \Bigl ( \left( \alpha - 1 - \lambda \right) \beta \left( t \right) - 2t w \left( t \right) \Bigr ) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2} \\&\qquad - \left( \alpha - 1 - \lambda \right) t \left\Vert 2 {\dot{z}} \left( t \right) - \beta \left( t \right) V \left( z \left( t \right) \right) \right\Vert ^{2} . \end{aligned}$$

This, in combination with $\left\langle {\dot{z}} \left( t \right) , \frac{d}{dt} V \left( z \left( t \right) \right) \right\rangle \ge 0$ for every $t \ge t_{0}$, which is a consequence of the monotonicity of V, leads to 16. $\square $

The following theorem provides first convergence rates which follow as a direct consequence of the previous lemma. Since $\beta $ is positive and nondecreasing, we have $\lim _{t \rightarrow + \infty } t \beta \left( t \right) = + \infty $.

Theorem 4

Let $z :\left[ t_{0}, + \infty \right) \rightarrow {\mathcal {H}}$ be a solution of 12 and $z_* \in {\mathcal {Z}}$. For every $t \ge t_{0}$, it holds

$$\begin{aligned} 0 \le \left\Vert V \left( z \left( t \right) \right) \right\Vert&\le \sqrt{2 {\mathcal {E}}_{\alpha - 1} \left( t_{0} \right) } \cdot \dfrac{1}{t \beta \left( t \right) } , \end{aligned}$$

(19a)

$$\begin{aligned} 0 \le \left\langle z \left( t \right) - z_{*} , V \left( z \left( t \right) \right) \right\rangle&\le \dfrac{{\mathcal {E}}_{\alpha - 1} \left( t_{0} \right) }{2 \left( \alpha - 1 \right) } \cdot \dfrac{1}{t \beta \left( t \right) } , \end{aligned}$$

(19b)

and the following statements are true

$$\begin{aligned} \int _{t_{0}}^{+ \infty } t w \left( t \right) \left\langle z \left( t \right) - z_{*} , V \left( z \left( t \right) \right) \right\rangle dt&< + \infty , \end{aligned}$$

(20a)

$$\begin{aligned} \int _{t_{0}}^{+ \infty } t^{2} \beta \left( t \right) w \left( t \right) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2} dt&< + \infty . \end{aligned}$$

(20b)

If we assume in addition that

$$\begin{aligned} 0 \le \sup \limits _{t \ge t_{0}} \dfrac{t {\dot{\beta }} \left( t \right) }{\beta \left( t \right) } < \alpha - 2 , \end{aligned}$$

(21)

then the trajectory $t \mapsto z \left( t \right) $ is bounded, it holds

$$\begin{aligned} \int _{t_{0}}^{+ \infty } t \left\Vert {\dot{z}} \left( t \right) \right\Vert ^{2} < + \infty , \end{aligned}$$

(22)

and the limit $\lim _{t \rightarrow + \infty } {\mathcal {E}}_{\lambda } \left( t \right) \in {\mathbb {R}}$ exists for every $\lambda $ satisfying $0 \le \lambda \le \alpha - 1$.

Proof

First we choose $\lambda := \alpha - 1$. Then, inequality 16 reduces to

$$\begin{aligned}{} & {} \dfrac{d}{dt} {\mathcal {E}}_{\alpha - 1} \left( t \right) \le - 2 \left( \alpha - 1 \right) t w \left( t \right) \left\langle z \left( t \right) - z_{*}, \nonumber \right. \\{} & {} \left. \quad \quad V \left( z \left( t \right) \right) \right\rangle - 2t^{2} \beta \left( t \right) w \left( t \right) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2} \le 0 \quad \forall t \ge t_0. \end{aligned}$$

(23)

This means that $t \mapsto {\mathcal {E}}_{\alpha - 1} \left( t \right) $ is nonincreasing on $[t_0,+\infty )$ and, thus, the inequalities (19) follow from the definition of the energy function. In addition, after integration of 23, we obtain the statements in (20).

Now we suppose that 21 holds. Then, there exists $0 \le \varepsilon < \alpha - 2$ such that

$$\begin{aligned} \sup \limits _{t \ge t_{0}} \dfrac{t {\dot{\beta }} \left( t \right) }{\beta \left( t \right) } = \alpha - 2 - \varepsilon < \alpha - 2 . \end{aligned}$$

This means that

$$\begin{aligned} w \left( t \right) = - \dfrac{1}{2} {\dot{\beta }} \left( t \right) + \dfrac{1}{2} \left( \alpha - 2 \right) \dfrac{\beta \left( t \right) }{t} \ge \dfrac{\varepsilon }{2} \dfrac{\beta \left( t \right) }{t} > 0 \quad \forall t \ge t_{0}. \end{aligned}$$

(24)

Hence,

$$\begin{aligned} t^{2} \beta \left( t \right) w \left( t \right) \ge \dfrac{\varepsilon }{2} t \beta ^{2} \left( t \right) \quad \forall t \ge t_{0}, \end{aligned}$$

(25)

which, due to 20b, gives

$$\begin{aligned} \int _{t_{0}}^{+ \infty } t \beta ^{2} \left( t \right) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2} dt < + \infty . \end{aligned}$$

(26)

In order to prove the last statements of the theorem, we notice that the estimate 16 gives for every $0 \le \lambda \le \alpha -1$ and every $t \ge t_0$

$$\begin{aligned} \dfrac{d}{dt} {\mathcal {E}}_{\lambda } \left( t \right)&\le \left( \alpha - 1 - \lambda \right) t \beta ^{2} \left( t \right) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2} - \left( \alpha - 1 - \lambda \right) t \left\Vert 2 {\dot{z}} \left( t \right) - \beta \left( t \right) V \left( z \left( t \right) \right) \right\Vert ^{2} \nonumber \\&\le 2 \left( \alpha - 1 - \lambda \right) t \beta ^{2} \left( t \right) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2} - 2 \left( \alpha - 1 - \lambda \right) t \left\Vert {\dot{z}} \left( t \right) \right\Vert ^{2} \end{aligned}$$

(27a)

$$\begin{aligned}&\le 2 \left( \alpha - 1 - \lambda \right) t \beta ^{2} \left( t \right) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2} . \end{aligned}$$

(27b)

The assertion 22 follows by integration of 27a for $\lambda := 0$ and by using then 26. Finally, as $t \mapsto t \beta ^{2} \left( t \right) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2} \in {\mathbb {L}}^{1} \left( \left[ t_{0}, + \infty \right) \right) $, we can apply Lemma A.1 to 27b in order to obtain the existence of the limit $\lim _{t \rightarrow + \infty } {\mathcal {E}}_{\lambda } \left( t \right) \in {\mathbb {R}}$ for every $0 \le \lambda \le \alpha - 1$. $\square $

The existence and uniqueness of solutions for 12 can be guaranteed in a very general setting, which includes the one of continuously differentiable operators defined on finite-dimensional spaces, that are obviously Lipschitz continuous in bounded sets. The proof of Theorem 5 is provided in the Appendix and it relies on showing that the maximal solution given by the Cauchy–Lipschitz theorem is a global solution.

Theorem 5

Let $\alpha >2$ and assume that $V: {\mathcal {H}}\rightarrow {\mathcal {H}}$ is continuously differentiable, $\beta : [t_0,+\infty ) \rightarrow (0,+\infty )$ is a continuously differentiable and nondecreasing function which satisfies condition 21 and that V and ${\dot{\beta }}$ are Lipschitz continuous on bounded sets. Then for every initial condition $z \left( t_{0} \right) = z^{0} \in {\mathcal {H}}\text { and } {\dot{z}} \left( t_{0} \right) = \dot{z}^{0} \in {\mathcal {H}}$, the dynamical system 12 has a unique global twice continuously differentiable solution $z :\left[ t_{0}, + \infty \right) \rightarrow {\mathcal {H}}$.

Further we prove that, under the slightly stronger growth condition 21, the trajectories of the dynamical system 12 converge to a zero of V. This phenomenon is also present at inertial gradient systems with asymptotically vanishing damping terms, where it concerns the coefficient $\alpha $, too.

Theorem 6

Let $\alpha >2$ and $z :\left[ t_{0}, + \infty \right) \rightarrow {\mathcal {H}}$ be a solution of 12 and assume that $\beta :\left[ t_{0}, + \infty \right) \rightarrow \left( 0, + \infty \right) $ satisfies the growth condition 21, in other words

$$\begin{aligned} 0 \le \sup \limits _{t \ge t_{0}} \dfrac{t {\dot{\beta }} \left( t \right) }{\beta \left( t \right) } < \alpha - 2 . \end{aligned}$$

Then, $z \left( t \right) $ converges weakly to a solution of 1 as $t \rightarrow + \infty $.

Proof

Let $z_* \in {{\mathcal {Z}}}$ and $0 \le \lambda _{1} < \lambda _{2} \le \alpha - 1$ be fixed. Then by the definition of the energy function in 14, we have for every $t \ge t_{0}$

$$\begin{aligned} {\mathcal {E}}_{\lambda _{2}} \left( t \right) - {\mathcal {E}}_{\lambda _{1}} \left( t \right) = \ {}&2 \left( \lambda _{2} - \lambda _{1} \right) t \left\langle z \left( t \right) - z_{*} , 2 {\dot{z}} \left( t \right) + \beta \left( t \right) V \left( z \left( t \right) \right) \right\rangle \\&\quad + 2 \left( \lambda _{2} - \lambda _{1} \right) \lambda \beta \left( t \right) \left\langle z \left( t \right) - z_{*} , V \left( z \left( t \right) \right) \right\rangle \\&\quad + 2 \left( \lambda _{2} - \lambda _{1} \right) \left( \alpha - 1 \right) \left\Vert z \left( t \right) - z_{*} \right\Vert ^{2} \\ = \ {}&4 \left( \lambda _{2} - \lambda _{1} \right) \left( t \left\langle z \left( t \right) - z_{*} , {\dot{z}} \left( t \right) + \beta \left( t \right) V \left( z \left( t \right) \right) \right\rangle \right. \\&\left. + \dfrac{1}{2} \left( \alpha - 1 \right) \left\Vert z \left( t \right) - z_{*} \right\Vert ^{2} \right) . \end{aligned}$$

For every $t \ge t_{0}$, we define

$$\begin{aligned} p \left( t \right)&:= t \left\langle z \left( t \right) - z_{*} , {\dot{z}} \left( t \right) + \beta \left( t \right) V \left( z \left( t \right) \right) \right\rangle + \dfrac{1}{2} \left( \alpha - 1 \right) \left\Vert z \left( t \right) - z_{*} \right\Vert ^{2} , \end{aligned}$$

(28)

$$\begin{aligned} q \left( t \right)&:= \dfrac{1}{2} \left\Vert z \left( t \right) - z_{*} \right\Vert ^{2} + \int _{t_{0}}^{t} \beta \left( s \right) \left\langle z \left( s \right) - z_{*} , V \left( z \left( s \right) \right) \right\rangle ds . \end{aligned}$$

(29)

One can easily see that for every $t \ge t_{0}$

$$\begin{aligned} {\dot{q}} \left( t \right)= & {} \left\langle z \left( t \right) - z_{*}, {\dot{z}} \left( t \right) + \beta \left( t \right) \left\langle z \left( t \right) - z_{*}, V \left( z \left( t \right) \right) \right\rangle \right\rangle \\= & {} \left\langle z \left( t \right) - z_{*}, {\dot{z}} \left( t \right) + \beta \left( t \right) V \left( z \left( t \right) \right) \right\rangle , \end{aligned}$$

and thus

$$\begin{aligned} \left( \alpha - 1 \right) q \left( t \right) + t {\dot{q}} \left( t \right) = p \left( t \right) + \left( \alpha - 1 \right) \int _{t_{0}}^{t} \beta \left( s \right) \left\langle z \left( s \right) - z_{*} , V \left( z \left( s \right) \right) \right\rangle ds . \end{aligned}$$

Since $0 \le \lambda _{1} < \lambda _{2} \le \alpha - 1$, Theorem 4 guarantees that $\lim _{t \rightarrow + \infty } \left\{ {\mathcal {E}}_{\lambda _2} \left( t \right) - {\mathcal {E}}_{\lambda _1} \left( t \right) \right\} \in {\mathbb {R}}$ exists, hence, by 28,

$$\begin{aligned} \lim \limits _{t \rightarrow + \infty } p \left( t \right) \in {\mathbb {R}}\text { exists}. \end{aligned}$$

(30)

Furthermore, the quantity $\int _{t_{0}}^{t} \beta \left( s \right) \left\langle z \left( s \right) - z_{*}, V \left( z \left( s \right) \right) \right\rangle ds$ is nondecreasing with respect to t, and according to 24 for every $t \ge t_{0}$ it holds

$$\begin{aligned} \dfrac{\varepsilon }{2} \int _{t_{0}}^{t} \beta \left( s \right) \left\langle z \left( s \right) - z_{*} , V \left( z \left( s \right) \right) \right\rangle ds \le \int _{t_{0}}^{t} s w \left( s \right) \left\langle z \left( s \right) - z_{*} , V \left( z \left( s \right) \right) \right\rangle ds . \end{aligned}$$

As a consequence, we conclude from 20a that

$$\begin{aligned} \lim \limits _{t \rightarrow + \infty } \int _{t_{0}}^{t} \beta \left( s \right) \left\langle z \left( s \right) - z_{*} , V \left( z \left( s \right) \right) \right\rangle ds \in {\mathbb {R}}. \end{aligned}$$

(31)

Combining 30 and 31, it yields that the limit $\lim _{t \rightarrow + \infty } \left\{ \left( \alpha - 1 \right) q \left( t \right) + t {\dot{q}} \left( t \right) \right\} \in {\mathbb {R}}$ exists, which, according to Lemma A.4, guarantees that $\lim _{t \rightarrow + \infty } q \left( t \right) \in {\mathbb {R}}$. Using the definition of q in 29 and once again the statement 31, we see that $\lim _{t \rightarrow + \infty } \left\Vert z \left( t \right) - z_{*} \right\Vert \in {\mathbb {R}}$. This proves the hypothesis (i) of Opial’s Lemma (see Lemma A.2).

Finally, let $\bar{z}$ be a weak sequential cluster point of the trajectory $z \left( t \right) $ as $t \rightarrow + \infty $. This means that there exists a sequence $\left( z \left( t_{n} \right) \right) _{n \ge 0}$ such that

$$\begin{aligned} z \left( t_{n} \right) \rightharpoonup \bar{z} \text { as } n \rightarrow + \infty , \end{aligned}$$

where $\rightharpoonup $ denotes weak convergence. On the other hand, Theorem 4 ensures that

$$\begin{aligned} V \left( z \left( t_{n} \right) \right) \rightarrow 0 \text { as } n \rightarrow + \infty . \end{aligned}$$

Since V is monotone and continuous, it is maximally monotone (see, for instance, [16, Corollary 20.28]). Therefore, the graph of V is sequentially closed in ${\mathcal {H}}^{\text {weak}} \times {\mathcal {H}}^{\text {strong}}$, which means that $V(\bar{z})=0$. In other words, the hypothesis (ii) of Opial’s Lemma also holds, and the proof is complete. $\square $

Next we will see that under the growth condition 21 the convergence rates obtained in Theorem 4 can be improved from ${\mathcal {O}}$ to o, which is also a phenomenon known for inertial gradient systems with asymptotically vanishing damping terms.

Theorem 7

Let $\alpha >2$ and $z :\left[ t_{0}, + \infty \right) \rightarrow {\mathcal {H}}$ be a solution of 12, $z_* \in {{\mathcal {Z}}}$, and assume that $\beta : [t_0,+\infty ) \rightarrow (0,+\infty )$ satisfies the growth condition 21, in other words

$$\begin{aligned} 0 \le \sup \limits _{t \ge t_{0}} \dfrac{t {\dot{\beta }} \left( t \right) }{\beta \left( t \right) } < \alpha - 2 . \end{aligned}$$

Then, it holds

$$\begin{aligned} \left\Vert {\dot{z}} \left( t \right) \right\Vert = o \left( \dfrac{1}{t} \right) \quad \text { as } t \rightarrow + \infty , \end{aligned}$$

and

$$\begin{aligned} \left\langle z \left( t \right) - z_{*} , V \left( z \left( t \right) \right) \right\rangle = o \left( \dfrac{1}{t \beta \left( t \right) } \right) \text { and } \left\Vert V \left( z \left( t \right) \right) \right\Vert = o \left( \dfrac{1}{t \beta \left( t \right) } \right) \quad \text { as } t \rightarrow + \infty . \end{aligned}$$

Proof

For every $0 \le \lambda \le \alpha - 1$, the energy function of the system can be written as

$$\begin{aligned} {\mathcal {E}}_{\lambda } \left( t \right)&= \ \dfrac{1}{2} \left\Vert 2 \lambda \left( z \left( t \right) - z_{*} \right) + t \Bigl ( 2 {\dot{z}} \left( t \right) + \beta \left( t \right) V \left( z \left( t \right) \right) \Bigr ) \right\Vert ^{2}\\&\quad + 2 \lambda \left( \alpha - 1 - \lambda \right) \left\Vert z \left( t \right) - z_{*} \right\Vert ^{2} \\&\quad + 2 \lambda t \beta \left( t \right) \left\langle z \left( t \right) - z_{*} , V \left( z \left( t \right) \right) \right\rangle + \dfrac{1}{2} t^{2} \beta ^{2} \left( t \right) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2} \\&= \ 2 \lambda \left( \alpha - 1 \right) \left\Vert z \left( t \right) - z_{*} \right\Vert ^{2} + 4 \lambda t \left\langle z \left( t \right) - z_{*} , {\dot{z}} \left( t \right) + \beta \left( t \right) V \left( z \left( t \right) \right) \right\rangle \\&\quad + \dfrac{1}{2} t^{2} \left\Vert 2 {\dot{z}} \left( t \right) + \beta \left( t \right) V \left( z \left( t \right) \right) \right\Vert ^{2} + \dfrac{1}{2} t^{2} \beta ^{2} \left( t \right) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2} \\&= \ 4 \lambda p \left( t \right) + t^{2} \left\Vert {\dot{z}} \left( t \right) + \beta \left( t \right) V \left( z \left( t \right) \right) \right\Vert ^{2} + t^{2} \left\Vert {\dot{z}} \left( t \right) \right\Vert ^{2} , \end{aligned}$$

where the last equation comes from the definition of $p \left( t \right) $ in 28 and the formula

$$\begin{aligned} \left\Vert x \right\Vert ^{2} + \left\Vert y \right\Vert ^{2} = \frac{1}{2} \left( \left\Vert x + y \right\Vert ^{2} + \left\Vert x - y \right\Vert ^{2} \right) \quad \forall x, y \in {\mathcal {H}}. \end{aligned}$$

(32)

Recalling that as both limits $\lim _{t \rightarrow + \infty } {\mathcal {E}}_{\lambda } \left( t \right) \in {\mathbb {R}}$ and $\lim _{t \rightarrow + \infty } p \left( t \right) \in {\mathbb {R}}$ exist (see Theorem 4 and 30), we conclude that for $h:[t_0,+\infty )\rightarrow {{\mathbb {R}}}, h(t) = t^{2} \left\Vert {\dot{z}} \left( t \right) + \beta \left( t \right) V \left( z \left( t \right) \right) \right\Vert ^{2} + t^{2} \left\Vert {\dot{z}} \left( t \right) \right\Vert ^{2}$,

$$\begin{aligned} \lim \limits _{t \rightarrow + \infty } h \left( t \right) \in [0,+\infty ) \text { exists} . \end{aligned}$$

(33)

Moreover, from 22 and 26, we see that

$$\begin{aligned} \int _{t_{0}}^{+ \infty } \dfrac{1}{t} h \left( t \right) dt \le 3 \int _{t_{0}}^{+ \infty } t \left\Vert {\dot{z}} \left( t \right) \right\Vert ^{2} dt + 2 \int _{t_{0}}^{+ \infty } t \beta ^{2} \left( t \right) \left\Vert V \left( z \left( t \right) \right) \right\Vert ^{2} dt < + \infty , \end{aligned}$$

which in combination with 33 leads to $\lim _{t\rightarrow +\infty }h(t)=0$. Thus,

$$\begin{aligned} \lim \limits _{t \rightarrow + \infty } t \left\Vert {\dot{z}} \left( t \right) + \beta \left( t \right) V \left( z \left( t \right) \right) \right\Vert = \lim \limits _{t \rightarrow + \infty } t \left\Vert {\dot{z}} \left( t \right) \right\Vert = 0 , \end{aligned}$$

and, consequently,

$$\begin{aligned} \lim \limits _{t \rightarrow + \infty } t \beta \left( t \right) \left\Vert V \left( z \left( t \right) \right) \right\Vert = 0 . \end{aligned}$$

Finally, by Cauchy–Schwarz inequality and the fact that the trajectory $t \mapsto z(t)$ is bounded, we deduce that

$$\begin{aligned} 0 \le t \beta \left( t \right) \left\langle z \left( t \right) - z_{*} , V \left( z \left( t \right) \right) \right\rangle \le t \beta \left( t \right) \left\Vert z \left( t \right) - z_{*} \right\Vert \left\Vert V \left( z \left( t \right) \right) \right\Vert \quad \forall t \ge t_{0} , \end{aligned}$$

which finishes the proof. $\square $

Remark 8

One of the anonymous referees made an excellent observation regarding the asymptotic behavior of the trajectories on which we will elaborate in the following. For the first-order system attached to 1

$$\begin{aligned} {\dot{u}} \left( t \right) + V \left( u \left( t \right) \right) = 0, \end{aligned}$$

(34)

it is known that the solution trajectories converge weakly in ergodic (averaged) sense toward a zero of V. In other words, there exists $ z_{*} \in {\mathcal {Z}}$ such that $z \left( t \right) := \frac{1}{t} \int _{0}^{t} u \left( s \right) ds \rightharpoonup z_{*} \in {\mathcal {Z}}$ as $t \rightarrow + \infty $ (see, for instance, [15, 41]).

This leads to the natural idea of considering the averaging trajectory z, that fulfills

$$\begin{aligned} {\dot{z}} \left( t \right) + \dfrac{1}{t} \left( z \left( t \right) - u \left( t \right) \right) = 0, \end{aligned}$$

(35)

and to drive the equation of its dynamics from 34. For more details on this very powerful approach, we refer the reader to [3].

From 35, we deduce that ${\dot{u}} \left( t \right) = t \ddot{z} \left( t \right) + 2 {\dot{z}} \left( t \right) $, and hence, equation 34 becomes

$$\begin{aligned} t \ddot{z} \left( t \right) + 2 {\dot{z}} \left( t \right) + V \left( z \left( t \right) + t {\dot{z}} \left( t \right) \right) = 0. \end{aligned}$$

Taking the Taylor expansion

$$\begin{aligned} V \left( z \left( t \right) + t {\dot{z}} \left( t \right) \right) \approx V \left( z \left( t \right) \right) + t \nabla V \left( z \left( t \right) \right) \dot{z} \left( t \right) = V \left( z \left( t \right) \right) + t \dfrac{d}{dt} V \left( z \left( t \right) \right) , \end{aligned}$$

it leads to the second-order dynamical system with correction term $\frac{d}{dt} V \left( z \left( t \right) \right) $

$$\begin{aligned} \ddot{z} \left( t \right) + \dfrac{2}{t} {\dot{z}} \left( t \right) + \dfrac{d}{dt} \left( V \left( z \left( t \right) \right) \right) + \dfrac{1}{t} V \left( z \left( t \right) \right) = 0, \end{aligned}$$

which is of the same type as 12. This approach suggests that one can expect the non-ergodic convergence of the solution trajectory of 12 to a zero of V.

The function $\beta $ can be “inserted” into the system through time scaling approaches aimed to speed up its convergence behavior (see also [3, 5, 7, 8] for related ideas).

3 An Implicit Numerical Algorithm

In this section, we formulate and investigate an implicit type numerical algorithm which follows from a temporal discretization of the dynamical system 12. We recall that the latter can be equivalently written as (see the proof of Theorem 5)

$$\begin{aligned} {\left\{ \begin{array}{ll} {\dot{u}} \left( t \right) &{} = \Bigl ( t {\dot{\beta }} \left( t \right) + \left( 2 - \alpha \right) \beta \left( t \right) \Bigr ) V \left( z \left( t \right) \right) \\ u \left( t \right) &{} = 2 \left( \alpha - 1 \right) z \left( t \right) + 2t {\dot{z}} \left( t \right) + 2t \beta \left( t \right) V \left( z \left( t \right) \right) \end{array}\right. } , \end{aligned}$$

(36)

with the initializations $z \left( t_{0} \right) = z^{0} \text { and } {\dot{z}} \left( t_{0} \right) = \dot{z}^{0}$.

We fix a time step $s > 0$, set $\tau _{k}:= s \left( k + 1 \right) $ and $\sigma _{k}:= sk$ for every $k \ge 1$, and approximate $z \left( \tau _{k} \right) \approx z^{k+1}$, $u \left( \tau _{k} \right) \approx u^{k+1}$, and $\beta \left( \sigma _{k} \right) \approx \beta _{k}$. The implicit finite-difference scheme for 36 at time $t:= \tau _{k}$ for $\left( z, u \right) $ and at time $t:= \sigma _{k}$ for $\beta $ gives for every $k \ge 1$

$$\begin{aligned} {\left\{ \begin{array}{ll} \dfrac{u^{k+1} - u^{k}}{s} &{} = \Bigl ( k \left( \beta _{k} - \beta _{k-1} \right) + \left( 2 - \alpha \right) \beta _{k} \Bigr ) V \left( z^{k+1} \right) \\ u^{k+1} &{} = 2 \left( \alpha - 1 \right) z^{k+1} + 2 \left( k + 1 \right) \left( z^{k+1} - z^{k} \right) + 2s \left( k + 1 \right) \beta _{k} V \left( z^{k+1} \right) \end{array}\right. } , \end{aligned}$$

(37)

with the initialization $u^{1}:= z^{0}$ and $u^{0}:= z^{0} - s \dot{z}^{0}$. Therefore, we have for every $k \ge 1$

$$\begin{aligned} u^{k} = 2 \left( \alpha - 1 \right) z^{k} + 2k \left( z^{k} - z^{k-1} \right) + 2sk \beta _{k-1} V \left( z^{k} \right) , \end{aligned}$$

and after substraction, we get

$$\begin{aligned} u^{k+1} - u^{k}&= \ 2 \left( k + \alpha \right) \left( z^{k+1} - z^{k} \right) - 2k \left( z^{k} - z^{k-1} \right) \nonumber \\&\quad + 2s \Bigl ( \left( k + 1 \right) \beta _{k} - k \beta _{k-1} \Bigr ) V \left( z^{k+1} \right) \nonumber \\&\quad + 2s k \beta _{k-1} \left( V \left( z^{k+1} \right) - V \left( z^{k} \right) \right) \nonumber \\&= \ s \Bigl ( k \left( \beta _{k} - \beta _{k-1} \right) + \left( 2 - \alpha \right) \beta _{k} \Bigr ) V \left( z^{k+1} \right) , \end{aligned}$$

(38)

where the last relation comes from the first equation in 37. From here, we deduce that for every $k \ge 1$

$$\begin{aligned} z^{k+1}&= \ z^{k} + \left( 1 - \dfrac{\alpha }{k + \alpha } \right) \left( z^{k} - z^{k-1} \right) - \dfrac{s \left( \alpha \beta _{k} + k \left( \beta _{k} - \beta _{k-1} \right) \right) }{2 \left( k + \alpha \right) } V \left( z^{k+1} \right) \\&\quad - \dfrac{sk \beta _{k-1}}{k + \alpha } \left( V \left( z^{k+1} \right) - V \left( z^{k} \right) \right) . \end{aligned}$$

For

$$s_{k}:= \dfrac{s \left( \alpha \beta _{k} + k \left( \beta _{k} - \beta _{k-1} \right) \right) }{2 \left( k + \alpha \right) } \quad \text{ and } \quad t_{k}:= \dfrac{sk \beta _{k-1}}{k + \alpha },$$

the algorithm can be further equivalently written as

$$\begin{aligned} z^{k+1} := (\text {Id}+ ( s_{k} + t_{k}) V)^{-1} \left( z^{k} + \left( 1 - \dfrac{\alpha }{k + \alpha } \right) \left( z^{k} - z^{k-1} \right) + t_{k} V \left( z^{k} \right) \right) \quad \forall k \ge 1, \end{aligned}$$

and is therefore well defined due to the maximal monotonicity of V.

We also want to point out that the discrete version of the growth condition 21 reads

$$\begin{aligned} 0 \le \sup _{k \ge 1} \dfrac{k \left( \beta _{k} - \beta _{k-1} \right) }{\beta _{k}} < \alpha - 2, \end{aligned}$$

where $\left( \beta _{k} \right) _{k \ge 0}$ is a positive and nondecreasing sequence. This means that there exists some $0 \le \varepsilon < \alpha - 2$ such that

$$\begin{aligned} \dfrac{k \left( \beta _{k} - \beta _{k-1} \right) }{\beta _{k}} \le \alpha - 2 - \varepsilon \ \text{ or, } \text{ equivalently, } \ k \left( \beta _{k} - \beta _{k-1} \right) \le \left( \alpha - 2 - \varepsilon \right) \beta _{k} \quad \forall k \ge 1. \end{aligned}$$

(39)

In addition, for every $k \ge \lceil \alpha \rceil $, it holds

$$\begin{aligned} \beta _{k} \le \dfrac{k}{k + 2 + \varepsilon - \alpha } \beta _{k-1} \le \dfrac{\alpha }{2 + \varepsilon } \beta _{k-1} . \end{aligned}$$

(40)

To sum up, the implicit algorithm we propose for solving 1 is formulated below.

Inspired by the continuous setting, we consider for $0 \le \lambda \le \alpha -1$ the following sequence defined for every $k \ge 1$

$$\begin{aligned} {\mathcal {E}}_{\lambda }^{k}&{:=} \dfrac{1}{2} \left\Vert 2 \lambda \left( z^{k} - z_{*} \right) + 2k \left( z^{k} - z^{k-1} \right) + sk \beta _{k-1} V \left( z^{k} \right) \right\Vert ^{2}\\&\quad + 2 \lambda \left( \alpha - 1 - \lambda \right) \left\Vert z^{k} - z_{*} \right\Vert ^{2} \\&\quad + 2 \lambda sk \beta _{k-1} \left\langle z^{k} - z_{*} , V \left( z^{k} \right) \right\rangle + \dfrac{1}{2} s^{2} \left( k + \alpha \right) k \beta _{k} \beta _{k-1} \left\Vert V \left( z^{k} \right) \right\Vert ^{2} \ge 0, \end{aligned}$$

which is the discrete version of the energy function considered in the previous section. We have for every $k \ge 1$

$$\begin{aligned} {\mathcal {E}}_{\lambda }^{k}&= \ 2 \lambda \left( \alpha - 1 \right) \left\Vert z^{k} - z_{*} \right\Vert ^{2} + 4 \lambda k \left\langle z^{k} - z_{*} , z^{k} - z^{k-1} + s \beta _{k-1} V \left( z^{k} \right) \right\rangle \nonumber \\&\quad + \dfrac{1}{2} k^{2} \left\Vert 2 \left( z^{k} - z^{k-1} \right) + s \beta _{k-1} V \left( z^{k} \right) \right\Vert ^{2} + \dfrac{1}{2} s^{2} \left( k + \alpha \right) k \beta _{k} \beta _{k-1} \left\Vert V \left( z^{k} \right) \right\Vert ^{2} . \end{aligned}$$

(42)

The following lemma shows that the discrete energy dissipates with every iteration of the algorithm. Its proof can be found in the Appendix. Lemma 9 is the essential ingredient for the derivation of the convergence rates in Theorem 10.

Lemma 9

Let $z_* \in {{\mathcal {Z}}}$ and $\left( z^{k} \right) _{k \ge 0}$ be the sequence generated by Algorithm 1 for $\left( \beta _{k} \right) _{k \ge 0}$ a positive and nondecreasing sequence which satisfies (41). Then for every $0 \le \lambda \le \alpha -1$ and every $k \ge \lceil \alpha \rceil $ it holds

$$\begin{aligned} {\mathcal {E}}_{\lambda }^{k+1} - {\mathcal {E}}_{\lambda }^{k}&\le 2 \lambda s \Bigl ( \left( k + 2 - \alpha \right) \beta _{k} - k \beta _{k-1} \Bigr ) \left\langle z^{k+1} - z_{*} , V \left( z^{k+1} \right) \right\rangle \nonumber \\&\quad + 2 \left( \lambda + 1 - \alpha \right) \left( 2k + \alpha + 1 \right) \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} \nonumber \\&\quad + 2s \biggl ( \Bigl ( \left( \lambda + 1 - \alpha \right) \left( 2k + \alpha + 1 \right) - \lambda \Bigr ) \beta _{k} \nonumber \\&\quad - \lambda k \left( \beta _{k} - \beta _{k-1} \right) \biggr ) \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) \right\rangle \nonumber \\&\quad - 2sk \left( k + \alpha \right) \beta _{k-1} \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\rangle \nonumber \\&\quad + \dfrac{1}{2} \Bigl ( C- \varepsilon \left( 2k + \alpha + 1 \right) \Bigr ) s^{2} \beta _{k}^{2} \left\Vert V \left( z^{k+1} \right) \right\Vert ^{2} \nonumber \\&\quad - \dfrac{1}{2} s^{2} k \Bigl ( \left( k + \alpha \right) \beta _{k} + k \beta _{k-1} \Bigr ) \beta _{k-1} \left\Vert V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\Vert ^{2} , \end{aligned}$$

(43)

where

$$\begin{aligned} C:= \dfrac{\alpha }{2 + \varepsilon } \left( \alpha - 2 - \varepsilon \right) \left( 2 \alpha - 2 - \varepsilon \right) > 0 \end{aligned}$$

(44)

and $\varepsilon $ is chosen to fulfill 39.

Theorem 10

Let $z_* \in {{\mathcal {Z}}}$ and $\left( z^{k} \right) _{k \ge 0}$ be the sequence generated by Algorithm 1 for $\left( \beta _{k} \right) _{k \ge 0}$ a positive and nondecreasing sequence which satisfies (41), and $0 \le \varepsilon < \alpha -2$ be such that 39 is satisfied. Then, it holds

$$\begin{aligned} \left\langle z^{k} - z_{*} , V \left( z^{k} \right) \right\rangle = {\mathcal {O}}\left( \dfrac{1}{k \beta _{k}} \right) \text { and } \left\Vert V \left( z^{k} \right) \right\Vert = {\mathcal {O}}\left( \dfrac{1}{k \beta _{k}} \right) \text { as } k \rightarrow + \infty . \end{aligned}$$

In addition, for every $\alpha - 1 - \frac{\varepsilon }{4}< \lambda < \alpha - 1$, the sequence $\left( {\mathcal {E}}_{\lambda }^{k} \right) _{k \ge 1}$ converges, $\left( z^{k} \right) _{k \ge 0}$ is bounded and

$$\begin{aligned} \displaystyle \sum \limits _{k \ge 1} \beta _{k} \left\langle z^{k+1} - z_{*} , V \left( z^{k+1} \right) \right\rangle < + \infty , \end{aligned}$$

(45a)

$$\begin{aligned} \displaystyle \sum \limits _{k \ge 1} k \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} < + \infty , \end{aligned}$$

(45b)

$$\begin{aligned} \displaystyle \sum \limits _{k \ge 1} k \beta _{k}^{2} \left\Vert V \left( z^{k+1} \right) \right\Vert ^{2} < + \infty . \end{aligned}$$

(45c)

Proof

Let $0< \alpha - 1 - \frac{\varepsilon }{4}< \lambda < \alpha - 1$. First we show that for sufficiently large k, it holds

$$\begin{aligned} R_{k}&{:=} \left( \lambda + 1 - \alpha \right) \left( 2k + \alpha + 1 \right) \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} \nonumber \\&\quad + 2s \biggl ( \Bigl ( \left( \lambda + 1 - \alpha \right) \left( 2k + \alpha + 1 \right) - \lambda \Bigr ) \beta _{k} - \lambda k \left( \beta _{k} - \beta _{k-1} \right) \biggr ) \nonumber \\&\quad \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) \right\rangle \nonumber \\&\quad + \dfrac{1}{4} \Bigl ( C- \varepsilon \left( 2k + \alpha + 1 \right) \Bigr ) s^{2} \beta _{k}^{2} \left\Vert V \left( z^{k+1} \right) \right\Vert ^{2} \le 0, \end{aligned}$$

(46)

where $C>0$ is given by 44. By setting $K_{\alpha }:= 2k + \alpha + 1 \ge 1$, for every $k \ge 0$ we have

$$\begin{aligned} R_{k}&= \left( \lambda + 1 - \alpha \right) K_{\alpha } \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} + \dfrac{1}{4} s^{2} \left( C- \varepsilon K_{\alpha } \right) \beta _{k}^{2} \left\Vert V \left( z^{k+1} \right) \right\Vert ^{2} \\&\quad + 2s \biggl ( \Bigl ( \left( \lambda + 1 - \alpha \right) K_{\alpha } - \lambda \Bigr ) \beta _{k} - \lambda k \left( \beta _{k} - \beta _{k-1} \right) \biggr ) \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) \right\rangle . \end{aligned}$$

To guarantee that $R_{k} \le 0$ for sufficiently large k, we show that

$$\begin{aligned} \dfrac{\Delta _{k}}{s^{2}}:= & {} 4 \biggl ( \Bigl ( \left( \lambda + 1 - \alpha \right) K_{\alpha } - \lambda \Bigr ) \beta _{k} - \lambda k \left( \beta _{k} - \beta _{k-1} \right) \biggr ) ^{2} \\{} & {} - \left( \lambda + 1 - \alpha \right) \left( C- \varepsilon K_{\alpha } \right) K_{\alpha } \beta _{k}^{2} \le 0 \end{aligned}$$

sufficiently large k. Since $\left( \beta _{k} \right) _{k \ge 0}$ is nondecreasing and $\lambda < \alpha - 1$, it follows from 39 that for every $k \ge 1$

$$\begin{aligned} 0 \ge&\Bigl ( \left( \lambda + 1 - \alpha \right) K_{\alpha } - \lambda \Bigr ) \beta _{k} - \lambda k \left( \beta _{k} - \beta _{k-1} \right) \\ \ge&\Bigl ( \left( \lambda + 1 - \alpha \right) K_{\alpha } - \lambda \Bigr ) \beta _{k} - \lambda \left( \alpha - 2 - \varepsilon \right) \beta _{k} \\ =&\Bigl ( \left( \lambda + 1 - \alpha \right) K_{\alpha } - \lambda \left( \alpha - 1 - \varepsilon \right) \Bigr ) \beta _{k} , \end{aligned}$$

and thus

$$\begin{aligned} \dfrac{\Delta _{k}}{s^{2} \beta _{k}^{2}}&{:=} \dfrac{4}{\beta _{k}^{2}} \Bigl ( \Bigl ( \left( \lambda + 1 - \alpha \right) K_{\alpha } - \lambda \Bigr ) \beta _{k} - \lambda k \left( \beta _{k} - \beta _{k-1} \right) \Bigr ) ^{2} \\&\quad - \left( \lambda + 1 - \alpha \right) \left( C- \varepsilon K_{\alpha } \right) K_{\alpha } \\&\le \ 4 \Bigl ( \left( \lambda + 1 - \alpha \right) K_{\alpha } - \lambda \left( \alpha - 1 - \varepsilon \right) \Bigr ) ^{2} - \left( \lambda + 1 - \alpha \right) \left( C- \varepsilon K_{\alpha } \right) K_{\alpha } \\&= \ 4 \left( \lambda + 1 - \alpha \right) ^{2} K_{\alpha }^{2} - 8 \lambda \left( \lambda + 1 - \alpha \right) \left( \alpha - 1 - \varepsilon \right) K_{\alpha } + 4 \lambda ^{2} \left( \alpha - 1 - \varepsilon \right) ^{2} \\&\quad - \left( \lambda + 1 - \alpha \right) CK_{\alpha } + \varepsilon \left( \lambda + 1 - \alpha \right) K_{\alpha }^{2} \\&= \left( \lambda + 1 - \alpha \right) \!\Bigl ( 4 \left( \lambda + 1 - \alpha \right) + \varepsilon \Bigr ) K_{\alpha }^{2} \\&\quad - \left( \lambda + 1 - \alpha \right) \Bigl ( 8 \lambda \left( \alpha - 1 - \varepsilon \right) + C\Bigr ) K_{\alpha } + 4 \lambda ^{2} \left( \alpha - 1 - \varepsilon \right) ^{2} . \end{aligned}$$

Since $\alpha - 1 - \frac{\varepsilon }{4}< \lambda < \alpha - 1$, we have $\left( \lambda + 1 - \alpha \right) \Bigl ( 4 \left( \lambda + 1 - \alpha \right) + \varepsilon \Bigr ) < 0$, hence for sufficiently large $k \ge 0$ it holds $\Delta _k \le 0$ and, consequently, $R_k \ge 0$.

From 39, we deduce that $\left( k + 2 - \alpha \right) \beta _{k} - k \beta _{k-1} \le - \varepsilon \beta _{k}$ for every $k \ge 1$. Hence, for every $\alpha - 1 - \frac{\varepsilon }{4}< \lambda < \alpha - 1$, from Lemma 9 and 46 we have that for sufficiently large k it holds

$$\begin{aligned}&{\mathcal {E}}_{\lambda }^{k+1} - {\mathcal {E}}_{\lambda }^{k}\\&\quad \le - \varepsilon 2 \lambda s \beta _{k} \left\langle z^{k+1} - z_{*} , V \left( z^{k+1} \right) \right\rangle + \left( \lambda + 1 - \alpha \right) \left( 2k + \alpha + 1 \right) \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} \\&\quad \quad - 2sk \left( k + \alpha \right) \beta _{k-1} \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\rangle \\&\quad \quad + \dfrac{1}{4} \Bigl ( C- \varepsilon \left( 2k + \alpha + 1 \right) \Bigr ) s^{2} \beta _{k}^{2} \left\Vert V \left( z^{k+1} \right) \right\Vert ^{2} \\&\quad \quad - \dfrac{1}{2} s^{2} k \Bigl ( \left( k + \alpha \right) \beta _{k} + k \beta _{k-1} \Bigr ) \beta _{k-1} \left\Vert V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\Vert ^{2}, \end{aligned}$$

which means the sequence $\left\{ {\mathcal {E}}_{\lambda }^{k} \right\} _{k \ge 1}$ is nonincreasing for sufficiently large k, thus it is convergent and the boundedness of $\left( z^{k} \right) _{k \ge 0}$ and the convergence rates follow from the definition of ${\mathcal {E}}_{\lambda }^{k}$ and 40. The remaining assertions follow from Lemma A.6. $\square $

Next we prove the weak convergence of the generated sequence of iterates.

Theorem 11

Let $z_* \in {{\mathcal {Z}}}$ and $\left( z^{k} \right) _{k \ge 0}$ be the sequence generated by Algorithm 1 for $\left( \beta _{k} \right) _{k \ge 0}$ a positive and nondecreasing sequence which satisfies (41). Then, the sequence $\left( z^{k} \right) _{k \ge 0}$ converges weakly to a solution of 1.

Proof

Let $0 \le \varepsilon < \alpha -2$ such that 39 is satisfied and $0< \alpha - 1 - \frac{\varepsilon }{4}< \lambda _{1}< \lambda _{2} < \alpha - 1$. For every $k \ge 1$, we set

$$\begin{aligned} p_{k}&:= \dfrac{1}{2} \left( \alpha - 1 \right) \left\Vert z^{k} - z_{*} \right\Vert ^{2} + k \left\langle z^{k} - z_{*} , z^{k} - z^{k-1} + s \beta _{k-1} V \left( z^{k} \right) \right\rangle , \end{aligned}$$

(47)

$$\begin{aligned} q_{k}&:= \dfrac{1}{2} \left\Vert z^{k} - z_{*} \right\Vert ^{2} + s \displaystyle \sum \limits _{i = 1}^{k} \beta _{i-1} \left\langle z^{i} - z_{*} , V \left( z^{i} \right) \right\rangle , \end{aligned}$$

(48)

and notice that

$$\begin{aligned} \left( \alpha - 1 \right) q_{k} + k \left( q_{k} - q_{k-1} \right)= & {} p_{k} + \left( \alpha - 1 \right) s \displaystyle \sum \limits _{i = 1}^{k+1} \beta _{i-1} \left\langle z^{i} - z_{*}, V \left( z^{i} \right) \right\rangle \\{} & {} - \dfrac{k}{2} \left\Vert z^{k} - z^{k-1} \right\Vert ^{2}. \end{aligned}$$

We have that

$$\begin{aligned} \lim \limits _{k \rightarrow + \infty } p_{k} = \lim _{k \rightarrow + \infty } \frac{1}{4(\lambda _2-\lambda _1)}\left( {\mathcal {E}}_{\lambda _{2}}^{k} - {\mathcal {E}}_{\lambda _{1}}^{k} \right) \in {\mathbb {R}}\text { exists} \end{aligned}$$

(49)

and, thanks to (45), that the limit $\lim _{k \rightarrow + \infty } \sum _{i = 1}^{k+1} \beta _{i-1} \left\langle z^{i} - z_{*}, V \left( z^{i} \right) \right\rangle \in {\mathbb {R}}$ exists and

$$\begin{aligned} \lim \limits _{k \rightarrow + \infty } k \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} = 0 . \end{aligned}$$

Consequently,

$$\begin{aligned} \lim \limits _{k \rightarrow + \infty } \left( \left( \alpha - 1 \right) q_{k} + k \left( q_{k} - q_{k-1} \right) \right) \in {\mathbb {R}}\text { exists} . \end{aligned}$$

From Theorem 10, we deduce that $\left( q_{k} \right) _{k \ge 1}$ is bounded. This allows us to apply Lemma A.5 and to conclude from here that $\lim _{k \rightarrow + \infty } q_{k} \in {\mathbb {R}}$ also exists. Once again, by the definition of $q_{k}$ and the fact that the sequence $\left( \sum _{i = 1}^{k} \beta _{i-1} \left\langle z^{i} - z_{*}, V \left( z^{i} \right) \right\rangle \right) _{k \ge 1}$ converges, it follows that $\lim _{k \rightarrow + \infty } \left\Vert z_{k} - z_{*} \right\Vert \in {\mathbb {R}}$ exists. In other words, the hypothesis (i) in Opial’s Lemma (see Lemma A.3) is fulfilled.

Now let $\bar{z}$ be a weak sequential cluster point of $\left( z^{k} \right) _{k \ge 0}$, meaning that there exists a subsequence $\left( z^{k_{n}} \right) _{n \ge 0}$ such that

$$\begin{aligned} z^{k_{n}} \rightharpoonup \bar{z} \text { as } n \rightarrow + \infty . \end{aligned}$$

From Theorem 10, we have

$$\begin{aligned} V \left( z^{k_{n}} \right) \rightarrow 0 \text { as } n \rightarrow + \infty . \end{aligned}$$

Since V monotone and continuous, it is maximally monotone [16, Corollary 20.28]. Therefore, the graph of V is sequentially closed in ${\mathcal {H}}^{\text {weak}} \times {\mathcal {H}}^{\text {strong}}$, which gives that $V(\bar{z}) =0$, thus $\bar{z} \in {\mathcal {Z}}$. This shows that hypothesis (ii) of Opial’s Lemma is also fulfilled, and completes the proof. $\square $

We close the section with a result which improves the convergence rates derived in Theorem 10 for the implicit algorithm.

Theorem 12

Let $z_* \in {{\mathcal {Z}}}$ and $\left( z^{k} \right) _{k \ge 0}$ be the sequence generated by Algorithm 1 for $\left( \beta _{k} \right) _{k \ge 0}$ a positive and nondecreasing sequence which satisfies (41). Then, it holds

$$\begin{aligned} \left\Vert z^{k} - z^{k-1} \right\Vert = o \left( \dfrac{1}{k} \right) \text { as } k \rightarrow + \infty \end{aligned}$$

and

$$\begin{aligned} \left\langle z^{k} - z_{*} , V \left( z^{k} \right) \right\rangle = o \left( \dfrac{1}{k \beta _{k}} \right) \ \text{ and } \ \left\Vert V \left( z^{k} \right) \right\Vert = o \left( \dfrac{1}{k \beta _{k}} \right) \text { as } k \rightarrow + \infty . \end{aligned}$$

Proof

Let $0 \le \varepsilon < \alpha -2$ such that 39 is satisfied and $0< \alpha - 1 - \frac{\varepsilon }{4}< \lambda < \alpha - 1$. In the view of 47, the discrete energy sequence can be written as

$$\begin{aligned} {\mathcal {E}}_{\lambda }^{k}= & {} 4 \lambda p_{k} + \dfrac{1}{2} k^{2} \left\Vert 2 \left( z^{k} - z^{k-1} \right) + s \beta _{k-1} V \left( z^{k} \right) \right\Vert ^{2} \\{} & {} \quad + \dfrac{1}{2} s^{2} \left( k + \alpha \right) k \beta _{k} \beta _{k-1} \left\Vert V \left( z^{k} \right) \right\Vert ^{2} \quad \forall k \ge 1. \end{aligned}$$

According to Theorem 10, we have

$$\begin{aligned} \lim \limits _{k \rightarrow + \infty } k \beta _{k} \beta _{k-1} \left\Vert V \left( z^{k} \right) \right\Vert ^{2} = 0 . \end{aligned}$$

This statement together with the fact that the limits $\lim _{k \rightarrow + \infty } {\mathcal {E}}_{\lambda }^{k} \in {\mathbb {R}}$ and $\lim _{k \rightarrow + \infty } p_{k} \in {\mathbb {R}}$ (according to 49) exist, allows us to deduce that for the sequence

$$h_k:= \dfrac{k^{2}}{2} \left( \left\Vert 2 \left( z^{k} - z^{k-1} \right) + s \beta _{k-1} V \left( z^{k} \right) \right\Vert ^{2} + s^{2} \beta _{k} \beta _{k-1} \left\Vert V \left( z^{k} \right) \right\Vert ^{2} \right) \quad \forall k \ge 1,$$

the limit

$$\begin{aligned} \lim \limits _{k \rightarrow + \infty } h_{k} \in [0,+\infty ) \text { exists}. \end{aligned}$$

Furthermore, by taking into consideration the relation 40, Theorem 10 also guarantees that

$$\begin{aligned} \displaystyle \sum \limits _{k \ge \lceil \alpha \rceil } \dfrac{1}{k} h_{k}&\le 2 \displaystyle \sum \limits _{k \ge \lceil \alpha \rceil } k \left\Vert z^{k} - z^{k-1} \right\Vert ^{2} + s^{2} \displaystyle \sum \limits _{k \ge \lceil \alpha \rceil } k \left( \beta _{k-1} + \dfrac{\beta _{k}}{2} \right) \beta _{k-1} \left\Vert V \left( z^{k} \right) \right\Vert ^{2} \\&\le 2 \displaystyle \sum \limits _{k \ge \lceil \alpha \rceil } k \left\Vert z^{k} - z^{k-1} \right\Vert ^{2} \\&\quad + s^{2} \left( 1 + \dfrac{\alpha }{2 \left( 2 + \varepsilon \right) } \right) \displaystyle \sum \limits _{k \ge 1} k \beta _{k-1}^{2} \left\Vert V \left( z^{k} \right) \right\Vert ^{2} < + \infty . \end{aligned}$$

From here we conclude that $\lim _{k \rightarrow + \infty } h_{k} = 0$, and since $h_{k}$ is a sum of two nonnegative terms and, since $\left( \beta _{k} \right) _{k \ge 0}$ is nondecreasing, we further deduce

$$\begin{aligned} \lim \limits _{k \rightarrow + \infty } k \left\Vert 2 \left( z^{k} - z^{k-1} \right) + s \beta _{k-1} V \left( z^{k} \right) \right\Vert= & {} \lim \limits _{k \rightarrow + \infty } k \sqrt{\beta _{k} \beta _{k-1}} \left\Vert V \left( z^{k} \right) \right\Vert \\= & {} \lim \limits _{k \rightarrow + \infty } k \beta _{k-1} \left\Vert V \left( z^{k} \right) \right\Vert = 0. \end{aligned}$$

Using once again 40, we obtain

$$\begin{aligned} \lim \limits _{k \rightarrow + \infty } k \beta _{k} \left\Vert V \left( z^{k} \right) \right\Vert = 0 . \end{aligned}$$

Since $\left( z_{k} \right) _{k \ge 0}$ is bounded, we use the Cauchy–Schwarz inequality to derive

$$\begin{aligned} 0 \le \lim \limits _{k \rightarrow + \infty } k \beta _{k} \left\langle z^{k} - z_{*} , V \left( z^{k} \right) \right\rangle \le \lim \limits _{k \rightarrow + \infty } k \beta _{k} \left\Vert z^{k} - z_{*} \right\Vert \left\Vert V \left( z^{k} \right) \right\Vert = 0 , \end{aligned}$$

and the proof is complete. $\square $

4 An Explicit Algorithm

In this section, additional to its monotonicity, we will assume that the operator V is L-Lipschitz continuous, with $L>0$. We propose and investigate an explicit numerical algorithm for solving 1, which follows from a temporal discretization of the dynamical system 12.

The starting point is again its reformulation 36. We fix a time step $s > 0$, set $\tau _{k}:= s \left( k + 1 \right) $ for every $k \ge 1$, and approximate $z \left( \tau _{k} \right) \approx z^{k+1}$ and $u \left( \tau _{k} \right) \approx u^{k+1}$. In addition, we choose $\beta \left( \tau _{k} \right) = 1$ for every $k \ge 1$ and refer to Remark 2 for the explanation of why the time scaling parameter function $\beta $ is discretized via a constant sequence. The finite-difference scheme for 36 at time $t:=\tau _k$ gives for every $k \ge 0$

$$\begin{aligned} {\left\{ \begin{array}{ll} \dfrac{u^{k+1} - u^{k}}{s} &{} = \left( 2 - \alpha \right) V \left( \bar{z}^{k} \right) \\ u^{k+1} &{} = 2 \left( \alpha - 1 \right) z^{k+1} + 2 \left( k + 1 \right) \left( z^{k+1} - z^{k} \right) + 2s \left( k + 1 \right) V \left( \bar{z}^{k} \right) \end{array}\right. }. \end{aligned}$$

(50)

Therefore, we have for every $k \ge 1$

$$\begin{aligned} u^{k} = 2 \left( \alpha - 1 \right) z^{k} + 2k \left( z^{k} - z^{k-1} \right) + 2sk V \left( \bar{z}^{k-1} \right) , \end{aligned}$$

(51)

and after substraction we get

$$\begin{aligned} u^{k+1} - u^{k}&= 2 \left( k + \alpha \right) \left( z^{k+1} - z^{k} \right) - 2k \left( z^{k} - z^{k-1} \right) + 2s V \left( \bar{z}^{k} \right) \nonumber \\&\quad + 2sk \left( V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right) \nonumber \\&= \left( 2 - \alpha \right) s V \left( \bar{z}^{k} \right) , \end{aligned}$$

(52)

where the last relation comes from the first equation in 50.

On the other hand, the second equation in 50 can be rewritten for every $k \ge 0$ as

$$\begin{aligned} z^{k+1} = \dfrac{1}{2 \left( k + \alpha \right) } u^{k+1} + \dfrac{k+1}{k + \alpha } \left( z^{k} - s V \left( \bar{z}^{k} \right) \right) . \end{aligned}$$

(53)

To get an explicit choice for $\bar{z}^{k}$, we opt for

$$\begin{aligned} \bar{z}^{k} := \dfrac{1}{2 \left( k + \alpha \right) } u^{k} + \dfrac{k+1}{k + \alpha } \left( z^{k} - \dfrac{sk}{k+1} V \left( \bar{z}^{k-1} \right) \right) - \dfrac{\alpha s}{2 \left( k + \alpha \right) } V \left( \bar{z}^{k-1} \right) \quad \forall k \ge 1. \end{aligned}$$

(54)

From here, 51 gives for all $k \ge 1$

$$\begin{aligned} \bar{z}^{k}&= z^{k} + \dfrac{k}{k + \alpha } \left( z^{k} - z^{k-1} \right) - \dfrac{\alpha s}{2 \left( k + \alpha \right) } V \left( \bar{z}^{k-1} \right) \\&= z^{k} + \left( 1 - \dfrac{\alpha }{k + \alpha } \right) \left( z^{k} - z^{k-1} \right) - \dfrac{\alpha s}{2 \left( k + \alpha \right) } V \left( \bar{z}^{k-1} \right) , \end{aligned}$$

thus, by subtracting 54 from 53, we obtain

$$\begin{aligned} z^{k+1} - \bar{z}^{k}&= \dfrac{1}{2 \left( k + \alpha \right) } \left( u^{k+1} - u^{k} \right) - \dfrac{s \left( k+1 \right) }{k + \alpha } V \left( \bar{z}^{k} \right) + \dfrac{sk}{k + \alpha } V \left( \bar{z}^{k-1} \right) \nonumber \\&\quad - \dfrac{\alpha s}{2 \left( k + \alpha \right) } V \left( \bar{z}^{k-1} \right) \nonumber \\&= - \dfrac{\alpha s}{2 \left( k + \alpha \right) } V \left( \bar{z}^{k} \right) - \dfrac{sk}{k + \alpha } \left( V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right) \nonumber \\&\quad + \dfrac{\alpha s}{2 \left( k + \alpha \right) } V \left( \bar{z}^{k-1} \right) \nonumber \\&= - \dfrac{s}{2} \left( 1 + \dfrac{k}{k + \alpha } \right) \left( V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right) . \end{aligned}$$

(55)

This gives the following important estimate, which holds for every $s >0$ such that $sL\le 1 $ and every $k \ge 1$

$$\begin{aligned} \left\Vert V \left( z^{k+1} \right) - V \left( \bar{z}^{k} \right) \right\Vert\le & {} L \left\Vert z^{k+1} - \bar{z}^{k} \right\Vert \le sL \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert \nonumber \\\le & {} \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert . \end{aligned}$$

(56)

Now we can formally state our explicit numerical algorithm.

For $z_* \in {{\mathcal {Z}}}$, $0 \le \lambda \le \alpha - 1$ and $z_* \in {{\mathcal {Z}}}$ $0< \gamma < 2$, we define first in analogy to the implicit case the discrete energy function for every $k \ge 1$ by

$$\begin{aligned} {\mathcal {E}}_{\lambda }^{k}&{:=} \ \dfrac{1}{2} \left\Vert u^{k}_{\lambda } \right\Vert ^{2} + 2 \lambda \left( \alpha - 1 - \lambda \right) \left\Vert z^{k} - z_{*} \right\Vert ^{2} + 2 \left( 2 - \gamma \right) \lambda sk \left\langle z^{k} - z_{*} , V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad + \dfrac{1}{2} \left( 2 - \gamma \right) s^{2} k \left( \gamma k + \alpha \right) \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2}, \end{aligned}$$

(57)

where

$$\begin{aligned} u^{k}_{\lambda } := 2 \lambda \left( z^{k} - z_{*} \right) + 2k \left( z^{k} - z^{k-1} \right) + \gamma sk V \left( \bar{z}^{k-1} \right) . \end{aligned}$$

(58)

In strong contrast to the implicit case, the discrete energy sequence $({\mathcal {E}}_{\lambda }^{k})_{k \ge 1}$ might not dissipate with every iteration of the algorithm and be even negative. This is the reason why we consider instead the following regularized sequence of the energy function, defined for every $k \ge 2$ as

$$\begin{aligned} {\mathcal {F}}_{\lambda }^{k}&{:=} \ {\mathcal {E}}_{\lambda }^{k} - 2 \left( 2 - \gamma \right) sk^{2} \left\langle z^{k} - z^{k-1} , V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad + \dfrac{1}{2} \left( 2 - \gamma \right) s^{2} k \sqrt{k} \left( 2sL \sqrt{k} + \alpha \right) \left\Vert V \left( \bar{z}^{k-1} \right) - V \left( \bar{z}^{k-2} \right) \right\Vert ^{2} \nonumber \\&\quad - \dfrac{1}{2} \lambda \left( \alpha - 2 \right) s^{2} \left( 2 - \frac{\alpha }{k + \alpha } \right) \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} . \end{aligned}$$

(59)

Its properties are collected in the following lemma, the proof of which is deferred to the Appendix.

Lemma 13

Let $z_* \in {{\mathcal {Z}}}$ and $\left( z^{k} \right) _{k \ge 0}$ be the sequence generated by Algorithm 2 for $0< \gamma < 2$ and $0 \le \lambda \le \alpha - 1$. Then, the following statements are true:

(i)
for every $k \ge k_{0}:= \max \left\{ 2, \lceil \frac{1}{\alpha - 2} \rceil \right\} $ it holds
$$\begin{aligned} {\mathcal {F}}_{\lambda }^{k+1} - {\mathcal {F}}_{\lambda }^{k}&\le \ 2 \lambda \left( 2 - \alpha \right) s \left\langle \bar{z}^{k} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle - \dfrac{1}{2} s^{2} \mu _{k} \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \\&\quad + 2 \Bigl ( \omega _{2} k + \omega _{3} \sqrt{k} \Bigr ) \left\Vert z^{k+1} - z^{k} \right\Vert ^{2}\\&\quad + 2s \Bigl ( \omega _{0} k + \omega _{1} \Bigr ) \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) \right\rangle \\&\quad + \dfrac{1}{2} s^{2} \Bigl ( \omega _{4} k + \omega _{5} \sqrt{k} \Bigr ) \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2}, \end{aligned}$$
where
$$\begin{aligned} \mu _{k}&:= \left( 2 - \gamma \right) \left( 2 \left( 1 - 2sL \right) \left( k + 1 \right) + \alpha ^{2} \sqrt{k+1} + \alpha - 4 \right) \left( k+1 \right) \nonumber \\&\quad - \left( 2 - \gamma \right) \left( \alpha - 2 \right) - 2 \lambda \left( \alpha - 2 \right) , \end{aligned}$$
(60a)
$$\begin{aligned} \omega _{0}&:= \left( 2 - \gamma \right) \lambda + \gamma - \alpha + \gamma \left( \lambda + 1 - \alpha \right) , \end{aligned}$$
(60b)
$$\begin{aligned} \omega _{1}&:= \gamma - \alpha + \alpha \left( \lambda + 1 - \alpha \right) < 0 , \end{aligned}$$
(60c)
$$\begin{aligned} \omega _{2}&:= 2 \left( \lambda + 1 - \alpha \right) \le 0 , \end{aligned}$$
(60d)
$$\begin{aligned} \omega _{3}&:= \left( 2 - \gamma \right) \sqrt{\alpha - 2} > 0 , \end{aligned}$$
(60e)
$$\begin{aligned} \omega _{4}&:= 2 \gamma \left( 2 - \alpha \right) < 0 , \end{aligned}$$
(60f)
$$\begin{aligned} \omega _{5}&:= \left( 2 - \gamma \right) \alpha > 0 . \end{aligned}$$
(60g)
(ii)
if $1< \gamma < 2$, then for every $k \ge k_{1}:= \lceil \frac{2 \lambda \left( \alpha - 2 \right) }{\left( 2 - \gamma \right) \alpha } \rceil $ it holds
$$\begin{aligned} {\mathcal {F}}_{\lambda }^{k}&\ge \ \dfrac{2 - \gamma }{\gamma } \left\Vert 2 \lambda \left( z^{k} - z_{*} \right) + k \left( z^{k} - z^{k-1} \right) + \gamma sk V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \nonumber \\&\quad + \dfrac{\left( 2 - \gamma \right) ^{2}}{2 \gamma } k^{2} \left\Vert z^{k} - z^{k-1} \right\Vert ^{2} + 2 \lambda \left( \alpha - 1 - \dfrac{2 \lambda }{\gamma } \right) \left\Vert z^{k} - z_{*} \right\Vert ^{2} . \end{aligned}$$
(61)

In Lemma 13, we have two degrees of freedom in the choice of the parameters $\gamma $ and $\lambda $. The next result proves that there are choices for these parameters for which the discrete energy starts to dissipate with every iteration after a finite number of iterations and in the same time it is bounded from below by a nonnegative term. These two statements are fundamental in the derivation of the convergence rates and finally in the proof of the convergence of the iterates. The proof of Lemma 14 can be found in the Appendix.

Lemma 14

The following statements are true:

(i)
if $\gamma $ and $\delta $ are such that
$$\begin{aligned} 1 + \dfrac{1}{\alpha - 1}< \gamma < 2 , \end{aligned}$$
(62)
and
$$\begin{aligned} \max \left\{ \sqrt{2 \left( 1 - \dfrac{1}{\gamma } \right) } , \sqrt{\dfrac{\left( 2 - \gamma \right) \left( \alpha - 1 \right) + \left( \gamma - 1 \right) \left( \alpha - 2 \right) }{\gamma \left( \alpha - 2 \right) }} \right\}< \delta < 1, \end{aligned}$$
(63)
then there exist two parameters
$$\begin{aligned} 0 \le {\underline{\lambda }} \left( \alpha , \gamma \right) < {\overline{\lambda }} \left( \alpha , \gamma \right) \le \dfrac{\gamma }{2} \left( \alpha - 1 \right) , \end{aligned}$$
(64)
such that for every $\lambda $ satisfying ${\underline{\lambda }} \left( \alpha , \gamma \right)< \lambda < {\overline{\lambda }} \left( \alpha , \gamma \right) $ one can find an integer $k_{2} \left( \lambda \right) \ge 1$ with the property that the following inequality holds for every $k \ge k_{2} \left( \lambda \right) $
$$\begin{aligned} R_{k}&:= 2 \delta \Bigl ( \omega _{2} k + \omega _{3} \sqrt{k} \Bigr ) \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} + 2s \Bigl ( \omega _{0} k + \omega _{1} \Bigr ) \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad + \dfrac{\delta }{2} s^{2} \Bigl ( \omega _{4} k + \omega _{5} \sqrt{k} \Bigr ) \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} \le 0 ; \end{aligned}$$
(65)
(ii)
there exists a positive integer $k_3$ such that for every $k \ge k_{3}$ it holds
$$\begin{aligned} \mu _{k} \ge \left( 2 - \gamma \right) \left( 1 - 2sL \right) \left( k+1 \right) ^{2} . \end{aligned}$$
(66)

Now we are in position to provide first convergence rates statements for Algorithm 2.

Theorem 15

Let $z_* \in {{\mathcal {Z}}}$ and $\left( z^{k} \right) _{k \ge 0}$ be the sequence generated by Algorithm 2. Then, the following statements are true:

(i)
it holds
$$\begin{aligned} \displaystyle \sum \limits _{k \ge 1} \left\langle \bar{z}^{k} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle&< + \infty , \end{aligned}$$
(67a)
$$\begin{aligned} \displaystyle \sum \limits _{k \ge 1} k^{2} \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2}&< + \infty , \end{aligned}$$
(67b)
$$\begin{aligned} \displaystyle \sum \limits _{k \ge 1} k \left\Vert z^{k+1} - z^{k} \right\Vert ^{2}&< + \infty , \end{aligned}$$
(67c)
$$\begin{aligned} \displaystyle \sum \limits _{k \ge 1} k \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2}&< + \infty ; \end{aligned}$$
(67d)
(ii)
the sequence $\left( z^{k} \right) _{k \ge 0}$ is bounded and it holds
$$\begin{aligned}&\left\Vert z^{k} - z^{k-1} \right\Vert = {\mathcal {O}}\left( \dfrac{1}{k} \right) , \quad \left\langle z^{k} - z_{*} , V \left( z^{k} \right) \right\rangle = {\mathcal {O}}\left( \dfrac{1}{k} \right) , \\&\left\Vert V \left( z^{k} \right) \right\Vert = {\mathcal {O}}\left( \dfrac{1}{k} \right) ,\quad \left\Vert V \left( \bar{z}^{k} \right) \right\Vert = {\mathcal {O}}\left( \dfrac{1}{k} \right) \text { as } k \rightarrow + \infty ; \end{aligned}$$
(iii)
if $1 + \dfrac{1}{\alpha - 1}< \gamma < 2$, then there exist $0 \le {\underline{\lambda }} \left( \alpha , \gamma \right) < {\overline{\lambda }} \left( \alpha , \gamma \right) \le \frac{\gamma }{2} \left( \alpha - 1 \right) $ such that for every ${\underline{\lambda }} \left( \alpha , \gamma \right)< \lambda < {\overline{\lambda }} \left( \alpha , \gamma \right) $ both sequences $\left( {\mathcal {E}}_{\lambda }^{k} \right) _{k \ge 1}$ and $\left( {\mathcal {F}}_{\lambda }^{k} \right) _{k \ge 2}$ converge.

Proof

Let $1 + \dfrac{1}{\alpha - 1}< \gamma < 2$ and $0< \delta < 1$ such that 63 holds. According to Lemma 14 there exist ${\underline{\lambda }} \left( \alpha , \gamma \right) < {\overline{\lambda }} \left( \alpha , \gamma \right) $ such that 64 holds. We choose ${\underline{\lambda }} \left( \alpha , \gamma \right)< \lambda < {\overline{\lambda }} \left( \alpha , \gamma \right) $ and get, according to the same result, an integer $k_2(\lambda ) \ge 1$ such that for every $k \ge k_2(\lambda )$ the inequality 65 holds. In addition, according to Lemma 14(ii), we get a positive integer $k_3$ such that 66 holds for every $k \ge k_3$.

This means that for every $k \ge k_{4} \left( \lambda \right) := \max \left\{ k_{0}, k_{2} \left( \lambda \right) , k_{3} \right\} $, where $k_0$ is the positive integer provided by Lemma 13(i), we have

$$\begin{aligned} {\mathcal {F}}_{\lambda }^{k+1} - {\mathcal {F}}_{\lambda }^{k}&\le \ 2 \lambda \left( 2 - \alpha \right) s \left\langle \bar{z}^{k} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle \\&\quad - \dfrac{1}{2} \left( 2 - \gamma \right) \left( 1 - 2sL \right) s^{2} \left( k+1 \right) ^{2} \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \\&\quad + 2 \left( 1 - \delta \right) \Bigl ( \omega _{2} k + \omega _{3} \sqrt{k} \Bigr ) \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} \\&\quad + \dfrac{1}{2} \left( 1 - \delta \right) s^{2} \Bigl ( \omega _{4} k + \omega _{5} \sqrt{k} \Bigr ) \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} . \end{aligned}$$

Since $\omega _{2}< 0, \omega _{4} < 0$ and $\omega _{3}, \omega _{5} \ge 0$, we can choose $k_{5}:= \lceil \max \left\{ - \frac{2 \omega _{3}}{\omega _{2}}, - \frac{2 \omega _{5}}{\omega _{4}} \right\} \rceil > 0$, which then means that for every $k \ge k_6:=\max \left\{ k_{4} \left( \lambda \right) , k_{5} \right\} $

$$\begin{aligned} {\mathcal {F}}_{\lambda }^{k+1} - {\mathcal {F}}_{\lambda }^{k}&\le 2 \lambda \left( 2 - \alpha \right) s \left\langle \bar{z}^{k} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad - \dfrac{1}{2} \left( 2 - \gamma \right) s^{2} \left( 1 - 2sL \right) \left( k+1 \right) ^{2} \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \nonumber \\&\quad + \left( 1 - \delta \right) \omega _{2} k \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} + \dfrac{1}{4} \left( 1 - \delta \right) s^{2} \omega _{4} k \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} . \end{aligned}$$

(68)

In view of 61 and by taking into account that $\lambda < \frac{\gamma }{2} \left( \alpha - 1 \right) $, we get that ${\mathcal {F}}_{\lambda }^{k} \ge 0$ starting from the index $k_{1}$, thus the sequence $\left( {\mathcal {F}}_{\lambda }^{k} \right) _{k \ge 2}$ is bounded from below. Under these premises, we can apply Lemma A.6 to 68, and obtain (i) as well as that the sequence $\left( {\mathcal {F}}_{\lambda }^{k} \right) _{k \ge 1}$ converges.

According to 68, we also have that $\left( {\mathcal {F}}_{\lambda }^{k} \right) _{k \ge k_{6}}$ is nonincreasing, which, according to 61, implies that following estimate holds for every $k \ge k_{6}$

$$\begin{aligned}&\dfrac{2 - \gamma }{\gamma } \left\Vert 2 \lambda \left( z^{k} - z_{*} \right) + k \left( z^{k} - z^{k-1} \right) + \gamma sk V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \\&\qquad + \dfrac{\left( 2 - \gamma \right) ^{2}}{2 \gamma } k^{2} \left\Vert z^{k} - z^{k-1} \right\Vert ^{2} + 2 \lambda \left( \alpha - 1 - \dfrac{2 \lambda }{\gamma } \right) \left\Vert z^{k} - z_{*} \right\Vert ^{2} \\&\quad \le {\mathcal {F}}_{\lambda }^{k} \le {\mathcal {F}}_{\lambda }^{k_{6}} < + \infty . \end{aligned}$$

This yields that the sequences $\left( 2 \lambda \left( z^{k} - z_{*} \right) + k \left( z^{k} - z^{k-1} \right) + \gamma sk V \left( \bar{z}^{k-1} \right) \right) _{k \ge 1}$, $\left( k \left( z^{k} - z^{k-1} \right) \right) _{k \ge 1}$, and $\left( z^{k} \right) _{k \ge 0}$ are bounded. In particular, for every $k \ge k_{6}$ we have

$$\begin{aligned}&\left\Vert 2 \lambda \left( z^{k} - z_{*} \right) + k \left( z^{k} - z^{k-1} \right) + \gamma sk V \left( \bar{z}^{k-1} \right) \right\Vert \le C_{0} := \sqrt{\dfrac{\gamma {\mathcal {F}}_{\lambda }^{k_{6}}}{2 - \gamma }}< + \infty , \\&k \left\Vert z^{k} - z^{k-1} \right\Vert \le C_{1} := \dfrac{\sqrt{2 \gamma {\mathcal {F}}_{\lambda }^{k_{6}}}}{2 - \gamma }< + \infty , \\&\left\Vert z^{k} - z_{*} \right\Vert \le C_{2} := \sqrt{\dfrac{\gamma {\mathcal {F}}_{\lambda }^{k_{6}}}{2 \lambda \left( \gamma \left( \alpha - 1 \right) - 2 \lambda \right) }} < + \infty . \end{aligned}$$

Using the triangle inequality, we deduce from here that for every $k \ge k_{6}$

(69)

where

$$\begin{aligned} C_{3} := \dfrac{1}{\gamma s} \left( C_{0} + C_{1} + 2 \lambda \left( \alpha , \gamma \right) C_{2} \right) > 0 . \end{aligned}$$

The statement 67b yields

$$\begin{aligned} \lim \limits _{k \rightarrow + \infty } k \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert = 0 \quad \Rightarrow \quad C_{4} := \sup _{k \ge 1} \left\{ k \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert \right\} < + \infty , \end{aligned}$$

(70)

which, together with 56 implies that for every $k \ge k_{6}$

$$\begin{aligned} \left\Vert V \left( z^{k+1} \right) \right\Vert&\le \left\Vert V \left( z^{k+1} \right) - V \left( \bar{z}^{k} \right) \right\Vert + \left\Vert V \left( \bar{z}^{k} \right) \right\Vert \nonumber \\&\le \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert + \left\Vert V \left( \bar{z}^{k} \right) \right\Vert \le \dfrac{C_{5}}{k} , \end{aligned}$$

(71)

where

$$\begin{aligned} C_{5} := C_{3} + C_{4} > 0 . \end{aligned}$$

The last assertion in (ii) follows from the Cauchy–Schwarz inequality and the boundedness of $\left( z^{k} \right) _{k \ge 0}$, namely, for for every $k \ge k_{6}$ it holds

$$\begin{aligned} 0 \le \left\langle z^{k} - z_{*} , V \left( z^{k} \right) \right\rangle \le \left\Vert z^{k} - z_{*} \right\Vert \left\Vert V \left( z^{k} \right) \right\Vert \le \dfrac{C_{2} C_{5}}{k-1} . \end{aligned}$$

To complete the proof of (iii), we are going to show that in fact

$$\begin{aligned} \lim \limits _{k \rightarrow + \infty } {\mathcal {E}}_{\lambda }^{k} = \lim \limits _{k \rightarrow + \infty } {\mathcal {F}}_{\lambda }^{k} \in {\mathbb {R}}. \end{aligned}$$

Indeed, we already have seen that

$$\begin{aligned} \lim \limits _{k \rightarrow + \infty } \left( k + 1 \right) \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert = \lim \limits _{k \rightarrow + \infty } \left\Vert V \left( \bar{z}^{k} \right) \right\Vert = 0 , \end{aligned}$$

which, by the Cauchy–Schwarz inequality and 56, yields

$$\begin{aligned} 0&\le \lim \limits _{k \rightarrow + \infty } k^{2} \left|\left\langle z^{k} - z^{k-1} , V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \right|\\&\le C_{1} \lim \limits _{k \rightarrow + \infty } k \left\Vert V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert \\&\le C_{1} \lim \limits _{k \rightarrow + \infty } k \left\Vert V \left( \bar{z}^{k-1} \right) - V \left( \bar{z}^{k-2} \right) \right\Vert = 0 . \end{aligned}$$

From here we obtain the desired statement. $\square $

The following theorem addresses the convergence of the sequence of iterates to an element in ${{\mathcal {Z}}}$.

Theorem 16

Let $z_* \in {{\mathcal {Z}}}$ and $\left( z^{k} \right) _{k \ge 0}$ be the sequence generated by Algorithm 2. Then, the sequence $\left( z^{k} \right) _{k \ge 0}$ converges weakly to a solution of 1.

Proof

Let $1 + \dfrac{1}{\alpha - 1}< \gamma < 2$ and ${\underline{\lambda }} \left( \alpha , \gamma \right) < {\overline{\lambda }} \left( \alpha , \gamma \right) $ be the parameters provided by Lemma 14 such that 64 holds and with the property that for every ${\underline{\lambda }} \left( \alpha , \gamma \right)< \lambda < {\overline{\lambda }} \left( \alpha , \gamma \right) $ there exists an integer $k_2(\lambda ) \ge 1$ such that for every $k \ge k_2(\lambda )$ the inequality 65 holds. The proof relies on Opial’s Lemma and follows the line of the proof of Theorem 11, by defining this time for every $k \ge 1$

$$\begin{aligned} p_{k}&:= \dfrac{1}{2} \left( \alpha - 1 \right) \left\Vert z^{k} - z_{*} \right\Vert ^{2} + k \left\langle z^{k} - z_{*} , z^{k} - z^{k-1} + sV \left( \bar{z}^{k-1} \right) \right\rangle , \end{aligned}$$

(72)

$$\begin{aligned} q_{k}&:= \dfrac{1}{2} \left\Vert z^{k} - z_{*} \right\Vert ^{2} + s \displaystyle \sum \limits _{i = 1}^{k} \left\langle z^{i} - z_{*} , V \left( \bar{z}^{i-1} \right) \right\rangle . \end{aligned}$$

(73)

One can notice that the limit

$$\begin{aligned} \lim _{k \rightarrow + \infty } \sum _{i = 1}^{k} \left\langle z^{i} - z_{*}, V \left( \bar{z}^{i-1} \right) \right\rangle= & {} \lim _{k \rightarrow + \infty } \sum _{i = 1}^{k} \left\langle z^{i} - \bar{z}^{i-1}, V \left( \bar{z}^{i-1} \right) \right\rangle \\{} & {} \quad + \lim _{k \rightarrow + \infty } \sum _{i = 1}^{k} \left\langle \bar{z}^{i-1} - z_{*}, V \left( \bar{z}^{i-1} \right) \right\rangle \in {\mathbb {R}}\end{aligned}$$

exists due to 67a and the fact that the series $\sum _{k \ge 2} \left\langle z^{k} - \bar{z}^{k-1}, V \left( \bar{z}^{k-1} \right) \right\rangle $ is absolutely convergent, which follows from

$$\begin{aligned} \displaystyle \sum \limits _{i = 2}^{k} \left|\left\langle z^{i} - \bar{z}^{i-1} , V \left( \bar{z}^{i-1} \right) \right\rangle \right|&\le \displaystyle \sum \limits _{i = 2}^{k} \left\Vert V \left( \bar{z}^{i-1} \right) \right\Vert \left\Vert z^{i} - \bar{z}^{i-1} \right\Vert \\&\le sC_{3} C_{4} \displaystyle \sum \limits _{i = 2}^{\infty } \dfrac{1}{i \left( i-1 \right) } < + \infty \quad \forall k \ge 2, \end{aligned}$$

where we make use of 56, 70, and 69, and of the constants $C_3, C_4$ defined in the proof of Theorem 15. $\square $

As for the implicit algorithm, we can improve also for the explicit algorithm the convergence rates.

Theorem 17

Let $z_* \in {{\mathcal {Z}}}$ and $\left( z^{k} \right) _{k \ge 0}$ be the sequence generated by Algorithm 2. Then, it holds

$$\begin{aligned}&\left\Vert z^{k} - z^{k-1} \right\Vert = o \left( \dfrac{1}{k} \right) , \quad \left\langle z^{k} - z_{*} , V \left( z^{k} \right) \right\rangle = o \left( \dfrac{1}{k} \right) , \\&\left\Vert V \left( z^{k} \right) \right\Vert = o \left( \dfrac{1}{k} \right) ,\quad \left\Vert V \left( \bar{z}^{k} \right) \right\Vert = o \left( \dfrac{1}{k} \right) \text { as } k \rightarrow + \infty . \end{aligned}$$

Proof

Let $1 + \dfrac{1}{\alpha - 1}< \gamma < 2$ and ${\underline{\lambda }} \left( \alpha , \gamma \right) < {\overline{\lambda }} \left( \alpha , \gamma \right) $ be the parameters provided by Lemma 14 such that 64 holds and with the property that for every ${\underline{\lambda }} \left( \alpha , \gamma \right)< \lambda < {\overline{\lambda }} \left( \alpha , \gamma \right) $ there exists an integer $k_2(\lambda ) \ge 1$ such that for every $k \ge k_2(\lambda )$ the inequality 65 holds.

We fix ${\underline{\lambda }} \left( \alpha , \gamma \right)< \lambda < {\overline{\lambda }} \left( \alpha , \gamma \right) $ and recall that according to Theorem 15(iii) the sequence $({\mathcal {E}}_{\lambda }^{k})_{k \ge 1}$ converges.

From 58 and 57, we have that for every $k \ge 1$

$$\begin{aligned} {\mathcal {E}}_{\lambda }^{k}&= \ \dfrac{1}{2} \left\Vert 2 \lambda \left( z^{k} - z_{*} \right) + 2k \left( z^{k} - z^{k-1} \right) + \gamma sk V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \\&\quad + 2 \lambda \left( \alpha - 1 - \lambda \right) \left\Vert z^{k} - z_{*} \right\Vert ^{2} \\&\quad + 2 \left( 2 - \gamma \right) \lambda sk \left\langle z^{k} - z_{*} , V \left( \bar{z}^{k-1} \right) \right\rangle + \dfrac{1}{2} \left( 2 - \gamma \right) s^{2} k \left( \gamma k + \alpha \right) \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \\&= \ 2 \lambda \left( \alpha - 1 \right) \left\Vert z^{k} - z_{*} \right\Vert ^{2} + 4 \lambda k \left\langle z^{k} - z_{*} , z^{k} - z^{k-1} + sV \left( \bar{z}^{k-1} \right) \right\rangle \\&\quad + \dfrac{1}{2} \left( 2 - \gamma \right) \alpha s^{2} k \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \\&\quad + \dfrac{k^{2}}{2} \left( \left\Vert 2 \left( z^{k} - z^{k-1} \right) + \gamma s V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} + \left( 2 - \gamma \right) \gamma s^{2} \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \right) . \end{aligned}$$

We set for every $k \ge 1$

$$\begin{aligned} h_{k} := \dfrac{k^{2}}{2} \left( \left\Vert 2 \left( z^{k} - z^{k-1} \right) + \gamma s V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} + \left( 2 - \gamma \right) \gamma s^{2} \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \right) , \end{aligned}$$

and notice that, in view of 72, we have

$$\begin{aligned} {\mathcal {E}}_{\lambda }^{k} = 4 \lambda p_{k} + \dfrac{1}{2} \left( 2 - \gamma \right) \alpha s^{2} k \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} + h_{k} . \end{aligned}$$

Theorem 15 asserts that

$$\begin{aligned} \lim \limits _{k \rightarrow \infty } k \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} = 0 , \end{aligned}$$

which, together with $\lim _{k \rightarrow + \infty } {\mathcal {E}}_{\lambda }^{k} \in {\mathbb {R}}$ and $\lim _{k \rightarrow + \infty } p_{k} \in {\mathbb {R}}$, yields

$$\begin{aligned} \lim \limits _{k \rightarrow + \infty } h_{k} \in {\mathbb {R}}\text { exists}. \end{aligned}$$

In addition, 67c and 67d in Theorem 15 guarantee that

$$\begin{aligned} \displaystyle \sum \limits _{k \ge 1} \dfrac{1}{k} h_{k} \le 4 \displaystyle \sum \limits _{k \ge 1} k \left\Vert z^{k} - z^{k-1} \right\Vert ^{2} + \dfrac{1}{2} \left( 2 + \gamma \right) \gamma s^{2} \displaystyle \sum \limits _{k \ge 1} k \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} < + \infty . \end{aligned}$$

Consequently, $\lim _{k \rightarrow + \infty } h_{k} = 0$, which yields

$$\begin{aligned} \lim \limits _{k \rightarrow \infty } k \left\Vert 2 \left( z^{k} - z^{k-1} \right) + \gamma s V \left( \bar{z}^{k-1} \right) \right\Vert = \lim \limits _{k \rightarrow \infty } k \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert = 0 . \end{aligned}$$

This immediately implies $\lim _{k \rightarrow + \infty } k\left\Vert z^{k} - z^{k-1} \right\Vert = 0$. The fact that $\lim _{k \rightarrow + \infty } k \left\Vert V \left( z^{k} \right) \right\Vert = 0$ follows from 70 and 71, since

$$\begin{aligned} 0 \le \lim \limits _{k \rightarrow + \infty } k \left\Vert V \left( z^{k} \right) \right\Vert \le \lim \limits _{k \rightarrow + \infty } k\left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert + \lim \limits _{k \rightarrow + \infty } k\left\Vert V \left( \bar{z}^{k} \right) \right\Vert = 0 . \end{aligned}$$

Finally, using the Cauchy–Schwarz inequality and the fact that $\left( z^{k} \right) _{k \ge 0}$ is bounded, we obtain that $\lim _{k \rightarrow + \infty } k \left\langle z^{k} - z_{*}, V \left( z^{k} \right) \right\rangle = 0$. $\square $

5 Numerical Experiments

In this section, we perform numerical experiments to illustrate the convergence rates derived for the explicit Fast OGDA method and to compare our algorithm with other numerical schemes from the literature designed to solve equations governed by a monotone and Lipschitz continuous operator. To this end, we consider a minmax problem studied in [39], which has then been used in [49] to illustrate the performances of anchoring-based numerical methods. This reads

$$\begin{aligned} \min \limits _{x \in {\mathbb {R}}^{n}} \max \limits _{y \in {\mathbb {R}}^{n}} {\mathcal {L}}\left( x , y \right) := \dfrac{1}{2} \left\langle x , Hx \right\rangle - \left\langle x , h \right\rangle - \left\langle y , Ax - b \right\rangle , \end{aligned}$$

(74)

where

We notice that ${\mathcal {L}}$ is nothing else than the Lagrangian of a linearly constrained quadratic minimization problem. It has been shown in [39] that $\left\Vert A \right\Vert \le \frac{1}{2}$, thus $\left\Vert H \right\Vert \le \frac{1}{2}$, and, consequently, for the monotone mapping $\left( x, y \right) \mapsto \Bigl ( \nabla _{x} {\mathcal {L}}\left( x, y \right) , - \nabla _{y} {\mathcal {L}}\left( x, y \right) \Bigr )$ we can take $L=1$ as Lipschitz constant.

In the following, we summarize all the algorithms we use in the numerical experiments and the corresponding step sizes:

(i)
OGDA: Optimistic Gradient Descent Ascent method 7 (see [42]) with $s:= \frac{0.48}{L}$;
(ii)
EG: Extragradient method 6 (see [2, 30]) with $s:= \frac{0.96}{L}$;
(iii)
EAG-V: Extra Anchored Gradient method 10 (see [49]) with variable step sizes $\left( s_{k} \right) _{k \ge 0}$ satisfying 11;
(iv)
Nesterov-EAG: Nesterov’s accelerated variant of the Extra Anchored Gradient method, which has been proposed in [47] and can be obtained from 10 be taking in the first update line the sequence $\left( \frac{k+1}{L \left( k+2 \right) } \right) _{k \ge 0}$ as step sizes, and in the second one the constant step size $\frac{1}{L}$ (see [47, Theorem 5.1, Lemma 5.1, Theorem 5.2]);
(v)
Halpern-OGDA: OGDA mixed with the Halpern anchoring scheme, which has been proposed in [48] and can be obtained from the variant of 10 with variable step sizes by replacing in the first update line $V \left( z^{k} \right) $ by $V \left( \bar{z}^{k-1} \right) $;
(vi)
Fast OGDA: our explicit algorithm with $s:= \frac{0.48}{L}$ and various choices of $\alpha $.

For the first numerical experiments, we consider the same setting as in [49], namely, we take $n=200$, which means that the underlying space is ${{\mathbb {R}}}^{400}$, and allow a maximum number of iterations of $5 \times 10^{5}$. Figure 1 presents the convergence behavior of the different methods when solving 74 in logarithmic scale. One can see that the anchoring-based methods perform better than the classical algorithms EG and OGDA, and that Nesterov-EAG performs better than Halpern-OGDA, which reconfirms a finding of [47] and is not surprising when one takes into account that the first allows for larger step sizes than EAG-V (and Halpnern-OGDA). On the other hand, Fast OGDA outperforms all the other methods in spite of the fact that the step size is restricted to $\left( 0, \frac{1}{2L}\right) $.

Figure 2 shows that the parameter $\alpha >2$ influences significantly the convergence behavior of the explicit Fast OGDA method. For this numerical experiment, we take $n=1000$, which means that the underlying space is ${{\mathbb {R}}}^{2000}$, and allow a maximum number of iterations of $5 \times 10^{5}$. The speed of convergence increases with increase in $\alpha $ and seems to be much better than $o \left( 1/k \right) $. Let us mention that the minimax problem 74 was constructed to show lower complexity bounds of first-order methods for convex–concave saddle point problems.

For Nesterov’s dynamical systems with $\frac{\alpha }{t}$ as damping coefficient and the corresponding numerical algorithms approaching the minimization of a smooth and convex function, it is known that $\alpha $ influences in the same way the convergence rates of the objective function values. Another intriguing similarity with Nesterov’s continuous and discrete schemes is the evident oscillatory behavior of the trajectories, however, there for the objective function values, while for explicit Fast OGDA for the norm of the operator along the trajectory/sequence of generated iterates. This suggests that Nesterov’s acceleration approach can improve the convergence behavior of continuous and discrete-time approaches beyond the optimization setting.

In the following we complement the comparative study of the above numerical methods by following the performance profile developed by Dolan and Moré [21]. We denote by ${\textsf {S}}$ the set of the algorithms/solvers (i)–(vi) from above, where for the Fast OGDA method, we take $\alpha := 3$. We solve minmax problems of the form 74 with $\mathcal{L}: {\mathbb {R}}^n \times {\mathbb {R}}^m \rightarrow {\mathbb {R}}$, for 10 different pairs $\left( n, m \right) $ such that $20 \le m \le n \le 200$ and, for each such pair, for 100 randomly generated sparse matrices $A \in {\mathbb {R}}^{m \times n}$ and vectors $b \in {\mathbb {R}}^{m}$ and $h \in {\mathbb {R}}^{n}$, and $H:=2A^TA$. For each pair $\left( n, m \right) $, we also take 10 initial points with normal distribution, which leads to a set of problems ${\textsf {P}}$ with ${\textsf {N}}_{{\textsf {p}}} = 10 \times 100 \times 10= 10000$ instances.

For each problem ${\textsf {p}} \in {\textsf {P}}$ and each solver ${\textsf {s}} \in {\textsf {S}}$, we denote by ${\textsf {t}}_{\textsf {p,s}}$ the number of iterations needed by solver ${\textsf {s}}$ to solve the problem ${\textsf {p}}$ successfully, i.e., by satsifying the following stopping criteria before $\texttt{k}_{\max }:= 10^{5}$ iterations

$$\begin{aligned} \dfrac{\left\Vert V \left( x_{k} , y_{k} \right) \right\Vert }{\left\Vert V \left( x_{0} , y_{0} \right) \right\Vert } \le \texttt{Tol}_{\texttt{op}}=10^{-6} \quad \text { and } \quad \dfrac{\left\Vert \left( x_{k} , y_{k} \right) - \left( x_{k-1} , y_{k-1} \right) \right\Vert }{\left\Vert \left( x_{k} , y_{k} \right) \right\Vert + 1} \le \texttt {Tol}_{\texttt {vec}} =10^{-5}. \end{aligned}$$

The two stopping criteria quantify the relative errors measured for the operator norm and the discrete velocity. We define the performance ratio as

$$\begin{aligned} {\textsf {r}}_{\textsf {p,s}} := {\left\{ \begin{array}{ll} \dfrac{{\textsf {t}}_{\textsf {p,s}}}{\min \left\{ {\textsf {t}}_{\textsf {p,s}} :{\textsf {s}} \in {\textsf {S}} \right\} } &{} \text { if } {\textsf {t}}_{\textsf {p,s}} < \texttt{k}_{\max }, \\ 0 &{} \text { otherwise, } \end{array}\right. } \end{aligned}$$

and the performance of the solver ${\textsf {s}}$ as

$$\begin{aligned} \rho _{{\textsf {s}}} \left( \tau \right) := \dfrac{1}{{\textsf {N}}_{{\textsf {p}}}} \text {size} \left\{ {\textsf {p}} \in {\textsf {P}} :0 < {\textsf {r}}_{\textsf {p,s}} \le \tau \right\} , \end{aligned}$$

where $\tau $ is a real factor. The performance $\rho _{{\textsf {s}}} \left( \tau \right) $ for solver ${\textsf {s}}$ gives the probability that the performance ratio ${\textsf {r}}_{\textsf {p,s}}$ is within a factor $\tau \in {\mathbb {R}}$ of the best possible ratio. Therefore, the value of $\rho _{{\textsf {s}}} \left( 1 \right) $ gives the probability that the solver ${\textsf {s}}$ gives the best numerical performance when compared to the others, while $\rho _{{\textsf {s}}} \left( \tau \right) $ for large values of $\tau $ measures its robustness.

Figure 3 represents the performance profiles of the six solvers. We observe that the Fast OGDA method is the most efficient, followed by EAG-V and Halpern-OGDA. We note that for $\tau \ge 3$ these three solvers are robust and solve $90\%$ of the problems, while Nesterov-EGA and EG solve for $\tau \ge 4$ $80\%$ of the problems.

References

B. Abbas, H. Attouch and B.F. Svaiter. Newton-like dynamics and forward–backward methods for structured monotone inclusions in Hilbert spaces. Journal of Optimization Theory and Applications 161(2):331–360, 2014
Article MathSciNet MATH Google Scholar
A. S. Antipin. On a method for convex programs using a symmetrical modification of the Lagrange function. Ekonomika i Matematicheskie Metody 12:1164–1173, 1976
Google Scholar
H. Attouch, R. I. Boţ and D.-K. Nguyen. Fast convex optimization via time scale and averaging of the steepest descent. arXiv:2208.08260, 2022
H. Attouch and A. Cabot. Convergence of a relaxed inertial forward–backward algorithm for structured monotone inclusions. Applied Mathematics & Optimization 80(3):547–598, 2019
Article MathSciNet MATH Google Scholar
H. Attouch, Z. Chbani, J. Fadili and H. Riahi. First-order optimization algorithms via inertial systems with Hessian driven damping. Mathematical Programming 193:113–155, 2022
Article MathSciNet MATH Google Scholar
H. Attouch, Z. Chbani, J. Fadili and H. Riahi. Fast convergence of dynamical ADMM via time scaling of damped inertial dynamics. Journal of Optimization Theory and Applications 193:704–736, 2022
Article MathSciNet MATH Google Scholar
H. Attouch, Z. Chbani, J. Fadili and H. Riahi. Convergence of iterates for first-order optimization algorithms with inertia and Hessian driven damping. Optimization 72(5):1199–1238, 2023
Article MathSciNet MATH Google Scholar
H. Attouch, Z. Chbani and H. Riahi. Fast proximal methods via time scaling of damped inertial dynamics. SIAM Journal on Optimization 29(3):2227–2256, 2019
Article MathSciNet MATH Google Scholar
H. Attouch and S. C. László. Continuous Newton-like inertial dynamics for monotone inclusions. Set-Valued and Variational Analysis 29(3):555–581, 2021
Article MathSciNet MATH Google Scholar
H. Attouch and S. C. László. Newton-like inertial dynamics and proximal algorithms governed by maximally monotone operators. SIAM Journal on Optimization 30(4):3252–3283, 2021
Article MathSciNet MATH Google Scholar
H. Attouch and J. Peypouquet. The rate of convergence of Nesterov’s accelerated forward-backward method is actually faster than $1/k^2$. SIAM Journal on Optimization 26(3), 1824–1834, 2016
Article MathSciNet MATH Google Scholar
H. Attouch and J. Peypouquet. Convergence of inertial dynamics and proximal algorithms governed by maximally monotone operators. Mathematical Programming 174(1-2):391–432, 2019
Article MathSciNet MATH Google Scholar
H. Attouch, J. Peypouquet and P. Redont. Fast convex optimization via inertial dynamics with Hessian driven damping. Journal of Differential Equations 261(10), 5734–5783, 2016
Article MathSciNet MATH Google Scholar
H. Attouch and B. F. Svaiter. A continuous dynamical Newton-like approach to solving monotone inclusions. SIAM Journal on Control and Optimization 49(2):574–598, 2011
Article MathSciNet MATH Google Scholar
J. B. Baillon and H. Brézis. Une remarque sur le comportement asymptotique des semigroupes non linyéaires. Houston Journal of Mathematics 2:5–7, 1976
MathSciNet MATH Google Scholar
H.H. Bauschke and P.L. Combettes. Convex Analysis and Monotone Operator Theory in Hilbert Spaces. CMS Books in Mathematics, Springer, New York, 2017
R. I. Boţ and D.-K. Nguyen. Improved convergence rates and trajectory convergence for primal-dual dynamical systems with vanishing damping. Journal of Differential Equations 303: 369–406, 2021
Article MathSciNet MATH Google Scholar
A. Böhm, M. Sedlmayer, E. R. Csetnek and R. I. Boţ. Two steps at a time–taking GAN training in stride with Tseng’s method. SIAM Journal on Mathematics of Data Science 4(2):750–771, 2022
T. Chavdarova, M. I. Jordan and M. Zampetakis. Last-iterate convergence of saddle point optimizers via high-resolution differential equations. OPT2021: The 13th Annual Workshop on Optimization for Machine Learning paper 37, 2021
E. R. Csetnek, Y. Malitsky and M. K. Tam. Shadow Douglas–Rachford splitting for monotone inclusions. Applied Mathematics & Optimization 80:665–678, 2019
Article MathSciNet MATH Google Scholar
E. D. Dolan and J. J. Moré. Benchmarking optimization software with performance profiles. Mathematical Programming 91:201–213, 2002
Article MathSciNet MATH Google Scholar
N. Golowich, S. Pattathil and C. Daskalakis. Tight last-iterate convergence rates for no-regret learning in multi-player games. NeurIPS 2020: The 34th Conference on Neural Information Processing Systems, 2020
N. Golowich, S. Pattathil, C. Daskalakis and A. Ozdaglar. Last iterate is slower than averaged iterate in smooth convex-concave saddle point problems. COLT2020: The 33rd Conference on Learning Theory, 1758–1784, 2020
I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville and Y. Bengio. Generative adversarial networks. NeurIPS 2014: Advances in Neural Information Processing Systems 27:2672–2680, 2014
E. Gorbunov. N. Loizou and G. Gidel. Extragradient method: ${{\cal O\it }} ( 1/k )$ last-iterate convergence for monotone variational inequalities and connections with cocoercivity. AISTATS 2022: The 25th International Conference on Artificial Intelligence and Statistics, 2022
O. Güler. On the convergence of the proximal point algorithm for convex minimization. SIAM Journal on Control and Optimization 29(2):403–419, 1991
Article MathSciNet MATH Google Scholar
O. Güler. New proximal point algorithms for convex minimization. SIAM Journal on Optimization 2(4):649–664, 1992
Article MathSciNet MATH Google Scholar
B. Halpern. Fixed points of nonexpanding maps. Bulletin of the American Mathematical Society 73(6): 957–961, 1967
Article MathSciNet MATH Google Scholar
D. Kim. Accelerated proximal point method for maximally monotone operators. Mathematical Programming 190:57–87, 2021
Article MathSciNet MATH Google Scholar
G. M. Korpelevich. An extragradient method for finding saddle points and for other problems. Ekonomika i Matematicheskie Metody 12(4):747–756,1976
MathSciNet MATH Google Scholar
S. Lee and D. Kim. Fast extra gradient methods for smooth structured nonconvex-nonconcave minimax problems. NeurIPS 2021: Advances in Neural Information Processing Systems 34, 2021
A. Madry, A. Makelov, L. Schmidt, D. Tsipras and A. Vladu. Towards deep learning models resistant to adversarial attacks. ICLR 2018: International Conference on Learning Representations, 2018
Y. Malitsky and M. K. Tam. A forward–backward splitting method for monotone inclusions without cocoercivity. SIAM Journal on Optimization 30(2):1451–1472, 2020
Article MathSciNet MATH Google Scholar
Y. Nesterov. A method of solving a convex programming problem with convergence rate ${{\cal{O}}} ( 1 / k^{2} )$. Soviet Mathematics Doklady 27:372–376, 1983
Y. Nesterov. Introductory Lectures on Convex Optimization. Springer, New York, 2004
Book MATH Google Scholar
Y. Nesterov. Dual extrapolation and its applications to solving variational inequalities and related problems. Mathematical Programming 109:319–344, 2007
Article MathSciNet MATH Google Scholar
S. Omidshafiei, J. Pazis, C. Amato, J. P. How and J. Vian. Deep decentralized multi-task multi-agent reinforcement learning under partial observability. The 34th International Conference on Machine Learning 70:2681–2690, 2017
Z. Opial. Weak convergence of the sequence of successive approximations for nonexpansive mappings. Bulletin of the American Mathematical Society 73:591–597, 1967
Article MathSciNet MATH Google Scholar
Y. Ouyang and Y. Xu. Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems. Mathematical Programming 185:1–35, 2021
Article MathSciNet MATH Google Scholar
J. Park and E. K. Ryu. Exact optimal accelerated complexity for fixed-point iterations. The 39th International Conference on Machine Learning 162, 2022
J. Peypouquet and S. Sorin. Evolution equations for maximal monotone operators: asymptotic analysis in continuous and discrete time. Journal of Convex Analysis 17(3-4):1113–1163, 2010
MathSciNet MATH Google Scholar
L. D. Popov. A modification of the Arrow–Hurwicz method for search of saddle points. Mathematical Notes of the Academy of Sciences of the USSR 28(5):845–848, 1980
MATH Google Scholar
R. T. Rockafellar. Monotone operators associated with saddle-functions and minimax problems. In: F. E. Browder (ed.), Nonlinear Functional Analysis, Proceedings of Symposia in Pure Mathematics 18: 241–250, American Mathematical Society, 1970
G. R. Sell and Y. You. Dynamics of Evolutionary Equations. Springer, New York, 2002
Book MATH Google Scholar
B. Shi, S. Du, M. I. Jordan and W.J. Su. Understanding the acceleration phenomenon via high-resolution differential equations. Mathematical Programming 195:79–148, 2022
Article MathSciNet MATH Google Scholar
W. Su, S. Boyd and E. Candès. A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. Journal of Machine Learning Research 17(153):1–43, 2016
MathSciNet MATH Google Scholar
Q. Tran-Dinh. The connection between Nesterov’s accelerated methods and Halpern fixed-point iterations. arXiv:2203.04869, 2022
Q. Tran-Dinh and Y. Luo. Halpern-type accelerated and splitting algorithms for monotone inclusions. arXiv:2110.08150, 2021
T. H. Yoon and E. K. Ryu. Accelerated algorithms for smooth convex-concave minimax problems with ${{\cal{O}}} ( 1/k^{2})$ rate on squared gradient norm. The 38th International Conference on Machine Learning 139: 12098–12109, 2021

Download references

Acknowledgements

The authors are thankful to the handling editor and two anonymous reviewers for comments and remarks which improved the quality of the manuscript, in particular for the observation on which we elaborate in Remark 8 and the suggestion to use the performance profiles in the numerical experiments.

Funding

Open access funding provided by University of Vienna.

Author information

Authors and Affiliations

Faculty of Mathematics, University of Vienna, Oskar-Morgenstern-Platz 1, 1090, Vienna, Austria
Radu Ioan Boţ, Ernö Robert Csetnek & Dang-Khoa Nguyen
Faculty of Mathematics and Computer Science, University of Science, Vietnam National University, Ho Chi Minh City, 700000, Vietnam
Dang-Khoa Nguyen

Authors

Radu Ioan Boţ
View author publications
You can also search for this author in PubMed Google Scholar
Ernö Robert Csetnek
View author publications
You can also search for this author in PubMed Google Scholar
Dang-Khoa Nguyen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Radu Ioan Boţ.

Additional information

Communicated by Jérôme Bolte.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Radu Ioan Boţ: Research partially supported by FWF (Austrian Science Fund), Projects W 1260 and P 34922-N. Ernö Robert Csetnek: Research partially supported by FWF (Austrian Science Fund), Project P 29809-N32. Dang-Khoa Nguyen: Research supported by FWF (Austrian Science Fund), Project P 34922-N.

Appendix

In the first subsection of the appendix, we collect some fundamental auxiliary results for the analysis carried out in the paper. Further, we present the proof of the existence and uniqueness theorem for 12 and also the proofs of technical lemmas used in the convergence analysis of the numerical algorithms.

1.1 Auxiliary Results

The following result can be found in [1, Lemma 5.1].

Lemma A.1

Let $\delta > 0$. Suppose that $f :\left[ \delta , + \infty \right) \rightarrow {\mathbb {R}}$ is locally absolutely continuous, bounded from below, and there exists $g \in {\mathbb {L}}^{1} \left( \left[ \delta , + \infty \right) \right) $ such that for almost every $t \ge \delta $

$$\begin{aligned} \dfrac{d}{dt} f \left( t \right) \le g \left( t \right) . \end{aligned}$$

Then, the limit $\lim \limits _{t \rightarrow + \infty } f \left( t \right) \in {\mathbb {R}}$ exists.

Opial’s Lemma [38] in continuous form is used in the proof of the weak convergence of the trajectory of the dynamical system 12.

Lemma A.2

Let ${\mathcal {S}}$ be a nonempty subset of ${\mathcal {H}}$ and $z :\left[ t_{0}, + \infty \right) \rightarrow {\mathcal {H}}$. Assume that

(i)
for every $z_{*} \in {\mathcal {S}}$, $\lim \limits _{t \rightarrow + \infty } \left\Vert z \left( t \right) - z_{*} \right\Vert $ exists;
(ii)
every weak sequential cluster point of the trajectory $z \left( t \right) $ as $t \rightarrow + \infty $ belongs to ${\mathcal {S}}$.

Then, z(t) converges weakly to a point in ${\mathcal {S}}$ as $t \rightarrow + \infty $.

For the convergence proof of the iterates generated by the two numerical algorithms, we use the discrete counterpart of Opial’s Lemma.

Lemma A.3

Let ${\mathcal {S}}$ be a nonempty subset of ${\mathcal {H}}$ and $\left( z_{k} \right) _{k \ge 1}$ be a sequence in ${\mathcal {H}}$. Assume that

(i)
for every $z_{*} \in {\mathcal {S}}$, $\lim \limits _{k \rightarrow + \infty } \left\Vert z_{k} - z_{*} \right\Vert $ exists;
(ii)
every weak sequential cluster point of the sequence $\left( z_{k} \right) _{k \ge 1}$ as $k \rightarrow + \infty $ belongs to ${\mathcal {S}}$.

Then, $\left( z_{k} \right) _{k \ge 1}$ converges weakly to a point in ${\mathcal {S}}$ as $k \rightarrow + \infty $.

The following result can be found in [13, Lemma A.2].

Lemma A.4

Let $a > 0$ and $q :\left[ t_{0}, + \infty \right) \rightarrow {\mathcal {H}}$ be a continuously differentiable function such that

$$\begin{aligned} \lim \limits _{t \rightarrow + \infty } \left( q \left( t \right) + \dfrac{t}{a} {\dot{q}} \left( t \right) \right) = l \in {\mathcal {H}}. \end{aligned}$$

Then, it holds $\lim \limits _{t \rightarrow + \infty } q \left( t \right) = l$.

The discrete counterpart of this result is stated below. We provide a proof for it, as we could not find any reference for this result in the literature.

Lemma A.5

Let $a \ge 1$ and $\left( q_{k} \right) _{k \ge 0}$ be a bounded sequence in ${\mathcal {H}}$ such that

$$\begin{aligned} \lim \limits _{k \rightarrow + \infty } \left( q_{k+1} + \dfrac{k}{a} \left( q_{k+1} - q_{k} \right) \right) = l \in {\mathcal {H}}. \end{aligned}$$

Then, it holds $\lim \limits _{k \rightarrow + \infty } q_{k} = l$.

Proof

For every $k \ge 0$, we set $r_{k}:= q_{k} - l$. We fix $\varepsilon > 0$. Then, there exists $k_{0} \ge 1$ such that for every $k \ge k_{0}$

$$\begin{aligned} \left\Vert r_{k+1} + \dfrac{k}{a} \left( r_{k+1} - r_{k} \right) \right\Vert \le \varepsilon . \end{aligned}$$

Multiplying both side by $ak^{a-1}$, we obtain for every $k \ge k_{0}$

$$\begin{aligned} \left\Vert \left( ak^{a-1} + k^{a} \right) r_{k+1} - k^{a} r_{k} \right\Vert \le \varepsilon ak^{a-1} . \end{aligned}$$

Then by applying the triangle inequality and using the fact that ${\overline{r}}:= \sup _{k \ge 0} \left\Vert r_{k} \right\Vert < + \infty $, we deduce that for every $k \ge k_{0}$

$$\begin{aligned} \left\Vert \left( k + 1 \right) ^{a} r_{k+1} - k^{a} r_{k} \right\Vert \le \varepsilon ak^{a-1} + \left|\left( k + 1 \right) ^{a} - k^{a} - ak^{a-1} \right|{\overline{r}} . \end{aligned}$$

(75)

The Lagrange error bound of a Taylor series says that for every $k \ge k_{0}$, there exists $m_k \in \left( k, k+1 \right) $ such that

$$\begin{aligned} \left|\left( k + 1 \right) ^{a} - k^{a} - ak^{a-1} \right|\le \dfrac{1}{2} a \left|a - 1 \right|m_k^{a-2} . \end{aligned}$$

From here we consider two cases.

${\underline{\hbox {The case }1 \le a < 2.}}$ Then for every $k \ge k_{0}$ and every $m \in \left( k, k+1 \right) $, we have $m^{a-2} \le 1$ and thus 75 leads to

$$\begin{aligned} \left\Vert \left( k + 1 \right) ^{a} r_{k+1} - k^{a} r_{k} \right\Vert \le \varepsilon ak^{a-1} + \dfrac{1}{2} a \left|a - 1 \right|{\overline{r}} . \end{aligned}$$

We choose $K \ge k_{0}$ and use a telescoping sum argument to get

$$\begin{aligned} \left\Vert \left( K + 1 \right) ^{a} r_{K+1} - k_{0}^{a} r_{k_{0}} \right\Vert&= \left\Vert \displaystyle \sum \limits _{k = k_{0}}^{K} \Bigl ( \left( k + 1 \right) ^{a} r_{k+1} - k^{a} r_{k} \Bigr ) \right\Vert \\&\le \displaystyle \sum \limits _{k = k_{0}}^{K} \left\Vert \left( k + 1 \right) ^{a} r_{k+1} - k^{a} r_{k} \right\Vert \\&\le \varepsilon a \displaystyle \sum \limits _{k = k_{0}}^{K} k^{a-1} + \dfrac{1}{2} a \left|a - 1 \right|{\overline{r}} \displaystyle \sum \limits _{k = k_{0}}^{K} 1\\&\le \varepsilon a \left( K + 1 \right) ^{a} + \dfrac{1}{2} a \left|a - 1 \right|{\overline{r}} \left( K + 1 \right) . \end{aligned}$$

Once again, using the triangle inequality, we conclude that

$$\begin{aligned} \left\Vert r_{K+1} \right\Vert&\le \dfrac{1}{\left( K + 1 \right) ^{a}} \left\Vert \left( K + 1 \right) ^{a} r_{K+1} - k_{0}^{a} r_{k_{0}} \right\Vert + \dfrac{k_{0}^{a}}{\left( K + 1 \right) ^{a}} \left\Vert r_{k_{0}} \right\Vert \\&\le \varepsilon a + \dfrac{a \left|a - 1 \right|{\overline{r}}}{2 \left( K+1 \right) ^{a-1}} + \dfrac{k_{0}^{a}{\overline{r}}}{\left( K + 1 \right) ^{a}} . \end{aligned}$$

${\underline{\hbox {The case }a \ge 2.}}$ For for every $k \ge k_{0}$ and every $m \in \left( k, k+1 \right) $, we have $m^{a - 2} \le \left( k + 1 \right) ^{a-2}$, hence 75 leads to

$$\begin{aligned} \left\Vert \left( k + 1 \right) ^{a} r_{k+1} - k^{a} r_{k} \right\Vert \le \varepsilon ak^{a-1} + \dfrac{1}{2} a \left( a - 1 \right) {\overline{r}} \left( k + 1 \right) ^{a - 2} . \end{aligned}$$

We choose also in this case $K \ge k_{0}$ and by a similar argument as above we have that

$$\begin{aligned} \left\Vert \left( K + 1 \right) ^{a} r_{K+1} - k_{0}^{a} r_{k_{0}} \right\Vert&= \left\Vert \displaystyle \sum \limits _{k = k_{0}}^{K} \Bigl ( \left( k + 1 \right) ^{a} r_{k+1} - k^{a} r_{k} \Bigr ) \right\Vert \\&\le \displaystyle \sum \limits _{k = k_{0}}^{K} \left\Vert \left( k + 1 \right) ^{a} r_{k+1} - k^{a} r_{k} \right\Vert \\&\le \varepsilon a \displaystyle \sum \limits _{k = k_{0}}^{K} k^{a-1} + \dfrac{1}{2} a \left( a - 1 \right) {\overline{r}} \displaystyle \sum \limits _{k = k_{0}}^{K} \left( k+1 \right) ^{a - 2} \\&\le \varepsilon a \left( K + 1 \right) ^{a-1} \displaystyle \sum \limits _{k = 0}^{K} 1 + \dfrac{1}{2} a \left( a - 1 \right) {\overline{r}} \left( K + 1 \right) ^{a - 2} \displaystyle \sum \limits _{k = 0}^{K} 1 \\&\le \varepsilon a \left( K + 1 \right) ^{a} + \dfrac{1}{2} a \left( a - 1 \right) {\overline{r}} \left( K + 1 \right) ^{a - 1} . \end{aligned}$$

This leads to

$$\begin{aligned} \left\Vert r_{K+1} \right\Vert \le \varepsilon a + \dfrac{a \left( a - 1 \right) {\overline{r}}}{2 \left( K+1 \right) } + \dfrac{k_{0}^{a} {\overline{r}}}{\left( K + 1 \right) ^{a}} . \end{aligned}$$

Therefore, in both scenarios we obtain

$$\begin{aligned} \limsup _{k \rightarrow + \infty } \left\Vert r_{k} \right\Vert \le \varepsilon a , \end{aligned}$$

which leads to the desired conclusion, as $\varepsilon > 0$ was arbitrarily chosen. $\square $

The following result is a particular instance of [16, Lemma 5.31].

Lemma A.6

Let $\left( a_{k} \right) _{k \ge 1}$, $\left( b_{k} \right) _{k \ge 1}$ and $\left( d_{k} \right) _{k \ge 1}$ be sequences of real numbers. Assume that $\left( a_{k} \right) _{k \ge 1}$ is bounded from below, and $\left( b_{k} \right) _{k \ge 1}$ and $\left( d_{k} \right) _{k \ge 1}$ are nonnegative sequences such that $\sum _{k \ge 1} d_{k} < + \infty $. If

$$\begin{aligned} a_{k+1} \le a_{k} - b_{k} + d_{k} \quad \forall k \ge 1, \end{aligned}$$

then the following statements are true:

(i)
the sequence $\left( b_{k} \right) _{k \ge 1}$ is summable, namely $\sum _{k \ge 1} b_{k} < + \infty $;
(ii)
the sequence $\left( a_{k} \right) _{k \ge 1}$ is convergent.

The following elementary result is used several times in the paper.

Lemma A.7

Let $a, b, c \in {\mathbb {R}}$ be such that $a \ne 0$ and $b^{2} - ac \le 0$. The following statements are true:

(i)
if $a > 0$, then it holds
$$\begin{aligned} a \left\Vert x \right\Vert ^{2} + 2b \left\langle x , y \right\rangle + c \left\Vert y \right\Vert ^{2} \ge 0 \quad \forall x, y \in {\mathcal {H}}; \end{aligned}$$
(ii)
if $a < 0$, then it holds
$$\begin{aligned} a \left\Vert x \right\Vert ^{2} + 2b \left\langle x , y \right\rangle + c \left\Vert y \right\Vert ^{2} \le 0 \quad \forall x, y \in {\mathcal {H}}. \end{aligned}$$

1.2 Proof of the Existence and Uniqueness Theorem for the Evolution Equation

In this subsection, we provide the proof of the existence and uniqueness of the trajectories of 12.

Proof of Theorem 5

The system 12 can be rewritten as a first-order ordinary differential equation

$$\begin{aligned} {\left\{ \begin{array}{ll} {\dot{z}} \left( t \right) &{} = \dfrac{1}{2t} u \left( t \right) - \dfrac{1}{t} \left( \alpha - 1 \right) z \left( t \right) - \beta \left( t \right) V \left( z \left( t \right) \right) \\ {\dot{u}} \left( t \right) &{} = \Bigl ( t {\dot{\beta }} \left( t \right) + \left( 2 - \alpha \right) \beta \left( t \right) \Bigr ) V \left( z \left( t \right) \right) \\ \Bigl ( z \left( t_{0} \right) , u \left( t_{0} \right) \Bigr ) &{} = \Bigl ( z^{0} , 2 \left( \alpha - 1 \right) z^{0} + 2t_{0} \dot{z}^{0} + 2t_{0} \beta \left( t_{0} \right) V \left( z^{0} \right) \Bigr ) \end{array}\right. }, \end{aligned}$$

(76)

where for every $t \ge t_{0}$ we define

$$\begin{aligned} u \left( t \right) := 2 \left( \alpha - 1 \right) z \left( t \right) + 2t {\dot{z}} \left( t \right) + 2t \beta \left( t \right) V \left( z \left( t \right) \right) . \end{aligned}$$

We define $G :\left[ t_{0}, + \infty \right) \times {\mathcal {H}}\times {\mathcal {H}}\rightarrow {\mathcal {H}}\times {\mathcal {H}}$ by

$$\begin{aligned} G \left( t , \zeta , \mu \right) := \biggl ( \Bigl ( t {\dot{\beta }} \left( t \right) + \left( 2 - \alpha \right) \beta \left( t \right) \Bigr ) V \left( \zeta \right) , \dfrac{1}{2t} \mu - \dfrac{1}{t} \left( \alpha - 1 \right) \zeta - \beta \left( t \right) V \left( \zeta \right) \biggr ) , \end{aligned}$$

so that 76 becomes

$$\begin{aligned} {\left\{ \begin{array}{ll} \Bigl ( {\dot{u}} \left( t \right) , {\dot{z}} \left( t \right) \Bigr ) &{} = G \left( t , z \left( t \right) , u \left( t \right) \right) \\ \Bigl ( z \left( t_{0} \right) , u \left( t_{0} \right) \Bigr ) &{} = \Bigl ( z^{0} , 2 \left( \alpha - 1 \right) z^{0} + 2t_{0} \dot{z}^{0} + 2t_{0} \beta \left( t_{0} \right) V \left( z^{0} \right) \Bigr ) \end{array}\right. } . \end{aligned}$$

Since G is Lipschitz continuous on bounded sets, the local existence and uniqueness theorem (see, for instance, [44, Theorems 46.2 and 46.3]) allows to conclude that there exists a unique continuous differentiable solution $\left( z, u \right) \in {\mathcal {H}}\times {\mathcal {H}}$ of 76 defined on a maximally interval $\left[ t_{0}, T_{\max }\right) $ where $0< t_{0} < T_{\max }\le + \infty $. Furthermore, either

$$\begin{aligned} T_{\max }= + \infty \qquad \text { or } \qquad \lim \limits _{t \rightarrow T_{\max }} \left\Vert \left( z \left( t \right) , u \left( t \right) \right) \right\Vert = + \infty . \end{aligned}$$

In the following, we will show that indeed $T_{\max }= + \infty $.

According to 23, for $z_{*} \in {\mathcal {Z}}$ fixed, for every $t_{0} \le t < T_{\max }$ it holds

$$\begin{aligned} {\mathcal {E}}_{\alpha - 1} \left( t \right) + 2 \int _{t_{0}}^{t} \tau ^{2} \beta \left( \tau \right) w \left( \tau \right) \left\Vert V \left( z \left( \tau \right) \right) \right\Vert ^{2} d \tau \le {\mathcal {E}}_{\alpha - 1} \left( t_{0} \right) < +\infty , \end{aligned}$$

which implies that

$$\begin{aligned} t \mapsto u(t) \text{ is } \text{ bounded } \text{ on } [t_0, T_{\max }). \end{aligned}$$

(77)

On the other hand, inequality 25 implies that

$$\begin{aligned} \int _{t_{0}}^{T_{\max }} \tau \beta ^{2} \left( \tau \right) \left\Vert V \left( z \left( \tau \right) \right) \right\Vert ^{2} d \tau \le \dfrac{{\mathcal {E}}_{\alpha - 1} \left( t_{0} \right) }{\varepsilon } < + \infty , \end{aligned}$$

for some $\varepsilon > 0$. Now for $0< \lambda < \alpha - 1$, we have according to 27b that for every $t_{0} \le t < T_{\max }$

$$\begin{aligned} 2 \lambda \left( \alpha - 1 - \lambda \right) \left\Vert z \left( t \right) - z_{*} \right\Vert ^{2} \le {\mathcal {E}}_{\lambda } \left( t \right)&\le {\mathcal {E}}_{\lambda } \left( t_{0} \right) + \dfrac{2}{\varepsilon } \left( \alpha - 1 - \lambda \right) {\mathcal {E}}_{\alpha - 1} \left( t_{0} \right) < + \infty . \end{aligned}$$

(78)

From 77 and 78, we have that $\lim _{t \rightarrow T_{\max }} \left\Vert \left( z \left( t \right) , u \left( t \right) \right) \right\Vert < + \infty $, therefore $T_{\max }= + \infty $. $\square $

1.3 Proof of the Technical Lemma Used in the Analysis of the Implicit Algorithm

In this subsection, we provide the proof of Lemma 9 which shows that the discrete energy 42 dissipates with every iteration of the implicit Fast OGDA method.

Proof of Lemma 9

Let $0 \le \lambda \le \alpha -1$. For brevity, we denote for every $k \ge 0$

$$\begin{aligned} u^{k+1}_{\lambda } := 2 \lambda \left( z^{k+1} - z_{*} \right) + 2 \left( k + 1 \right) \left( z^{k+1} - z^{k} \right) + s \left( k + 1 \right) \beta _{k} V \left( z^{k+1} \right) . \end{aligned}$$

(79)

This means that for every $k \ge 1$ it holds

$$\begin{aligned} u^{k}_{\lambda } = 2 \lambda \left( z^{k} - z_{*} \right) + 2k \left( z^{k} - z^{k-1} \right) + sk \beta _{k-1} V \left( z^{k} \right) , \end{aligned}$$

therefore taking the difference and using 38 we deduce that

$$\begin{aligned} u^{k+1}_{\lambda } - u^{k}_{\lambda }&= \ 2 \left( \lambda + 1 - \alpha \right) \left( z^{k+1} - z^{k} \right) + 2 \left( k + \alpha \right) \left( z^{k+1} - z^{k} \right) - 2k \left( z^{k} - z^{k-1} \right) \nonumber \\&\quad + s \Bigl ( \left( k + 1 \right) \beta _{k} - k \beta _{k-1} \Bigr ) V \left( z^{k+1} \right) + sk \beta _{k-1} \left( V \left( z^{k+1} \right) - V \left( z^{k} \right) \right) \nonumber \\&= \ 2 \left( \lambda + 1 - \alpha \right) \left( z^{k+1} - z^{k} \right) + \left( 1 - \alpha \right) s \beta _{k} V \left( z^{k+1} \right) \nonumber \\&\quad - sk \beta _{k-1} \left( V \left( z^{k+1} \right) - V \left( z^{k} \right) \right) . \end{aligned}$$

(80)

In the following, we want to use the following identity

$$\begin{aligned} \dfrac{1}{2} \left( \left\Vert u^{k+1}_{\lambda } \right\Vert ^{2} - \left\Vert u^{k}_{\lambda } \right\Vert ^{2} \right) = \left\langle u^{k+1}_{\lambda } , u^{k+1}_{\lambda } - u^{k}_{\lambda } \right\rangle - \dfrac{1}{2} \left\Vert u^{k+1}_{\lambda } - u^{k}_{\lambda } \right\Vert ^{2} \quad \forall k \ge 1. \end{aligned}$$

(81)

Using the relations 79 and 80, for every $k \ge 1$ we derive that

$$\begin{aligned}&\left\langle u^{k+1}_{\lambda } , u^{k+1}_{\lambda } - u^{k}_{\lambda } \right\rangle \nonumber \\&\quad = \ 4 \lambda \left( \lambda + 1 - \alpha \right) \left\langle z^{k+1} - z_{*} , z^{k+1} - z^{k} \right\rangle \nonumber \\&\qquad + 2 \lambda \left( 1 - \alpha \right) s \beta _{k} \left\langle z^{k+1} - z_{*} , V \left( z^{k+1} \right) \right\rangle \nonumber \\&\qquad - 2 \lambda sk \beta _{k-1} \left\langle z^{k+1} - z_{*} , V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\rangle \nonumber \\&\qquad + 4 \left( \lambda + 1 - \alpha \right) \left( k + 1 \right) \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} \nonumber \\&\qquad + 2s \left( \lambda + 2 - 2 \alpha \right) \left( k + 1 \right) \beta _{k} \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) \right\rangle \nonumber \\&\qquad - 2s \left( k + 1 \right) k \beta _{k-1} \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\rangle \nonumber \\&\qquad + \left( 1 - \alpha \right) s^{2} \left( k + 1 \right) \beta _{k}^{2} \left\Vert V \left( z^{k+1} \right) \right\Vert ^{2} \nonumber \\&\qquad - s^{2} \left( k + 1 \right) k \beta _{k} \beta _{k-1} \left\langle V \left( z^{k+1} \right) , V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\rangle , \end{aligned}$$

(82)

and

$$\begin{aligned} - \dfrac{1}{2} \left\Vert u^{k+1}_{\lambda } - u^{k}_{\lambda } \right\Vert ^{2}&= - 2 \left( \lambda + 1 - \alpha \right) ^{2} \left\Vert z^{k+1} - z^{k} \right\Vert ^{2}\nonumber \\&\quad - \dfrac{1}{2} \left( 1 - \alpha \right) ^{2} s^{2} \beta _{k}^{2} \left\Vert V \left( z^{k+1} \right) \right\Vert ^{2} \nonumber \\&\quad - \dfrac{1}{2} s^{2} k^{2} \beta _{k-1}^{2} \left\Vert V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\Vert ^{2} \nonumber \\&\quad - 2 \left( \lambda + 1 - \alpha \right) \left( 1 - \alpha \right) s \beta _{k} \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) \right\rangle \nonumber \\&\quad + 2 \left( \lambda + 1 - \alpha \right) sk \beta _{k-1} \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\rangle \nonumber \\&\quad + \left( 1 - \alpha \right) s^{2} k \beta _{k} \beta _{k-1} \left\langle V \left( z^{k+1} \right) , V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\rangle . \end{aligned}$$

(83)

A direct computation shows that

$$\begin{aligned}&\Bigl ( \left( \lambda + 2 - 2 \alpha \right) \left( k + 1 \right) - \left( \lambda + 1 - \alpha \right) \left( 1 - \alpha \right) \Bigr ) \beta _{k} \nonumber \\&\quad = \ \Bigl ( \left( \lambda + 1 - \alpha \right) \left( 2k + \alpha + 1 \right) - \lambda \left( k + 1 \right) \Bigr ) \beta _{k} \nonumber \\&\quad = \ \Bigl ( \left( \lambda + 1 - \alpha \right) \left( 2k + \alpha + 1 \right) - \lambda \Bigr ) \beta _{k} - \lambda k \left( \beta _{k} - \beta _{k-1} \right) + \lambda k \beta _{k-1} . \end{aligned}$$

(84)

By plugging 82 and 83 into 81, we get for every $k \ge 1$

$$\begin{aligned}&\dfrac{1}{2} \left( \left\Vert u^{k+1}_{\lambda } \right\Vert ^{2} - \left\Vert u^{k}_{\lambda } \right\Vert ^{2} \right) \nonumber \\&\quad = \ 4 \lambda \left( \lambda + 1 - \alpha \right) \left\langle z^{k+1} - z_{*} , z^{k+1} - z^{k} \right\rangle \nonumber \\&\qquad + 2 \lambda \left( 1 - \alpha \right) s \beta _{k} \left\langle z^{k+1} - z_{*} , V \left( z^{k+1} \right) \right\rangle \nonumber \\&\qquad - 2 \lambda sk \beta _{k-1} \left\langle z^{k+1} - z_{*} , V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\rangle \nonumber \\&\qquad + 2 \left( \lambda + 1 - \alpha \right) \left( 2k + \alpha + 1 - \lambda \right) \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} \nonumber \\&\qquad + 2s \Bigl ( \left( \lambda + 1 - \alpha \right) \left( 2k + \alpha + 1 \right) - \lambda \left( k + 1 \right) \Bigr ) \beta _{k} \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) \right\rangle \nonumber \\&\qquad - 2sk \left( k + \alpha - \lambda \right) \beta _{k-1} \! \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\rangle \nonumber \\&\qquad + \!\dfrac{1}{2} \left( 1 - \alpha \right) s^{2} \beta _{k}^{2} \left( 2k + \alpha + 1 \right) \left\Vert V \left( z^{k+1} \right) \right\Vert ^{2} \nonumber \\&\qquad - s^{2} \left( k + \alpha \right) k \beta _{k} \beta _{k-1} \left\langle V \left( z^{k+1} \right) , V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\rangle \nonumber \\&\qquad - \dfrac{1}{2} s^{2} k^{2} \beta _{k-1}^{2} \left\Vert V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\Vert ^{2} . \end{aligned}$$

(85)

Next we are going to consider the remaining terms in the difference of the discrete energy functions. First we observe that for every $k \ge 0$

$$\begin{aligned}&2 \lambda \left( \alpha - 1 - \lambda \right) \left( \left\Vert z^{k+1} - z_{*} \right\Vert ^{2} - \left\Vert z^{k} - z_{*} \right\Vert ^{2} \right) \nonumber \\&\quad = \ 2 \lambda \left( \alpha - 1 - \lambda \right) \left( 2 \left\langle z^{k+1} - z_{*} , z^{k+1} - z^{k} \right\rangle - \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} \right) . \end{aligned}$$

(86)

Some algebra shows that for every $k \ge 1$

$$\begin{aligned}&2 \lambda s \left( k + 1 \right) \beta _{k} \left\langle z^{k+1} - z_{*} , V \left( z^{k+1} \right) \right\rangle - 2 \lambda s k \beta _{k-1} \left\langle z^{k} - z_{*} , V \left( z^{k} \right) \right\rangle \nonumber \\&\quad = \ 2 \lambda s \Bigl ( \left( k + 1 \right) \beta _{k} - k \beta _{k-1} \Bigr ) \left\langle z^{k+1} - z_{*} , V \left( z^{k+1} \right) \right\rangle \nonumber \\&\quad \quad + 2 \lambda sk \beta _{k-1} \left( \left\langle z^{k+1} - z_{*} , V \left( z^{k+1} \right) \right\rangle - \left\langle z^{k} - z_{*} , V \left( z^{k} \right) \right\rangle \right) \nonumber \\&\quad = \ 2 \lambda s \Bigl ( \left( k + 1 \right) \beta _{k} - k \beta _{k-1} \Bigr ) \left\langle z^{k+1} - z_{*} , V \left( z^{k+1} \right) \right\rangle \nonumber \\&\quad \quad + 2 \lambda sk \beta _{k-1} \left\langle z^{k+1} - z_{*} , V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\rangle \nonumber \\&\qquad + 2 \lambda sk \beta _{k-1} \left\langle z^{k+1} - z^{k} , V \left( z^{k} \right) \right\rangle \nonumber \\&\quad = \ 2 \lambda s \Bigl ( \left( k + 1 \right) \beta _{k} - k \beta _{k-1} \Bigr ) \left\langle z^{k+1} - z_{*} , V \left( z^{k+1} \right) \right\rangle \nonumber \\&\qquad - 2 \lambda sk \beta _{k-1} \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\rangle \nonumber \\&\quad \quad + 2 \lambda sk \beta _{k-1} \left\langle z^{k+1} - z_{*} , V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\rangle \nonumber \\&\qquad + 2 \lambda sk \beta _{k-1} \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) \right\rangle . \end{aligned}$$

(87)

Finally, according to 39 and 40, we have for every $k \ge \lceil \alpha \rceil $

$$\begin{aligned}&\left( k + \alpha + 1 \right) \left( k + 1 \right) \beta _{k+1} - \left( k + \alpha \right) k \beta _{k-1} \\&\quad = \ \left( k + \alpha + 1 \right) \left( k + 1 \right) \bigl ( \beta _{k+1} - \beta _{k} \bigr ) + \left( k + \alpha \right) k \bigl ( \beta _{k} - \beta _{k-1} \bigr ) + \left( 2k + \alpha + 1 \right) \beta _{k} \\&\quad \le \ \left( \alpha - 2 - \varepsilon \right) \Bigl ( \left( k + \alpha + 1 \right) \beta _{k+1} + \left( k + \alpha \right) \beta _{k} \Bigr ) + \left( 2k + \alpha + 1 \right) \beta _{k} \\&\quad = \ \left( \alpha - 2 - \varepsilon \right) \Bigl ( \left( k + 1 \right) \left( \beta _{k+1} - \beta _{k} \right) + \alpha \beta _{k+1} + \left( 2k + \alpha + 1 \right) \beta _{k} \Bigr ) \\&\qquad + \left( 2k + \alpha + 1 \right) \beta _{k} \\&\quad \le \ \left( \alpha - 2 - \varepsilon \right) \left( 2 \alpha - 2 - \varepsilon \right) \beta _{k+1} + \left( \alpha - 1 - \varepsilon \right) \left( 2k + \alpha + 1 \right) \beta _{k} \\&\quad \le \ \dfrac{\alpha }{2 + \varepsilon } \left( \alpha - 2 - \varepsilon \right) \left( 2 \alpha - 2 - \varepsilon \right) \beta _{k} + \left( \alpha - 1 - \varepsilon \right) \left( 2k + \alpha + 1 \right) \beta _{k} \end{aligned}$$

and thus it holds

$$\begin{aligned}&\ \dfrac{1}{2} s^{2} \left( k + \alpha + 1 \right) \left( k + 1 \right) \beta _{k+1} \beta _{k} \left\Vert V \left( z^{k+1} \right) \right\Vert ^{2} \nonumber \\&\qquad - \dfrac{1}{2} s^{2} \left( k + \alpha \right) k \beta _{k} \beta _{k-1} \left\Vert V \left( z^{k} \right) \right\Vert ^{2} \nonumber \\&\quad = \ \dfrac{1}{2} s^{2} \Bigl ( \left( k + \alpha + 1 \right) \left( k + 1 \right) \beta _{k+1} - \left( k + \alpha \right) k \beta _{k-1} \Bigr ) \beta _{k} \left\Vert V \left( z^{k+1} \right) \right\Vert ^{2} \nonumber \\&\qquad + \dfrac{1}{2} s^{2} \left( k + \alpha \right) k \beta _{k} \beta _{k-1} \left( \left\Vert V \left( z^{k+1} \right) \right\Vert ^{2} - \left\Vert V \left( z^{k} \right) \right\Vert ^{2} \right) \nonumber \\&\quad \le \ \dfrac{1}{2} \Bigl ( \dfrac{\alpha }{2 + \varepsilon } \left( \alpha - 2 - \varepsilon \right) \left( 2 \alpha - 2 - \varepsilon \right) \nonumber \\&\qquad + \left( \alpha - 1 - \varepsilon \right) \left( 2k + \alpha + 1 \right) \Bigr ) s^{2} \beta _{k}^{2} \left\Vert V \left( z^{k+1} \right) \right\Vert ^{2} \nonumber \\&\qquad + s^{2} \left( k + \alpha \right) k \beta _{k} \beta _{k-1} \left\langle V \left( z^{k+1} \right) , V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\rangle \nonumber \\&\qquad - \dfrac{1}{2} s^{2} \left( k + \alpha \right) k \beta _{k} \beta _{k-1} \left\Vert V \left( z^{k+1} \right) - V \left( z^{k} \right) \right\Vert ^{2} . \end{aligned}$$

(88)

After adding the relations 85–88 and by taking into consideration 84, we obtain 43.

$\square $

1.4 Proofs of the Technical Lemmas Used in the Analysis of the Explicit Algorithm

In this subsection, we provide the proofs of the two main technical lemmas used in the analysis of the explicit Fast OGDA method.

Proof of Lemma 13

Let $z_* \in {{\mathcal {Z}}}$, $0< \gamma < 2$ and $0 \le \lambda \le \alpha - 1$. First we prove that for every $k \ge 1$

$$\begin{aligned} {\mathcal {E}}_{\lambda }^{k+1} - {\mathcal {E}}_{\lambda }^{k}&= 2 \lambda \left( 2 - \alpha \right) s \left\langle z^{k+1} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad + 2 \left( \lambda + 1 - \alpha \right) \left( 2k + \alpha + 1 \right) \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} \nonumber \\&\quad + 2s \Bigl ( \!\!\bigl ( \left( 2 - \gamma \right) \lambda + \gamma - \alpha + \gamma \left( \lambda + 1 - \alpha \right) \bigr ) k + \gamma \nonumber \\&\quad - \alpha + \alpha \left( \lambda + 1 - \alpha \right) \Bigr ) \! \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad - 2 \left( 2 - \gamma \right) sk \left( k + \alpha \right) \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad + \dfrac{1}{2} \left( 2 - \alpha \right) s^{2} \left( 2 \gamma k + \alpha + \gamma \right) \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} \nonumber \\&\quad - \dfrac{1}{2} \left( 2 - \gamma \right) s^{2} k \left( 2k + \alpha \right) \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2}. \end{aligned}$$

(89)

For every $k \ge 1$, we have

$$\begin{aligned} u^{k+1}_{\lambda } := 2 \lambda \left( z^{k+1} - z_{*} \right) + 2 \left( k + 1 \right) \left( z^{k+1} - z^{k} \right) + \gamma s \left( k + 1 \right) V \left( \bar{z}^{k} \right) , \end{aligned}$$

(90)

and after substraction we deduce from 52 that

$$\begin{aligned} u^{k+1}_{\lambda } - u^{k}_{\lambda }&= 2 \left( \lambda + 1 - \alpha \right) \left( z^{k+1} - z^{k} \right) + 2 \left( k + \alpha \right) \left( z^{k+1} - z^{k} \right) - 2k \left( z^{k} - z^{k-1} \right) \nonumber \\&\quad + \gamma s V \left( \bar{z}^{k} \right) + \gamma sk \left( V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right) \nonumber \\&= 2 \left( \lambda + 1 - \alpha \right) \left( z^{k+1} - z^{k} \right) + \left( \gamma - \alpha \right) s V \left( \bar{z}^{k} \right) \nonumber \\&\quad + \left( \gamma - 2 \right) sk \left( V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right) . \end{aligned}$$

(91)

Next we recall the identities in 81 and 86

$$\begin{aligned}&\dfrac{1}{2} \left( \left\Vert u^{k+1}_{\lambda } \right\Vert ^{2} - \left\Vert u^{k}_{\lambda } \right\Vert ^{2} \right) \nonumber \\&\quad = \ \left\langle u^{k+1}_{\lambda } , u^{k+1}_{\lambda } - u^{k}_{\lambda } \right\rangle - \dfrac{1}{2} \left\Vert u^{k+1}_{\lambda } - u^{k}_{\lambda } \right\Vert ^{2} \quad \forall k \ge 1, \end{aligned}$$

(92)

$$\begin{aligned}&2 \lambda \left( \alpha - 1 - \lambda \right) \left( \left\Vert z^{k+1} - z_{*} \right\Vert ^{2} - \left\Vert z^{k} - z_{*} \right\Vert ^{2} \right) \nonumber \\&\quad = \ 4 \lambda \left( \alpha - 1 - \lambda \right) \left\langle z^{k+1} - z_{*} , z^{k+1} - z^{k} \right\rangle \nonumber \\&\quad \quad - 2 \lambda \left( \alpha - 1 - \lambda \right) \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} \quad \forall k \ge 0, \end{aligned}$$

(93)

respectively, as they are required also in the analysis of the explicit algorithm.

We first use the relations 90 and 91 to derive for every $k \ge 1$ that

$$\begin{aligned}&\left\langle u^{k+1}_{\lambda } , u^{k+1}_{\lambda } - u^{k}_{\lambda } \right\rangle \nonumber \\&\quad = \ 4 \lambda \left( \lambda + 1 - \alpha \right) \left\langle z^{k+1} - z_{*} , z^{k+1} - z^{k} \right\rangle + 2 \lambda \left( \gamma - \alpha \right) s \left\langle z^{k+1} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad \quad + 2 \lambda \left( \gamma - 2 \right) sk \left\langle z^{k+1} - z_{*} , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\qquad + 4 \left( \lambda + 1 - \alpha \right) \left( k + 1 \right) \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} \nonumber \\&\quad \quad + 2 \Bigl ( \gamma - \alpha + \gamma \left( \lambda + 1 - \alpha \right) \Bigr ) s \left( k + 1 \right) \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad \quad + 2 \left( \gamma - 2 \right) s \left( k + 1 \right) k \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\qquad + \gamma \left( \gamma - \alpha \right) s^{2} \left( k + 1 \right) \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} \nonumber \\&\quad \quad + \gamma \left( \gamma - 2 \right) s^{2} \left( k + 1 \right) k \left\langle V \left( \bar{z}^{k} \right) , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle , \end{aligned}$$

(94)

and

$$\begin{aligned}&- \dfrac{1}{2} \left\Vert u^{k+1}_{\lambda } - u^{k}_{\lambda } \right\Vert ^{2}\nonumber \\&\quad = - 2 \left( \lambda + 1 - \alpha \right) ^{2} \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} - \dfrac{1}{2} \left( \gamma - \alpha \right) ^{2} s^{2} \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} \nonumber \\&\quad \quad - \dfrac{1}{2} \left( \gamma - 2 \right) ^{2} s^{2} k^{2} \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \nonumber \\&\quad \quad - 2 \left( \lambda + 1 - \alpha \right) \left( \gamma - \alpha \right) s \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad \quad - 2 \left( \lambda + 1 - \alpha \right) \left( \gamma - 2 \right) sk \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad \quad - \left( \gamma - 2 \right) \left( \gamma - \alpha \right) s^{2} k \left\langle V \left( \bar{z}^{k} \right) , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle . \end{aligned}$$

(95)

A direct computation shows that

$$\begin{aligned}&\bigl ( \gamma - \alpha + \gamma \left( \lambda + 1 - \alpha \right) \bigr ) \left( k + 1 \right) - \left( \lambda + 1 - \alpha \right) \left( \gamma - \alpha \right) \\&\quad = \ \bigl ( \gamma - \alpha + \gamma \left( \lambda + 1 - \alpha \right) \bigr ) k + \gamma - \alpha + \alpha \left( \lambda + 1 - \alpha \right) , \end{aligned}$$

therefore, by replacing 94 and 95 into 92, we get for every $k \ge 1$

$$\begin{aligned}&\ \dfrac{1}{2} \left( \left\Vert u^{k+1}_{\lambda } \right\Vert ^{2} - \left\Vert u^{k}_{\lambda } \right\Vert ^{2} \right) \nonumber \\&\quad = \ 4 \lambda \left( \lambda + 1 - \alpha \right) \left\langle z^{k+1} - z_{*} , z^{k+1} - z^{k} \right\rangle + 2 \lambda \left( \gamma - \alpha \right) s \left\langle z^{k+1} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad \quad + 2 \lambda \left( \gamma - 2 \right) sk \left\langle z^{k+1} - z_{*} , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad \quad + 2 \left( \lambda + 1 - \alpha \right) \left( 2k + \alpha + 1 - \lambda \right) \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} \nonumber \\&\quad \quad + 2s \Bigl ( \bigl ( \gamma - \alpha + \gamma \left( \lambda + 1 - \alpha \right) \bigr ) k + \gamma - \alpha + \alpha \left( \lambda + 1 - \alpha \right) \Bigr ) \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad \quad + 2 \left( \gamma - 2 \right) sk \left( k + \alpha - \lambda \right) \! \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad \quad + \dfrac{1}{2} \left( \gamma - \alpha \right) s^{2} \left( 2 \gamma k + \alpha + \gamma \right) \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} \nonumber \\&\quad \quad + \left( \gamma - 2 \right) s^{2} k \left( \gamma k + \alpha \right) \left\langle V \left( \bar{z}^{k} \right) , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad \quad - \dfrac{1}{2} \left( \gamma - 2 \right) ^{2} s^{2} k^{2} \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} . \end{aligned}$$

(96)

Furthermore, one can show that for every $k \ge 1$ it holds

$$\begin{aligned}&\ 2 \lambda s \left( k + 1 \right) \left\langle z^{k+1} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle - 2 \lambda sk \left\langle z^{k} - z_{*} , V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad = \ 2 \lambda s \left\langle z^{k+1} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle + 2 \lambda sk \left( \left\langle z^{k+1} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle - \left\langle z^{k} - z_{*} , V \left( \bar{z}^{k-1} \right) \right\rangle \right) \nonumber \\&\quad = \ 2 \lambda s \left\langle z^{k+1} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle + 2 \lambda sk \left\langle z^{k+1} - z_{*} , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\qquad - 2 \lambda sk \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle + 2 \lambda sk \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) \right\rangle . \end{aligned}$$

(97)

and

$$\begin{aligned}&\dfrac{1}{2} s^{2} \left( k + 1 \right) \bigl ( \gamma \left( k + 1 \right) + \alpha \bigr ) \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} - \dfrac{1}{2} s^{2} k \left( \gamma k + \alpha \right) \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \nonumber \\&\quad = \ \dfrac{1}{2} s^{2} \left( 2 \gamma k + \alpha + \gamma \right) \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2}\nonumber \\&\quad \quad + \dfrac{1}{2} s^{2} k \left( \gamma k + \alpha \right) \left( \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} - \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \right) \nonumber \\&\quad = \ \dfrac{1}{2} s^{2} \left( 2 \gamma k + \alpha + \gamma \right) \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} + s^{2} k \left( \gamma k + \alpha \right) \left\langle V \left( \bar{z}^{k} \right) , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad \quad - \dfrac{1}{2} s^{2} k \left( \gamma k + \alpha \right) \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} . \end{aligned}$$

(98)

Hence, multiplying 97 and 98 by $2 - \gamma > 0$, and summing up the resulting identities with 93 and 96, we obtain 89.

(i) Let $k \ge 2$ be fixed. By the definition of ${\mathcal {F}}_{\lambda }^{k}$ in 59, we have for every $k \ge 2$

$$\begin{aligned}&{\mathcal {F}}_{\lambda }^{k+1} - {\mathcal {F}}_{\lambda }^{k} \nonumber \\&\quad = \ {\mathcal {E}}_{\lambda }^{k+1} - {\mathcal {E}}_{\lambda }^{k} - \dfrac{1}{2} \lambda \left( \alpha - 2 \right) s^{2} \left[ \left( 2 - \frac{\alpha }{k + \alpha + 1} \right) \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2}\right. \nonumber \\&\left. \quad \quad - \left( 2 - \frac{\alpha }{k + \alpha } \right) \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \right] \nonumber \\&\quad \quad - 2s \left( 2 - \gamma \right) \left[ \left( k + 1 \right) ^{2} \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) - V \left( \bar{z}^{k} \right) \right\rangle \right. \nonumber \\&\left. \quad \quad - k^{2} \left\langle z^{k} - z^{k-1} , V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \right] \nonumber \\&\quad \quad + \dfrac{1}{2} \left( 2 - \gamma \right) \alpha s^{2} \left[ \left( k+1 \right) \sqrt{k+1} \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2}\right. \nonumber \\&\left. \quad \quad - k \sqrt{k} \left\Vert V \left( \bar{z}^{k-1} \right) - V \left( \bar{z}^{k-2} \right) \right\Vert ^{2} \right] \nonumber \\&\quad \quad + \left( 2 - \gamma \right) s^{3} L \left[ \left( k+1 \right) ^{2} \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2}\right. \nonumber \\&\left. \quad \quad - k^{2} \left\Vert V \left( \bar{z}^{k-1} \right) - V \left( \bar{z}^{k-2} \right) \right\Vert ^{2} \right] . \end{aligned}$$

(99)

By using the definition of $\omega _{0}, \omega _{1}, \omega _{2}$ and $\omega _{4}$ in (60) the fact that $0 \le \lambda \le \alpha - 1$ and $0< \gamma < 2$, from 89 we obtain that for every $k \ge 1$ it holds

$$\begin{aligned}&{\mathcal {E}}_{\lambda }^{k+1} - {\mathcal {E}}_{\lambda }^{k} \nonumber \\&\quad = \ 2 \lambda \left( 2 - \alpha \right) s \left\langle z^{k+1} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\qquad - 2 \left( 2 - \gamma \right) sk \left( k + \alpha \right) \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad \quad + 2s \left( \omega _{0} k + \omega _{1} \right) \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad \quad + 2 \bigl ( \omega _{2} k + \left( \lambda + 1 - \alpha \right) \left( \alpha + 1 \right) \bigr ) \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} \nonumber \\&\quad \quad + \dfrac{1}{2} s^{2} \bigl ( \omega _{4} k + \left( 2 - \alpha \right) \left( \alpha + \gamma \right) \bigr ) \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2}\nonumber \\&\quad \quad - \dfrac{1}{2} \left( 2 - \gamma \right) s^{2} k \left( 2k + \alpha \right) \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \nonumber \\&\le \ 2 \lambda \left( 2 - \alpha \right) s \left\langle z^{k+1} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad \quad - 2 \left( 2 - \gamma \right) sk \left( k + \alpha \right) \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad \quad + 2s \left( \omega _{0} k + \omega _{1} \right) \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) \right\rangle + 2 \omega _{2} k \left\Vert z^{k+1} - z^{k} \right\Vert ^{2}\nonumber \\&\quad \quad + \dfrac{1}{2} s^{2} \omega _{4} k \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} \nonumber \\&\quad \quad - \dfrac{1}{2} \left( 2 - \gamma \right) s^{2} k \left( 2k + \alpha \right) \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2}. \end{aligned}$$

(100)

Plugging 100 into 99, it yields for every $k \ge 2$

$$\begin{aligned}&{\mathcal {F}}_{\lambda }^{k+1} - {\mathcal {F}}_{\lambda }^{k} \nonumber \\&\quad \le \ 2 \lambda \left( 2 - \alpha \right) s \left\langle z^{k+1} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad \quad - 2 \left( 2 - \gamma \right) sk \left( k + \alpha \right) \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad \quad - \dfrac{1}{2} \lambda \left( \alpha - 2 \right) s^{2} \left[ \left( 2 - \frac{\alpha }{k + \alpha + 1} \right) \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2}\nonumber \right. \\&\left. \quad \quad - \left( 2 - \frac{\alpha }{k + \alpha } \right) \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \right] \nonumber \\&\quad \quad - 2s \left( 2 - \gamma \right) \left[ \left( k + 1 \right) ^{2} \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) - V \left( \bar{z}^{k} \right) \right\rangle \nonumber \right. \\&\left. \quad \quad - k^{2} \left\langle z^{k} - z^{k-1} , V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \right] \nonumber \\&\quad \quad + \dfrac{1}{2} \left( 2 - \gamma \right) \alpha s^{2} \left[ \left( k+1 \right) \sqrt{k+1} \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2}\nonumber \right. \\&\left. \quad \quad - k \sqrt{k} \left\Vert V \left( \bar{z}^{k-1} \right) - V \left( \bar{z}^{k-2} \right) \right\Vert ^{2} \right] \nonumber \\&\quad \quad + \left( 2 - \gamma \right) s^{3} L \left[ \left( k+1 \right) ^{2} \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2}\nonumber \right. \\&\left. \quad \quad - k^{2} \left\Vert V \left( \bar{z}^{k-1} \right) - V \left( \bar{z}^{k-2} \right) \right\Vert ^{2} \right] \nonumber \\&\quad \quad + 2s \left( \omega _{0} k + \omega _{1} \right) \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) \right\rangle + 2 \omega _{2} k \left\Vert z^{k+1} - z^{k} \right\Vert ^{2}\nonumber \\&\quad \quad + \dfrac{1}{2} s^{2} \omega _{4} k \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} \nonumber \\&\quad \quad - \dfrac{1}{2} \left( 2 - \gamma \right) s^{2} \left( 2k^{2} + \alpha k \right) \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} . \end{aligned}$$

(101)

Our next aim is to derive upper estimates for the first two terms on the right-hand side of 101, which will eventually simplify the subsequent four terms. First we observe that from 55 we have for every $k \ge 1$

$$\begin{aligned}&2 \lambda \left( 2 - \alpha \right) s \left\langle z^{k+1} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad = \ 2 \lambda \left( 2 - \alpha \right) s \left\langle z^{k+1} - \bar{z}^{k} , V \left( \bar{z}^{k} \right) \right\rangle + 2 \lambda \left( 2 - \alpha \right) s \left\langle \bar{z}^{k} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad = \ \lambda \left( \alpha - 2 \right) s^{2} \left( 1 + \frac{k}{k + \alpha } \right) \left\langle V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad \quad + 2 \lambda \left( 2 - \alpha \right) s \left\langle \bar{z}^{k} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad = \ \dfrac{1}{2} \lambda \left( \alpha - 2 \right) s^{2} \left( 2 - \frac{\alpha }{k + \alpha } \right) \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \nonumber \\&\quad \quad + \dfrac{1}{2} \lambda \left( \alpha - 2 \right) s^{2} \left( 2 - \frac{\alpha }{k + \alpha } \right) \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} \nonumber \\&\quad \quad - \dfrac{1}{2} \lambda \left( \alpha - 2 \right) s^{2} \left( 2 - \frac{\alpha }{k + \alpha } \right) \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \nonumber \\&\quad \quad + 2 \lambda \left( 2 - \alpha \right) s \left\langle \bar{z}^{k} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad \le \ \lambda \left( \alpha - 2 \right) s^{2} \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \nonumber \\&\qquad + \dfrac{1}{2} \lambda \left( \alpha - 2 \right) s^{2} \left( 2 - \frac{\alpha }{k + \alpha + 1} \right) \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} \nonumber \\&\quad \quad - \dfrac{1}{2} \lambda \left( \alpha - 2 \right) s^{2} \left( 2 - \frac{\alpha }{k + \alpha } \right) \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \nonumber \\&\qquad + 2 \lambda \left( 2 - \alpha \right) s \left\langle \bar{z}^{k} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle . \end{aligned}$$

(102)

The monotonicity of V and relation 52 yield for every $k \ge 1$

$$\begin{aligned}&- 2sk \left( k + \alpha \right) \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad \le \ 2 sk \left( k + \alpha \right) \left\langle z^{k+1} - z^{k} , \Bigl ( V \left( z^{k+1} \right) - V \left( \bar{z}^{k} \right) \Bigr ) - \Bigl ( V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \Bigr ) \right\rangle \nonumber \\&\quad = \ 2 sk \left( k + \alpha \right) \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) - V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad \quad - 2 sk \left( k + \alpha \right) \left\langle z^{k+1} - z^{k} , V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad = \ 2s \left( k + 1 \right) ^{2} \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) - V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad \quad - 2 sk^{2} \left\langle z^{k} - z^{k-1} , V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad \quad + 2s \bigl ( \left( \alpha - 2 \right) k - 1 \bigr ) \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) - V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad \quad + \alpha s^{2} k \left\langle V \left( \bar{z}^{k} \right) , V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad \quad + 2s^{2} k^{2} \left\langle V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) , V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle . \end{aligned}$$

(103)

Young’s inequality together with 56 show that for every $k \ge \lceil \frac{1}{\alpha - 2} \rceil $ it holds

$$\begin{aligned}&2s \bigl ( \left( \alpha - 2 \right) k - 1 \bigr ) \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) - V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad \le \ 2 \sqrt{\left( \alpha - 2 \right) k - 1} \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} \nonumber \\&\quad \quad + \dfrac{1}{2} s^{2} \bigl ( \left( \alpha - 2 \right) k - 1 \bigr ) \sqrt{\left( \alpha - 2 \right) k - 1} \left\Vert V \left( z^{k+1} \right) - V \left( \bar{z}^{k} \right) \right\Vert ^{2} \nonumber \\&\quad \le \ 2 \sqrt{\left( \alpha - 2 \right) k} \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} \nonumber \\&\quad \quad + \dfrac{1}{2} \left( \alpha - 1 \right) \sqrt{\alpha - 1} s^{2} \left( k+1 \right) \sqrt{k+1} \left\Vert V \left( z^{k+1} \right) - V \left( \bar{z}^{k} \right) \right\Vert ^{2} \nonumber \\&\quad \le \ 2 \sqrt{\left( \alpha - 2 \right) k} \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} \nonumber \\&\quad \quad + \dfrac{1}{2} \left( \alpha - 1 \right) \sqrt{\alpha - 1} s^{4} L^{2} \left( k+1 \right) \sqrt{k+1} \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \nonumber \\&\quad \le \ 2 \sqrt{\left( \alpha - 2 \right) k} \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} \nonumber \\&\qquad + \dfrac{1}{2} \left( \alpha - 1 \right) \alpha s^{2} \left( k+1 \right) \sqrt{k+1} \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} , \end{aligned}$$

(104)

where in the second estimate we use the fact that $\left( \alpha - 2 \right) k -1 \le \left( \alpha - 1 \right) \left( k + 1 \right) $, while in the last one we combine $\sqrt{\alpha - 1} \le \alpha $ and $sL< 1/2 < 1$.

In addition, for every $k \ge 2$ it holds

$$\begin{aligned}&\alpha s^{2} k \left\langle V \left( \bar{z}^{k} \right) , V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad \le \ \dfrac{1}{2} \alpha s^{2} \sqrt{k} \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} + \dfrac{1}{2} \alpha s^{2} k \sqrt{k} \left\Vert V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \nonumber \\&\quad \le \ \dfrac{1}{2} \alpha s^{2} \sqrt{k} \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} + \dfrac{1}{2} \alpha s^{2} k \sqrt{k} \left\Vert V \left( \bar{z}^{k-1} \right) - V \left( \bar{z}^{k-2} \right) \right\Vert ^{2} , \end{aligned}$$

(105)

and, by using the Cauchy–Schwarz inequality and 56,

$$\begin{aligned}&2s^{2} k^{2} \left\langle V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) , V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad \le \ s^{3} Lk^{2} \left( \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} + \left\Vert V \left( \bar{z}^{k-1} \right) - V \left( \bar{z}^{k-2} \right) \right\Vert ^{2} \right) . \end{aligned}$$

(106)

By plugging 104 - 106 into 103 and adding then the result to 102, we get after rearranging the terms for every $k \ge k_0$

$$\begin{aligned}&2 \lambda \left( 2 - \alpha \right) s \left\langle z^{k+1} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad \quad - 2 \left( 2 - \gamma \right) sk \left( k + \alpha \right) \left\langle z^{k+1} - z^{k} , V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \nonumber \\&\quad \le \ \dfrac{1}{2} \lambda \left( \alpha - 2 \right) s^{2} \left[ \left( 2 - \frac{\alpha }{k + \alpha + 1} \right) \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2}\nonumber \right. \\&\left. \quad \quad - \left( 2 - \frac{\alpha }{k + \alpha } \right) \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \right] \nonumber \\&\quad \quad + 2s \left( 2 - \gamma \right) \left[ \left( k + 1 \right) ^{2} \left\langle z^{k+1} - z^{k} , V \left( z^{k+1} \right) - V \left( \bar{z}^{k} \right) \right\rangle \nonumber \right. \\&\left. \quad \quad - k^{2} \left\langle z^{k} - z^{k-1} , V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \right] \nonumber \\&\quad \quad - \dfrac{1}{2} \left( 2 - \gamma \right) \alpha s^{2} \left[ \left( k+1 \right) \sqrt{k+1} \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2}\nonumber \right. \\&\left. \quad \quad - k \sqrt{k} \left\Vert V \left( \bar{z}^{k-1} \right) - V \left( \bar{z}^{k-2} \right) \right\Vert ^{2} \right] \nonumber \\&\quad \quad - \left( 2 - \gamma \right) s^{3} L \left[ \left( k+1 \right) ^{2} \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2}\nonumber \right. \\&\left. \quad \quad - k^{2} \left\Vert V \left( \bar{z}^{k-1} \right) - V \left( \bar{z}^{k-2} \right) \right\Vert ^{2} \right] \nonumber \\&\quad \quad + 2 \lambda \left( 2 - \alpha \right) s \left\langle \bar{z}^{k} - z_{*} , V \left( \bar{z}^{k} \right) \right\rangle \nonumber \\&\quad \quad - \dfrac{1}{2} s^{2} \Bigl ( \mu _{k} - \left( 2 - \gamma \right) \left( 2k^{2} + \alpha k \right) \Bigr ) \left\Vert V \left( \bar{z}^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \nonumber \\&\quad \quad + 2 \left( 2 - \gamma \right) \sqrt{\left( \alpha - 2 \right) k} \left\Vert z^{k+1} - z^{k} \right\Vert ^{2} + \dfrac{1}{2} \left( 2 - \gamma \right) \alpha s^{2} \sqrt{k} \left\Vert V \left( \bar{z}^{k} \right) \right\Vert ^{2} , \end{aligned}$$

(107)

where we set

$$\begin{aligned} \mu _{k}&{:=} - 2 \lambda \left( \alpha - 2 \right) - \left( 2 - \gamma \right) \left( \left( \alpha - 1 \right) \alpha \left( k+1 \right) \sqrt{k+1} \right. \\&\quad \left. + \alpha \left( k+1 \right) \sqrt{k+1} + 4sL \left( k + 1 \right) ^{2} \right) \\&\quad + \left( 2 - \gamma \right) \left( 2k^{2} + \alpha k \right) \\&= \left( 2 - \gamma \right) \left( \left( 2 - 4sL \right) \left( k + 1 \right) ^{2} + \left( \alpha - 4 \right) k \right. \\&\quad \left. - 2 - \alpha ^{2} \left( k+1 \right) \sqrt{k+1} \right) - 2 \lambda \left( \alpha - 2 \right) \\&= \left( 2 - \gamma \right) \left( 2 \left( 1 - 2sL \right) \left( k + 1 \right) + \alpha ^{2} \sqrt{k+1} + \alpha - 4 \right) \left( k+1 \right) \\&\quad - \left( 2 - \gamma \right) \left( \alpha - 2 \right) - 2 \lambda \left( \alpha - 2 \right) . \end{aligned}$$

Finally, by summing up the relations 101 and 107, we obtain the desired estimate.

(ii) By the definition of $u_{\lambda }^{k}$ in 58 and by using the identity 32, for every $k \ge 1$ it holds

$$\begin{aligned} {\mathcal {E}}_{\lambda }^{k}&= \ \dfrac{1}{2} \left\Vert u^{k}_{\lambda } \right\Vert ^{2} + 2 \lambda \left( \alpha - 1 - \lambda \right) \left\Vert z^{k} - z_{*} \right\Vert ^{2} + 2 \left( 2 - \gamma \right) \lambda sk \left\langle z^{k} - z_{*} , V \left( \bar{z}^{k-1} \right) \right\rangle \\&\quad + \dfrac{1}{2} \left( 2 - \gamma \right) s^{2} k \left( \gamma k + \alpha \right) \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \nonumber \\&= \ \dfrac{1}{2} \left\Vert 2 \lambda \left( z^{k} - z_{*} \right) + 2k \left( z^{k} - z^{k-1} \right) + \gamma sk V \left( \bar{z}^{k-1} \right) \right\Vert ^{2}\nonumber \\&\quad + 2 \lambda \left( \alpha - 1 - \dfrac{2 \lambda }{\gamma } \right) \left\Vert z^{k} - z_{*} \right\Vert ^{2} \nonumber \\&\quad + \dfrac{1}{2} \left( 2 - \gamma \right) \alpha s^{2} k \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} + \dfrac{2 - \gamma }{2 \gamma } \left\Vert 2 \lambda \left( z^{k} - z_{*} \right) + \gamma sk V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \nonumber \\&= \ \dfrac{2 - \gamma }{2 \gamma } \left( \left\Vert 2 \lambda \left( z^{k} - z_{*} \right) + 2k \left( z^{k} - z^{k-1} \right) + \gamma sk V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \right. \\&\quad \left. + \left\Vert 2 \lambda \left( z^{k} - z_{*} \right) + \gamma sk V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \right) \\&\quad + 2 \lambda \left( \alpha - 1 - \dfrac{2 \lambda }{\gamma } \right) \left\Vert z^{k} - z_{*} \right\Vert ^{2} + \dfrac{1}{2} \left( 2 - \gamma \right) \alpha s^{2} k \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \\&\quad + \dfrac{\gamma - 1}{\gamma } \left\Vert 2 \lambda \left( z^{k} - z_{*} \right) + 2k \left( z^{k} - z^{k-1} \right) + \gamma sk V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \\&= \ \dfrac{2 - \gamma }{\gamma } \left( \left\Vert 2 \lambda \left( z^{k} - z_{*} \right) + k \left( z^{k} - z^{k-1} \right) + \gamma sk V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} + k^{2} \left\Vert z^{k} - z^{k-1} \right\Vert ^{2} \right) \\&\quad + 2 \lambda \left( \alpha - 1 - \dfrac{2 \lambda }{\gamma } \right) \left\Vert z^{k} - z_{*} \right\Vert ^{2} + \dfrac{1}{2} \left( 2 - \gamma \right) \alpha s^{2} k \left\Vert V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \\&\quad + \dfrac{\gamma - 1}{\gamma } \left\Vert 2 \lambda \left( z^{k} - z_{*} \right) + 2k \left( z^{k} - z^{k-1} \right) + \gamma sk V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} . \end{aligned}$$

Consequently, as $1< \gamma < 2$, for every $k \ge k_{1} = \lceil \frac{2 \lambda \left( \alpha - 2 \right) }{\left( 2 - \gamma \right) \alpha } \rceil $ we have

Now we use relation 56 and apply Lemma A.7 with $\left( a, b, c \right) := \Bigl ( \frac{1}{2}, -s, \frac{s}{L} \Bigr )$ to verify that for every $k \ge 1$

$$\begin{aligned}&\dfrac{1}{2} k^{2} \left\Vert z^{k} - z^{k-1} \right\Vert ^{2} - 2 sk^{2} \left\langle z^{k} - z^{k-1} , V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \\&\qquad + s^{3} L k^{2} \left\Vert V \left( \bar{z}^{k-1} \right) - V \left( \bar{z}^{k-2} \right) \right\Vert ^{2} \\&\quad \ge \ k^{2} \left( \dfrac{1}{2} \left\Vert z^{k} - z^{k-1} \right\Vert ^{2} - 2s \left\langle z^{k} - z^{k-1} , V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\rangle \right. \\&\left. \qquad + \dfrac{s}{L} \left\Vert V \left( z^{k} \right) - V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \right) \\&\quad \ge \ 0. \end{aligned}$$

Combining the last two estimates, one can easily conclude that for every $k \ge k_{1}$ it holds

$$\begin{aligned} {\mathcal {F}}_{\lambda }^{k}&\ge \ \dfrac{2 - \gamma }{\gamma } \left\Vert 2 \lambda \left( z^{k} - z_{*} \right) + k \left( z^{k} - z^{k-1} \right) + \gamma sk V \left( \bar{z}^{k-1} \right) \right\Vert ^{2} \\&\quad + \dfrac{\left( 2 - \gamma \right) ^{2}}{2 \gamma } k^{2} \left\Vert z^{k} - z^{k-1} \right\Vert ^{2} + 2 \lambda \left( \alpha - 1 - \dfrac{2 \lambda }{\gamma } \right) \left\Vert z^{k} - z_{*} \right\Vert ^{2} , \end{aligned}$$

which is the desired inequality. $\square $

Proof of Lemma 14

(i) First we notice that $2 \left( 1 - \frac{1}{\gamma } \right) = 2 - \frac{2}{\gamma } < 1$ and

$$\begin{aligned} \dfrac{1}{\gamma \left( \alpha - 2 \right) } \bigl ( \left( 2 - \gamma \right) \left( \alpha - 1 \right) + \left( \gamma - 1 \right) \left( \alpha - 2 \right) \bigr )< 1 \Leftrightarrow 1 + \dfrac{1}{\alpha - 1}< \gamma < 2 . \end{aligned}$$

This means, if $\gamma $ satisfies 62, it holds

$$\begin{aligned} \max \left\{ \sqrt{2 \left( 1 - \dfrac{1}{\gamma } \right) } , \sqrt{\dfrac{\left( 2 - \gamma \right) \left( \alpha - 1 \right) + \left( \gamma - 1 \right) \left( \alpha - 2 \right) }{\gamma \left( \alpha - 2 \right) }} \right\} < 1 , \end{aligned}$$

and thus one can choose $\delta $ to fulfill 63.

For the quadratic expression in $R_k$, we calculate

$$\begin{aligned} \dfrac{\Delta _{k}'}{s^{2}}&:= \left( \omega _{0} k + \omega _{1} \right) ^{2} - \delta ^{2} k \Bigl ( \omega _{2} \sqrt{k} + \omega _{3} \Bigr ) \Bigl ( \omega _{4} \sqrt{k} + \omega _{5} \Bigr ) \\&= \left( \omega _{0}^{2} - \delta ^{2} \omega _{2} \omega _{4} \right) k^{2} - \delta ^{2} \left( \omega _{2} \omega _{5} + \omega _{3} \omega _{4} \right) k \sqrt{k} + \left( 2 \omega _{0} \omega _{1} - \delta ^{2} \omega _{3} \omega _{5} \right) k + \omega _{1}^{2} . \end{aligned}$$

Since $\left( \omega _{0}^{2} - \delta ^{2} \omega _{2} \omega _{4} \right) k^{2}$ is the dominant term in the above polynomial, it suffices to guarantee that $\omega _{0}^{2} - \delta ^{2} \omega _{2} \omega _{4} < 0$ in order to be sure that there exits some integer $k_{2} \left( \lambda \right) \ge 1$ such that $\Delta _{k}' \le 0$ for every $k \ge k_{2} \left( \lambda \right) $ and to obtain from here, due to Lemma A.7 (ii), that $R_{k} \le 0$ for every $k \ge k_{2} \left( \lambda \right) $.

It remains to show that there exists a choice of $\lambda $ for which $\omega _{0}^{2} - \delta ^{2} \omega _{2} \omega _{4} < 0$ holds. We set $\xi := \lambda + 1 - \alpha \le 0$ and get

$$\begin{aligned} \omega _{0}&= 2 \lambda + \gamma - \alpha + \gamma \left( 1 - \alpha \right) = 2 \lambda - \alpha + \gamma \left( 2 - \alpha \right) = 2 \xi + \left( \gamma - 1 \right) \left( 2 - \alpha \right) , \\ \omega _{2} \omega _{4}&= - 4 \gamma \left( \alpha - 2 \right) \xi . \end{aligned}$$

This means that we have to guarantee that there exists a choice for $\xi $ for which

$$\begin{aligned} \omega _{0}^{2} - \delta ^{2} \omega _{2} \omega _{4}&= \Bigl ( 2 \xi - \left( \gamma - 1 \right) \left( \alpha - 2 \right) \Bigr ) ^{2} + 4 \delta ^{2} \gamma \left( \alpha - 2 \right) \xi \nonumber \\&= 4 \xi ^{2} + 4 \left( \alpha - 2 \right) \left( \delta ^{2} \gamma - \gamma + 1 \right) \xi + \left( \gamma - 1 \right) ^{2} \left( \alpha - 2 \right) ^{2} < 0 . \end{aligned}$$

(108)

A direct computation shows that, according to 63,

$$\begin{aligned} \Delta _{\xi }':= & {} 4 \left( \alpha - 2 \right) ^{2} \left[ \left( \delta ^{2} \gamma - \gamma + 1 \right) ^{2} - \left( \gamma - 1 \right) ^{2} \right] \\= & {} 4 \left( \alpha - 2 \right) ^{2} \delta ^{2} \gamma \left( \delta ^{2} \gamma - 2 \left( \gamma - 1 \right) \right) > 0. \end{aligned}$$

Hence, in order to get 108, we have to choose $\xi $ between the two roots of the quadratic function arising in this formula, in other words

$$\begin{aligned} \xi _{1} \left( \alpha , \gamma \right)&:= - \dfrac{1}{2} \left( \alpha - 2 \right) \left( \delta ^{2} \gamma - \gamma + 1 \right) - \dfrac{\sqrt{\Delta _{\xi }'}}{4} \\&< \xi = \lambda + 1 - \alpha < \xi _{2} \left( \alpha , \gamma \right) := - \dfrac{1}{2} \left( \alpha - 2 \right) \left( \delta ^{2} \gamma - \gamma + 1 \right) + \dfrac{\sqrt{\Delta _{\xi }'}}{4} . \end{aligned}$$

Obviously $\xi _{1} \left( \alpha , \gamma \right) < 0$ and from Viète’s formula $\xi _{1} \left( \alpha , \gamma \right) \cdot \xi _{2} \left( \alpha , \gamma \right) = \frac{\left( \gamma - 1 \right) ^{2} \left( \alpha - 2 \right) ^{2}}{4}$, it follows that $\xi _{2} \left( \alpha , \gamma \right) < 0$ as well.

Therefore, going back to $\lambda $, in order to be sure that $\omega _{0}^{2} - \delta ^{2} \omega _{2} \omega _{4} < 0$ this must be chosen such that

$$\begin{aligned} \alpha - 1 + \xi _{1} \left( \alpha , \gamma \right)< \lambda < \alpha - 1 + \xi _{2} \left( \alpha , \gamma \right) . \end{aligned}$$

Next we will show that

$$\begin{aligned} 0< \alpha - 1 - \dfrac{1}{2} \left( \alpha - 2 \right) \left( \delta ^{2} \gamma - \gamma + 1 \right) < \dfrac{\gamma }{2} \left( \alpha - 1 \right) . \end{aligned}$$

(109)

Indeed, the left-hand side inequality 109 is straightforward since

$$\begin{aligned} 0< \alpha - 1 - \dfrac{1}{2} \left( \alpha - 2 \right) \left( \delta ^{2} \gamma - \gamma + 1 \right) \Leftrightarrow \delta ^{2} < 1 + \dfrac{1}{\gamma } \left( 1 + \dfrac{2}{\alpha - 2} \right) . \end{aligned}$$

The right-hand side inequality 109 is equivalent to

$$\begin{aligned} \alpha - 1 - \dfrac{1}{2} \left( \alpha - 2 \right) \left( \delta ^{2} \gamma - \gamma + 1 \right)< & {} \dfrac{\gamma }{2} \left( \alpha - 1 \right) \Leftrightarrow \delta ^{2} > \dfrac{1}{\gamma \left( \alpha - 2 \right) } \bigl ( \left( 2 - \gamma \right) \left( \alpha - 1 \right) \\{} & {} \quad + \left( \gamma - 1 \right) \left( \alpha - 2 \right) \bigr ), \end{aligned}$$

which is true according to 63.

From 109, we immediately deduce that

$$\begin{aligned} 0< \alpha - 1 + \xi _{2} \left( \alpha , \gamma \right) \quad \text { and } \quad \alpha - 1 + \xi _{1} \left( \alpha , \gamma \right) < \dfrac{\gamma }{2} \left( \alpha - 1 \right) , \end{aligned}$$

which allow us to choose

$$\begin{aligned} {\underline{\lambda }} \left( \alpha , \gamma \right):= & {} \max \left\{ 0, \alpha - 1 + \xi _{1} \left( \alpha , \gamma \right) \right\} < {\overline{\lambda }} \left( \alpha , \gamma \right) \\:= & {} \min \left\{ \dfrac{\gamma }{2} \left( \alpha - 1 \right) , \alpha - 1 + \xi _{2} \left( \alpha , \gamma \right) \right\} . \end{aligned}$$

In conclusion, choosing $\lambda $ to satisfy ${\underline{\lambda }} \left( \alpha , \gamma \right)< \lambda < {\overline{\lambda }} \left( \alpha , \gamma \right) $, we have $\omega _{0}^{2} - \delta ^{2} \omega _{2} \omega _{4} < 0$ and therefore there exists some integer $k_{2} \left( \lambda \right) \ge 1$ such that $R_k \le 0$ for every $k \ge k_{2} \left( \lambda \right) $.

(ii) For every $k \ge 1$, we have

$$\begin{aligned}&\mu _{k} - \left( 2 - \gamma \right) \left( 1 - 2sL \right) \left( k+1 \right) ^{2}\\&\quad = \left( 2 - \gamma \right) \left( 1 - 2sL \right) \left( k+1 \right) ^{2} + \left( 2 - \gamma \right) \alpha ^{2} \left( k+1 \right) \sqrt{k+1}\\&\qquad + \left( 2 - \gamma \right) \left( \alpha - 4 \right) \left( k+1 \right) - \left( 2 - \gamma \right) \left( \alpha - 2 \right) - 2 \lambda \left( \alpha - 2 \right) , \end{aligned}$$

and the conclusion is obvious since $\gamma <2$ and $s < \dfrac{1}{2\,L}$. $\square $

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Boţ, R.I., Csetnek, E.R. & Nguyen, DK. Fast Optimistic Gradient Descent Ascent (OGDA) Method in Continuous and Discrete Time. Found Comput Math (2023). https://doi.org/10.1007/s10208-023-09636-5

Download citation

Received: 25 March 2022
Revised: 16 June 2023
Accepted: 11 September 2023
Published: 29 November 2023
DOI: https://doi.org/10.1007/s10208-023-09636-5

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Fast Optimistic Gradient Descent Ascent (OGDA) Method in Continuous and Discrete Time

Abstract

Similar content being viewed by others

Relaxed Inertial Method for Solving Split Monotone Variational Inclusion Problem with Multiple Output Sets Without Co-coerciveness and Lipschitz Continuity

Tseng’s extragradient method with double projection for solving pseudomonotone variational inequality problems in Hilbert spaces

Random Gradient-Free Minimization of Convex Functions

1 Introduction

1.1 Related Works

1.2 Our Contributions

Remark 1

Remark 2

2 The Continuous Time Approach

Lemma 3

Proof

Theorem 4

Proof

Theorem 5

Theorem 6

Proof

Theorem 7

Proof

Remark 8

3 An Implicit Numerical Algorithm

Lemma 9

Theorem 10

Proof

Theorem 11

Proof

Theorem 12

Proof

4 An Explicit Algorithm

Lemma 13

Lemma 14

Theorem 15

Proof

Theorem 16

Proof

Theorem 17

Proof

5 Numerical Experiments

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix

Appendix

1.1 Auxiliary Results

Lemma A.1

Lemma A.2

Lemma A.3

Lemma A.4

Lemma A.5

Proof

Lemma A.6

Lemma A.7

1.2 Proof of the Existence and Uniqueness Theorem for the Evolution Equation

Proof of Theorem 5

1.3 Proof of the Technical Lemma Used in the Analysis of the Implicit Algorithm

Proof of Lemma 9

1.4 Proofs of the Technical Lemmas Used in the Analysis of the Explicit Algorithm

Proof of Lemma 13

Proof of Lemma 14

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation