1 Introduction

The mathematical model The parabolic variational inequality of this paper describes the flow of a Bingham fluid, which behaves as a rigid material, if the stress is below a certain threshold. If the flow is sufficiently slow (i.e., if the convective term can be omitted for small Reynolds numbers), the inertial terms can be neglected, and, in the case of a uni-directional flow, i.e., the flow in a pipe of cross-section \(\Omega \subset {\mathbb {R}}^2\), the problem reads as follows. Given a viscosity \(\mu >0\), a yield limit \(g\ge 0\), some right-hand side \(f\in L^2(0,T;L^2(\Omega ))\) and initial data \(u_0\in L^2(\Omega )\) in a bounded polygonal Lipschitz domain \(\Omega \subset {\mathbb {R}}^2\), the time-dependent Mosolov or Bingham problem for the flow in a pipe seeks \(u\in L^2(0,T;H^1_0(\Omega ))\) with a time derivative \({\dot{u}}\in L^2(0,T;H^{-1}(\Omega ))\) such that \(u(0)=u_0\) and, at a.e. \(t\in [0,T]\), any \(v\in H^1_0(\Omega )\) satisfies

figure a

Here and throughout the paper, \(\nabla \) denotes the gradient with respect to the space variable, \({\dot{v}}=(d/dt) v\) denotes the time-derivative of a function \(v:[0,T]\rightarrow X\), and \(\langle a,b\rangle _{H^{-1}(\Omega )\times H^1_0(\Omega )}\) denotes the dual pairing of \(a\in H^{-1}(\Omega )\) and \(b\in H^1_0(\Omega )\). Although this paper mainly considers uni-directional flows, all of the results are also valid for the general flow in a three-dimensional domain as outlined in Sect. 5.

Discretisations in space Conforming finite element methods (FEM) for the discretisation in space suffer from the non-differentiable \(L^1\) norm, which results in a suboptimal predicted convergence \(O(h^{1/2})\) for general smooth solutions and in a convergence \(O(h\sqrt{\log (1/h)})\) for a particular prototype example of a circular domain [16]. The traditional analysis for linear problems via the Strang-Fix lemma for non-conforming schemes relies qualitatively on the smoothness of the exact solution. This has been overcome recently by a medius analysis, which employs techniques from a posteriori analysis to prove a priori results [7, 9, 12, 18]. Quasi-optimal error estimates for the stationary Bingham problem were proved for a non-conforming FEM [11] based on the finite element space of Crouzeix and Raviart [10] whenever a stress-like variable belongs to \(H^1(\Omega )\). This results in the optimal convergence O(h), whenever the solution and a corresponding stress-like variable are smooth. The only other discretisations in space with proved optimal convergence rates were those of [14] (which is equivalent to the discretisation of [11]) and the mixed discretisations of the very recent paper [15]. The generalisation to the three-dimensional case is still open [15] and the combination of it with a time-discretisation is left for future research.

Time discretisation This paper combines the discretisation of [11] in space with two time-discretisations of (tB): A generalised midpoint rule and a discontinuous Galerkin discretisation.

The mathematical analysis of this paper is twofold: The dG(0) scheme is studied under the relatively weak assumption of consistent initial data and a source term \(f\in H^1(0,T;L^2(\Omega ))\) that guarantees a weak solution \(u\in H^1(0,T;L^2(\Omega ))\cap L^2(0,T;H^1_0(\Omega ))\). This results in a discrete approximation \(u_\textrm{DG}\in P_0({\mathcal {K}};\textrm{CR}^1_0({\mathcal {T}}))\) that is piecewise constant in time with minimal time-step \({\underline{k}}\) and maximal time-step k and a Crouzeix-Raviart function in space based on a shape-regular triangulation \({\mathcal {T}}\) with maximal mesh-size h with the plain convergence

$$\begin{aligned} \Vert \nabla _{\textrm{NC}}(u-u_\textrm{DG})\Vert _{L^2(Q)} + \Vert u-u_\textrm{DG}\Vert _{L^\infty (L^2)} \rightarrow 0 \qquad \text {as }(h,k)\rightarrow 0 \end{aligned}$$
(1.1)

under the (mild) constraint \(h\lesssim \sqrt{{\underline{k}}}\). In fact, according to the best knowledge of the authors (and seemingly that of our anonymous referees) this is the first paper that proves convergence under realistic regularity assumptions on the weak solution. In the absence of further regularity of the weak solution u, any convergence rate in (1.1) remains unclear. This paper identifies certain best-approximation terms of \(\nabla u\) and a stress-type variable \(\sigma \in \partial W(\nabla u)\) (derived in Theorem 2.3) plus data oscillation terms of f as an upper bound for the errors in (1.1). The backward Euler scheme is a modified dG(0) scheme with a slightly different treatment of the source term. In this low regularity regime, there are no other convergence results for the generalised midpoint scheme. In all numerical examples below f is piecewise constant and so \(u_\textrm{DG}=u_h\) (for \(\theta _\ell =1\) in the generalised mid-point rule with \(\tau _\ell =t_\ell \) for the backward Euler scheme). The second look at the class of time-discretisations hence solely concerns the generalised mid-point rules under the (possibly) unrealistic regularity assumptions \(u\in W^{m,1}(0,T;L^2(\Omega )) \cap C(0,T;H^1_0(\Omega ))\) and \(\sigma \in C(0,T;L^2(\Omega ;{\mathbb {R}}^2))\) that allow for a pointwise evaluation of the variational inequality at the prescribed times \(\tau _\ell \) for \(m=2,3\). In the second analysis for smooth solutions, the convergence rates are \(O(h+k)\) and for a uniform mid-point rule \(O(h+k^2)\). This improves the known result for a conforming FEM of [21].

Although discontinuous Galerkin methods for parabolic equations are already included in textbooks [19], to the best of the authors knowledge only [2, 8] define a discontinuous Galerkin FEM for a time-dependent variational inequality.

An overview over some time-stepping schemes for problem (tB) can be found in [13, 17]; see also the references therein. However, proofs of convergence rates are rare and only concern the combination of continuous linear finite elements in space with backward differences in time, which result in a convergence \(O(h^{1/2}+k)\) in [21] under the much stronger regularity assumptions \(u\in L^\infty (0,T;H^2(\Omega ))\), \({\dot{u}}\in L^2(0,T;H^1(\Omega ))\cap L^\infty (0,T;L^2(\Omega ))\), and \(\ddot{u}\in L^2(0,T;H^{-1}(\Omega ))\). This suboptimal convergence in space should be contrasted with the optimal convergence rate obtained in this paper, which is a consequence of the non-conforming discretisation in space.

Structure of the paper The paper is organised as follows. Section 2 specifies notation and states the main results in Sect. 2.4. The proofs of the error estimates for the generalised midpoint rule follow in Sect. 3, while the error estimate for the discontinuous Galerkin FEM is proved in Sect. 4. Section 5 generalises the results to the Bingham flow in a three-dimensional domain. Section 6 concludes the paper with numerical experiments.

General notation Standard notation on Lebesgue and Sobolev spaces applies throughout the paper and the space of \(L^2\) functions with values in \({\mathbb {R}}^m\) reads \(L^2(\Omega ;{\mathbb {R}}^m)\) and \(\int _\Omega \bullet \,dx\) denotes an integral over a domain \(\Omega \subset {\mathbb {R}}^2\) in space, while \(\int _{\widetilde{Q}} \bullet \,dQ\) denotes an integral over a domain \(\widetilde{Q}=(t_1,t_2)\times \Omega \) in space and time and \(\left\| \bullet \right\| _{L^2(\Omega )}\) and \(\left\| \bullet \right\| _{L^2(\widetilde{Q})}\) denote the respective \(L^2\) norms. The notation \(\bullet \) abbreviates the identity mapping. The space \(L^2(t_0,t_1;X)\) denotes the space of \(L^2\) functions \(v:[t_0,t_1]\rightarrow X\) for any linear space X, the space \(W^{m,p}(t_0,t_1;X)\) denotes the space of m-times (weakly) differentiable \(L^p(t_0,t_1;X)\) functions, whose derivatives belong to \(L^p(t_0,t_1;X)\), while \(C(t_0,t_1;X)\) denotes the space of continuous functions from \([t_0,t_1]\) to X. We abbreviate \(\Vert \bullet \Vert _{L^p(L^q)}:=\Vert \bullet \Vert _{L^p(0,T;L^q(\Omega ))}\). The formula \(A\lesssim B\) abbreviates that there exists a positive generic mesh-size independent constant \(C>0\) such that \(A\le C B\), \(A\approx B\) abbreviates \(A\lesssim B\lesssim A\).

2 Notation and main results

This section defines some notation in Sect. 2.1 and recalls two operators for the \(P_1\) non-conforming finite element space in Sect. 2.2, before it defines the discrete problems in Sect. 2.3 and states the main results in Sect. 2.4.

2.1 Notation

A shape-regular triangulation \({\mathcal {T}}\) of a polygonal bounded Lipschitz domain \(\Omega \subset {\mathbb {R}}^n\) is a set of simplices such that \({\overline{\Omega }}=\bigcup {\mathcal {T}}\) and any two distinct simplices are either disjoint or share exactly one common face, edge or vertex. Let \({\mathcal {E}}\) denote the set of sides in \({\mathcal {T}}\) and \({\mathcal {N}}\) the set of vertices. Let \(P_p(K;{\mathbb {R}}^m)\) denote the set of polynomials on K of total degree \(\le p\) and \(P_p({\mathcal {T}};{\mathbb {R}}^m)\) the set of piecewise polynomials, i.e.,

$$\begin{aligned} \begin{array}{rl} P_p(K;{\mathbb {R}}^m) &{}:= \{v:K\rightarrow {\mathbb {R}}^m\;|\;\forall \ell =1,\ldots ,m,\text { the component}\\ &{}\qquad \qquad \;v_\ell \text { of }v \text { is a polynomial of total degree}\le p\},\\ P_p({\mathcal {T}};{\mathbb {R}}^m) &{}:= \{v:\Omega \rightarrow {\mathbb {R}}^m\;| \; \forall K\in {\mathcal {T}},v|_K \in P_p(K;{\mathbb {R}}^m)\} \end{array} \end{aligned}$$

and let \(P_p({\mathcal {T}}):=P_p({\mathcal {T}};{\mathbb {R}})\). Define the space of continuous piecewise affine functions with homogeneous boundary conditions by

$$\begin{aligned} S^1_0({\mathcal {T}}):=P_1({\mathcal {T}})\cap C_0(\Omega ), \end{aligned}$$

where \(C_0(\Omega )\) denotes the space of continuous functions with homogeneous boundary conditions on \(\partial \Omega \).

For an edge \(E\in {\mathcal {E}}\), \(\textrm{mid}(E)\) denotes the midpoint of E. The integral mean over a domain \(\omega \subseteq \Omega \) or \(\omega \subseteq [0,T]\) is defined as , where \(|\omega |\) denotes the volume of \(\omega \) or the length of \(\omega \), if it is an edge or an interval. The \(L^2\)-projection onto \({\mathcal {T}}\)-piecewise constant functions or vectors \({\Pi _0:L^2(\Omega ;{\mathbb {R}}^m)\rightarrow P_0({\mathcal {T}};{\mathbb {R}}^m)}\) is defined by for all \(K\in {\mathcal {T}}\) with area \(\vert K\vert \) and all \(f\in L^2(\Omega ;{\mathbb {R}}^m)\). Let \(h_{\mathcal {T}}\in P_0({\mathcal {T}})\) denote the piecewise constant mesh-size with \(h_{\mathcal {T}}\vert _K:={\textrm{diam}}(K)\) for all \(K\in {\mathcal {T}}\). The square of the oscillations of a function \(f\in L^2(\Omega )\) is defined by \(\textrm{osc}^2(f,{\mathcal {T}}):=\Vert h_{\mathcal {T}}(f-\Pi _0 f)\Vert _{L^2(\Omega )}^2\). For piecewise affine functions \(v_h\in P_1({\mathcal {T}})\) the \({\mathcal {T}}\)-piecewise gradient \(\nabla _{\textrm{NC}}v_h\) with \((\nabla _{\textrm{NC}}v_h)\vert _K=\nabla (v_h\vert _K)\) for all \(K\in {\mathcal {T}}\) exists and \(\nabla _{\textrm{NC}}v_h\in P_0({\mathcal {T}};{\mathbb {R}}^{n})\).

The \(P_1\) non-conforming finite element space [10], sometimes named after Crouzeix and Raviart, reads

$$\begin{aligned} \textrm{CR}^1_0({\mathcal {T}}) :=\{v_\textrm{CR}\in P_1({\mathcal {T}})\;\vert \;v_\textrm{CR}\text { is continuous at midpoints of interior}\\ \text {edges and vanishes at midpoints of boundary edges}\}. \end{aligned}$$

2.2 Interpolation and companion operators

This subsection recalls the non-conforming interpolation operator and a companion operator for Crouzeix-Raviart functions in 2D.

Non-conforming interpolation operator The non-conforming interpolation operator \(I_{\textrm{NC}}:H^1_0(\Omega )\rightarrow \textrm{CR}^1_0({\mathcal {T}})\) is defined by

and satisfies the integral mean property

$$\begin{aligned} \nabla _{\textrm{NC}}I_{\textrm{NC}}v= \Pi _0\nabla v \end{aligned}$$

and the approximation and stability estimates

$$\begin{aligned} \begin{aligned} \Vert h_{\mathcal {T}}^{-1}(v-I_{\textrm{NC}}v)\Vert _{L^2(\Omega )}&\le C_{\textrm{NC}}\Vert \nabla _{\textrm{NC}}(v-I_{\textrm{NC}}v)\Vert _{L^2(\Omega )}\\&= C_{\textrm{NC}}\Vert \nabla _{\textrm{NC}}v-\Pi _0\nabla _{\textrm{NC}}v\Vert _{L^2(\Omega )} \end{aligned} \end{aligned}$$
(2.1)

with constant \(C_{\textrm{NC}}=0.4396\) [5, Theorem 2.1].

Companion operators The design of three conforming companions to any \(v_h\in \textrm{CR}^1_0({\mathcal {T}})\) begins with the map \(J_1: \textrm{CR}^1_0({\mathcal {T}})\rightarrow S^1_0({\mathcal {T}})\). Given a vertex \(z\in {\mathcal {N}}\), let \({\mathcal {T}}(z):=\{T\in {\mathcal {T}}\mid z\in {\mathcal {T}}\}\) denote the set of triangles that contain z. Then \(J_1\) is defined by

$$\begin{aligned} J_1 v_h:=\sum _{z\in {\mathcal {N}}(\Omega )}\textrm{card}({\mathcal {T}}(z))^{-1} \sum _{T\in {\mathcal {T}}(z)} v_h|_T(z)\,\varphi _z \qquad \text {for all }v_h\in \textrm{CR}^1_0({\mathcal {T}}) \end{aligned}$$

with the conforming nodal basis function \(\varphi _z\in S^1_0({\mathcal {T}})\). This operator is also known as enriching operator in the context of fast solvers [3]. For a given edge \(E:={\textrm{conv}}\{a,b\}\in {\mathcal {E}}\) let \(b_E:=6\varphi _a\varphi _b\) denote the edge bubble function. Then the operator \(J_2:\textrm{CR}^1_0({\mathcal {T}})\rightarrow P_2({\mathcal {T}})\cap C_0(\Omega )\) is given by

For any triangle \(T\in {\mathcal {T}}\) with \(T={\textrm{conv}}\{a,b,c\}\) define the element bubble function \(b_T:=60\varphi _a\varphi _b\varphi _c\). The operator \(J_3:\textrm{CR}^1_0({\mathcal {T}})\rightarrow P_3({\mathcal {T}})\cap C_0(\Omega )\) is given by

(2.2)

Lemma 2.1

( [6]) The operators \(J_m:\textrm{CR}^1_0({\mathcal {T}})\rightarrow (P_m({\mathcal {T}})\cap C_0(\Omega ))\), \(m=1,2,3\), defined above satisfy the approximation and stability properties

$$\begin{aligned} \begin{aligned}&\Vert h_\mathcal {T}^{-1}({v_{h}}-{J_{m}} {v_{h}})\Vert _{L^2(\Omega )} +\Vert \nabla _{{\textrm{NC}}}({v_{h}}-{J_{m}} {v_{h}})\Vert _{L^2(\Omega )}\\&\qquad \qquad \le C_{\textrm{comp}} \min _{\varphi \in H^1_0(\Omega )} \Vert \nabla _{{\textrm{NC}}}({v_{h}}-\varphi )\Vert _{L^{2}(\Omega )} \le C_{\textrm{comp}}\Vert \nabla _\textrm{NC} v_h\Vert _{L^{2}(\Omega )} \end{aligned} \end{aligned}$$
(2.3)

and the conservation properties

$$\begin{aligned} \int \limits _T\nabla _{\textrm{NC}}(v_h - J_m v_h)\,dx&= 0{} & {} \quad \text {for all } T\in {\mathcal {T}}\text { and } m=2,3, \end{aligned}$$
(2.4)
$$\begin{aligned} \int \limits _T(v_h-J_3 v_h)\,dx&=0\quad{} & {} \quad \text {for all }T\in {\mathcal {T}}. \end{aligned}$$
(2.5)

2.3 Problem formulation

Given a yield limit \(g>0\) and \(f\in L^2(0,T;L^2(\Omega ))\), define the functionals

$$\begin{aligned} j(q)&:=g\int \limits _\Omega |q|\,dx{} & {} \quad \text {for all }q\in L^2(\Omega ;{\mathbb {R}}^2)\,, \\ F(t,v)&:=\int \limits _\Omega f(t) v\,dx{} & {} \quad \text {for all }v\in H^1_0(\Omega )\text { and a.e. }\,t\in (0,T). \end{aligned}$$

Then j is convex and lower semi continuous.

The following problem admits a unique solution [17, Chapter III, Theorem 6.1].

Definition 2.2

(Problem (tB)) Given initial data \(u_0\in L^2(\Omega )\), the time-dependent Mosolov problem in two dimensions seeks \(u\in L^2(0,T;H^1_0(\Omega ))\cap H^{1}(0,T;H^{-1}(\Omega ))\) such that \(u(0) = u_0\) and, for a.e. \(t\in (0,T)\) and all \(v\in H^1_0(\Omega )\), it holds that

figure b

The following theorem proves the existence of some stress-like variable \(\sigma \). Furthermore, it proves regularity of the solution u of (tB) under smoothness assumptions on f and \(u_0\). In the stationary case, the existence of the stress-like variable \(\sigma \) is proved in [17, Chapter II, Theorem 6.3].

Theorem 2.3

(Regularity result) There exists a stress-like variable \(\sigma \in L^2(0,T;\) \(L^2(\Omega ;{\mathbb {R}}^2))\) that, at a.e. \(t\in (0,T)\), satisfies

$$\begin{aligned} \sigma (t)\in \partial W(\nabla u(t)) \quad \text { and }\quad F(t,\bullet ) - {\dot{u}}+{\text {div}}\sigma (t)=0\in H^{-1}(\Omega ), \end{aligned}$$
(2.6)

where \(W(A):=(\mu /2)|A|^2 + g |A|\) for all \(A\in {\mathbb {R}}^2\).

Suppose \(f\in H^1(0,T;L^2(\Omega ))\) and consistent initial values in the sense that \(u_0\in H^1_0(\Omega )\) and there exists some \(w\in L^2(\Omega )\) such that, for all \(v\in H^1_0(\Omega )\),

$$\begin{aligned} \int \limits _\Omega w (v-u_0)\,dx \le \int \limits _\Omega \nabla u_0\cdot \nabla (v-u_0)\,dx + j(\nabla v)-j(\nabla u_0). \end{aligned}$$
(2.7)

Then the solution u to (tB) satisfies \(u\in H^1(0,T;L^2(\Omega ))\).

Proof

The proof of the existence of the stress-like variable \(\sigma \) follows similar to the stationary case with a regularisation argument as in [17, Chapter II, Theorem 6.3]: The regularisation

$$\begin{aligned} W_\varepsilon (A):= (\mu /2) |A|^2 + g \sqrt{\varepsilon ^2 + |A|^2} \end{aligned}$$

of W leads to a stress \(\sigma _\varepsilon :=D W_\varepsilon (\nabla u_\varepsilon )\in L^2(0,T;L^2(\Omega ))\) with the solution \(u_\varepsilon \in L^2(0,T;H^1_0(\Omega ))\) of (tB) with \(j(\nabla v) - j(\nabla u(t))\) replaced by

$$\begin{aligned} \int \limits _\Omega \big (\sqrt{\varepsilon ^2 + |\nabla v|^2} - \sqrt{\varepsilon ^2 + |\nabla u(t)|^2}\big )\,dx. \end{aligned}$$

Since \(|\sigma _\varepsilon (t) - \mu \nabla u_\varepsilon (t)|\le 1\) for a.e. \(t\in (0,T)\) almost everywhere in \(\Omega \), there exists a weak limit \(\sigma \in L^2(0,T;L^2(\Omega ))\) of a subsequence. The claimed properties of \(\sigma \) then follow as in the proof of [17, Chapter II, Theorem 6.3].

The proof of the second part follows from [20, Proposition 55.5] with \(V=H^1_0(\Omega )\), \(H=L^2(\Omega )\), \(a(\bullet ,\bullet )=\mu \int _\Omega \nabla \bullet \cdot \nabla \bullet \,dx\), \(\varphi =j\) and \(b=f\). \(\square \)

For the discretisation in time, consider the partition \({\mathcal {K}}=(t_0,\dots ,t_N)\) of the time-interval (0, T) with subordinated evaluation points \(\tau _1,\dots ,\tau _N\) for the generalised midpoint rule with

$$\begin{aligned} 0=t_0<\tau _1\le t_1<\tau _2\le t_2<\dots \le t_{N-1}<\tau _N\le t_N=T. \end{aligned}$$

Let \(k_\ell :=t_\ell -t_{\ell -1}\) for all \(\ell =1,\dots ,N\) and \(0<\theta _\ell \le 1\) such that \(\tau _\ell =\theta _\ell t_\ell +(1-\theta _\ell )t_{\ell -1}\). For \(0\le s_0< s_1\le T\), let \(P_p(s_0,s_1;X)\) denote the space of polynomials from the interval \([s_0,s_1]\) with values in X of degree \(\le p\) and define the discrete spaces

$$\begin{aligned} P_p({\mathcal {K}};X)&:=\left\{ v_h\in L^2(0,T;X)\left| \begin{array}{l} \forall \ell =1,\dots ,N,\\ v_h\vert _{[t_{\ell -1},t_\ell ]}\in P_p(t_{\ell -1},t_\ell ;X) \end{array} \right. \right\} ,\\ S^1({\mathcal {K}};X)&:=P_1({\mathcal {K}};X)\cap C(0,T;X). \end{aligned}$$

Let \(U_0\in \textrm{CR}^1_0({\mathcal {T}})\) be an approximation to \(u_0\).

Definition 2.4

(Generalised midpoint rule) The generalised midpoint rule seeks \(u_h\in S^1({\mathcal {K}};\textrm{CR}^1_0({\mathcal {T}}))\) such that \(u_h(0)=U_0\) and, for all \(\ell =1,\dots ,N\) and for all \(v_h\in \textrm{CR}^1_0({\mathcal {T}})\), it holds

figure c

The generalised midpoint rule requires the evaluation of the right-hand side \(F(\tau _\ell ,\bullet )\) at \(\tau _\ell \) and, hence, it requires \(f\in C(0,T;L^2(\Omega ))\), while the subsequent discontinuous Galerkin discretisation merely involves an integrability \(f\in L^2(0,T;L^2(\Omega ))\). For a piecewise smooth (in time) function \(v_\textrm{DG}\), define the left and right limit and the jump at \(t_\ell \) by

$$\begin{aligned} (v_\textrm{DG})_{\ell }^+&:=\lim _{s\searrow t_\ell } v_\textrm{DG}(s); \quad (v_\textrm{DG})_{\ell }^-:=\lim _{s\nearrow t_\ell } v_\textrm{DG}(s); \quad [v_\textrm{DG}]_{\ell }&:= (v_\textrm{DG})_\ell ^+ - (v_\textrm{DG})_\ell ^-. \end{aligned}$$

Furthermore, define the time-space cylinders \(Q_\ell :=I_\ell \times \Omega \), where \(I_\ell :=(t_{\ell -1},t_\ell )\).

Definition 2.5

(dG(0) FEM) The dG(0) FEM seeks \(u_\textrm{DG}\in P_0({\mathcal {K}};\textrm{CR}^1_0({\mathcal {T}}))\) such that \(u_\textrm{DG}(0)=U_0\) and, for all \(\ell =1,\dots , N\) and for all \(v_\textrm{DG}\in P_0({\mathcal {K}};\textrm{CR}^1_0({\mathcal {T}}))\), it holds

figure d

Proposition 2.6

There exist unique solutions to problems (tB\(_h\)) and (dG(0)).

Proof

The existence of solutions to problems (tB\(_h\)) and (dG(0)) follows in every time step from the equivalence to the minimization problems

$$\begin{aligned} u_\textrm{DG}\vert _{I_\ell }&=\mathop {\text {argmin}}\limits _{v_\textrm{CR}\in \textrm{CR}^1_0({\mathcal {T}})} E_\textrm{DG}(v_\textrm{CR},I_\ell ),\\ u_h(\tau _\ell )&= \mathop {\text {argmin}}\limits _{v_\textrm{CR}\in \textrm{CR}^1_0({\mathcal {T}})} E_{\textrm{tB}}(v_\textrm{CR},I_\ell ,\theta _\ell ) \end{aligned}$$

with discrete energies

$$\begin{aligned} E_\textrm{DG}(v_\textrm{CR},I_\ell )&:= (\mu k_\ell /2)\Vert \nabla _{\textrm{NC}}v_\textrm{CR}\Vert _{L^2(\Omega )}^2 + (1/2) \Vert v_\textrm{CR}\Vert _{L^2(\Omega )}^2 + k_\ell j(\nabla _{\textrm{NC}}v_\textrm{CR})\\&\quad - \int \limits _\Omega (k_\ell {\overline{f}}\vert _{I_\ell } +u_\textrm{CR}\vert _{I_{\ell -1}}) v_\textrm{CR}\,dx,\\ E_\textrm{tB}(v_\textrm{CR},I_\ell ,\theta _\ell )&:= \frac{\mu k_\ell }{2}\Vert \nabla _{\textrm{NC}}v_\textrm{CR}\Vert _{L^2(\Omega )}^2 + \frac{1}{2\theta _\ell } \Vert v_\textrm{CR}\Vert _{L^2(\Omega )}^2 + k_\ell j(\nabla _{\textrm{NC}}v_\textrm{CR})\\&\quad - \int \limits _\Omega \left( k_\ell f(\tau _\ell ) +\frac{1}{\theta _\ell }u_\textrm{CR}(t_{\ell -1})\right) v_\textrm{CR}\,dx, \end{aligned}$$

where . The uniqueness follows from the strong convexity of the energies. \(\square \)

2.4 Main results

This section states the error estimates for the generalised midpoint rule and the discontinuous Galerkin scheme. The proofs follow in Sects. 34.

The first two theorems state error estimates for the generalised midpoint rule (tB\(_h\)). The first theorem proves an error estimate for the case that \(\theta _\ell >1/2\) for all \(\ell =1,\dots ,N\). Let \(e_\ell :=u(t_\ell )-u_h(t_\ell )\) and \(e(\tau _\ell ):=u(\tau _\ell )-u_h(\tau _\ell )\) and recall the definition of \(\sigma \) from Theorem 2.3.

Theorem 2.7

Assume that \(f\in C(0,T;L^2(\Omega ))\), \(u_0\in H^1(\Omega )\), \(u\in W^{2,1}(0,T;\) \(L^2(\Omega ))\cap C(0,T;H^1_0(\Omega ))\), and \(\sigma \in C(0,T;L^2(\Omega ;{\mathbb {R}}^2))\) and define \(\sigma _\ell :=\sigma (\tau _\ell )\). Suppose that there exists \(\alpha \le 1\) and \(\beta <1\) with \((1-\theta _{\ell -1}) k_{\ell -1}\le \alpha \theta _\ell k_\ell \) and \((1-\theta _\ell )\le \beta \theta _\ell \) (i.e., \(\theta _\ell >1/2\)), and let \(\delta :=(2-\alpha -\beta )/2>0\). Then it holds that

$$\begin{aligned}&\max _{\ell =0,\dots ,N} \Vert e_\ell \Vert _{L^2(\Omega )}^2 + \sum _{\ell =1}^N \Big (\theta _\ell - \frac{1}{2}\Big ) \Vert e_\ell -e_{\ell -1}\Vert _{L^2(\Omega )}^2 + \mu \sum _{\ell =1}^N k_\ell \Vert \nabla _{\textrm{NC}}e(\tau _\ell )\Vert _{L^2(\Omega )}^2\\&\; \lesssim \Vert e_0\Vert _{L^2(\Omega )}^2 + \Vert k\ddot{u}\Vert _{L^1(L^2)}^2 \\&\qquad + \sum _{\ell =1}^N \frac{k_\ell }{\mu } \Big (\Vert \sigma _\ell -\Pi _0\sigma _\ell \Vert _{L^2(\Omega )}^2 + \textrm{osc}^2(f(\tau _\ell ),{\mathcal {T}}) + \Vert h_{\mathcal {T}}{\dot{u}}(\tau _\ell )\Vert _{L^2(\Omega )}^2\Big )\\&\qquad + \frac{\Vert h_{\mathcal {T}}\Vert _{L^\infty (\Omega )}^2}{\delta \mu } \bigg (g\Vert \nabla _{\textrm{NC}}u_h(0)\Vert _{L^1(\Omega )} + (\mu /2) \Vert \nabla _{\textrm{NC}}u_h(0)\Vert _{L^2(\Omega )}^2 \\&\qquad + \frac{1}{\delta } \sum _{\ell =1}^N \theta _\ell k_\ell \Vert f(\tau _\ell )\Vert _{L^2(\Omega )}^2\bigg ). \end{aligned}$$

The following theorem asserts quadratic convergence in time for the uniform midpoint rule.

Theorem 2.8

(Error estimate for uniform midpoint rule) Assume that \(f\in C(0,T;L^2(\Omega ))\), \(u\in W^{3,1}(0,T;L^2(\Omega ))\cap C(0,T;H^1_0(\Omega ))\) and \(\sigma \in C(0,T;L^2(\Omega ;{\mathbb {R}}^2))\) and define \(\sigma _\ell :=\sigma (\tau _\ell )\). Suppose \(\theta =\theta _\ell =1/2\) and \(k_\ell =k=T/N\) and \(\Vert h_{\mathcal {T}}\Vert _{L^\infty (\Omega )}< k\sqrt{\mu }/(8\sqrt{T} C_{\textrm{NC}})\) with \(C_{\textrm{NC}}=0.4396\) from (2.1). Then it holds

$$\begin{aligned}&\max _{\ell =0,\dots ,N} \Vert e_\ell \Vert _{L^2(\Omega )}^2 + \mu \sum _{\ell =1}^N k \Vert \nabla _{\textrm{NC}}e(\tau _\ell )\Vert _{L^2(\Omega )}^2\\&\qquad \lesssim \Vert e_0\Vert _{L^2(\Omega )}^2 + k^4 (\Vert \dddot{u}\Vert _{L^1(L^2)}^2 + \Vert \ddot{u}\Vert _{L^\infty (L^2)}^2) + \Vert h_{\mathcal {T}}{\dot{u}}\Vert ^2_{L^2(L^2)} \\&\qquad \qquad + \sum _{\ell =1}^N \frac{k}{\mu } \Big (\Vert \sigma _\ell -\Pi _0\sigma _\ell \Vert _{L^2(\Omega )}^2 + \textrm{osc}^2(f(\tau _\ell ),{\mathcal {T}}) + \Vert h_{\mathcal {T}}{\dot{u}}(\tau _\ell )\Vert _{L^2(\Omega )}^2\Big ). \end{aligned}$$

The following theorem states an error estimate for the discontinuous Galerkin scheme. Let \(e:=u-u_\textrm{DG}\in L^2(0,T;L^2(\Omega ))\) and let \(e(t_\ell ^-):=\lim _{s\nearrow t_\ell } e(s)\) denote the right limit of e in \(t_\ell \) and \(u(t_{\ell -1}^+)\) and \(u(t_\ell ^-)\) denote the left and right limits of u in \(t_{\ell -1}\) and \(t_\ell \). Recall the non-conforming interpolation operator \(I_{\textrm{NC}}\) from Sect. 2.2.

Theorem 2.9

(Error estimate for dG(0)) Assume that \(u\in C(0,T;L^2(\Omega ))\) and let \({\overline{f}}\in L^2(0,T;L^2(\Omega ))\) and \({\overline{u}}\in L^2(0,T;L^2(\Omega ))\) be defined by

It holds for any \(\ell =1,\dots ,N\)

$$\begin{aligned}{} & {} \mu \Vert \nabla _{\textrm{NC}}e \Vert _{L^2(Q_\ell )}^2 + \frac{1}{4}\Vert [e]_{\ell -1}\Vert _{L^2(\Omega )}^2 + \frac{1}{2}\Vert e(t_\ell ^-)\Vert _{L^2(\Omega )}^2 -\frac{1}{2}\Vert e(t_{\ell -1}^-)\Vert _{L^2(\Omega )}^2 \nonumber \\{} & {} \quad \lesssim \Vert \sigma -\Pi _0\sigma \Vert _{L^2(Q_\ell )}^2 + \Vert h_{\mathcal {T}}(f-\Pi _0 f)\Vert _{L^2(Q_\ell )}^2\nonumber \\{} & {} \quad \quad + \Vert h_{\mathcal {T}}(u(t_{\ell -1}^+)-u(t_\ell ^-))/k_\ell \Vert _{L^2(Q_\ell )}^2 +\Vert f-{\overline{f}}\Vert _{L^2(Q_\ell )} \Vert u-{\overline{u}}\Vert _{L^2(Q_\ell )}\nonumber \\{} & {} \quad \quad + \Vert h_{\mathcal {T}}f\Vert _{L^2(Q_\ell )} \Vert \nabla _{\textrm{NC}}(u-I_{\textrm{NC}}u)\Vert _{L^2(Q_\ell )} + \Vert (u-{\overline{u}})_{\ell -1}^+\Vert _{L^2(\Omega )}^2\nonumber \\{} & {} \quad \quad + k_\ell ^{-1}\Vert h_{\mathcal {T}}\nabla _{\textrm{NC}}(u-I_{\textrm{NC}}u)\Vert _{L^2(Q_\ell )}^2. \end{aligned}$$
(2.8)

Remark 2.10

A piecewise Poincaré-Friedrichs inequality [4] proves, that \(\left\| \nabla _{\textrm{NC}}\bullet \right\| _{L^2(Q_\ell )}\) defines in fact a norm on \(L^2([t_{\ell -1},t_\ell ];H^1_0(\Omega )+\textrm{CR}^1_0({\mathcal {T}}))\).

The following lemma proves a trace inequality with respect to a time interval. This is employed in Corollary 2.12 below, which states a convergence rate for dG(0).

Lemma 2.11

(Trace inequality) Let X be a normed linear space. Any \(w\in H^{1}(a,b;X)\) satisfies

$$\begin{aligned} \Vert w(a) \Vert _X \le |b-a|^{-1/2} \Vert w\Vert _{L^2(a,b;X)} + (2/3) |b-a|^{1/2} \Vert {\dot{w}}\Vert _{L^2(a,b;X)}. \end{aligned}$$
(2.9)

Proof

The fundamental theorem of calculus implies for a.e. \(s\in [a,b]\)

$$\begin{aligned} w(a) = w(s) - \int \limits _{a}^s {\dot{w}}(t) \,dt. \end{aligned}$$

An integration over \(s\in [a,b]\), a triangle inequality and two Hölder inequalities lead to

$$\begin{aligned} |b-a\vert \Vert w(a) \Vert _X&\le \int \limits _{a}^{b} \Vert w(s)\Vert _X\,ds + \int \limits _{a}^{b} \int \limits _{a}^s \Vert {\dot{w}}(t)\Vert _X \,dt\,ds\\&\le |b-a\vert ^{1/2} \Vert w\Vert _{L^2(a,b;X)} + \Vert {\dot{w}}\Vert _{L^2(a,b;X)} \int \limits _{a}^{b} \sqrt{s-a}\,ds \end{aligned}$$

This eventually proves

$$\begin{aligned} |b-a\vert \Vert w(a) \Vert _X \le |b-a\vert ^{1/2} \Vert w\Vert _{L^2(a,b;X)} + (2/3) |b-a\vert ^{3/2} \Vert {\dot{w}}\Vert _{L^2(a,b;X)}. \end{aligned}$$

\(\square \)

Corollary 2.12

(Convergence of dG(0)) Let the conditions of Theorem 2.3 be fulfilled, i.e., let \(f\in H^1(0,T;L^2(\Omega ))\) and \(u_0\in H^1_0(\Omega )\) satisfy (2.7). Let \(h:=\Vert h_{\mathcal {T}}\Vert _{L^\infty (\Omega )}\) and \(k:=\max \{k_\ell \vert \ell =1,\dots ,N\}\) and \(\Vert u_0-U_0\Vert _{L^2(\Omega )}\rightarrow 0\) for \(h\rightarrow 0\). If \(h^2/\min \{k_\ell \vert \ell =1,\dots ,N\}\le C\) for some constant \(C<\infty \), then

$$\begin{aligned} \left\| \nabla _{\textrm{NC}}(u-u_\textrm{DG})\right\| _{L^2(0,T;L^2(\Omega ))} \rightarrow 0 \qquad \text {as }(h,k)\rightarrow 0. \end{aligned}$$

If furthermore \(\Vert u_0-U_0\Vert _{L^2(\Omega )}=O(h)\), \(\sigma \in L^2(0,T;H^1(\Omega ))\), and \(u\in L^2(0,T;H^2(\Omega ))\), then \(\left\| \nabla _{\textrm{NC}}(u-u_\textrm{DG})\right\| _{L^2(0,T;L^2(\Omega ))}=O(h+k^{1/2})\) as \((h,k)\rightarrow 0\).

Proof

A summation over \(\ell =1,\dots ,N\) in (2.8) together with two applications of Young’s inequality shows for \(Q=(0,T)\times \Omega \)

$$\begin{aligned}{} & {} \mu \left\| \nabla _{\textrm{NC}}e\right\| _{L^2(Q)}^2 + \sum _{\ell =1}^N \Vert [e]_{\ell -1}\Vert _{L^2(\Omega )}^2 + \Vert e(t_N^-)\Vert _{L^2(\Omega )}^2 \nonumber \\{} & {} \quad \lesssim \Vert u_0-U_0\Vert _{L^2(\Omega )}^2 + \Vert h_{\mathcal {T}}(f-\Pi _0 f)\Vert _{L^2(Q)}^2 +\Vert f-{\overline{f}}\Vert _{L^2(Q)}^2 + \Vert h_{\mathcal {T}}f\Vert _{L^2(Q)}^2 \nonumber \\{} & {} \quad \quad +\Vert \nabla _{\textrm{NC}}(u-I_{\textrm{NC}}u)\Vert _{L^2(Q)}^2 + \sum _{\ell =1}^N k_\ell ^{-1}\Vert h_{\mathcal {T}}\nabla _{\textrm{NC}}(u-I_{\textrm{NC}}u)\Vert _{L^2(Q_\ell )}^2 \nonumber \\{} & {} \quad \quad + \Vert \sigma -\Pi _0\sigma \Vert _{L^2(Q)}^2 +\Vert u-{\overline{u}}\Vert _{L^2(Q)}^2\nonumber \\{} & {} \quad \quad + \Vert h_{\mathcal {T}}(u(t_{\ell -1}^+)-u(t_\ell ^-))/k_\ell \Vert _{L^2(Q)}^2 + \sum _{\ell =1}^N \Vert (u-{\overline{u}})_{\ell -1}^+\Vert _{L^2(\Omega )}^2. \end{aligned}$$
(2.10)

By assumption \(\Vert u_0-U_0\Vert _{L^2(\Omega )}\rightarrow 0\) and, since \(f\in H^1(0,T;L^2(\Omega ))\),

$$\begin{aligned} \Vert h_{\mathcal {T}}(f-\Pi _0 f)\Vert _{L^2(Q)} +\Vert f-{\overline{f}}\Vert _{L^2(Q)} + \Vert h_{\mathcal {T}}f\Vert _{L^2(Q)} = O(h+k). \end{aligned}$$

Theorem 2.3 guarantees \(\sigma \in L^2(0,T;L^2(\Omega ;{\mathbb {R}}^2))\) and \(u\in H^1(0,T;L^2(\Omega ))\), and the assumption \(h^2/\min \{k_\ell \vert \ell =1,\dots ,N\}\le C\) proves convergence to zero of

$$\begin{aligned} \Vert u-{\overline{u}}\Vert _{L^2(Q)}^2 +\Vert \nabla _{\textrm{NC}}(u-I_{\textrm{NC}}u)\Vert _{L^2(Q)}^2 + \sum _{\ell =1}^N k_\ell ^{-1}\Vert h_{\mathcal {T}}\nabla _{\textrm{NC}}(u-I_{\textrm{NC}}u)\Vert _{L^2(Q_\ell )}^2. \end{aligned}$$

Moreover, \(\Vert \sigma -\Pi _0\sigma \Vert _{L^2(Q)}\rightarrow 0\) and the trace inequality of Lemma 2.11 and a Poincaré inequality imply

$$\begin{aligned} \Vert (u-{\overline{u}})_{\ell -1}^+\Vert _{L^2(\Omega )}^2 \le k_\ell \left( \pi +\frac{2}{3}\right) \Vert {\dot{u}}\Vert _{L^2(Q_\ell )}^2. \end{aligned}$$

The \(L^2\) projection of \({\dot{u}}\) onto piecewise constants (in time),

satisfies

$$\begin{aligned} \Vert h_{\mathcal {T}}(u(t_{\ell -1}^+)-u(t_\ell ^-))/k_\ell \Vert _{L^2(Q_\ell )} \le \Vert h_{\mathcal {T}}{\dot{u}}\Vert _{L^2(Q_\ell )}. \end{aligned}$$

This leads to plain convergence of dG(0).

Under the additional assumptions \(\sigma \in L^2(0,T;H^1(\Omega ))\), \(u\in L^2(0,T;H^2(\Omega ))\), and \(\Vert u_0-U_0\Vert _{L^2(\Omega )}=O(h)\), the convergence rate \(O(h+k^{1/2})\) follows from the above estimates. \(\square \)

Remark 2.13

(comparison of midpoint rule and dG(0)) The convergence \(h+k^{1/2}\) for dG(0) (by Corollary 2.12) is suboptimal in comparison to the generalised midpoint rule (for a smooth solution u). But the convergence of dG(0) is guaranteed for more general and realistic regularity of the solution \(u\in H^1(0,T;L^2(\Omega ))\cap L^2(0,T;H^1(\Omega ))\) and \(\sigma \in L^2(0,T;L^2(\Omega ;{\mathbb {R}}^2))\).

3 Proofs of theorems 2.72.8

This section starts with a stability result for the generalised midpoint rule for the two cases \(\theta _\ell >1/2\) as well as \(\theta _\ell =1/2\) for all \(\ell =1,\dots ,n\). Sect. 3.2 proves the error estimates from Theorems 2.7 and 2.8.

3.1 Stability

The stability of the generalised midpoint rule will be employed in the error estimate of Theorem 2.7. The stability of the uniform midpoint rule with \(\theta =\theta _\ell =1/2\), will not enter the proof of the error estimate of Theorem 2.8 below, and is merely given for completeness.

Theorem 3.1

(Stability) Assume that \(f\in C(0,T;L^2(\Omega ))\) and let \(n\in \{1,\dots ,N\}\). If \(\theta =\theta _\ell =1/2\) and \(k=k_\ell \) for all \(\ell =1,\dots ,n\), then

$$\begin{aligned}&\frac{k}{4} \Vert {\dot{u}}_h(\tau _n)\Vert _{L^2(\Omega )}^2 + g\int \limits _\Omega |\nabla _{\textrm{NC}}u_h(\tau _n)|\,dx + \frac{\mu }{2} \Vert \nabla _{\textrm{NC}}u_h(\tau _n)\Vert _{L^2(\Omega )}^2\\&\quad + \frac{1}{8}\sum _{\ell =1}^n k \Vert {\dot{u}}_h(\tau _\ell )+{\dot{u}}_h(\tau _{\ell -1})\Vert _{L^2(\Omega )}^2 + \frac{\mu }{2} \sum _{\ell =1}^n \Vert \nabla _{\textrm{NC}}(u_h(\tau _\ell )-u_h(\tau _{\ell -1}))\Vert _{L^2(\Omega )}^2\\&\le g\int \limits _\Omega |\nabla _{\textrm{NC}}u_h(0)|\,dx + \frac{\mu }{2} \Vert \nabla _{\textrm{NC}}u_h(0)\Vert _{L^2(\Omega )}^2 + \frac{1}{2}\sum _{\ell =1}^n k \Vert f(\tau _\ell )\Vert _{L^2(\Omega )}^2. \end{aligned}$$

If \(\theta _\ell >1/2\), assume that there exists \(\alpha \le 1\) and \(\beta <1\) with \((1-\theta _{\ell -1}) k_{\ell -1}\le \alpha \theta _\ell k_\ell \) and \((1-\theta _\ell )\le \beta \theta _\ell \) (i.e., \(\theta _\ell >1/2\)) for all \(\ell =1,\dots ,n\). Then \(\delta :=(2-\alpha -\beta )/2>0\) satisfies

$$\begin{aligned}{} & {} \frac{\delta }{2}\sum _{\ell =1}^n \theta _\ell k_\ell \Vert {\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2 + g\int \limits _\Omega |\nabla _{\textrm{NC}}u_h(\tau _n)|\,dx + \frac{\mu }{2} \Vert \nabla _{\textrm{NC}}u_h(\tau _n)\Vert _{L^2(\Omega )}^2\\{} & {} \quad + \left( 1+\frac{\delta }{1+\beta }\right) \frac{\beta \theta _n k_n}{2} \Vert {\dot{u}}_h(\tau _n)\Vert _{L^2(\Omega )}^2 \\{} & {} \quad + \frac{1}{2}\sum _{\ell =1}^n (1-\theta _{\ell -1}) k_{\ell -1} \Vert {\dot{u}}_h(\tau _\ell )+{\dot{u}}_h(\tau _{\ell -1})\Vert _{L^2(\Omega )}^2\\{} & {} \quad + \frac{\mu }{2} \sum _{\ell =1}^n \Vert \nabla _{\textrm{NC}}(u_h(\tau _\ell )-u_h(\tau _{\ell -1}))\Vert _{L^2(\Omega )}^2\\{} & {} \le g\int \limits _\Omega |\nabla _{\textrm{NC}}u_h(0)|\,dx + \frac{\mu }{2} \Vert \nabla _{\textrm{NC}}u_h(0)\Vert _{L^2(\Omega )}^2 \\{} & {} \quad + \frac{(1+\alpha )(1+\beta )}{2\delta } \sum _{\ell =1}^n \theta _\ell k_\ell \Vert f(\tau _\ell )\Vert _{L^2(\Omega )}^2. \end{aligned}$$

Proof

Let \(\ell \in \{1,\dots ,n\}\) and choose \(v_h=u_h(\tau _{\ell -1})\) in (tB\(_h\)) (with \(\tau _0:=t_0=0\)) to conclude that

$$\begin{aligned} \begin{aligned}&\int \limits _\Omega {\dot{u}}_h(\tau _\ell ) (u_h(\tau _{\ell })-u_h(\tau _{\ell -1}))\,dx \le F(\tau _\ell ,u_h(\tau _\ell )-u_h(\tau _{\ell -1})) \\&\qquad \qquad \qquad \qquad + \mu \int \limits _\Omega \nabla _{\textrm{NC}}u_h(\tau _\ell )\cdot \nabla _{\textrm{NC}}(u_h(\tau _{\ell -1})-u_h(\tau _\ell ))\,dx\\&\qquad \qquad \qquad \qquad + g\int \limits _\Omega |\nabla _{\textrm{NC}}u_h(\tau _{\ell -1})|\,dx - g\int \limits _\Omega |\nabla _{\textrm{NC}}u_h(\tau _\ell )|\, dx. \end{aligned} \end{aligned}$$
(3.1)

Since \({\dot{u}}_h\) is piecewise constant with respect to the time-intervals, it follows

$$\begin{aligned} {\dot{u}}_h(\tau _\ell ) =\frac{u_h(\tau _\ell )-u_h(t_{\ell -1})}{\theta _\ell k_\ell } \quad \text {and}\quad {\dot{u}}_h(\tau _{\ell -1}) =\frac{u_h(t_{\ell -1})-u_h(\tau _{\ell -1})}{(1-\theta _{\ell -1})k_{\ell -1}}. \end{aligned}$$
(3.2)

Therefore, the left-hand side of (3.1) reads with \(\theta _0:=k_0:=0\) and \({\dot{u}}_h(0):=0\) as

$$\begin{aligned}&\int \limits _\Omega {\dot{u}}_h(\tau _\ell ) (u_h(\tau _{\ell })-u_h(\tau _{\ell -1}))\,dx \\&\qquad = \theta _\ell k_\ell \Vert {\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2 +(1-\theta _{\ell -1})k_{\ell -1} \int \limits _\Omega {\dot{u}}_h(\tau _\ell ) {\dot{u}}_h(\tau _{\ell -1})\,dx \\&\qquad = \theta _\ell k_\ell \Vert {\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2 + (1/2) (1-\theta _{\ell -1}) k_{\ell -1} \Big (\Vert {\dot{u}}_h(\tau _\ell )+{\dot{u}}_h(\tau _{\ell -1})\Vert _{L^2(\Omega )}^2\\&\qquad \qquad - \Vert {\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2 - \Vert {\dot{u}}_h(\tau _{\ell -1})\Vert _{L^2(\Omega )}^2\Big ). \end{aligned}$$

The identity \(u_h(\tau _\ell )=(u_h(\tau _\ell )-u_h(\tau _{\ell -1}))/2 + (u_h(\tau _\ell )+u_h(\tau _{\ell -1}))/2\) reveals

$$\begin{aligned}&\sum _{\ell =1}^n \int \limits _\Omega \nabla _{\textrm{NC}}u_h(\tau _\ell ) \cdot \nabla _{\textrm{NC}}(u_h(\tau _{\ell -1})-u_h(\tau _\ell ))\,dx\\&\qquad \qquad = \frac{1}{2} \Vert \nabla _{\textrm{NC}}u_h(0)\Vert _{L^2(\Omega )}^2 - \frac{1}{2}\Vert \nabla _{\textrm{NC}}u_h(\tau _n)\Vert _{L^2(\Omega )}^2\\&\qquad \qquad \quad -\frac{1}{2}\sum _{\ell =1}^n \Vert \nabla _{\textrm{NC}}(u_h(\tau _\ell )-u_h(\tau _{\ell -1}))\Vert _{L^2(\Omega )}^2. \end{aligned}$$

The combination of the previous inequalities with the sum of (3.1) over \(\ell =1,\dots ,n\) reads

$$\begin{aligned} \begin{aligned}&\sum _{\ell =1}^n \theta _\ell k_\ell \Vert {\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2 + g\int \limits _\Omega |\nabla _{\textrm{NC}}u_h(\tau _n)|\,dx + \frac{\mu }{2} \Vert \nabla _{\textrm{NC}}u_h(\tau _n)\Vert _{L^2(\Omega )}^2\\&\quad + \frac{1}{2}\sum _{\ell =1}^n (1-\theta _{\ell -1}) k_{\ell -1} \Vert {\dot{u}}_h(\tau _\ell )+{\dot{u}}_h(\tau _{\ell -1})\Vert _{L^2(\Omega )}^2\\&\quad + \frac{\mu }{2} \sum _{\ell =1}^n \Vert \nabla _{\textrm{NC}}(u_h(\tau _\ell )-u_h(\tau _{\ell -1}))\Vert _{L^2(\Omega )}^2\\&\le \frac{1}{2}\sum _{\ell =1}^n (1-\theta _{\ell -1}) k_{\ell -1} \big (\Vert {\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2 + \Vert {\dot{u}}_h(\tau _{\ell -1})\Vert _{L^2(\Omega )}^2\big )\\&\quad + g\int \limits _\Omega |\nabla _{\textrm{NC}}u_h(0)|\,dx + \frac{\mu }{2} \Vert \nabla _{\textrm{NC}}u_h(0)\Vert _{L^2(\Omega )}^2 \\&\quad +\sum _{\ell =1}^n F(\tau _\ell ,u_h(\tau _\ell )-u_h(\tau _{\ell -1})). \end{aligned} \end{aligned}$$
(3.3)

Case 1 (Uniform midpoint rule): \(\theta =\theta _\ell =1/2\) and \(k=k_\ell \).

The Cauchy inequality and the Young inequality with weight 1/2 for each time step lead to

$$\begin{aligned}&\sum _{\ell =1}^n F(\tau _\ell ,u_h(\tau _\ell )-u_h(\tau _{\ell -1}))\\&\qquad = \frac{k}{2}\sum _{\ell =1}^n F(\tau _\ell ,{\dot{u}}_h(\tau _\ell )+{\dot{u}}_h(\tau _{\ell -1}))\\&\qquad \le \frac{k}{2}\sum _{\ell =1}^n \Vert f(\tau _\ell )\Vert _{L^2(\Omega )}^2 + \frac{k}{8} \sum _{\ell =1}^n \Vert {\dot{u}}_h(\tau _\ell ) + {\dot{u}}_h(\tau _{\ell -1})\Vert _{L^2(\Omega )}^2. \end{aligned}$$

The combination with (3.3) results in

$$\begin{aligned}&\frac{k}{4} \Vert {\dot{u}}_h(\tau _n)\Vert _{L^2(\Omega )}^2 + g\int \limits _\Omega |\nabla _{\textrm{NC}}u_h(\tau _n)|\,dx + \frac{\mu }{2} \Vert \nabla _{\textrm{NC}}u_h(\tau _n)\Vert _{L^2(\Omega )}^2\\&\quad + \frac{k}{8}\sum _{\ell =1}^n \Vert {\dot{u}}_h(\tau _\ell )+{\dot{u}}_h(\tau _{\ell -1})\Vert _{L^2(\Omega )}^2\\&\quad + \frac{\mu }{2} \sum _{\ell =1}^n \Vert \nabla _{\textrm{NC}}(u_h(\tau _\ell )-u_h(\tau _{\ell -1}))\Vert _{L^2(\Omega )}^2\\&\le g\int \limits _\Omega |\nabla _{\textrm{NC}}u_h(0)|\,dx + \frac{\mu }{2} \Vert \nabla _{\textrm{NC}}u_h(0)\Vert _{L^2(\Omega )}^2 + \frac{k}{2}\sum _{\ell =1}^n \Vert f(\tau _\ell )\Vert _{L^2(\Omega )}^2. \end{aligned}$$

Case 2: There exists \(\alpha \le 1\) with \(\theta _\ell > 1/2\) and \((1-\theta _{\ell -1}) k_{\ell -1}\le \alpha \theta _\ell k_\ell \).

Let \(\beta <1\) such that \((1-\theta _\ell )\le \beta \theta _\ell \) for all \(\ell =1,\dots ,n\) and set \(\delta :=(2-\alpha -\beta )/2>0\). This implies

$$\begin{aligned}&\frac{1}{2}\sum _{\ell =1}^n (1-\theta _{\ell -1}) k_{\ell -1} \big (\Vert {\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2 + \Vert {\dot{u}}_h(\tau _{\ell -1})\Vert _{L^2(\Omega )}^2\big )\\&\qquad \le \frac{1}{2} \sum _{\ell =1}^n \alpha \theta _\ell k_\ell \Vert {\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2 + \frac{1}{2}\sum _{\ell =1}^{n-1} \beta \theta _{\ell } k_{\ell } \Vert {\dot{u}}_h(\tau _{\ell })\Vert _{L^2(\Omega )}^2\\&\qquad = \frac{\alpha +\beta }{2} \sum _{\ell =1}^n\theta _\ell k_\ell \Vert {\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2 - \frac{\beta \theta _n k_n}{2} \Vert {\dot{u}}_h(\tau _n)\Vert _{L^2(\Omega )}^2. \end{aligned}$$

The combination with (3.3) leads to

$$\begin{aligned} \begin{aligned}&\delta \sum _{\ell =1}^n \theta _\ell k_\ell \Vert {\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2 + g\int \limits _\Omega |\nabla _{\textrm{NC}}u_h(\tau _n)|\,dx + \frac{\mu }{2} \Vert \nabla _{\textrm{NC}}u_h(\tau _n)\Vert _{L^2(\Omega )}^2\\&\quad + \frac{\beta \theta _n k_n}{2} \Vert {\dot{u}}_h(\tau _n)\Vert _{L^2(\Omega )}^2 + \frac{1}{2}\sum _{\ell =1}^n (1-\theta _{\ell -1}) k_{\ell -1} \Vert {\dot{u}}_h(\tau _\ell )+{\dot{u}}_h(\tau _{\ell -1})\Vert _{L^2(\Omega )}^2\\&\quad + \frac{\mu }{2} \sum _{\ell =1}^n \Vert \nabla _{\textrm{NC}}(u_h(\tau _\ell )-u_h(\tau _{\ell -1}))\Vert _{L^2(\Omega )}^2\\&\le g\int \limits _\Omega |\nabla _{\textrm{NC}}u_h(0)|\,dx + (\mu /2) \Vert \nabla _{\textrm{NC}}u_h(0)\Vert _{L^2(\Omega )}^2 \\&\quad +\sum _{\ell =1}^n F(\tau _\ell ,u_h(\tau _\ell )-u_h(\tau _{\ell -1})). \end{aligned} \end{aligned}$$
(3.4)

The identities of (3.2) plus Cauchy and Young inequalities with weight \(\gamma >0\) imply

$$\begin{aligned}&\sum _{\ell =1}^n F(\tau _\ell ,u_h(\tau _\ell )-u_h(\tau _{\ell -1}))\\&\qquad = \sum _{\ell =1}^n F(\tau _\ell , \theta _\ell k_\ell {\dot{u}}_h(\tau _\ell )+(1-\theta _{\ell -1}) k_{\ell -1} {\dot{u}}_h(\tau _{\ell -1}))\\&\qquad \le \frac{1}{2\gamma }\sum _{\ell =1}^n (\theta _\ell k_\ell + (1-\theta _{\ell -1})k_{\ell -1})) \Vert f(\tau _\ell )\Vert _{L^2(\Omega )}^2\\&\qquad \qquad + \frac{\gamma }{2} \sum _{\ell =1}^n \theta _\ell k_\ell \Vert {\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2 + \frac{\gamma }{2}\sum _{\ell =1}^n (1-\theta _{\ell -1}) k_{\ell -1} \Vert {\dot{u}}_h(\tau _{\ell -1})\Vert _{L^2(\Omega )}^2. \end{aligned}$$

Since \((1-\theta _\ell )\le \beta \theta _\ell \) for all \(\ell =1,\dots ,n\) and \({\dot{u}}_h(0)=0\) by definition, it follows

$$\begin{aligned} \frac{\gamma }{2}\sum _{\ell =1}^n (1-\theta _{\ell -1}) k_{\ell -1} \Vert {\dot{u}}_h(\tau _{\ell -1})\Vert _{L^2(\Omega )}^2 \le \frac{\beta \gamma }{2}\sum _{\ell =1}^{n-1} \theta _{\ell } k_{\ell } \Vert {\dot{u}}_h(\tau _{\ell })\Vert _{L^2(\Omega )}^2. \end{aligned}$$

Therefore, \((1-\theta _{\ell -1})k_{\ell -1}\le \alpha \theta _\ell k_\ell \) implies

$$\begin{aligned} \sum _{\ell =1}^n F(\tau _\ell ,u_h(\tau _\ell )-u_h(\tau _{\ell -1}))&\le \frac{1+\alpha }{2\gamma }\sum _{\ell =1}^n \theta _\ell k_\ell \Vert f(\tau _\ell )\Vert _{L^2(\Omega )}^2\\&\quad + \frac{\gamma (1+\beta )}{2} \sum _{\ell =1}^n \theta _\ell k_\ell \Vert {\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2\\&\quad - \frac{\beta \gamma }{2} \theta _n k_n\Vert {\dot{u}}_h(\tau _n)\Vert _{L^2(\Omega )}^2. \end{aligned}$$

For \(\gamma =\delta /(1+\beta )\), this and the combination with (3.4) lead to

$$\begin{aligned} \begin{aligned}&\frac{\delta }{2}\sum _{\ell =1}^n \theta _\ell k_\ell \Vert {\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2 + g\int \limits _\Omega |\nabla _{\textrm{NC}}u_h(\tau _n)|\,dx + \frac{\mu }{2} \Vert \nabla _{\textrm{NC}}u_h(\tau _n)\Vert _{L^2(\Omega )}^2\\&\quad + (1+\gamma )\frac{\beta \theta _n k_n}{2} \Vert {\dot{u}}_h(\tau _n)\Vert _{L^2(\Omega )}^2 + \frac{1}{2}\sum _{\ell =1}^n (1-\theta _{\ell -1}) k_{\ell -1} \Vert {\dot{u}}_h(\tau _\ell )+{\dot{u}}_h(\tau _{\ell -1})\Vert _{L^2(\Omega )}^2\\&\quad + \frac{\mu }{2} \sum _{\ell =1}^n \Vert \nabla _{\textrm{NC}}(u_h(\tau _\ell )-u_h(\tau _{\ell -1}))\Vert _{L^2(\Omega )}^2\\&\le g\int \limits _\Omega |\nabla _{\textrm{NC}}u_h(0)|\,dx + (\mu /2) \Vert \nabla _{\textrm{NC}}u_h(0)\Vert _{L^2(\Omega )}^2 \\&\quad + \frac{(1+\alpha )(1+\beta )}{2\delta } \sum _{\ell =1}^n \theta _\ell k_\ell \Vert f(\tau _\ell )\Vert _{L^2(\Omega )}^2. \end{aligned} \end{aligned}$$

This concludes the proof. \(\square \)

3.2 Error estimates

The following theorem controls the error in space for a fixed time \(\tau _\ell \). The proof follows the arguments of [11] and considers \(\sigma _\ell :=\sigma (\tau _\ell )\) with \(\sigma \) from (2.6). Recall the constants \(C_\textrm{comp}\) and \(C_{\textrm{NC}}\) from Sect. 2.2.

Theorem 3.2

(Error estimate in space) Assume that \(f\in C(0,T;L^2(\Omega ))\), \(\sigma \in C(0,T;L^2(\Omega ;{\mathbb {R}}^2))\) and \({\dot{u}}\in C(0,T;L^2(\Omega ))\). The error \(e(\tau _\ell ):=u(\tau _\ell )-u_h(\tau _\ell )\) satisfies

$$\begin{aligned}&\frac{\mu ^2}{4}\Vert \nabla _{\textrm{NC}}e(\tau _\ell )\Vert _{L^2(\Omega )}^2 + \mu \int \limits _\Omega {\dot{e}}(\tau _\ell ) e(\tau _\ell )\,dx\\&\quad \le C_\textrm{comp}^2\Vert \sigma _\ell -\Pi _0\sigma _\ell \Vert _{L^2(\Omega )}^2 + 2 (C_\textrm{comp}^2+C_{\textrm{NC}}^2)\, \textrm{osc}^2(f(\tau _\ell ),{\mathcal {T}})\\&\quad \quad + 2C_\textrm{comp}^2\Vert h_{\mathcal {T}}{\dot{u}}(\tau _\ell )\Vert _{L^2(\Omega )}^2 + 2C_{\textrm{NC}}^2\Vert h_{\mathcal {T}}{\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2. \end{aligned}$$

Proof

Theorem 3.2 of [11] proves the existence of \(\sigma _{h,\ell }\in P_0({\mathcal {T}};{\mathbb {R}}^2)\cap \partial W(\nabla _{\textrm{NC}}u_h(\tau _\ell ))\) with W as in Theorem 2.3 such that for all \(v_h\in \textrm{CR}^1_0({\mathcal {T}})\)

$$\begin{aligned} \int \limits _\Omega \sigma _{h,\ell } \cdot \nabla _{\textrm{NC}}v_h\,dx = \int \limits _\Omega f(\tau _\ell ) \,v_h\,dx - \int \limits _\Omega {\dot{u}}_h(\tau _\ell )\, v_h\,dx. \end{aligned}$$
(3.5)

The definition of the subgradients \(\sigma _{h,\ell }\in \partial W(\nabla _{\textrm{NC}}u_h(\tau _\ell ))\) and \(\sigma _\ell \in \partial W(\nabla u(\tau _\ell ))\) results in

$$\begin{aligned} \int \limits _\Omega (\sigma _\ell&- \mu \nabla u(\tau _\ell )) \cdot \nabla _{\textrm{NC}}(u_h(\tau _\ell )- u(\tau _\ell ))\,dx\\&\quad + \int \limits _\Omega (\sigma _{h,\ell } - \mu \nabla _{\textrm{NC}}u_h(\tau _\ell )) \cdot \nabla _{\textrm{NC}}(I_{\textrm{NC}}u(\tau _\ell ) - u_h(\tau _\ell ))\,dx\\&\le g\int \limits _\Omega \big (|\nabla _{\textrm{NC}}I_{\textrm{NC}}u(\tau _\ell ))|- |\nabla u(\tau _\ell )|\big )\,dx. \end{aligned}$$

Since \(\nabla _{\textrm{NC}}I_{\textrm{NC}}u(\tau _\ell )=\Pi _0\nabla u(\tau _\ell )\), Jensen’s inequality proves

$$\begin{aligned} \int \limits _\Omega \big (|\nabla _{\textrm{NC}}I_{\textrm{NC}}u(\tau _\ell ))|- |\nabla u(\tau _\ell )|\big ) \,dx\le 0. \end{aligned}$$

This, \(F(\tau _\ell ,\bullet )-{\dot{u}}(\tau _\ell ) + {\text {div}}\sigma _\ell = 0\) in \(H^{-1}(\Omega )\), and its discrete analogue (3.5) lead to

$$\begin{aligned} \mu \Vert \nabla _{\textrm{NC}}e(\tau _\ell )\Vert _{L^2(\Omega )}^2&\le \int \limits _\Omega \sigma _\ell \cdot \nabla _{\textrm{NC}}(J_3 u_h(\tau _\ell )-u_h(\tau _\ell ))\,dx\\&\quad + F(\tau _\ell ,u(\tau _\ell )-I_{\textrm{NC}}u(\tau _\ell ) + u_h(\tau _\ell )-J_3 u_h(\tau _\ell ))\\&\quad + \int \limits _\Omega {\dot{u}}(\tau _\ell ) (J_3 u_h(\tau _\ell )-u(\tau _\ell ))\,dx\\&\quad + \int \limits _\Omega {\dot{u}}_h(\tau _\ell ) (I_{\textrm{NC}}u(\tau _\ell )-u_h(\tau _\ell ))\,dx\\&\quad +\mu \int \limits _\Omega \nabla _{\textrm{NC}}u_h(\tau _\ell ) \cdot \nabla _{\textrm{NC}}(I_{\textrm{NC}}u(\tau _\ell )- u(\tau _\ell ))\,dx. \end{aligned}$$

The integral mean property of \(I_{\textrm{NC}}\) implies that the last term vanishes. The conservation property (2.4) and the stability property (2.3) prove

$$\begin{aligned} \int \limits _\Omega \sigma _\ell&\cdot \nabla _{\textrm{NC}}(J_3 u_h(\tau _\ell )-u_h(\tau _\ell ))\,dx\\&= \int \limits _\Omega (\sigma _\ell -\Pi _0\sigma _\ell ) \cdot \nabla _{\textrm{NC}}(J_3 u_h(\tau _\ell )- u_h(\tau _\ell ))\,dx\\&\le C_\textrm{comp}\Vert \sigma _\ell - \Pi _0\sigma _\ell \Vert _{L^2(\Omega )} \; \Vert \nabla _{\textrm{NC}}(u(\tau _\ell ) - u_h(\tau _\ell ))\Vert _{L^2(\Omega )}. \end{aligned}$$

The conservation property (2.5) and the approximation property (2.3) imply

$$\begin{aligned}&F(\tau _\ell ,u_h(\tau _\ell )-J_3 u_h(\tau _\ell )) \\ {}&= \int \limits _\Omega (f(\tau _\ell )-\Pi _0 f(\tau _\ell )) (u_h(\tau _\ell )-J_3 u_h(\tau _\ell ))\,dx\\&\le C_\textrm{comp}\Vert h_{\mathcal {T}}(f(\tau _\ell )-\Pi _0 f(\tau _\ell ))\Vert _{L^2(\Omega )} \; \Vert \nabla _{\textrm{NC}}(u(\tau _\ell )-u_h(\tau _\ell ))\Vert _{L^2(\Omega )}. \end{aligned}$$

The integral mean property and the approximation property of \(I_{\textrm{NC}}\) yield

$$\begin{aligned}&F(\tau _\ell ,u(\tau _\ell )-I_{\textrm{NC}}u(\tau _\ell ))\\&= \int \limits _\Omega (f(\tau _\ell )-\Pi _0 f(\tau _\ell )) (u(\tau _\ell ) - I_{\textrm{NC}}u(\tau _\ell ))\,dx\\&\le C_{\textrm{NC}}\Vert h_{\mathcal {T}}(f(\tau _\ell )-\Pi _0 f(\tau _\ell ))\Vert _{L^2(\Omega )} \; \Vert \nabla _{\textrm{NC}}(u(\tau _\ell )-u_h(\tau _\ell ))\Vert _{L^2(\Omega )}. \end{aligned}$$

A calculation reveals

$$\begin{aligned} \int \limits _\Omega {\dot{u}}(\tau _\ell )&(J_3 u_h(\tau _\ell )-u(\tau _\ell ))\,dx + \int \limits _\Omega {\dot{u}}_h(\tau _\ell ) (I_{\textrm{NC}}u(\tau _\ell )-u_h(\tau _\ell ))\,dx\\&= \int \limits _\Omega {\dot{u}}(\tau _\ell ) (J_3 u_h(\tau _\ell ) - u_h(\tau _\ell ))\,dx + \int \limits _\Omega {\dot{u}}_h(\tau _\ell ) (I_{\textrm{NC}}u(\tau _\ell ) - u(\tau _\ell ))\,dx\\&\quad - \int \limits _\Omega {\dot{e}}(\tau _\ell ) e(\tau _\ell )\,dx. \end{aligned}$$

The combination of the previously displayed formulas with two applications of Young’s inequality with weight \(2/\mu \) lead to

$$\begin{aligned}&\frac{\mu }{2} \Vert \nabla _{\textrm{NC}}e(\tau _\ell )\Vert _{L^2(\Omega )}^2 + \int \limits _\Omega {\dot{e}}(\tau _\ell ) e(\tau _\ell )\,dx\\&\qquad \le \frac{C_\textrm{comp}^2}{\mu } \Vert \sigma _\ell -\Pi _0\sigma _\ell \Vert _{L^2(\Omega )}^2 + \frac{2\, (C_\textrm{comp}^2+C_{\textrm{NC}}^2)}{\mu }\,\textrm{osc}^2(f(\tau _\ell ),{\mathcal {T}})\\&\qquad \qquad + \int \limits _\Omega {\dot{u}}(\tau _\ell ) (J_3 u_h(\tau _\ell ) - u_h(\tau _\ell ))\,dx + \int \limits _\Omega {\dot{u}}_h(\tau _\ell ) (I_{\textrm{NC}}u(\tau _\ell )-u(\tau _\ell ))\,dx. \end{aligned}$$

The approximation properties of \(J_3\) and \(I_{\textrm{NC}}\), two applications of the Cauchy inequality, and two applications of a weighted Young’s inequality with weight \(4/\mu \) lead to

$$\begin{aligned}&\int \limits _\Omega {\dot{u}}(\tau _\ell ) (J_3 u_h(\tau _\ell ) - u_h(\tau _\ell ))\,dx + \int \limits _\Omega {\dot{u}}_h(\tau _\ell ) (I_{\textrm{NC}}u(\tau _\ell )-u(\tau _\ell ))\,dx\\&\quad \qquad \le \frac{\mu }{4} \Vert \nabla _{\textrm{NC}}e(\tau _\ell )\Vert _{L^2(\Omega )}^2 + \frac{2\,C_\textrm{comp}^2}{\mu } \Vert h_{\mathcal {T}}{\dot{u}}(\tau _\ell )\Vert _{L^2(\Omega )}^2 + \frac{2\,C_{\textrm{NC}}^2}{\mu } \Vert h_{\mathcal {T}}{\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2. \end{aligned}$$

The combination of the two previously displayed inequalities concludes the proof. \(\square \)

The following theorem proves a first error estimate for the discretisation (tB\(_h\)).

Theorem 3.3

(A priori error estimate for the full discretisation) Assume that \(f\in C(0,T;L^2(\Omega ))\), \(u\in W^{2,1}(0,T;L^2(\Omega ))\cap C(0,T;H^1_0(\Omega ))\) and \(\sigma \in C(0,T;L^2(\Omega ;{\mathbb {R}}^2))\). If \(\theta _\ell > 1/2\), then it holds

$$\begin{aligned} \max _{\ell =0,\dots ,N}&\Vert e_\ell \Vert _{L^2(\Omega )}^2 + \sum _{\ell =1}^N \Big (\theta _\ell - \frac{1}{2}\Big ) \Vert e_\ell -e_{\ell -1}\Vert _{L^2(\Omega )}^2 + \mu \sum _{\ell =1}^N k_\ell \Vert \nabla _{\textrm{NC}}e(\tau _\ell )\Vert _{L^2(\Omega )}^2\\&\lesssim \Vert e_0\Vert _{L^2(\Omega )}^2 + \Vert k\ddot{u}\Vert _{L^1(L^2)}^2 + \sum _{\ell =1}^N \frac{k_\ell }{\mu } \Big (\Vert \sigma _\ell -\Pi _0\sigma _\ell \Vert _{L^2(\Omega )}^2 + \textrm{osc}^2(f(\tau _\ell ),{\mathcal {T}})\\&\quad + \Vert h_{\mathcal {T}}{\dot{u}}(\tau _\ell )\Vert _{L^2(\Omega )}^2 + \Vert h_{\mathcal {T}}{\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2\Big ). \end{aligned}$$

If \(k=k_\ell =T/N\), \(\theta =\theta _\ell =1/2\), and, in addition to the above regularity assumptions \(u\in W^{3,1}(0,T;L^2(\Omega ))\), then

$$\begin{aligned}{} & {} \frac{1}{2}\max _{\ell =1,\dots ,N} \Vert e_\ell \Vert _{L^2(\Omega )}^2 + \frac{\mu k}{2}\sum _{\ell =1}^N \Vert \nabla _{\textrm{NC}}e(\tau _\ell )\Vert _{L^2(\Omega )}^2\nonumber \\{} & {} \quad \le \Vert e_0\Vert _{L^2(\Omega )}^2 + \frac{k^4}{2} \Big (\Vert \dddot{u}\Vert _{L^1(L^2)}^2 + \frac{7}{8}\Vert \ddot{u}\Vert _{L^\infty (L^2)}^2\Big )\nonumber \\{} & {} \quad \quad + \frac{2 k}{\mu }\sum _{\ell =1}^N \Big (C_\textrm{comp}^2\Vert \sigma _\ell -\Pi _0\sigma _\ell \Vert _{L^2(\Omega )}^2 + 2 (C_\textrm{comp}^2+C_{\textrm{NC}}^2)\, \textrm{osc}^2(f(\tau _\ell ),{\mathcal {T}})\nonumber \\{} & {} \quad \quad + 2C_\textrm{comp}^2\Vert h_{\mathcal {T}}{\dot{u}}(\tau _\ell )\Vert _{L^2(\Omega )}^2 + 2C_{\textrm{NC}}^2\Vert h_{\mathcal {T}}{\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2\Big ). \end{aligned}$$
(3.6)

Proof

The error analysis with respect to the time discretisation follows as in [1, Proposition 5.1] for elastoplasticity with \(|||\bullet |||\) replaced by the \(L^2\) norm \(\Vert \bullet \Vert _{L^2(\Omega )}\) and the scalar product \(a(\bullet ,\bullet )\) replaced by the \(L^2\) scalar product. Those arguments lead to

$$\begin{aligned}&\max _{\ell =0,\dots ,N} \Vert e_\ell \Vert _{L^2(\Omega )}^2 + \sum _{\ell =1}^N \Big (\theta _\ell - \frac{1}{2}\Big ) \Vert e_\ell -e_{\ell -1}\Vert _{L^2(\Omega )}^2\\&\qquad \qquad \qquad \lesssim \Vert e_0\Vert _{L^2(\Omega )}^2 + \Vert k\ddot{u}\Vert _{L^1(L^2)}^2 + \sum _{\ell =1}^N k_\ell \int \limits _\Omega {\dot{e}}(\tau _\ell ) e(\tau _\ell )\,dx \end{aligned}$$

for the generalised midpoint rule with \(\theta >1/2\) and

$$\begin{aligned} \frac{1}{2}\max _{\ell =1,\dots ,N} \Vert e_\ell \Vert _{L^2(\Omega )}^2&\le \Vert e_0\Vert _{L^2(\Omega )}^2 + \frac{k^4}{2} \Vert \dddot{u}\Vert _{L^1(L^2)}^2 + \frac{7k^4}{16}\Vert \ddot{u}\Vert _{L^\infty (L^2)}^2\\&\quad + 2 k\sum _{\ell =1}^N \int \limits _\Omega {\dot{e}}(\tau _\ell ) e(\tau _\ell )\,dx \end{aligned}$$

for the uniform midpoint rule. The combination with the error estimate in space of \(\int _\Omega {\dot{e}}(\tau _\ell ) e(\tau _\ell )\,dx\) from Theorem 3.2 then proves the assertion. \(\square \)

Proof of Theorem 2.7

The combination of Theorem 3.3 with the estimation of \(\Vert h_{\mathcal {T}}{\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2\) from Theorem 3.1 leads to Theorem 2.7. \(\square \)

Proof of Theorem 2.8

For \(\theta =1/2\), a Young inequality implies

$$\begin{aligned} \sum _{\ell =1}^N k\Vert h_{\mathcal {T}}{\dot{u}}_h(\tau _\ell )\Vert _{L^2(\Omega )}^2 \le 2\sum _{\ell =1}^N k\Vert h_{\mathcal {T}}\dot{\widetilde{e}}(\tau _\ell )\Vert _{L^2(\Omega )}^2 + 2\sum _{\ell =1}^N k\Vert h_{\mathcal {T}}\dot{\widetilde{u}}(\tau _\ell )\Vert _{L^2(\Omega )}^2 \end{aligned}$$
(3.7)

with \(\widetilde{e}:=\widetilde{u} -u_h\) for the nodal interpolation \(\widetilde{u}=I_h u\in S^1({\mathcal {K}};H^1_0(\Omega ))\) of u in time with \(I_h u(t_\ell )=u(t_\ell )\) for all \(\ell =0,\dots ,N\). Recall \(Q_\ell =[t_{\ell -1},t_\ell ]\times \Omega \). Since , the second term on the right-hand side in (3.7) is bounded by

$$\begin{aligned} \sum _{\ell =1}^N k\Vert h_{\mathcal {T}}\dot{\widetilde{u}}(\tau _\ell )\Vert _{L^2(\Omega )}^2 \le \sum _{\ell =1}^N \Vert h_{\mathcal {T}}{\dot{u}}\Vert _{L^2(Q_\ell )}^2 =\Vert h_{\mathcal {T}}{\dot{u}}\Vert _{L^2(L^2)}^2. \end{aligned}$$

Let \(H:=\Vert h_{\mathcal {T}}\Vert _{L^\infty (\Omega )}\) abbreviate the maximal (spatial) mesh-size. Since \(\widetilde{u}(t_\ell )=u(t_\ell )\) by the definition of \(\widetilde{u}\), it holds \(\widetilde{e}(t_\ell )=e_\ell \). For the first term on the right-hand side of (3.7), the representation

$$\begin{aligned} \dot{\widetilde{e}}(\tau _\ell )=(\widetilde{e}(t_\ell )-\widetilde{e}(t_{\ell -1}))/k = (e_\ell -e_{\ell -1})/k \end{aligned}$$

and a triangle inequality lead to

$$\begin{aligned} \begin{aligned} \sum _{\ell =1}^N k \Vert h_{\mathcal {T}}\dot{\widetilde{e}}(\tau _\ell )\Vert _{L^2(\Omega )}^2&\le 2 \sum _{\ell =1}^N \frac{H^2}{k} \big (\Vert e_{\ell }\Vert _{L^2(\Omega )}^2 + \Vert e_{\ell -1}\Vert _{L^2(\Omega )}^2\big )\\&\le 4\max _{\ell =0,\dots ,N} \Vert e_\ell \Vert _{L^2(\Omega )}^2 \,N H^2/k. \end{aligned} \end{aligned}$$
(3.8)

If \(H< k\sqrt{\mu }/(8\sqrt{T}C_{\textrm{NC}})\), then \(N=T/k\) leads to

$$\begin{aligned} \frac{8 N H^2}{k} <\frac{N k \mu }{8 T C_{\textrm{NC}}^2} = \frac{\mu }{8 C_{\textrm{NC}}^2}. \end{aligned}$$

Eventually, the right-hand side of (3.8) can be absorbed on the right-hand side of (3.6). The combination of this with Theorem 3.3 concludes the proof of Theorem 2.8. \(\square \)

4 Proof of theorem 2.9

This section proves the error estimate for the discontinuous Galerkin discretisation (dG(0)) from Theorem 2.9.

4.1 Error estimate in space

Since \(\sigma (t)\in \partial W(\nabla u(t))\) for a.e. \(t\in [0,T]\), the sum rule of the subderivative implies \(\sigma (t)-\mu \nabla u(t)\in \partial j(\nabla u(t))\). The definition of the subderivative implies for all \(q\in L^2(\Omega ;{\mathbb {R}}^2)\)

$$\begin{aligned} \int \limits _\Omega (\sigma (t)-\mu \nabla u(t))\,(q-\nabla u(t))\,dx \le j(q)-j(\nabla u(t)). \end{aligned}$$
(4.1)

Since \(\Pi _0 \nabla J_3 u_\textrm{DG}(t) = \nabla _{\textrm{NC}}u_\textrm{DG}(t)\), for \(q=\nabla _{\textrm{NC}}u_\textrm{DG}(t)\) and \(e:=u-u_\textrm{DG}\) the left-hand side reads

$$\begin{aligned}&\int \limits _\Omega (\sigma (t)-\mu \nabla u(t))\,(q-\nabla u(t))\,dx\\&\qquad = \int \limits _\Omega (\sigma (t)-\Pi _0\sigma (t))(\nabla _{\textrm{NC}}(u_\textrm{DG}(t)-J_3 u_\textrm{DG}(t))\,dx\\&\qquad \qquad + \int \limits _\Omega \sigma (t)\cdot \nabla (J_3 u_\textrm{DG}(t)-u(t))\,dx +\mu \int \limits _\Omega \nabla u(t)\cdot \nabla _{\textrm{NC}}e(t)\,dx. \end{aligned}$$

The application of \(-{\text {div}}\sigma (t) = (f(t)-{\dot{u}}(t))\in H^{-1}(\Omega )\) leads for the second term to

$$\begin{aligned} \begin{aligned}&\int \limits _\Omega \sigma (t)\cdot \nabla (J_3 u_\textrm{DG}(t)-u(t))\,dx = \int \limits _\Omega f(t) (J_3 u_\textrm{DG}(t) -u_\textrm{DG}(t))\,dx\\&\qquad \quad +\int \limits _\Omega f(t) (u_\textrm{DG}(t)-u(t))\,dx - \langle {\dot{u}}, J_3 u_\textrm{DG}(t)-u(t)\rangle _{H^{-1}(\Omega )\times H^1_0(\Omega )}. \end{aligned} \end{aligned}$$
(4.2)

The integral mean property of \(J_3\) from (2.5) and the combination of (4.1)–(4.2) with an integration over \(t\in I_j\) results in

$$\begin{aligned} \begin{aligned}&\mu \int \limits _{Q_j} \nabla u\cdot \nabla _{\textrm{NC}}(u-u_\textrm{DG})\,dQ \le \int \limits _{Q_j}(\sigma -\Pi _0\sigma )\cdot \nabla _{\textrm{NC}}(J_3 u_\textrm{DG}- u_\textrm{DG})\,dQ\\&\qquad +\int \limits _{Q_j} (f-\Pi _0 f)(u_\textrm{DG}- J_3 u_\textrm{DG})\,dQ +\int \limits _{Q_j} f (u - u_\textrm{DG})\,dQ\\&\qquad +\int \limits _{I_j} \langle {\dot{u}},J_3 u_\textrm{DG}-u\rangle _{H^{-1}(\Omega )\times H^1_0(\Omega )}\,dt + \int \limits _{I_j} j(\nabla _{\textrm{NC}}u_\textrm{DG}) - j(\nabla u)\,dt. \end{aligned} \end{aligned}$$
(4.3)

4.2 Combination of variational inequalities

The sum of (dG(0)) with the new notation \(v_\textrm{DG}=u_\textrm{DG}^\star \) (to highlight that this will be chosen below to approximate the exact solution) and (4.3) provides

$$\begin{aligned} \begin{aligned}&\mu \int \limits _{Q_j} \nabla _{\textrm{NC}}u_\textrm{DG}\cdot \nabla _{\textrm{NC}}(u_\textrm{DG}-u_\textrm{DG}^\star )\,dQ + \mu \int \limits _{Q_j} \nabla u\cdot \nabla _{\textrm{NC}}(u-u_\textrm{DG})\,dQ\\&\le \int \limits _{Q_j} (\sigma -\Pi _0\sigma ) \nabla _{\textrm{NC}}(J_3 u_\textrm{DG}- u_\textrm{DG})\,dQ\\&\quad +\int \limits _{Q_j} (f-\Pi _0 f) (u_\textrm{DG}- J_3 u_\textrm{DG})\,dQ +\int \limits _{Q_j} f \,(u-u_\textrm{DG}^\star )\,dQ\\&\quad +\int \limits _{I_j} \langle {\dot{u}}, J_3 u_\textrm{DG}-u\rangle _{H^{-1}(\Omega )\times H^1_0(\Omega )}\,dt\\&\quad +\int \limits _\Omega [u_\textrm{DG}]_{\ell -1}(u_\textrm{DG}^\star -u_\textrm{DG})^+_{\ell -1}\,dx + \int \limits _{I_j} j(\nabla _{\textrm{NC}}u_\textrm{DG}^\star )-j(\nabla u)\,dt. \end{aligned}\nonumber \\ \end{aligned}$$
(4.4)

Since \((d/dt) J_3 u_\textrm{DG}=0\) in \(I_\ell \), the fourth term on the right-hand side of (4.4) reads

$$\begin{aligned}&\int \limits _{I_\ell } \langle {\dot{u}}, J_3 u_\textrm{DG}-u\rangle _{H^{-1}(\Omega )\times H^1_0(\Omega )}\,dt\\&\qquad \qquad \quad = \int \limits _{I_\ell } \left\langle \frac{d}{dt}(u - J_3 u_\textrm{DG}), J_3 u_\textrm{DG}-u\right\rangle _{H^{-1}(\Omega )\times H^1_0(\Omega )}\,dt\\&\qquad \qquad \quad =-\frac{1}{2}\int \limits _{I_\ell } \frac{d}{dt} \Vert (u - J_3 u_\textrm{DG})(t)\Vert _{L^2(\Omega )}^2\,dt\\&\qquad \qquad \quad =\Vert (u - J_3 u_\textrm{DG})(t_{\ell -1}^+)\Vert _{L^2(\Omega )}^2/2 - \Vert (u - J_3 u_\textrm{DG})(t_{\ell }^-)\Vert _{L^2(\Omega )}^2/2. \end{aligned}$$

Since \(u_\textrm{DG}\) is constant on \(I_\ell \), the binomial formulas show

$$\begin{aligned}{} & {} \int \limits _{I_\ell } \langle {\dot{u}}, J_3 u_\textrm{DG}-u\rangle _{H^{-1}(\Omega )\times H^1_0(\Omega )}\,dt = \Vert e(t_{\ell -1}^+)\Vert _{L^2(\Omega )}^2/2 - \Vert e(t_\ell ^-)\Vert _{L^2(\Omega )}^2/2\nonumber \\{} & {} \quad + \int \limits _\Omega (u (t_{\ell -1}^+) - u (t_\ell ^-)) (u_\textrm{DG}- J_3 u_\textrm{DG})(t_\ell ^-)\,dx. \end{aligned}$$
(4.5)

The approximation properties (2.3) of \(J_3\) and a Cauchy inequality imply

$$\begin{aligned}{} & {} \int \limits _\Omega (u (t_{\ell -1}^+) - u (t_\ell ^-)) (u_\textrm{DG}- J_3 u_\textrm{DG})(t_\ell ^-)\,dx\nonumber \\{} & {} \qquad \qquad \lesssim \Vert h_{\mathcal {T}}(u (t_{\ell -1}^+) - u (t_\ell ^-))/k_\ell \Vert _{L^2(Q_\ell )} \Vert \nabla _{\textrm{NC}}(u_\textrm{DG}- J_3 u_\textrm{DG})\Vert _{L^2(Q_\ell )}\nonumber \\{} & {} \qquad \qquad \lesssim \Vert h_{\mathcal {T}}(u (t_{\ell -1}^+) - u (t_\ell ^-))/k_\ell \Vert _{L^2(Q_\ell )} \Vert \nabla _{\textrm{NC}}e\Vert _{L^2(Q_\ell )}. \end{aligned}$$
(4.6)

Since by assumption \(u\in C(0,T;L^2(\Omega ))\) is continuous, and \(e=u-u_\textrm{DG}\), the fifth term on the right-hand side of (4.4) reads

$$\begin{aligned} \begin{aligned}&\int \limits _\Omega [u_\textrm{DG}]_{\ell -1}(u_\textrm{DG}^\star -u_\textrm{DG})^+_{\ell -1}\,dx\\&\; = \int \limits _\Omega [u_\textrm{DG}]_{\ell -1}(u_\textrm{DG}^\star -u)^+_{\ell -1}\,dx - \int \limits _\Omega [e]_{\ell -1}(e)^+_{\ell -1}\,dx. \end{aligned} \end{aligned}$$
(4.7)

The definition of the average \(\langle v \rangle _{\ell -1}=(v(t_{\ell -1}^+) + v(t_{\ell -1}^-))/2\) implies \((e)_{\ell -1}^+ = \langle e \rangle _{\ell -1} + [e]_{\ell -1}/2\). Consequently,

$$\begin{aligned} -\int \limits _\Omega [e]_{\ell -1}(e)^+_{\ell -1}\,dx = -\int \limits _\Omega [e]_{\ell -1} \langle e \rangle _{\ell -1}\,dx - \int \limits _\Omega [e]_{\ell -1}^2/2\,dx . \end{aligned}$$

The product rule \([ab]_{\ell -1}=[a]_{\ell -1}\langle b\rangle _{\ell -1}+\langle a\rangle _{\ell -1}[b]_{\ell -1}\) for any piecewise continuous ab implies

$$\begin{aligned} 2\int \limits _\Omega [e]_{\ell -1} \langle e \rangle _{\ell -1} \,dx = \int \limits _\Omega [e^2]_{\ell -1}\,dx = \Vert e(t_{\ell -1}^+)\Vert _{L^2(\Omega )}^2 - \Vert e(t_{\ell -1}^-)\Vert _{L^2(\Omega )}^2. \end{aligned}$$

Therefore, (4.7) results in

$$\begin{aligned} \begin{aligned}&\int \limits _\Omega [u_\textrm{DG}]_{\ell -1}(u_\textrm{DG}^\star -u_\textrm{DG})^+_{\ell -1}\,dx = \int \limits _\Omega [u_\textrm{DG}]_{\ell -1}(u_\textrm{DG}^\star -u)^+_{\ell -1}\,dx\\&\qquad \qquad - \Vert [e]_{\ell -1}\Vert _{L^2(\Omega )}^2/2 +\Vert e(t_{\ell -1}^-)\Vert _{L^2(\Omega )}^2/2 -\Vert e(t_{\ell -1}^+)\Vert _{L^2(\Omega )}^2/2. \end{aligned} \end{aligned}$$
(4.8)

The approximation and stability properties (2.3) of \(J_{3}\) prove

$$\begin{aligned} \begin{aligned} \int \limits _{Q_\ell } (\sigma -\Pi _0\sigma ) \nabla _{\textrm{NC}}(J_3 u_\textrm{DG}- u_\textrm{DG})\,dQ&\lesssim \Vert \sigma -\Pi _0\sigma \Vert _{L^2(Q_\ell )} \Vert \nabla _{\textrm{NC}}e\Vert _{L^2(Q_\ell )}, \\ \int \limits _{Q_\ell } (f-\Pi _0 f) (u_\textrm{DG}- J_3 u_\textrm{DG})\,dQ&\lesssim \Vert h_{\mathcal {T}}(f-\Pi _0 f)\Vert _{L^2(Q_\ell )} \Vert \nabla _{\textrm{NC}}e\Vert _{L^2(Q_\ell )}. \end{aligned}\nonumber \\ \end{aligned}$$
(4.9)

Since \(\Vert \nabla _{\textrm{NC}}e\Vert _{L^2(Q_\ell )}\) can be absorbed, the combination of (4.4) with (4.5)–(4.6), (4.8)–(4.9) and

$$\begin{aligned}&\int \limits _{Q_\ell } \nabla _{\textrm{NC}}u_\textrm{DG}\cdot \nabla _{\textrm{NC}}(u_\textrm{DG}-u_\textrm{DG}^\star )\,dQ + \int \limits _{Q_\ell } \nabla u\cdot \nabla _{\textrm{NC}}(u-u_\textrm{DG})\,dQ\\&\qquad \qquad \qquad \qquad = \Vert \nabla _{\textrm{NC}}e\Vert _{L^2(Q_\ell )}^2 +\int \limits _{Q_\ell } \nabla _{\textrm{NC}}u_\textrm{DG}\cdot \nabla _{\textrm{NC}}(u-u_\textrm{DG}^\star )\,dQ \end{aligned}$$

leads to

$$\begin{aligned} \mu \Vert \nabla _{\textrm{NC}}e \Vert _{L^2(Q_\ell )}^2&+ \Vert [e]_{\ell -1}\Vert _{L^2(\Omega )}^2/2 + \Vert e(t_\ell ^-)\Vert _{L^2(\Omega )}^2/2 -\Vert e(t_{\ell -1}^-)\Vert _{L^2(\Omega )}^2/2 \nonumber \\&\lesssim \frac{1}{\mu }\bigg ( \Vert \sigma -\Pi _0\sigma \Vert _{L^2(Q_\ell )}^2 + \Vert h_{\mathcal {T}}(f-\Pi _0 f)\Vert _{L^2(Q_\ell )}^2 \nonumber \\&\quad + \Vert h_{\mathcal {T}}(u(t_{\ell -1}^+)-u(t_\ell ^-))/k_\ell \Vert _{L^2(Q_\ell )}^2\bigg ) \nonumber \\&\quad +\int \limits _{Q_\ell } f \,(u-u_\textrm{DG}^\star )\,dQ \end{aligned}$$
(4.10)
$$\begin{aligned}&\quad +\int \limits _\Omega [u_\textrm{DG}]_{\ell -1}(u_\textrm{DG}^\star -u)^+_{\ell -1}\,dx \end{aligned}$$
(4.11)
$$\begin{aligned}&\quad + \int \limits _{I_\ell } j(\nabla _{\textrm{NC}}u_\textrm{DG}^\star )-j(\nabla u)\,dt \end{aligned}$$
(4.12)
$$\begin{aligned}&\quad +\mu \int \limits _{Q_\ell } \nabla _{\textrm{NC}}u_\textrm{DG}\cdot \nabla _{\textrm{NC}}(u_\textrm{DG}^\star -u)\,dQ. \end{aligned}$$
(4.13)

4.3 Choosing \(u_\textrm{DG}^\star \)

Define \({\overline{u}}\in L^2(\Omega )\) by for a.e. \(x\in \Omega \). Note that \({\overline{u}}\in H^1_0(\Omega )\) and (from Fubini’s theorem). Define \(u_\textrm{DG}^\star \vert _{I_\ell }:=I_{\textrm{NC}}{\overline{u}}\). The integral mean property \(\nabla _{\textrm{NC}}I_{\textrm{NC}}\bullet =\Pi _0\nabla \bullet \) implies . This and \(\nabla _{\textrm{NC}}u_\textrm{DG}\in P_0(Q_\ell ;{\mathbb {R}}^2)\) lead to

$$\begin{aligned} \int \limits _{Q_\ell } \nabla u_\textrm{DG}\cdot \nabla (u_\textrm{DG}^\star -u)\,dQ = 0. \end{aligned}$$

Jensen’s inequality yields

$$\begin{aligned} \int \limits _{I_\ell } j(\nabla _{\textrm{NC}}u_\textrm{DG}^\star )-j(\nabla u)\,dt \le 0. \end{aligned}$$

It remains to estimate (4.10) and (4.11). Define \({\overline{f}}\in L^2(\Omega )\) by for a.e. \(x\in \Omega \). Then

$$\begin{aligned} \int \limits _{Q_\ell } f (u-u_\textrm{DG}^\star )\,dQ = \int \limits _{Q_\ell } \big (f-{\overline{f}}\big )\big (u-{\overline{u}}\big )\,dQ + \int \limits _{Q_\ell } f \big ({\overline{u}}-u_\textrm{DG}^\star \big )\,dQ. \end{aligned}$$

A Cauchy inequality implies for the first term

$$\begin{aligned} \int \limits _{Q_\ell } \big (f-{\overline{f}}\big )\big (u-{\overline{u}}\big )\,dQ&\le \Vert f-{\overline{f}}\Vert _{L^2(Q_\ell )} \Vert u-{\overline{u}}\Vert _{L^2(Q_\ell )}. \end{aligned}$$

The definition of \( u_\textrm{DG}^\star = I_{\textrm{NC}}{\overline{u}}\) and the approximation property of \(I_{\textrm{NC}}\) imply

$$\begin{aligned} \int \limits _{Q_\ell } f \big ({\overline{u}}-u_\textrm{DG}^\star \big )\,dQ&\lesssim \int \limits _{I_\ell } \Vert h_{\mathcal {T}}f\Vert _{L^2(\Omega )} \Vert \nabla _{\textrm{NC}}({\overline{u}}-I_{\textrm{NC}}{\overline{u}})\Vert _{L^2(\Omega )}\,dt\\&\le \Vert h_{\mathcal {T}}f\Vert _{L^2(Q_\ell )} \Vert \nabla _{\textrm{NC}}({\overline{u}}-I_{\textrm{NC}}{\overline{u}})\Vert _{L^2(Q_\ell )}. \end{aligned}$$

Since and , it follows for a.e. \(x\in \Omega \). Hence, Jensen’s inequality proves

$$\begin{aligned} \Vert \nabla _{\textrm{NC}}({\overline{u}}-I_{\textrm{NC}}{\overline{u}})\Vert _{L^2(Q_\ell )} \le \Vert \nabla _{\textrm{NC}}(u-I_{\textrm{NC}}u)\Vert _{L^2(Q_\ell )} . \end{aligned}$$

The combination of the previous inequalities leads to

$$\begin{aligned} \int \limits _{Q_\ell } f (u-u_\textrm{DG}^\star )\,dQ&\lesssim \Vert f-{\overline{f}}\Vert _{L^2(Q_\ell )} \Vert u-{\overline{u}}\Vert _{L^2(Q_\ell )}\\&\quad + \Vert h_{\mathcal {T}}f\Vert _{L^2(Q_\ell )} \Vert \nabla _{\textrm{NC}}(u - I_{\textrm{NC}}u)\Vert _{L^2(Q_\ell )}. \end{aligned}$$

Since u is continuous,

$$\begin{aligned} \int \limits _\Omega [u_\textrm{DG}]_{\ell -1}&(u_\textrm{DG}^\star -u)^+_{\ell -1}\,dx\\&\le \Vert [e]_{\ell -1}\Vert _{L^2(\Omega )} \big (\Vert (u-{\overline{u}})_{\ell -1}^+\Vert _{L^2(\Omega )} + \Vert ({\overline{u}} - u_\textrm{DG}^\star )_{\ell -1}^+\Vert _{L^2(\Omega )}\big ) . \end{aligned}$$

Since \(u_\textrm{DG}^\star = I_{\textrm{NC}}{\overline{u}}\) and \({\overline{u}}\vert _{I_\ell }\in H^1(\Omega )\), (2.1) implies

(4.14)

Altogether,

$$\begin{aligned}&\mu \Vert \nabla _{\textrm{NC}}e \Vert _{L^2(Q_\ell )}^2 + \Vert [e]_{\ell -1}\Vert _{L^2(\Omega )}^2/4 + \Vert e(t_\ell ^-)\Vert _{L^2(\Omega )}^2/2 -\Vert e(t_{\ell -1}^-)\Vert _{L^2(\Omega )}^2/2\\&\qquad \lesssim \frac{1}{\mu }\bigg (\Vert \sigma -\Pi _0\sigma \Vert _{L^2(Q_\ell )}^2 + \Vert h_{\mathcal {T}}(f-\Pi _0 f)\Vert _{L^2(Q_\ell )}^2 \\&\qquad \qquad + \Vert h_{\mathcal {T}}(u(t_{\ell -1}^+)-u(t_\ell ^-))/k_\ell \Vert _{L^2(Q_\ell )}^2\bigg ) +\Vert f-{\overline{f}}\Vert _{L^2(Q_\ell )} \Vert u-{\overline{u}}\Vert _{L^2(Q_\ell )}\\&\qquad \qquad + \Vert h_{\mathcal {T}}f\Vert _{L^2(Q_\ell )} \Vert \nabla _{\textrm{NC}}(u-I_{\textrm{NC}}u)\Vert _{L^2(Q_\ell )} + \Vert (u-{\overline{u}})_{\ell -1}^+\Vert _{L^2(\Omega )}^2\\&\qquad \qquad + k^{-1}\Vert h_{\mathcal {T}}\nabla _{\textrm{NC}}(u-I_{\textrm{NC}}u)\Vert _{L^2(Q_\ell )}^2. \end{aligned}$$

This concludes the proof of Theorem 2.9.

5 Generalization to 3D

In this section, let \(\Omega \subseteq {\mathbb {R}}^3\) be a three-dimensional bounded polyhedral Lipschitz domain. Define the vector-valued spaces

$$\begin{aligned} H^1_0(\Omega ;{\mathbb {R}}^3)&:=\{w\in L^2(\Omega ;{\mathbb {R}}^3)\mid \forall m=1,2,3,\;w_m\in H^1_0(\Omega )\},\\ Z&:=\{w\in H^1_0(\Omega ;{\mathbb {R}}^3)\mid {\text {div}}w=0\} \end{aligned}$$

and let \(H^{-1}(\Omega ;{\mathbb {R}}^3)\) denote the dual space of \(H^1_0(\Omega ;{\mathbb {R}}^3)\). Furthermore, the linearised Green strain tensor \(\varepsilon (u)=(\nabla u+\nabla u^\top )/2\) denotes the symmetric part of the gradient of a velocity function \(u\in H^1_0(\Omega ;{\mathbb {R}}^3)\) and  :  denotes the scalar product between two matrices \(A,B\in {\mathbb {R}}^{3\times 3}\), i.e., \(A:B:=\sum _{\ell =1}^3 \sum _{m=1}^3 A_{\ell m}B_{\ell m}\).

Given \(f\in L^2(0,T;L^2(\Omega ;{\mathbb {R}}^3))\) and initial data \(u_0\in H({\text {div}},\Omega )\) with \({\text {div}}u_0=0\), the time-dependent Bingham flow problem in 3D seeks \(u\in L^2(0,T;Z)\) with \({\dot{u}}\in L^2(0,T;H^{-1}(\Omega ;{\mathbb {R}}^3))\) such that \(u(0)=u_0\) and, at a.e. \(t\in [0,T]\), any \(v\in Z\) satisfies

$$\begin{aligned} \int \limits _\Omega f(t)\cdot (v-u(t))\,dx\le & {} \langle {\dot{u}}(t),v-u(t)\rangle _{H^{-1}(\Omega ;{\mathbb {R}}^3)\times H^1_0(\Omega ;{\mathbb {R}}^3)}\nonumber \\{} & {} +\mu \int \limits _\Omega \varepsilon (u(t)):\varepsilon (v-u(t))\,dx\nonumber \\{} & {} + g\Vert \varepsilon (v)\Vert _{L^1(\Omega )} - g\Vert \varepsilon (u(t))\Vert _{L^1(\Omega )}. \end{aligned}$$
(5.1)

A direct discretisation of the variational inequality with \(P_1\) non-conforming finite elements is not possible, since \(\int _\Omega \varepsilon _{\textrm{NC}}(\bullet ):\varepsilon _{\textrm{NC}}(\bullet )\,dx\) is not positive definite on the \(P_1\) non-conforming finite element space: Korn’s inequality fails for \(\textrm{CR}^1_0({\mathcal {T}};{\mathbb {R}}^3)\) in general. However, for homogeneous Dirichlet boundary conditions, a straightforward calculation reveals, for all \(v,w\in H^1_0(\Omega ;{\mathbb {R}}^3)\), that

$$\begin{aligned} 2 \int \limits _\Omega \varepsilon (v):\varepsilon (w)\,dx = \int \limits _\Omega (\nabla v:\nabla w + {\text {div}}v{\text {div}}w)\,dx. \end{aligned}$$

Since \(u(t)\in Z\), the following problem is equivalent to (5.1): Seek \(u\in L^2(0,T;Z)\) with time derivative \({\dot{u}}\in L^2(0,T;H^{-1}(\Omega ;{\mathbb {R}}^3))\) such that \(u(0)=u_0\) and, at a.e. \(t\in [0,T]\), any \(v\in Z\) satisfies

$$\begin{aligned} \int \limits _\Omega f(t)\cdot (v-u(t))\,dx\le & {} \langle {\dot{u}}(t),v-u(t)\rangle _{H^{-1}(\Omega )\times H^1_0(\Omega )}\nonumber \\{} & {} +\frac{\mu }{2}\int \limits _\Omega \nabla u(t):\nabla (v-u(t))\,dx\nonumber \\{} & {} + g\Vert \varepsilon (v)\Vert _{L^1(\Omega )} - g\Vert \varepsilon (u(t))\Vert _{L^1(\Omega )}. \end{aligned}$$
(5.2)

The \(P_1\) non-conforming finite element space in this three-dimensional context consists of piecewise affine functions, which are continuous in the midpoints of interior faces and vanish in midpoints of boundary faces. Let \({\text {div}}_{\textrm{NC}}\) and \(\varepsilon _{\textrm{NC}}\) denote the piecewise versions of \({\text {div}}\) and \(\varepsilon \) and define the vector-valued spaces

$$\begin{aligned} \textrm{CR}^1_0({\mathcal {T}};{\mathbb {R}}^3)&:=\left\{ v_\textrm{CR}=(v_{\textrm{CR},1},v_{\textrm{CR},2},v_{\textrm{CR},3}):\Omega \rightarrow {\mathbb {R}}^3 \left| \begin{array}{l} \forall m=1,2,3,\\ v_{\textrm{CR},m}\in \textrm{CR}^1_0({\mathcal {T}}) \end{array} \right\} \right. , \\ Z_\textrm{CR}&:=\{v_\textrm{CR}\in \textrm{CR}^1_0({\mathcal {T}};{\mathbb {R}}^3)\mid {\text {div}}_{\textrm{NC}}v_\textrm{CR}=0\}. \end{aligned}$$

Let \(U_0\in \textrm{CR}^1_0({\mathcal {T}};{\mathbb {R}}^3)\) be some approximation of \(u_0\). With the notation for the partition of the time interval \({\mathcal {K}}\) from Sect. 2, the generalised midpoint rule seeks \(u_h\in S^1({\mathcal {K}};Z_\textrm{CR})\) such that \(u_h(0)=U_0\) and, for all \(\ell =1,\dots ,N\) and for all \(v_h\in Z_\textrm{CR}\), it holds

$$\begin{aligned} \int \limits _\Omega f(\tau _\ell )\cdot (v_h-u_h(\tau _\ell ))\,dx\le & {} \int \limits _\Omega {\dot{u}}_h(\tau _\ell )\cdot (v_h-u_h(\tau _\ell ))\,dx \nonumber \\{} & {} +\mu \int \limits _\Omega \nabla _{\textrm{NC}}u_h(\tau _\ell ): \nabla _{\textrm{NC}}(v_h-u_h(\tau _\ell ))\,dx\nonumber \\{} & {} + j(\varepsilon _{\textrm{NC}}(v_h)) - j(\varepsilon _{\textrm{NC}}(u_h(\tau _\ell ))). \end{aligned}$$
(5.3)

Recall the notation of the jump \([\bullet ]_\ell \) and the left limit \((\bullet )^+_\ell \) from Sect. 2. The discontinuous Galerkin FEM in 3D seeks \(u_\textrm{DG}\in P_0({\mathcal {K}};Z_\textrm{CR})\) such that, for all \(\ell =1,\dots , N\) and for all \(v_\textrm{DG}\in P_0({\mathcal {K}};Z_\textrm{CR})\), it holds

$$\begin{aligned} \begin{aligned} \int \limits _{Q_\ell } f \cdot (v_\textrm{DG}-u_\textrm{DG})\,dQ&\le \int \limits _\Omega [u_\textrm{DG}]_{\ell -1}\cdot (v_\textrm{DG}-u_\textrm{DG})^+_{\ell -1}\,dx\\&\quad +\mu \int \limits _{Q_\ell } \nabla _{\textrm{NC}}u_\textrm{DG}: \nabla _{\textrm{NC}}(v_\textrm{DG}-u_\textrm{DG})\,dQ\\&\quad + \int \limits _{I_\ell } \Big (j(\varepsilon _{\textrm{NC}}(v_\textrm{DG}\vert _{I_\ell })) -j(\varepsilon _{\textrm{NC}}(u_\textrm{DG}\vert _{I_\ell }))\Big )\,dt. \end{aligned}\nonumber \\ \end{aligned}$$
(5.4)

The divergence-free condition \(u_h(t)\in Z_\textrm{CR}\) and \(u_\textrm{DG}\vert _{I_\ell }\in Z_\textrm{CR}\) can be implemented using a piecewise constant Lagrange multiplier as it is common for the Stokes equations. Define

$$\begin{aligned} H({\text {div}},\Omega ;{\mathbb {R}}^{3\times 3})&:= \{\tau =(\tau _1;\tau _2;\tau _3):\Omega \rightarrow {\mathbb {R}}^{3\times 3}\mid \forall m=1,2,3,\; \tau _m\in H({\text {div}},\Omega )\},\\ L^2_0(\Omega )&:=\left\{ v\in L^2(\Omega )\left| \int _\Omega v\,dx=0 \right\} \right. . \end{aligned}$$

Let \(t\in [0,T]\) be arbitrary such that (5.2) holds and \(f(t)-{\dot{u}}(t)\in H^{-1}(\Omega )\) and \(u(t)\in H^1_0(\Omega )\). Define \(W_{3D}(A):=(\mu /4) |A|^2 + g|{\textrm{sym}} A|\). The Euler-Lagrange equations of [11, Theorem 5.1] prove the existence of \(\sigma (t)\in H({\text {div}},\Omega ;{\mathbb {R}}^{3\times 3})\) and \(\xi (t)\in L^2_0(\Omega )\) with

$$\begin{aligned} \sigma (t) - \xi (t) I_{3\times 3}\in \partial W_{3D}(\nabla u(t)) \quad \text {and}\quad f(t)+{\text {div}}\sigma (t)-{\dot{u}}(t)=0 \text { a.e. in }\Omega . \end{aligned}$$

With this \(\sigma \), the error estimates of Theorems 2.72.9 equally hold in this three-dimensional situation. The proof follows as in Sects. 34, see also [11, Theorem 5.2] for the analysis of the error estimate in space. We want to highlight that it is essential for the proof that the functions in \(Z_\textrm{CR}\) are pointwise divergence-free.

6 Numerical experiments

This section is devoted to two numerical experiments for the two discretisations (tB\(_h\)) and (dG(0)) in Sects. 6.26.3. Sect. 6.1 describes how the solution to the discrete nonlinear problems are approximated, while Sect. 6.4 draws some conclusions of the numerical experiments.

6.1 Numerical realisation

The discrete solutions of (tB\(_h\)) and (dG(0)) are approximated by the following Uzawa algorithm from [13]. It is recalled here for the sake of completeness.

figure e

The numerical experiments in Subsects. 6.26.3 employ \(\rho =\mu /g^2\approx 44.4\) and \(\texttt {tol}=10^{-6}\) for the discrete solutions (resp. \(\texttt {tol}=10^{-6}\) for the computation of the reference solutions) and we neglect the influence of the error in the Usawa algorithm in the discussion.

Fig. 1
figure 1

Initial triangulations for the numerical examples from Sect. 6

Fig. 2
figure 2

Red-refined triangle

Fig. 3
figure 3

Plot of rigid regions of \(u_{\textrm{DG},m,j}\) at times 0.0333, 0.1 and 0.2 for different values of m and j for the numerical experiment from Sect. 6.2

Fig. 4
figure 4

Plot of rigid regions of \(u_{0.5,m,j}\) at times 0.0333, 0.1 and 0.2 for different values of m and j for the numerical experiment from Sect. 6.2

Fig. 5
figure 5

Convergence history plot for the numerical experiment from Sect. 6.2. The errors from (6.1) are plotted against the degrees of freedom in \({\mathcal {T}}_m\)

Fig. 6
figure 6

Convergence history plot for the numerical experiment from Sect. 6.2. The errors (6.2) for \(\theta =1\) and \(\theta =0.5\) are plotted against the degrees of freedom in \({\mathcal {T}}_m\)

Fig. 7
figure 7

Convergence history plot for the numerical experiment from Sect. 6.2. The errors (6.1) are plotted against the number of time steps

Fig. 8
figure 8

Convergence history plot for the numerical experiment from Sect. 6.2. The errors (6.2) for \(\theta =1\) and \(\theta =0.5\) are plotted against the number of time steps

6.2 Numerical experiment on the square

This example approximates the solution to (tB) on the unit square \(\Omega =(0,1)^2\) and the time intervall [0, T] with \(T=0.2\) and initial value \(u(0)=0\), right-hand side \(f=5\), viscosity \(\mu =1\), and yield limit \(g=0.15\). The initial triangulation \({\mathcal {T}}_0\) is depicted in Fig. 1a. The approximations \(u_{\textrm{DG},m,j}\) defined by (dG(0)) and \(u_{\theta ,m,j}\) defined by (tB\(_h\)) for \(\theta =\theta _\ell =0.5\) and \(\theta =\theta _\ell =1\) for all time steps \(\ell \) are computed on a sequence \(({\mathcal {T}}_m)_{m=0,\dots ,5}\) of red-refined triangulations (see Fig. 2 for a red-refined triangle). The time step sizes are chosen as \(3^{-j}T/2\) for \(j=0,1,\dots ,4\) and the parameter \(\texttt {tol}\) in the Uzawa algorithm as \(10^{-6}\). A discrete rigid region of \(u_{\theta ,m,j}\) or \(u_{\textrm{DG},m,j}\) at some time t is defined as the union of all triangles where the norm of its (spatial) gradient is less than 1 of the maximal norm of the gradient at that time. These rigid regions are plotted in Fig. 3 (resp. Fig. 4) for \(u_{\textrm{DG},m,j}\) (resp. \(u_{\theta ,m,j}\)) for three different times and for different mesh-sizes and time-step sizes to illustrate the behaviour of the discrete solutions. A reference solution \(u_\textrm{ref}\) is computed on the triangulation \({\mathcal {T}}_6\) with time step size \(3^{-5}T/2\) and \(\texttt {tol}=10^{-7}\). Note that the refinement in time is chosen in such a way that the evaluation points \(\tau _\ell \) with respect to a coarse time step are also evaluation points on a finer time step. The errors \(e_{\theta ,m,j}\) (resp. \(e_{\textrm{DG},m,j}\)) read

$$\begin{aligned} \sqrt{\int \limits _{[0,T]\times \Omega } |\nabla _{\textrm{NC}}(u_\textrm{ref}-u_h)|^2\,dQ} \end{aligned}$$
(6.1)

for \(u_h=u_{\textrm{DG},m,j}\) (resp. \(u_h=u_{\theta ,m,j}\)) and are plotted in Fig. 5 against the numbers of degrees of freedom (ndof) in \({\mathcal {T}}_m\), while the errors

$$\begin{aligned} e_{\tau ;\theta ,m,j}:= \sqrt{\sum _{\ell =1}^N k \Vert \nabla _{\textrm{NC}}(u_\textrm{ref}-u_{\theta ,m,j})(\tau _\ell )\Vert _{L^2(\Omega )}^2} \end{aligned}$$
(6.2)

for the number of time steps N are plotted in Fig. 6.

For a time step size of \(k=3^{-5}T/2\), the errors from (6.1) and (6.2) for all three methods show a convergence rate of \(\texttt {ndof}^{-1/2} = O(h)\). For \(\theta =0.5\) this convergence rate is also achieved for a time step size of \(k=3^{-4}T/2\). In Figs. 7 and 8, the same errors are plotted against the number of time steps. The errors (6.1) show a convergence rate of O(k) on the finest triangulation \({\mathcal {T}}_5\) for all considered methods for the first time steps. In particular, the observed convergence rate is O(k) instead of the predicted \(O(k^{1/2})\) for the discretisation (dG(0)). For larger time steps, the error in space already seems to dominate the error, so that the convergence is diminished. The errors (6.2) converge with rate between O(k) and \(O(k^{1/2})\) for \(\theta =1\) on \({\mathcal {T}}_5\) for the first time steps. This is a slower convergence compared to the predicted convergence rate of Theorem 2.7 and it seems that preasymptotic effects diminish the convergence rate. For \(\theta =0.5\), the convergence between O(k) and the predicted \(O(k^2)\) is visible. It seems that preasymptotic effects or the error in space slow down the convergence.

Fig. 9
figure 9

Plot of rigid regions of \(u_{\textrm{DG},m,j}\) at times 0.0333, 0.1 and 0.2 for different values of m and j for the numerical experiment from Sect. 6.3

Fig. 10
figure 10

Plot of rigid regions of \(u_{0.5,m,j}\) at times 0.0333, 0.1 and 0.2 for different values of m and j for the numerical experiment from Sect. 6.3

Fig. 11
figure 11

Convergence history plot for the numerical experiment from Sect. 6.3. The errors from (6.1) are plotted against the degrees of freedom in \({\mathcal {T}}_m\)

Fig. 12
figure 12

Convergence history plot for the numerical experiment from Sect. 6.3. The errors (6.2) for \(\theta =1\) and \(\theta =0.5\) are plotted against the degrees of freedom in \({\mathcal {T}}_m\)

Fig. 13
figure 13

Convergence history plot for the numerical experiment from Sect. 6.3. The errors (6.1) are plotted against the number of time steps

Fig. 14
figure 14

Convergence history plot for the numerical experiment from Sect. 6.3. The errors (6.2) for \(\theta =1\) and \(\theta =0.5\) are plotted against the number of time steps

6.3 Numerical experiment on the L-shaped domain

This example approximates the solution of (tB) on the L-shaped domain \(\Omega =(-1,1)^2\setminus ([0,1]\times [0,1])\) and the time interval [0, T] with \(T=0.2\) and with initial value \(u(0)=0\), right-hand side \(f=5\), viscosity \(\mu =1\), and yield limit \(g=0.15\). As in Sect. 6.2, the approximations \(u_{\textrm{DG},m,j}\) defined by (dG(0)) and \(u_{\theta ,m,j}\) defined by (tB\(_h\)) for \(\theta =\theta _\ell =0.5\) and \(\theta =\theta _\ell =1\) for all time steps \(\ell \) are computed on a sequence \(({\mathcal {T}}_m)_{m=0,\dots ,5}\) of red-refined triangulations (see Fig. 2 for a red-refined triangle). The initial triangulation \({\mathcal {T}}_0\) is depicted in Fig. 1b. The time step sizes are chosen as \(3^{-j}T/2\) for \(j=0,1,\dots ,4\) and the parameter \(\texttt {tol}\) in the Uzawa algorithm as \(10^{-6}\). As in Sect. 6.2, the rigid regions are plotted in Fig. 9 (resp. Fig. 10) for \(u_{\textrm{DG},m,j}\) (resp. \(u_{\theta ,m,j}\)) for three different times and for different mesh-sizes and time-step sizes to illustrate the behaviour of the discrete solutions.

A reference solution \(u_\textrm{ref}\) is computed on the triangulation \({\mathcal {T}}_6\) with time step size \(3^{-5}T/2\) and \(\texttt {tol}=10^{-7}\). The errors from (6.1) are plotted in Fig. 11 against the number of degrees of freedom (ndof) in \({\mathcal {T}}_m\), while Fig. 12 displays the errors from (6.2). For coarse time steps, the convergence is limited by the error in time, while for small time steps, the errors for all methods show a convergence rate slightly smaller than O(h). In the stationary case, the experiments of [11] show a convergence rate slightly larger than \(O(h^{2/3})\) due to the re-entrant corner and the resulting singularity of the solution on an L-shaped domain. However, for the time-dependent situation considered here, it seems that (at least preasymptotic) the singularity has merely a marginal influence. Figures 13 and 14 display the same errors plotted against the number of time steps. The observed convergence rates are O(k) for the first refinement in time on the fines triangulation for all three methods and the two errors. The predicted convergence rate of \(O(k^2)\) for the uniform midpoint rule cannot be observed. The error in space seems to dominate the approximation error as in Sect. 6.2 on coarser triangulations.

6.4 Conclusions

Section 2.4 states error estimates for the discretisations (tB\(_h\)) and (dG(0)). The numerical experiments of this section illustrate those results. Moreover, they indicate the following results beyond the theoretical statements from Sect. 2.4.

  • The convergence history plots show no preasymptotic effect with respect to h.

  • The empirical convergence for the discontinuous Galerkin discretisation is O(k) rather than \(O(k^{1/2})\) as \(k\rightarrow 0\) predicted by Theorem 2.9.

  • The factor \(h^2/k^{1/2}\) in the upper bound for the error of the discontinuous Galerkin discretisation cannot be seen in the numerical experiments, i.e., the error do not increase, when k becomes small for large h.

  • The condition \(\Vert h_{\mathcal {T}}\Vert _{L^\infty (\Omega )}<k\sqrt{\mu }/(8\sqrt{T}C_{\textrm{NC}})\) that was needed in the error estimate for the uniform midpoint rule in Theorem 2.8 is satisfied only for the pairs \((m,j)\in \{(3,0),(4,0),(5,0),(5,1)\}\) for both experiments in Subsects. 6.26.3. Although Theorem 2.8 only proves a parameter conditional convergence, the numerical experiments indicate that the convergence is independent of that parameter.