1 Introduction

In this paper, we solve numerically the following Stokes problem: Find unknown velocity \(\textbf{u}\) and pressure p such that

$$\begin{aligned} -\mu \Delta \textbf{u}+\nabla p= & {} \textbf{f}\quad \text{ in }\;\Omega , \end{aligned}$$
(1.1)
$$\begin{aligned} \nabla \cdot \textbf{u}= & {} 0\quad \text{ in }\;\Omega , \end{aligned}$$
(1.2)
$$\begin{aligned} \textbf{u}= & {} 0\quad \text{ on }\; \partial \Omega , \end{aligned}$$
(1.3)

where \(\mu \) denotes the fluid viscosity and \(\Omega \) is a 2D polygonal domain, or a 3D polyhedral domain.

The Stokes equations have many applications in fluid dynamics, especially for linearizing the famous Navier-Stokes equations. Finite element methods for the Stokes equations have been studied extensively. Finite element methods in the velocity–pressure formulation have been studied in Crouzeix and Raviart (1973); Girault and Raviart (1986) by \(H^1\) conforming and nonconforming finite element methods, in Schotzau et al. (2003) by discontinuous Galerkin finite element methods, and in Chen et al. (2016); Liu et al. (2020); Mu (2020); Wang et al. (2016) by weak Galerkin finite element methods. In particular, \(H^1\) divergence-free finite element methods are pressure-robust and produce exactly divergence free solutions, cf. (Arnold and Qin 1992; Fabien et al. 2022; Falk and Neilan 2013; Guzman and Neilan 2014a, b, 2018; Huang and Zhang 2012, 2011, 2013; Kean et al. 2022; Neilan 2015; Neilan and Otus 2021; Neilan and Sap 2016; Qin and Zhang 2007; Scott and Vogelius 1985; Xu and Zhang 2010; Zhang 2005, 2008, 2009, 2011a, b, 2017, 2024). In Wang et al. (2009); Wang and Ye (2007), an \(H({\text {div}})\) finite element method on triangular meshes is proposed for the Stokes equations, i.e. the velocity is approximated by \(H({\text {div}})\) finite elements. Therefore the resulting solutions are also exactly divergence free, and consequently the method is pressure robust (John et al. 2017).

Since the finite element solutions in \(H({\text {div}})\) space are discontinuous in tangential directions, a stabilizer is introduced in Wang et al. (2009); Wang and Ye (2007), enforcing some weak continuity of velocity approximation in tangential directions. One would remove such stabilizers in discontinuous finite element methods if possible, for simplifying the methods and reducing the computational cost. The stabilizer-free weak Galerkin and the stabilizer-free discontinuous Galerkin methods on polytopal meshes were first introduced in Ye and Zhang (2020a, 2020c) respectively, for second order elliptic problems. The method has been improved in Al-Taweel and Wang (2020) and applied to the Stokes equations (Feng et al. 2022; Mu et al. 2021; Peng et al. 2021; Ye and Zhang 2021a) and the biharmonic equation (Ye and Zhang 2020e, 2023). Recently, we developed a stabilizer free \(H({\text {div}})\) finite element method in Ye and Zhang (2021b) for the Stokes problem on triangular and tetrahedral meshes. The method is extended to tensor-product meshes in Mu et al. (2023). The method is improved in 2D by finding the divergence-free subspace of \(H({\text {div}})\) finite element space on triangular meshes (Ye and Zhang 2021c). To obtain a stabilizer-free \(H({\text {div}})\) finite element method in Ye and Zhang (2021b), a technique is adopted from the conforming discontinuous Galerkin method (Ye and Zhang 2020b, c, d, 2022a, b). This work is to extend the stabilizer-free \(H({\text {div}})\) finite element method (Ye and Zhang 2021b) on triangular/tetrahedral meshes to polytopal meshes.

So far most existing \(H({\text {div}})\) finite elements (Brezzi and Fortin 1991) are defined on triangles or rectangles in 2D and tetrahedra or cuboids in 3D, respectively. It is far from trivial to construct \(H({\text {div}})\) finite elements on polygons/polyhedra, for the required normal-continuity on too many edges/face-polygons. Wachspress coordinates are used in Chen and Wang (2017) to construct a lowest order \(H({\text {div}})\) element on convex polygons and polyhedra, which are not polynomials, but rational functions. RT-type \(H({\text {div}})\) finite elements on polytopal meshes are constructed in Lin et al. (2022), and are applied to a one-order superconvergent discontinuous Galerkin finite element method (Ye and Zhang 2020c). But for the Stokes equations, we need BDM-type \(H({\text {div}})\) finite elements on polytopal meshes in order to have an optimal-order discretization. BDM-type \(H({\text {div}})\) finite elements on polytopal meshes are constructed and applied to a two-order superconvergent discontinuous Galerkin finite element method in Ye and Zhang (2020d). More BDM-type \(H({\text {div}})\) finite elements on polytopal meshes are constructed in Xu et al. (2024), covering more polyhedra in 3D.

In this work, we adopt the BDM-type \(H({\text {div}})\) finite elements on polytopal meshes (Ye and Zhang 2020d; Xu et al. 2024) to approximate the velocity and the discontinuous \(P_k\) polynomials to approximate the pressure, in solving the Stokes equations. Again, this method is stabilizer free and pressure robust. In addition, the \(H({\text {div}})\) velocity solution is divergence free. The optimal order error estimates are established for both velocity and pressure finite element approximations. The pressure-robustness of the method is proved in the theory, and confirmed in numerical tests.

2 Polytopal BDM-type H-div finite elements

2.1 2D polygons

Let \(T\in \mathcal {T}_h\) be a n-polygon, cf. Fig. 1. We assume T can be subdivided into \(n-2\) shape regular triangles by connecting one vertex \(\textbf{x}_1\) with vertices \(\textbf{x}_3\), ..., \(\textbf{x}_{n-1}\). We define \(\textbf{H}_k(T)\) for the BDM \(H({\text {div}})\) finite element space on T as

$$\begin{aligned} \textbf{H}_{k}(T)&=\left\{ \textbf{v}\in \prod _{i=1}^{n-2} [P_{k}(T_i)]^2 \; : \; \textbf{v} {\text { \ satisfies (2.3) below}} \right\} . \end{aligned}$$
(2.1)

We construct here a subspace of \(H({\text {div}}; T)\) which has a one piece divergence,

$$\begin{aligned} \textbf{H}_{k}(T)&\subset \{\textbf{v}\in H({\text {div}};T):\ \textbf{v}|_{T_i}\in [P_{k}(T_i)]^2,\;\;\nabla \cdot \textbf{v}\in P_{k-1}(T)\}. \end{aligned}$$
Fig. 1
figure 1

A polygon is subdivided in to triangles by connecting one vertex with only other vertices

We specify the degrees of freedom, i.e., a basis \(\{F_i\}\) for the linear functional space on \(\textbf{H}_k(T)\), as follows.

$$\begin{aligned} {\begin{aligned}{ F_i(\textbf{u}) = {\left\{ \begin{array}{ll} \int _{e_{j,j+1}} \textbf{u}\cdot \textbf{n} p {\text {d}}s, &{} j=1,\dots ,n, \; p \in P_k(e_{j,j+1}), \\ \int _{e_{1j}} \textbf{u} \cdot \textbf{n} p {\text {d}}s, &{} j=3,\dots ,n-1, \; p \in P_k(e_{1j})\setminus P_0, \\ \int _{T} \textbf{u}\cdot \nabla p {\text {d}} \textbf{x}, &{} p \in P_{k-1}(T)\setminus P_0, \\ \int _{T_j} \textbf{u}\cdot \nabla ^\perp (p B) {\text {d}}\textbf{x}, &{} j=1,\dots ,n-2, \; p \in P_{k-2}(T_j), \\ \end{array}\right. } } \end{aligned} } \end{aligned}$$
(2.2)

where edge \(e_{j,j+1}=\textbf{x}_j\textbf{x}_{j+1}\), \(\textbf{x}_{n+1}=\textbf{x}_1\), \(\textbf{n}\) is a fixed unit normal vector on the edge, \(T_j=\textbf{x}_1\textbf{x}_{j+1}\textbf{x}_{j+2}\), \(\nabla ^\perp f = \langle -\partial _y f, \partial _x f\rangle \), and \(B=\lambda _1\lambda _2\lambda _3\) with the barycentric coordinates (i.e., the linear function \(\lambda _i=0\) on one edge of the triangle \(T_j\) and \(\lambda _i=1\) at the opposite vertex.)

We specify next the continuity constraints on the functions of \(\textbf{H}_{k}(T)\) to eliminate the extra degrees of freedom,

$$\begin{aligned} \left\{ \begin{array}{ll} \int _{e_{1j}} [\textbf{u}] \cdot \textbf{n} p {\text {d}}s=0, &{} j=3,\dots ,n-1, \; p \in P_0(e_{1j}), \\ {\text {div}}\textbf{u}|_{T_j}-{\text {div}}\textbf{u}|_{T_1}=0 &{}{\text {on }} \ T_1, \ j=2,\dots , n-2,\\ \end{array}\right. \end{aligned}$$
(2.3)

where \([\cdot ]\) denotes the jump on the edge, and \(\textbf{u}|_{T_j}\in {[}P_k(T_j)]^3\) is viewed as \(\textbf{u}|_{T_j}\in [P_k(T_1\cup T_j)]^3\). We note that the number of constraints plus the number of degrees of freedom is exactly equal to \(2 n \dim P_k\).

2.2 Type-one polyhedra

We consider one type of polyhedron which can be subdivided in to tetrahedra without introducing any internal edge.

Let \(T\in \mathcal {T}_h\) be an m-polygon-face polyhedron, cf. Fig. 2. Let \(F_i\) be a face polygon of T. We subdivide it into \(f_i\) (may have \(f_i=1\) triangles \(\{ e_{i,1},\dots ,e_{i,f_i}\}\). Together we get \(M=4n-2(n-1)\) face-triangles, where n is the number of tetrahedra after the subdivision. With these M triangles and no internal edge introduced, we assume T is subdivided into n shape regular tetrahedra, \(\{T_1,\dots ,T_n\}\). Between \(T_i\) and \(T_{i+1}\), we have one internal triangle \(t_i\). There are \((n-1)\) internal triangles \(t_i\) separating the n tetrahedra.

Fig. 2
figure 2

An m-face (\(m=5\) in the figure) polyhedron is subdivided in to n (\(n=3\)) tetrahedra, where there is no internal edge, there are exactly \((n-1)\) internal triangles, a face polygon \(F_i\) is subdivided into \(f_i\) (\(f_i=1\) or 2 in the figure) triangles \(\{e_{i1},\dots ,e_{i,f_i}\}\), and there are M outside face triangles

We define the 3D BDM \(H({\text {div}})\) macro-element space on T as

$$\begin{aligned} \textbf{H}_{k}(T)&=\left\{ \textbf{v}\in \prod _{i=1}^{n} [P_{k}(T_i)]^3 \; : \; \textbf{v} {\text { \ satisfies (2.7) below}} \right\} . \end{aligned}$$
(2.4)

We specify the degrees of freedom, the set \(\{F_l(\textbf{u}) \}\) of the linear functionals, as follows.

$$\begin{aligned} \begin{aligned} F_l(\textbf{u})&= {\left\{ \begin{array}{ll} \int _{e_{i1}} \textbf{u}\cdot \textbf{n} p {\text {d}}s, &{} i=1,\dots ,m, \; p \in P_k(e_{i1}), \\ \int _{t_{i}} (\textbf{u}|_{T_i}) \cdot \textbf{n} p {\text {d}}s, &{} i=1,\dots ,n-1, \; p \in P_k(t_{i})\setminus P_0(t_i) , \\ \int _{T_i} \textbf{u}\cdot \textbf{p} {\text {d}}\textbf{x}, &{} i=1,\dots ,n, \; \textbf{p} \in \textbf{CP}_{k}(T_j), \\ \int _{T} \textbf{u}\cdot \nabla p {\text {d}}\textbf{x}, &{} p \in P_{k-1}(T)\setminus P_0(T), \\ \end{array}\right. } \end{aligned} \end{aligned}$$
(2.5)

where the curl space

$$\begin{aligned} \textbf{CP}_k(T_j)&=\{ \textbf{v}\in P_k(T_j)^3 \; : \; \textbf{v}\cdot \textbf{n}|_{\partial T_j} =0, \nonumber \\&\qquad \textbf{v}\cdot \nabla p=0 \ \forall p\in P_{k-1}(T_j)\setminus P_0(T_j) \}. \end{aligned}$$
(2.6)

We specify the constraints on the base polynomial space \(\prod _{i=1}^n P_k(T_i)^3\) as follows.

$$\begin{aligned} { {\left\{ \begin{array}{ll} \int _{e_{i1}} (\textbf{u}-\textbf{u}|_{e_{ij}})\cdot \textbf{n} p {\text {d}}s=0, &{} i=1,\dots ,m, \; p \in P_k(e_{i1}), \\ {} &{} j=2,\dots ,i_f, \\ \int _{t_{i}} [\textbf{u}] \cdot \textbf{n} p {\text {d}}s=0, &{} i=1,\dots ,n-1, \; p \in P_k(t_{i}), \\ \int _{T_1} {\text {div}}(\textbf{u}-\textbf{u}|_{T_i}) p {\text {d}}\textbf{x}=0, &{} i=2,\dots ,n, \; p\in P_{k-1}(T_1), \end{array}\right. } } \end{aligned}$$
(2.7)

where \(\textbf{u}|_{e_{ij}}\in [P_k(e_{ij})]^3\) is viewed as \(\textbf{u}|_{e_{ij}}\in [P_k(e_{i1}\cup e_{ij})]^3\), \([\textbf{u}]\) is the jump of \(\textbf{u}\) at triangle \(t_i\), and \(\textbf{u}|_{T_i}\in [P_k(T_i)]^3\) is viewed as \(\textbf{u}|_{T_i}\in [P_k(T_1\cup T_i)]^3\).

2.3 Type-two polyhedra

Let \(T\in \mathcal {T}_h\) be an m-polygon-face polyhedron, cf. Fig. 3. Let \(F_i\) be a face polygon of T. We subdivide it into \(f_i\) triangles \(\{ e_{i,1},\dots ,e_{i,f_i}\}\). Together we get \(M=2n\) face-triangles. With these M triangles and exactly one internal edge (connecting two vertices), we assume T is subdivided into n shape regular tetrahedra, \(\{T_1,\dots ,T_n\}\). Between \(T_i\) and \(T_{i+1}\), we have one internal triangle \(t_i\). \(t_n\) is the triangle between \(T_n\) and \(T_1\).

Fig. 3
figure 3

An m-face (\(m=6\) in the figure) polyhedron is subdivided in to n (\(n=6\)) tetrahedra, where there is exactly one internal edge, there are exactly n internal triangles, a face polygon \(F_i\) is subdivided into \(f_i\) (\(f_i=2\)) triangles \(\{e_{i1},\dots ,e_{i,f_i}\}\), and there are M face triangles

We define the 3D BDM \(H({\text {div}})\) macro-element space on T as

$$\begin{aligned} \textbf{H}_{k}(T)&=\left\{ \textbf{v}\in \prod _{i=1}^{n} [P_{k}(T_i)]^3 \; : \; \textbf{v} {\text { \ satisfies (2.7) below}} \right\} . \end{aligned}$$
(2.8)

We specify the degrees of freedom, the set \(\{F_l(\textbf{u}) \}\) of the linear functionals, as follows.

$$\begin{aligned} \begin{aligned} F_l(\textbf{u})&= {\left\{ \begin{array}{ll} \int _{e_{i1}} \textbf{u}\cdot \textbf{n} p {\text {d}}s, &{} i=1,\dots ,m, \; p \in P_k(e_{i1}), \\ \int _{t_{i}} (\textbf{u}|_{T_i}) \cdot \textbf{n} p {\text {d}}s, &{} i=1,\dots ,n, \; p \in P_k(t_{i})\setminus P_0(t_i) , \\ \int _{t_{1}} (\textbf{u}|_{T_1}) \cdot \textbf{n} p {\text {d}}s, &{} p \in P_0(t_1) , \\ \int _{T_i} \textbf{u}\cdot \textbf{p} d \textbf{x}, &{} i=1,\dots ,n, \; \textbf{p} \in \textbf{CP}_{k}(T_j), \\ \int _{T} \textbf{u}\cdot \nabla p {\text {d}}\textbf{x}, &{} p \in P_{k-1}(T)\setminus P_0(T), \\ \end{array}\right. } \end{aligned} \end{aligned}$$
(2.9)

where the curl space \(\textbf{CP}_k(T_j)\) is defined in (2.6), and \(t_1\) is an internal triangle which is on a tetrahedron having two other face-triangles on \(\partial T\).

We specify the constraints on the base polynomial space \(\prod _{i=1}^n P_k(T_i)^3\) as follows.

$$\begin{aligned} {\left\{ \begin{array}{ll} \int _{e_{i1}} (\textbf{u}-\textbf{u}|_{e_{ij}})\cdot \textbf{n} p {\text {d}}s=0, &{} i=1,\dots ,m, \; p \in P_k(e_{i1}), \\ {} &{} j=2,\dots ,i_f, \\ \int _{t_{i}} [\textbf{u}] \cdot \textbf{n} p {\text {d}}s=0, &{} i=1,\dots ,n, \; p \in P_k(t_{i}), \\ \int _{T_1} {\text {div}}(\textbf{u}-\textbf{u}|_{T_i}) p {\text {d}}\textbf{x}=0, &{} i=2,\dots ,n, \; p\in P_{k-1}(T_1), \end{array}\right. } \end{aligned}$$
(2.10)

where \(\textbf{u}|_{e_{ij}}\in [P_k(e_{ij})]^3\) is viewed as \(\textbf{u}|_{e_{ij}}\in [P_k(e_{i1}\cup e_{ij})]^3\), \([\textbf{u}]\) is the jump of \(\textbf{u}\) at triangle \(t_i\), and \(\textbf{u}|_{T_i}\in [P_k(T_i)]^3\) is viewed as \(\textbf{u}|_{T_i}\in [P_k(T_1\cup T_i)]^3\).

3 The finite element

Let \(\mathcal {T}_h\) be a partition of the domain \(\Omega \) consisting of polygons in two dimensions or polyhedra in three dimensions satisfying subdivision conditions specified in last section. For every element \(T\in \mathcal {T}_h\), we denote by \(h_T\) its diameter. We let mesh size \(h=\max _{T\in \mathcal {T}_h} h_T\) for \({\mathcal T}_h\).

The space \(H({\text {div}};\Omega )\) is defined as the set of vector-valued functions on \(\Omega \) which, together with their divergence, are square integrable; i.e.,

$$\begin{aligned} H({\text {div}}; \Omega )=\left\{ \textbf{v}\in [L^2(\Omega )]^d:\; \nabla \cdot \textbf{v}\in L^2(\Omega )\right\} . \end{aligned}$$

For any \(T\in \mathcal {T}_h\), it can be divided in to a set of disjoint triangles/tetrahedrons \(T_i\) with \(T=\cup T_i\), specified in the last section. We define the \(H({\text {div}})\) finite element space \(V_h\) for velocity approximation as, for \(k\ge 1\),

$$\begin{aligned} { V_h=\{\textbf{v}\in H({\text {div}};\Omega ):\ \textbf{v}|_{T}\in \textbf{H}_k(T)\}, } \end{aligned}$$
(3.1)

where \(\textbf{H}_k(T)\) is defined in (2.1), or (2.4), or (2.8). As we define the degrees of freedom locally, not on a reference element, the dual (i.e., computational) basis functions in (3.1) are local too, not mapped from a set of reference functions.

The following lemma is proved in Ye and Zhang (2021a).

Lemma 3.1

The interpolation operator \(\Pi _h: [H^1_0(\Omega )]^d \rightarrow V_h\), defined by the degrees of freedom (2.2), (2.5) or (2.9), satisfies

$$\begin{aligned} (\nabla \cdot \tau ,\;w)_T= & {} (\nabla \cdot \Pi _h\tau ,\;w)_T \quad \forall w\in P_{k-1}(T), \end{aligned}$$
(3.2)
$$\begin{aligned} \Vert \Pi _h\tau -\tau \Vert\le & {} Ch^{k+1}|\tau |_{k+1}. \end{aligned}$$
(3.3)

Let \(T_1\) and \(T_2\) be two triangles/tetrahedrons in \(T\in \mathcal {T}_h\) sharing e. For \(\textbf{v}\in V_h+[H_0^1(\Omega )]^d \), we define jump \([\textbf{v}]\) as

$$\begin{aligned}{ [\textbf{v}]=\textbf{v}\quad \textrm{if} \;e\subset \partial \Omega ,\quad [\textbf{v}]=\textbf{v}|_{T_1}-\textbf{v}|_{T_2}\;\; \textrm{if} \;e\not \subset \partial \Omega , } \end{aligned}$$

and the average \(\{v\}\) as

$$\begin{aligned}{ \{\textbf{v}\}=\textbf{0}\quad \textrm{if} \;e\subset \partial \Omega ,\quad \{\textbf{v}\}=\frac{1}{2}(\textbf{v}|_{T_1}+\textbf{v}|_{T_2})\;\; \textrm{if} \;e\not \subset \partial \Omega . } \end{aligned}$$

The order of \(T_1\) and \(T_2\) is not essential.

For a function \(\textbf{v}\in V_h+[H_0^1(\Omega )]^d\), its weak gradient is a piecewise polynomial satisfying \(\nabla _w\textbf{v}|_{T_i} \in [P_{k+1}(T_i)]^{d\times d}\) and

$$\begin{aligned} (\nabla _w\textbf{v},\ \tau )_{T_i} = -(\textbf{v},\ \nabla \cdot \tau )_{T_i}+ {\langle }\{\textbf{v}\}, \ \tau \cdot \textbf{n}{\rangle }_{{\partial T}_i}\quad \forall \tau \in [P_{k+1}(T_i)]^{d\times d}. \end{aligned}$$
(3.4)

Let \(\mathbb {Q}_h\) be the element-wise defined \(L^2\) projection onto the space \([P_{k+1}(T_i)]^{d\times d}\) on \(T_i\).

Lemma 3.2

Let \(\varvec{\phi }\in [H_0^1(\Omega )]^d\), then on \(T_i\subset T\in \mathcal {T}_h\)

$$\begin{aligned} \nabla _w \varvec{\phi }= & {} \mathbb {Q}_h\nabla \varvec{\phi }. \end{aligned}$$
(3.5)

Proof

Using (3.4) and integration by parts, we have that for any \(\tau \in [P_{k+1}(T_i)]^{d\times d}\)

$$\begin{aligned} (\nabla _w \varvec{\phi },\tau )_{T_i}= & {} -(\varvec{\phi },\nabla \cdot \tau )_{T_i} +\langle \{\varvec{\phi }\},\tau \cdot \textbf{n}\rangle _{{\partial T}_i}\\= & {} -(\varvec{\phi },\nabla \cdot \tau )_{T_i} +\langle \varvec{\phi },\tau \cdot \textbf{n}\rangle _{{\partial T}_i}\\= & {} (\nabla \varvec{\phi },\tau )_{T_i}\\= & {} (\mathbb {Q}_h\nabla \phi ,\tau )_{T_i}, \end{aligned}$$

which implies the identity (3.5). \(\square \)

For \(k\ge 1\) and given \(\mathcal {T}_h\), we define the finite element space for the pressure approximation,

$$\begin{aligned} W_h =\left\{ q\in L_0^2(\Omega ): \ q|_T\in P_{k-1}(T)\right\} . \end{aligned}$$
(3.6)

We introduce the following finite element scheme without stabilizers.

Algorithm 1

A numerical approximation for (1.1)–(1.3) can be obtained by seeking \(\textbf{u}_h\in V_h\) and \(p_h\in W_h\) such that for all \(\textbf{v}\in V_h\) and \(q\in W_h\),

$$\begin{aligned} ( \mu \nabla _w\textbf{u}_h,\ \nabla _w\textbf{v})-(\nabla \cdot \textbf{v},\;p_h)= & {} (f,\;\textbf{v}), \end{aligned}$$
(3.7)
$$\begin{aligned} (\nabla \cdot \textbf{u}_h,\;q)= & {} 0. \end{aligned}$$
(3.8)

where \(V_h\) and \(W_h\) are defined in (3.1) and (3.6), respectively.

For any function \(\varphi \in H^1(T)\), the following trace inequality holds true (see Wang and Ye 2014 for details):

$$\begin{aligned} \Vert \varphi \Vert _{e}^2 \le C \left( h_T^{-1} \Vert \varphi \Vert _T^2 + h_T \Vert \nabla \varphi \Vert _{T}^2\right) . \end{aligned}$$
(3.9)

We introduce two semi-norms \({|\hspace{-.02in}|\hspace{-.02in}|} \textbf{v}{|\hspace{-.02in}|\hspace{-.02in}|}\) and \(\Vert \textbf{v}\Vert _{1,h}\) for any \(\textbf{v}\in V_h \cup [H^1(\Omega )]^d\) as follows:

$$\begin{aligned} {|\hspace{-.02in}|\hspace{-.02in}|} \textbf{v}{|\hspace{-.02in}|\hspace{-.02in}|}^2= & {} ( \nabla _w\textbf{v},\nabla _w\textbf{v}), \end{aligned}$$
(3.10)
$$\begin{aligned} \Vert \textbf{v}\Vert _{1,h}^2= & {} \sum _{T\in \mathcal {T}_h}\sum _i\left( \Vert \nabla \textbf{v}\Vert _{T_i}^2+ h_{T_i}^{-1}\Vert [\textbf{v}]\Vert _{{\partial T}_i}^2\right) , \end{aligned}$$
(3.11)

where \(T=\cup _iT_i\). It is easy to see that \(\Vert \textbf{v}\Vert _{1,h}\) defines a norm in \(V_h\).

Lemma 3.3

There exist two positive constants \(C_1\) and \(C_2\) such that for any \(\textbf{v}\in V_h\), we have

$$\begin{aligned} C_1 \Vert \textbf{v}\Vert _{1,h}\le {|\hspace{-.02in}|\hspace{-.02in}|} \textbf{v}{|\hspace{-.02in}|\hspace{-.02in}|} \le C_2 \Vert \textbf{v}\Vert _{1,h}, \end{aligned}$$
(3.12)

where the semi-norms are defined in (3.10)–(3.11).

Lemma 3.3 has been proved in Ye and Zhang (2020c); Al-Taweel and Wang (2020).

Lemma 3.4

There exists a positive constant \(\beta \) independent of h such that for all \(\rho \in W_h\),

$$\begin{aligned} \sup _{\textbf{v}\in V_h}\frac{(\nabla \cdot \textbf{v},\rho )}{{|\hspace{-.02in}|\hspace{-.02in}|}\textbf{v}{|\hspace{-.02in}|\hspace{-.02in}|}}\ge \beta \Vert \rho \Vert . \end{aligned}$$
(3.13)

Proof

We note that the interpolation operator \(\Pi _h\) introduced in (3.2) and (3.3) is defined locally, not on a reference macro-element. For any given \(\rho \in W_h\subset L_0^2(\Omega )\), it is known (Girault and Raviart 1986) that there exists a function \({\tilde{\textbf{v}}}\in [H_0^1(\Omega )]^d\) such that

$$\begin{aligned} \frac{(\nabla \cdot {\tilde{\textbf{v}}},\rho )}{\Vert {\tilde{\textbf{v}}}\Vert _1}\ge C\Vert \rho \Vert , \end{aligned}$$
(3.14)

where \(C>0\) is a constant independent of h. Let \(\textbf{v}=\Pi _h\tilde{\textbf{v}}\in V_h\). It follows from (3.12), (3.9), (3.3) and \(\tilde{\textbf{v}}\in [H_0^1(\Omega )]^d\),

$$\begin{aligned} {|\hspace{-.02in}|\hspace{-.02in}|} \textbf{v}{|\hspace{-.02in}|\hspace{-.02in}|}^2\le & {} C\Vert \textbf{v}\Vert _{1,h}^2=C \sum _{T\in \mathcal {T}_h}\sum _i\left( \Vert \nabla \textbf{v}\Vert _{T_i}^2+h_{T_i}^{-1}\Vert [\textbf{v}]\Vert _{{\partial T}_i}^2\right) \\\le & {} C \sum _{T\in \mathcal {T}_h}\sum _i\left( \Vert \nabla \Pi _h\tilde{\textbf{v}}\Vert _{T_i}^2+h_{T_i}^{-1} \Vert [\Pi _h\tilde{\textbf{v}}-\tilde{\textbf{v}}]\Vert _{{\partial T}_i}^2\right) \\\le & {} C \Vert \tilde{\textbf{v}}\Vert _1^2, \end{aligned}$$

which implies

$$\begin{aligned} {|\hspace{-.02in}|\hspace{-.02in}|}\textbf{v}{|\hspace{-.02in}|\hspace{-.02in}|}\le C \Vert \tilde{\textbf{v}}\Vert _1. \end{aligned}$$
(3.15)

It follows from (3.2) that

$$\begin{aligned} (\nabla \cdot \textbf{v},\;\rho )= & {} (\nabla \cdot \Pi _h{\tilde{\textbf{v}}},\;\rho )=(\nabla \cdot {\tilde{\textbf{v}}},\;\rho ). \end{aligned}$$

Using the above Eqs. (3.14) and (3.15), we have

$$\begin{aligned} \frac{|(\nabla \cdot \textbf{v},\rho )|}{{|\hspace{-.02in}|\hspace{-.02in}|}\textbf{v}{|\hspace{-.02in}|\hspace{-.02in}|}}\ge & {} \frac{|(\nabla \cdot {\tilde{\textbf{v}}},\rho )|}{C \Vert {\tilde{\textbf{v}}}\Vert _1}\ge \beta \Vert \rho \Vert , \end{aligned}$$

for a positive constant \(\beta \). This completes the proof of the lemma. \(\square \)

Lemma 3.5

The weak Galerkin method (3.7)–(3.8) has a unique solution.

Proof

It suffices to show that zero is the only solution of (3.7)–(3.8) if \(\textbf{f}=\textbf{0}\). To this end, let \(\textbf{f}=\textbf{0}\) and take \(\textbf{v}=\textbf{u}_h\) in (3.7) and \(q=p_h\) in (3.8). By adding the two resulting equations, we obtain

$$\begin{aligned} (\nabla _w\textbf{u}_h,\ \nabla _w\textbf{u}_h)=0, \end{aligned}$$

which implies that \(\nabla _w \textbf{u}_h=0\) on each element T. By (3.12), we have \(\Vert \textbf{u}_h\Vert _{1,h}=0\) which implies that \(\textbf{u}_h=0\).

Since \(\textbf{u}_h=\textbf{0}\) and \(\textbf{f}=\textbf{0}\), the Eq. (3.7) becomes \((\nabla \cdot \textbf{v},\ p_h)=0\) for any \(\textbf{v}\in V_h\). Then the inf-sup condition (3.13) implies \(p_h=0\). We have proved the lemma. \(\square \)

4 Error equations

In this section, we will derive the equations that the errors satisfy. First we define an element-wise \(L^2\) projection \(Q_h\) onto the local space \(P_{k-1}(T)\) for \(T\in \mathcal {T}_h\). Let \(\textbf{e}_h=\Pi _h\textbf{u}-\textbf{u}_h\), \(\textbf{E}_h=\textbf{u}-\textbf{u}_h\) and \(\varepsilon _h=Q_hp-p_h\).

Lemma 4.1

The following error equations hold true for any \(\textbf{v}\in V_h\) and \(q\in W_h\),

$$\begin{aligned} ( \mu \nabla _w\textbf{e}_h,\; \nabla _w\textbf{v}) -(\varepsilon _h,\;\nabla \cdot \textbf{v})= & {} \ell _1(\textbf{u},\textbf{v})-\ell _2(\textbf{u},\textbf{v}),\end{aligned}$$
(4.1)
$$\begin{aligned} (\nabla \cdot \textbf{e}_h,\ q)= & {} 0, \end{aligned}$$
(4.2)

where

$$\begin{aligned} \ell _1(\textbf{u},\ \textbf{v})= & {} \mu \sum _{T\in \mathcal {T}_h}\sum _i{\langle }\textbf{v}-\{\textbf{v}\},\ \nabla \textbf{u}\cdot \textbf{n}-\mathbb {Q}_h(\nabla \textbf{u})\cdot \textbf{n}{\rangle }_{{\partial T}_i},\end{aligned}$$
(4.3)
$$\begin{aligned} \ell _2(\textbf{u},\textbf{v})= & {} (\mu \nabla _w (\textbf{u}-\Pi _h\textbf{u}),\nabla _w\textbf{v}). \end{aligned}$$
(4.4)

Proof

First, we test (1.1) by \(\textbf{v}\in V_h\) to obtain

$$\begin{aligned} -( \mu \Delta \textbf{u},\;\textbf{v})+(\nabla p,\ \textbf{v})=(\textbf{f},\; \textbf{v}). \end{aligned}$$
(4.5)

Integration by parts gives

$$\begin{aligned} -(\mu \Delta \textbf{u},\;\textbf{v})=\mu \sum _{T\in \mathcal {T}_h}\sum _i(\nabla \textbf{u},\nabla \textbf{v})_{T_i}- \mu \sum _{T\in \mathcal {T}_h}\sum _i\langle \nabla \textbf{u}\cdot \textbf{n},\textbf{v}-\{\textbf{v}\}\rangle _{{\partial T}_i}, \end{aligned}$$
(4.6)

where we use the fact \(\sum _{T\in \mathcal {T}_h}\sum _i\langle \nabla \textbf{u}\cdot \textbf{n},\{\textbf{v}\}\rangle _{{\partial T}_i}=0\). It follows from integration by parts, (3.4) and (3.5),

$$\begin{aligned}{} & {} \mu \sum _{T\in \mathcal {T}_h}\sum _i(\nabla \textbf{u},\nabla \textbf{v})_{T_i}\nonumber \\= & {} \mu \sum _{T\in \mathcal {T}_h}\sum _i(\mathbb {Q}_h\nabla \textbf{u},\nabla \textbf{v})_{T_i}\nonumber \\= & {} -\mu \sum _{T\in \mathcal {T}_h}\sum _i(\textbf{v},\nabla \cdot (\mathbb {Q}_h\nabla \textbf{u}))_{T_i} +\langle \textbf{v}, \mathbb {Q}_h\nabla \textbf{u}\cdot \textbf{n}\rangle _{{\partial T}_i}\nonumber \\= & {} \mu \sum _{T\in \mathcal {T}_h}\sum _i(\mathbb {Q}_h\nabla \textbf{u}, \nabla _w \textbf{v})_{T_i} +\mu \sum _{T\in \mathcal {T}_h}\sum _i\langle \textbf{v}-\{\textbf{v}\}, \mathbb {Q}_h\nabla \textbf{u}\cdot \textbf{n}\rangle _{{\partial T}_i}\nonumber \\= & {} ( \mu \nabla _w \textbf{u}, \nabla _w \textbf{v})+ \mu \sum _{T\in \mathcal {T}_h}\sum _i\langle \textbf{v}-\{\textbf{v}\},\mathbb {Q}_h\nabla \textbf{u}\cdot \textbf{n}\rangle _{{\partial T}_i}. \end{aligned}$$
(4.7)

Combining (4.6) and (4.7) gives

$$\begin{aligned} -(\mu \Delta \textbf{u},\;\textbf{v})= & {} (\mu \nabla _w \textbf{u},\nabla _w\textbf{v})- \ell _1(\textbf{u},\textbf{v}). \end{aligned}$$
(4.8)

Using integration by parts and \(\textbf{v}\in V_h\), we have

$$\begin{aligned} (\nabla p,\ \textbf{v})= -(p,\nabla \cdot \textbf{v})+{\langle }p, \textbf{v}\cdot \textbf{n}{\rangle }=-(p,\nabla \cdot \textbf{v})=-(Q_hp,\nabla \cdot \textbf{v}). \end{aligned}$$
(4.9)

Substituting (4.8) and (4.9) into (4.5) gives

$$\begin{aligned} ( \mu \nabla _w\textbf{u},\nabla _w\textbf{v})-(Q_hp,\nabla \cdot \textbf{v})=(\textbf{f},\textbf{v})+\ell _1(\textbf{u},\textbf{v}). \end{aligned}$$
(4.10)

The difference of (4.10) and (3.7) implies

$$\begin{aligned} ( \mu \nabla _w\textbf{E}_h,\nabla _w\textbf{v})-(\varepsilon _h,\nabla \cdot \textbf{v}) =\ell _1(\textbf{u},\textbf{v})\quad \forall \textbf{v}\in V_h. \end{aligned}$$
(4.11)

Adding and subtracting \(( \mu \nabla _w\Pi _h\textbf{u},\nabla _w\textbf{v})\) in (4.11), we have

$$\begin{aligned}{ ( \mu \nabla _w\textbf{e}_h,\nabla _w\textbf{v}) -(\varepsilon _h,\nabla \cdot \textbf{v}) =\ell _1(\textbf{u},\textbf{v})-\ell _2(\textbf{u},\textbf{v}), } \end{aligned}$$

which implies (4.1).

Testing Eq. (1.2) by \(q\in W_h\) and using (3.2) give

$$\begin{aligned} (\nabla \cdot \textbf{u},\ q)=(\nabla \cdot \Pi _h\textbf{u},\ q)=0. \end{aligned}$$
(4.12)

The difference of (4.12) and (3.8) implies (4.2). We have proved the lemma. \(\square \)

5 Error estimates in energy norm

In this section, we shall establish optimal order error estimates for the velocity approximation \(\textbf{u}_h\) in \({|\hspace{-.02in}|\hspace{-.02in}|}\cdot {|\hspace{-.02in}|\hspace{-.02in}|}\) norm and for the pressure approximation \(p_h\) in the standard \(L^2\) norm.

Lemma 5.1

Let \(\textbf{w}\in [H^{k+1}(\Omega )]^d\) and \(\textbf{v}\in V_h\). Then, the following estimates hold true

$$\begin{aligned} |\ell _1(\textbf{w},\ \textbf{v})|\le & {} C { \mu } h^{k}|\textbf{w}|_{k+1}{|\hspace{-.02in}|\hspace{-.02in}|} \textbf{v}{|\hspace{-.02in}|\hspace{-.02in}|}, \end{aligned}$$
(5.1)
$$\begin{aligned} |\ell _2(\textbf{w},\ \textbf{v})|\le & {} C { \mu } h^{k}|\textbf{w}|_{k+1}{|\hspace{-.02in}|\hspace{-.02in}|} \textbf{v}{|\hspace{-.02in}|\hspace{-.02in}|}, \end{aligned}$$
(5.2)
$$\begin{aligned} {|\hspace{-.02in}|\hspace{-.02in}|} \textbf{w}-\Pi _h\textbf{w}{|\hspace{-.02in}|\hspace{-.02in}|}\le & {} C h^k|\textbf{w}|_{k+1}, \end{aligned}$$
(5.3)

where \(\ell _1\) and \(\ell _2\) are defined in (4.3) and (4.4), respectively.

Proof

Using the Cauchy–Schwarz inequality, the trace inequality (3.9), and (3.12), we have

$$\begin{aligned} |\ell _1(\textbf{w},\ \textbf{v})|= & {} | \mu \sum _{T\in \mathcal {T}_h}\sum _i{\langle }\textbf{v}-\{\textbf{v}\},\ \nabla \textbf{w}\cdot \textbf{n}-\mathbb {Q}_h(\nabla \textbf{w})\cdot \textbf{n}{\rangle }_{{\partial T}_i}|\\\le & {} C \mu \sum _{T\in \mathcal {T}_h}\sum _i \Vert \nabla \textbf{w}-\mathbb {Q}_h\nabla \textbf{w}\Vert _{{\partial T}_i} \Vert \textbf{v}-\{\textbf{v}\}\Vert _{{\partial T}_i}\\\le & {} C \mu \left( \sum _{T\in \mathcal {T}_h}h_T\sum _i\Vert ( \nabla \textbf{w}-\mathbb {Q}_h\nabla \textbf{w})\Vert _{{\partial T}_i}^2\right) ^{\frac{1}{2}} \\{} & {} \quad \cdot \left( \sum _{T\in \mathcal {T}_h}\sum _i h_{T_i}^{-1}\Vert [\textbf{v}]\Vert _{{\partial T}_i}^2\right) ^{\frac{1}{2}}\\\le & {} C { \mu } h^{k}|\textbf{w}|_{k+1}{|\hspace{-.02in}|\hspace{-.02in}|} \textbf{v}{|\hspace{-.02in}|\hspace{-.02in}|}. \end{aligned}$$

Next we estimate \(|\ell _2(\textbf{w}, \textbf{v})|=|( \mu \nabla _w (\textbf{w}-\Pi _h\textbf{w}),\nabla _w\textbf{v})|\). It follows from (3.4), integration by parts, (3.9) and (3.3) that for any \(\textbf{q}\in [P_{k+1}(T_i)]^{d\times d}\),

$$\begin{aligned} \begin{aligned}&\quad \ |( \mu \nabla _w(\textbf{w}-\Pi _h\textbf{w}), \textbf{q})_{T_i}|\\&= \big |- \mu (\textbf{w}-\Pi _h\textbf{w}, \nabla \cdot \textbf{q})_{T_i}+ \mu {\langle }\{\textbf{w}-\Pi _h\textbf{w}\}, \textbf{q}\cdot \textbf{n}{\rangle }_{{\partial T}_i}\big | \\&= \big |( \mu \nabla (\textbf{w}-\Pi _h\textbf{w}), \textbf{q})_{T_i}+ \mu {\langle }\{\textbf{w}-\Pi _h\textbf{w}\}-(\textbf{w}-\Pi _h\textbf{w}), \textbf{q}\cdot \textbf{n}{\rangle }_{{\partial T}_i}\big | \\&= \mu \Vert \nabla (\textbf{w}-\Pi _h\textbf{w})\Vert _{T_i}\Vert \textbf{q}\Vert _{T_i}+C \mu h^{-1/2}\Vert [\textbf{w}-\Pi _h\textbf{w}]\Vert _{{\partial T}_i}\Vert \textbf{q}\Vert _{T_i} \\&\le C \mu h^k|\textbf{w}|_{k+1, T_i}\Vert \textbf{q}\Vert _{T_i}. \end{aligned} \end{aligned}$$
(5.4)

Letting \(\textbf{q}=\nabla _w\textbf{v}\) in the above equation and taking summation over \(T_i\), we have

$$\begin{aligned} |\ell _2(\textbf{w},\ \textbf{v})|\le & {} C { \mu } h^{k}|\textbf{w}|_{k+1}{|\hspace{-.02in}|\hspace{-.02in}|} \textbf{v}{|\hspace{-.02in}|\hspace{-.02in}|}. \end{aligned}$$

Letting \(\textbf{q}=\nabla _w(\textbf{w}-\Pi _h\textbf{w})\) in (5.4) implies (5.3). We have proved the lemma. \(\square \)

Theorem 5.1

Let \((\textbf{u}_h,p_h)\in V_h^0\times W_h\) be the solution of (3.7)–(3.8). Then, the following error estimates hold true

$$\begin{aligned} {|\hspace{-.02in}|\hspace{-.02in}|} \textbf{u}-\textbf{u}_h{|\hspace{-.02in}|\hspace{-.02in}|}\le & {} C h^{k}|\textbf{u}|_{k+1}, \end{aligned}$$
(5.5)
$$\begin{aligned} \Vert Q_hp-p_h\Vert\le & {} C \mu h^{k}|\textbf{u}|_{k+1}. \end{aligned}$$
(5.6)

Proof

By letting \(\textbf{v}=\textbf{e}_h\) in (4.1) and \(q=\varepsilon _h\) in (4.2) and using (4.2), we have

$$\begin{aligned} \mu {|\hspace{-.02in}|\hspace{-.02in}|} \textbf{e}_h{|\hspace{-.02in}|\hspace{-.02in}|}^2= & {} |\ell _1(\textbf{u},\textbf{v})-\ell _2(\textbf{u},\textbf{v})|. \end{aligned}$$
(5.7)

By (5.7), it then follows from (5.1) and (5.2) that

$$\begin{aligned} {|\hspace{-.02in}|\hspace{-.02in}|} \textbf{e}_h{|\hspace{-.02in}|\hspace{-.02in}|}^2 \le C h^{k}|\textbf{u}|_{k+1}{|\hspace{-.02in}|\hspace{-.02in}|} \textbf{e}_h{|\hspace{-.02in}|\hspace{-.02in}|}. \end{aligned}$$
(5.8)

By the triangle inequality, (5.3) and (5.8), (5.5) holds. To estimate \(\Vert \varepsilon _h\Vert \), we have from (4.1) that

$$\begin{aligned} (\varepsilon _h, \nabla \cdot \textbf{v})=( \mu \nabla _w\textbf{e}_h,\nabla _w\textbf{v})-\ell _1(\textbf{u}, \textbf{v})+\ell _2(\textbf{u}, \textbf{v}). \end{aligned}$$

Using the equation above, (5.8), (5.1) and (5.2), we arrive at

$$\begin{aligned} |(\varepsilon _h, \nabla \cdot \textbf{v})|\le C { \mu }h^{k}|\textbf{u}|_{k+1}{|\hspace{-.02in}|\hspace{-.02in}|}\textbf{v}{|\hspace{-.02in}|\hspace{-.02in}|}. \end{aligned}$$

Combining the above estimate with the inf-sup condition (3.13) gives

$$\begin{aligned} \Vert \varepsilon _h\Vert \le \beta ^{-1} \frac{ (\varepsilon _h, \nabla \cdot \textbf{v})}{{|\hspace{-.02in}|\hspace{-.02in}|}\textbf{v}{|\hspace{-.02in}|\hspace{-.02in}|}} \le C \mu h^{k}|\textbf{u}|_{k+1}, \end{aligned}$$

which yields the estimate (5.6) and completes the proof of the theorem. \(\square \)

6 Error estimates in \(L^2\) norm

In this section, we shall derive an \(L^2\)-error estimate for the velocity approximation through a duality argument. Recall that \(\textbf{e}_h=\Pi _h\textbf{u}-\textbf{u}_h\) and \(\textbf{E}_h=\textbf{u}-\textbf{u}_h\). To this end, consider the problem of seeking \((\textbf{U},\xi )\) such that

$$\begin{aligned} -\mu \Delta \textbf{U}+\nabla \xi&=\textbf{E}_h&\quad \text{ in }\;\Omega , \end{aligned}$$
(6.1)
$$\begin{aligned} \nabla \cdot \textbf{U}&=0&\quad \text{ in }\;\Omega ,\nonumber \\ \textbf{U}&= 0&\quad \text{ on }\;\partial \Omega . \end{aligned}$$
(6.2)

Assume that the dual problem has the \([H^{2}(\Omega )]^d\times H^1(\Omega )\)-regularity property in the sense that the solution \((\textbf{U},\xi )\in [H^{2}(\Omega )]^d\times H^1(\Omega )\) and the following a priori estimate holds true:

$$\begin{aligned} \mu \Vert \textbf{U}\Vert _{2}+\Vert \xi \Vert _1\le C\Vert \textbf{E}_h\Vert . \end{aligned}$$
(6.3)

Lemma 6.1

For \(\textbf{E}_h=\textbf{u}-\textbf{u}_h\) and \(\ell _1(\cdot ,\cdot )\) defined in (4.3), the following equation holds true,

$$\begin{aligned} \Vert \textbf{E}_h\Vert ^2= & {} \ell _1(\textbf{u},\Pi _h\textbf{U})+\ell _3(\textbf{U},\ \textbf{E}_h)+\ell _4(\textbf{U},\ \textbf{E}_h) -\ell _1(\textbf{U}, \textbf{E}_h). \end{aligned}$$
(6.4)

where

$$\begin{aligned} \ell _3(\textbf{U},\ \textbf{E}_h)= & {} (\mu \nabla _w(\textbf{U}-\Pi _h\textbf{U}), \;\nabla _w\textbf{E}_h),\\ \ell _4(\textbf{U},\;\textbf{E}_h)= & {} \mu \sum _{T\in \mathcal {T}_h}\sum _i(\nabla \textbf{U}-\mathbb {Q}_h\nabla \textbf{U},\ \nabla \textbf{E}_h)_{T_i} \end{aligned}$$

Proof

Testing (6.1) by \(\textbf{E}_h\) gives

$$\begin{aligned} (\textbf{E}_h, \textbf{E}_h)= & {} -(\mu \Delta \textbf{U},\;\textbf{E}_h)+(\nabla \xi ,\ \textbf{E}_h). \end{aligned}$$
(6.5)

Using integration by parts, (3.4) and the fact \(\sum _{T\in \mathcal {T}_h}\sum _i{\langle }\nabla \textbf{U}\cdot \textbf{n},\{\textbf{E}_h\}{\rangle }_{{\partial T}_i}=0\), then

$$\begin{aligned}&\quad \ -(\mu \Delta \textbf{U},\textbf{E}_h) \\&= \mu \sum _{T\in \mathcal {T}_h}\sum _i\big ((\nabla \textbf{U},\ \nabla \textbf{E}_h)_{T_i}-{\langle }\nabla \textbf{U}\cdot \textbf{n},\ \textbf{E}_h- \{\textbf{E}_h\}{\rangle }_{{\partial T}_i}\big )\\&= \mu \sum _{T\in \mathcal {T}_h}\sum _i\big ((\mathbb {Q}_h\nabla \textbf{U},\ \nabla \textbf{E}_h)_{T_i} +(\nabla \textbf{U}-\mathbb {Q}_h\nabla \textbf{U},\ \nabla \textbf{E}_h)_{T_i} \\&\quad \ -{\langle }\nabla \textbf{U}\cdot \textbf{n},\ \textbf{E}_h- \{\textbf{E}_h\}{\rangle }_{{\partial T}_i}\big )\\&= \mu \sum _{T\in \mathcal {T}_h}\sum _i\big (-(\nabla \cdot \mathbb {Q}_h\nabla \textbf{U},\ \textbf{E}_h)_{T_i} +{\langle }\mathbb {Q}_h\nabla \textbf{U}\cdot \textbf{n},\ \textbf{E}_h{\rangle }_{{\partial T}_i} \\&\quad \ -{\langle }\nabla \textbf{U}\cdot \textbf{n},\ \textbf{E}_h- \{\textbf{E}_h\}{\rangle }_{\partial {\partial T}_i}\big ) + \ell _4(\textbf{U},\;\textbf{E}_h)\\&= \mu \sum _{T\in \mathcal {T}_h}\sum _i\big ((\mathbb {Q}_h\nabla \textbf{U},\ \nabla _w\textbf{E}_h)_{T_i} +{\langle }\mathbb {Q}_h\nabla \textbf{U}\cdot \textbf{n},\ \textbf{E}_h-\{\textbf{E}_h\}{\rangle }_{{\partial T}_i} \\&\quad \ -{\langle }\nabla \textbf{U}\cdot \textbf{n},\ \textbf{E}_h- \{\textbf{E}_h\}{\rangle }_{{\partial T}_i}\big ) + \ell _4(\textbf{U},\;\textbf{E}_h)\\&= \mu \sum _{T\in \mathcal {T}_h}\sum _i(\mathbb {Q}_h\nabla \textbf{U},\ \nabla _w\textbf{E}_h)_{T_i}+\ell _4(\textbf{U},\;\textbf{E}_h)-\ell _1(\textbf{U},\textbf{E}_h). \end{aligned}$$

It follows from (3.5) that

$$\begin{aligned}{} & {} \mu \sum _{T\in \mathcal {T}_h}\sum _i(\mathbb {Q}_h\nabla \textbf{U},\ \nabla _w\textbf{E}_h)_{T_i}\\= & {} \mu \sum _{T\in \mathcal {T}_h}\sum _i(\nabla _w \textbf{U},\;\nabla _w\textbf{E}_h)_{T_i}\\= & {} \mu \sum _{T\in \mathcal {T}_h}\sum _i\big ((\nabla _w \Pi _h\textbf{U},\;\nabla _w\textbf{E}_h)_{T_i} +(\nabla _w (\textbf{U}-\Pi _h\textbf{U}),\;\nabla _w\textbf{E}_h)_{T_i}\big )\\= & {} \mu (\nabla _w \Pi _h\textbf{U},\;\nabla _w\textbf{E}_h)+\ell _3(\textbf{U},\ \textbf{E}_h). \end{aligned}$$

The two equations above imply that

$$\begin{aligned} \begin{aligned}&\quad \ -(\mu \Delta \textbf{U},\textbf{E}_h)\\&= (\mu \nabla _w \Pi _h\textbf{U},\;\nabla _w\textbf{E}_h) +\ell _3(\textbf{U},\ \textbf{E}_h)+\ell _4(\textbf{U},\ \textbf{E}_h) -\ell _1(\textbf{U},\textbf{E}_h). \end{aligned} \end{aligned}$$
(6.6)

Letting \(\textbf{v}=\Pi _h\textbf{U}\) in (4.11) gives

$$\begin{aligned} (\mu \nabla _w\textbf{E}_h,\nabla _w\Pi _h\textbf{U})-( \varepsilon _h,\nabla \cdot \Pi _h\textbf{U})=\ell _1(\textbf{u},\Pi _h\textbf{U}). \end{aligned}$$
(6.7)

It follows from (3.2) and (6.2)

$$\begin{aligned} (\varepsilon _h,\nabla \cdot \Pi _h\textbf{U}) =(\varepsilon _h,\nabla \cdot \textbf{U})=0. \end{aligned}$$
(6.8)

The Eqs. (6.7) and (6.8) give

$$\begin{aligned} (\mu \nabla _w \Pi _h\textbf{U},\;\nabla _w\textbf{E}_h)=\ell _1(\textbf{u},\Pi _h \textbf{U}). \end{aligned}$$
(6.9)

Using(6.9), the Eq. (6.6) becomes

$$\begin{aligned} \begin{aligned}&\quad \ -(\mu \Delta \textbf{U},\textbf{E}_h)\\&= \ell _1(\textbf{u},\Pi \textbf{U})+\ell _3(\textbf{U},\ \textbf{E}_h)+\ell _4(\textbf{U},\ \textbf{E}_h) -\ell _1(\textbf{U},\textbf{E}_h). \end{aligned} \end{aligned}$$
(6.10)

It follows from integration by parts and (1.2) and (3.8)

$$\begin{aligned} (\nabla \xi ,\ \textbf{E}_h)= & {} 0. \end{aligned}$$
(6.11)

Combining (6.10)–(6.11) with (6.5), we have obtained (6.4). \(\square \)

Theorem 6.1

Let \((\textbf{u}_h,p_h)\in V_h\times W_h\) be the solution of (3.7)–(3.8). Assume that (6.3) holds true. Then, we have

$$\begin{aligned} \Vert \textbf{u}-\textbf{u}_h\Vert \le Ch^{k+1}|\textbf{u}|_{k+1}. \end{aligned}$$
(6.12)

Proof

By Lemma 6.1, we have

$$\begin{aligned} \Vert \textbf{E}_h\Vert ^2= & {} \ell _1(\textbf{u},\Pi _h\textbf{U})+\ell _3(\textbf{U},\ \textbf{E}_h)+\ell _4(\textbf{U},\ \textbf{E}_h) -\ell _1(\textbf{U}, \textbf{E}_h). \end{aligned}$$
(6.13)

Next we will estimate all the terms on the right hand side of (6.13). Using the Cauchy–Schwarz inequality, the trace inequality (3.9) and the definitions of \(\Pi _h\) and \(\mathbb {Q}_h\) we obtain

$$\begin{aligned}&\quad \ |\ell _1(\textbf{u},\Pi _h\textbf{U})|\\&\le \mu \left| \sum _{T\in \mathcal {T}_h}\sum _i \langle (\nabla \textbf{u}-\mathbb {Q}_h\nabla \textbf{u})\cdot \textbf{n},\; \Pi _h\textbf{U}-\{\Pi _h\textbf{U}\}\rangle _{{\partial T}_i} \right| \\&\le \mu \bigg [\sum _{T\in \mathcal {T}_h}\sum _i\Vert \nabla \textbf{u}-\mathbb {Q}_h\nabla \textbf{u}\Vert ^2_{{\partial T}_i}\bigg ]^{\frac{1}{2}} \bigg [\sum _{T\in \mathcal {T}_h}\sum _i\Vert \Pi _h\textbf{U}-\{\Pi _h\textbf{U}\}\Vert ^2_{{\partial T}_i}\bigg ]^{\frac{1}{2}}\nonumber \\&\le C\mu \bigg [\sum _{T\in \mathcal {T}_h}\sum _ih\Vert \nabla \textbf{u}-\mathbb {Q}_h\nabla \textbf{u}\Vert ^2_{{\partial T}_i}\bigg ]^{\frac{1}{2}} \bigg [\sum _{T\in \mathcal {T}_h}\sum _ih^{-1}\Vert [\Pi _h\textbf{U}-\textbf{U}]\Vert ^2_{{\partial T}_i}\bigg ]^{\frac{1}{2}} \nonumber \\&\le C\mu h^{k+1}|\textbf{u}|_{k+1}|\textbf{U}|_2. \end{aligned}$$

It follows from (5.3) and (5.5) that

$$\begin{aligned} \ell _3(\textbf{U},\ \textbf{E}_h)= & {} \mu \left| \sum _{T\in \mathcal {T}_h}\sum _i(\nabla _w(\textbf{U}-\Pi _h\textbf{U}),\;\nabla _w \textbf{E}_h)_{T_i}\right| \\\le & {} C\mu {|\hspace{-.02in}|\hspace{-.02in}|} \textbf{E}_h{|\hspace{-.02in}|\hspace{-.02in}|} {|\hspace{-.02in}|\hspace{-.02in}|} \textbf{U}-\Pi _h\textbf{U}{|\hspace{-.02in}|\hspace{-.02in}|}\\\le & {} C\mu h^{k+1}|\textbf{u}|_{k+1}|\textbf{U}|_2. \end{aligned}$$

The norm equivalence (3.12) and (5.8) imply

$$\begin{aligned}&\quad \ \ell _4(\textbf{U},\ \textbf{E}_h)\\&= \mu \left| \sum _{T\in \mathcal {T}_h}\sum _i(\nabla \textbf{U}-\mathbb {Q}_h\nabla \textbf{U},\ \nabla \textbf{E}_h)_{T_i}\right| \\&\le C\mu \left( \sum _{T\in \mathcal {T}_h}\sum _i\Vert \nabla \textbf{E}_h\Vert _{T_i}^2\right) ^{1/2} \left( \sum _{T\in \mathcal {T}_h}\sum _i\Vert \nabla \textbf{U}-\mathbb {Q}_h\nabla \textbf{U}\Vert _{T_i}^2\right) ^{1/2}\\&\le C\mu \left( \sum _{T\in \mathcal {T}_h}\sum _i(\Vert \nabla (\textbf{u}-\Pi _h\textbf{u})\Vert _{T_i}^2+\Vert \nabla (\Pi _h\textbf{u}-\textbf{u}_h)\Vert _{T_i}^2)\right) ^{1/2}\\&\quad \cdot \left( \sum _{T\in \mathcal {T}_h}\sum _i\Vert \nabla \textbf{U}-\mathbb {Q}_h\nabla \textbf{U}\Vert _{T_i}^2\right) ^{1/2}\\&\le C\mu h|\textbf{U}|_2(h^k|\textbf{u}|_{k+1}+{|\hspace{-.02in}|\hspace{-.02in}|} \Pi _h\textbf{u}-\textbf{u}_h{|\hspace{-.02in}|\hspace{-.02in}|})\\&\le C\mu h^{k+1}|\textbf{u}|_{k+1}|\textbf{U}|_2. \end{aligned}$$

Using (3.12) and (5.8), we obtain

$$\begin{aligned} |\ell _1(\textbf{U},\textbf{E}_h)|= & {} \mu \left| \sum _{T\in \mathcal {T}_h}\sum _i{\langle }(\nabla \textbf{U}-\mathbb {Q}_h\nabla \textbf{U})\cdot \textbf{n},\ \textbf{E}_h-\{\textbf{E}_h\}{\rangle }_{{\partial T}_i}\right| \\\le & {} \mu \sum _{T\in \mathcal {T}_h}\sum _i h_{T_i}^{1/2}\Vert \nabla \textbf{U}-\mathbb {Q}_h\nabla \textbf{U}\Vert _{{\partial T}_i} h_{T_i}^{-1/2}\Vert [\textbf{E}_h]\Vert _{{\partial T}_i}\\\le & {} C\mu h\Vert \textbf{U}\Vert _2\left( \sum _{T\in \mathcal {T}_h}\sum _ih_{T_i}^{-1}(\Vert [\textbf{e}_h]\Vert ^2_{{\partial T}_i}+\Vert [\textbf{u}-\Pi _h\textbf{u}]\Vert ^2_{{\partial T}_i}\right) ^{1/2}\\\le & {} C \mu h\Vert \textbf{U}\Vert _2({|\hspace{-.02in}|\hspace{-.02in}|}\textbf{e}_h{|\hspace{-.02in}|\hspace{-.02in}|}+\left( \sum _{T\in \mathcal {T}_h}\sum _ih_{T_i}^{-1}\Vert [\textbf{u}-\Pi _h\textbf{u}]\Vert ^2_{{\partial T}_i}\right) ^{1/2}\\\le & {} C\mu h^{k+1 }|\textbf{u}|_{k+1}\Vert \textbf{U}\Vert _2. \end{aligned}$$

Combining all the estimates above with (6.4) yields

$$\begin{aligned} \Vert \textbf{E}_h\Vert ^2 \le C \mu h^{k+1}|\textbf{u}|_{k+1} \Vert \textbf{U}\Vert _2. \end{aligned}$$

The estimate (6.12) follows from the above inequality and the regularity assumption (6.3). We have completed the proof. \(\square \)

7 Numerical experiments

In the 2D numerical test, we solve the Stokes equations (1.1)–(1.3) on the domain of unit square \(\Omega =(0,1)^2\). We choose an exact solution

$$\begin{aligned} \begin{aligned} \textbf{u}(x,y)&=\begin{pmatrix} \partial _y g(x,y) \\ -\partial _x g(x,y)\end{pmatrix}, \quad p(x,y) = \frac{1}{5} - x^4, \end{aligned} \end{aligned}$$
(7.1)

where \(g(x,y)=2^6 x^2(1-x)^2 y^2 (1-y^2)\). We compute this problem on the uniform rectangular meshes shown in Fig. 4. Though we have rectangular meshes, we do not use the rectangular BDFM \(H({\text {div}})\) finite element, but the new macro-triangular \(H({\text {div}})\) finite element, \(V_h\) defined in (3.1). We list the numerical results in Tables 1, 2, 3 and 4. Optimal orders of convergence are achieved in all cases, for all variables, all degree elements, and in all norms. We can see the method is pressure robust, as the error of \(\textbf{u}_h\) is independent of the viscosity \(\mu \). When computing \(P_4\) elements in Table 4, the round-off error is too large that the error of \(\textbf{u}_h\) is affected slightly. As proved in the theory, the pressure error is of order \(O(\mu h^k)\) shown in the tables. Again, when \(\mu =10^{-8}\), the error of \(p_h\) is too small that the round-off error alters the order of convergence.

Fig. 4
figure 4

The first three levels of grids for the computation in Tables 1, 2, 3 and 4

Table 1 Error profile on the meshes show as in Fig. 4 for the solution (7.1)
Table 2 Error profile on the meshes show as in Fig. 4 for the solution (7.1)
Table 3 Error profile on the meshes show as in Fig. 4 for the solution (7.1)
Table 4 Error profile on the meshes show as in Fig. 4 for the solution (7.1)

In the 3D numerical test, we solve the Stokes equations (1.1)–(1.3) on the domain of unit cube \(\Omega =(0,1)^3\), with a non-homogeneous boundary condition, and an exact solution

$$\begin{aligned} \begin{aligned} \textbf{u}(x,y,z)&=\begin{pmatrix} y^4 \\ z^4 \\ x^4 \end{pmatrix}, \quad p(x,y) = \frac{1}{2} - x. \end{aligned} \end{aligned}$$
(7.2)

We compute the problem on the uniform cubic meshes shown in Fig. 5. Again, we do not use the cubic BDFM \(H({\text {div}})\) finite element, but the new macro-tetrahedral \(H({\text {div}})\) finite element, \(V_h\) defined in (3.1). We list the numerical results in Tables 5 and 6. Optimal orders of convergence are achieved in all cases, for all variables, all degree elements, and in all norms. The method is pressure robust, which implies the error of velocity \(\textbf{u}_h\) remains same as \(\mu \rightarrow 0\). The error of pressure \(p_h\) at \(\mu =10^{-8}\) is precisely \(10^{-8}\) of that at \(\mu =1\).

Fig. 5
figure 5

The first three levels of grids for the computation in Tables 5 and 6

Table 5 Error profile on the meshes show as in Fig. 5 for the solution (7.2)
Table 6 Error profile on the meshes show as in Fig. 5 for the solution (7.2)