1 Introduction

Finite element methods are a successful and well-established technique for the solution of partial differential equations. A key tool for the quality assessment of a given finite element approximation and the application of adaptive techniques are so-called a posteriori error estimators. These are functionals that are computable in terms of data and the finite element approximation and aim at quantifying the approximation error. For all known estimators, their actual relationship to the error is spoiled by oscillation, i.e., by some additive terms measuring distances between non-discrete and discrete data. Remarkably, oscillation may be even greater than the error. This flaw directly interferes with the quality assessment and, on top of that, it weakens results on adaptive methods and complicates their proofs.

In this article we introduce a new approach to a posteriori error estimation, where oscillation is error-dominated, i.e. it is bounded by the error of the finite element approximation, up to a multiplicative constant depending on the shape-regularity of the underlying mesh.

We illustrate this new approach in the simplest case, where the weak solution \(u\in H^1_0(\Omega )\) of the Dirichlet-Poisson problem

$$\begin{aligned} -\Delta u = f\quad \text {in}~\Omega , \qquad u=0\quad \text {on}~\partial \Omega \end{aligned}$$
(1.1)

is approximated by the Galerkin approximation U that is continuous and piecewise affine over some simplicial mesh \({\mathcal {M}}\) of \(\Omega \). It is instructive to start by recalling the a posteriori error bounds in terms of the standard residual estimator

$$\begin{aligned} {\mathcal {E}}_{\mathrm {R}}(U,f,{\mathcal {M}}) \mathrel {:=}\left( \sum _{K \in {\mathcal {M}}} h_K \Vert J(U)\Vert _{L^2(\partial K)}^2 + h_K ^2 \Vert f\Vert _{L^2(K)}^2 \right) ^{1/2}; \end{aligned}$$
(1.2)

see, e.g., Ainsworth and Oden [2] or Verfürth [26]. If \(f\in L^2(\Omega )\), then the energy norm error \(\Vert u-U\Vert _{H^1_0(\Omega )}\) and the estimator are almost equivalent. More precisely, we have

$$\begin{aligned} \Vert u-U\Vert _{H^1_0(\Omega )} \!\lesssim \! {\mathcal {E}}_{\mathrm {R}}(U,f,{\mathcal {M}}), \qquad {\mathcal {E}}_{\mathrm {R}}(U,f,{\mathcal {M}}) \!\lesssim \! \Vert u-U\Vert _{H^1_0(\Omega )} \!+\! {{\,\mathrm{osc}\,}}_0(f,{\mathcal {M}}),\nonumber \\ \end{aligned}$$
(1.3)

where the interfering oscillation is given by

$$\begin{aligned} {{\,\mathrm{osc}\,}}_0(f,{\mathcal {M}})^2 \mathrel {:=}\sum _{K \in {\mathcal {M}}} h_K ^2 \Vert f-P_{0,{\mathcal {M}}}f\Vert _{L^2(K)}^2 \quad \text {with}\quad P_{0,{\mathcal {M}}}f|_{K} \mathrel {:=}\frac{1}{|K |} \int _{K} f.\nonumber \\ \end{aligned}$$
(1.4)

Let us discuss the relationship of this classical \(L^2\)-oscillation and the energy norm error; for the proofs of the nontrival statements, see Sect. 3.8. Customarily, oscillation is associated with higher order. This idea is supported by the following observation: if f is actually in \(H^1(\Omega )\), then \({{\,\mathrm{osc}\,}}_0(f,{\mathcal {M}})=O(h_{\mathcal {M}}^2)\) as \(h_{\mathcal {M}}\mathrel {:=}\max _{K \in {\mathcal {M}}} h_K \searrow 0\).

On any fixed mesh however, the oscillation \({{\,\mathrm{osc}\,}}_0(f,{\mathcal {M}})\) may be arbitrarily greater than the energy norm error \(\Vert u-U\Vert _{H^1_0(\Omega )}\). This is a consequence of the fact that the \(L^2\)-norm is strictly stronger than the \(H^{-1}\)-norm. The use of the \(L^2\)-norm in (1.4) can be traced back to its use in the element residual \(h_K \Vert f\Vert _{L^2(K)}\) in (1.2) and so it can be motivated by the request for the computability of the estimator. In fact, in contrast to an element residual based upon some local \(H^{-1}\)-norm of f, this form reduces to the (approximate) computation of an integral.

One may think that the use of the \(L^2\)-norm is the only reason for the possible relative largeness of oscillations like \({{\,\mathrm{osc}\,}}_0(f,{\mathcal {M}})\). Yet, Cohen, DeVore and Nochetto present in [10] a striking example which entails that even the \(H^{-1}\)-oscillation

$$\begin{aligned} \begin{aligned} \min _{g\in {\mathbb {P}}_0({\mathcal {M}})}\Vert f-g\Vert _{H^{-1}(\Omega )}^2&\\ \text {with}\quad&{\mathbb {P}}_0({\mathcal {M}}) \mathrel {:=}\{g\in L^\infty (\Omega ) \mid \forall K \in {\mathcal {M}}\; g|_{K} \text { is constant} \} \end{aligned} \end{aligned}$$
(1.5)

from Stevenson [23] may converge slower than the error; see Lemma 21 below. Notice that this contradicts the aforementioned idea that \({{\,\mathrm{osc}\,}}_0(f,{\mathcal {M}})\) is always of higher order and, moreover, in view of \({{\,\mathrm{osc}\,}}_0(f,{\mathcal {M}})\lesssim {\mathcal {E}}_{\mathrm {R}}(U,f,{\mathcal {M}})\), it entails that also the estimator \({\mathcal {E}}_{\mathrm {R}}(U,f,{\mathcal {M}})\) may decrease slightly slower than the error.

The key tool to overcome the shortcomings of the above oscillations is a new projection operator \({\mathcal {P}}_{{\mathcal {M}}}\) enjoying the following properties; see Sects. 3.33.6:

  • \({\mathcal {P}}_{{\mathcal {M}}}f\) is discrete for any functional \(f\in H^{-1}(\Omega )\). In comparison to \(P_{0,{\mathcal {M}}}\), the image of \({\mathcal {P}}_{{\mathcal {M}}}\) is enriched by the span of the face-supported Dirac distributions and so contains true functionals.

  • \({\mathcal {P}}_{{\mathcal {M}}}f\) is computable in a local manner. Here computable means that it can be determined from the information available in the linear systems for finite element approximations.

  • The local dual norms of the new oscillation \(f-{\mathcal {P}}_{{\mathcal {M}}}f\) are dominated by corresponding local errors. This property hinges on the face-supported Dirac distributions and on local \(H^{-1}\)-stability of \({\mathcal {P}}_{{\mathcal {M}}}f\).

  • In contrast to the local dual norms of the residual \(f+\Delta U\), the local dual norms of the discretized residual \({\mathcal {P}}_{{\mathcal {M}}}{f}+\Delta U\) can be estimated from below and above in a computable manner.

Thanks to these properties, we derive in Sect. 3.7 abstract a posteriori bounds such that the oscillation is bounded by the error. In Sect. 4 we provide several realizations leading to hierarchical estimators and estimators based on local problems or based on equilibrated fluxes. Furthermore, in Sect. 4.2 we show that an extension of the standard residual estimator (1.2) onto the image of \({\mathcal {P}}_{{\mathcal {M}}}\) satisfies

$$\begin{aligned} \Vert u-U\Vert _{H^1_0(\Omega )}^2 \eqsim {\mathcal {E}}_{\mathrm {R}}(U,{\mathcal {P}}_{{\mathcal {M}}}{f},{\mathcal {M}})^2 + \sum _{z\in {\mathcal {V}}} \Vert f-{\mathcal {P}}_{{\mathcal {M}}}{f}\Vert _{H^{-1}(\omega _z)}^2, \end{aligned}$$
(1.6)

where \({\mathcal {V}}\) stands for the set of vertices of \({\mathcal {M}}\) and \(\omega _z\) is the star around the vertex z. A comparison with (1.3) immediately yields:

  • Both \({\mathcal {E}}_{\mathrm {R}}(U,f,{\mathcal {M}})\) and the right-hand side of (1.6) bound the energy norm error in terms of U, f, and \({\mathcal {M}}\). However, while the latter one is free of overestimation, the first one may overestimate, even asymptotically.

  • Since \({\mathcal {P}}_{{\mathcal {M}}}{f}\) is discrete and computable in the aforementioned sense, we have that \({\mathcal {E}}_{\mathrm {R}}(U,{\mathcal {P}}_{{\mathcal {M}}}{f},{\mathcal {M}})\) is also computable, while \({\mathcal {E}}_{\mathrm {R}}(U,f,{\mathcal {M}})\) is not.

  • Equivalence (1.6) thus splits the estimation of the error in two parts, reflecting the spirit of Verfürth [26, Remark 1.8] and Ainsworth [1, Section 3.1]: One part is computable and related to the underlying differential operator. The other one depends solely on data; its computation, or rather estimation, hinges on a priori knowledge.

We remark that our new approach also has consequences in the convergence analysis of adaptive methods. In particular, it allows to generalize and sharpen the basic convergence for adaptive methods from [19]; see [15].

2 Model problem and discretization

In order to exemplify our new approach to a posteriori error estimation, we consider the homogeneous Dirichlet problem for Poisson’s equation and the energy norm error of the associated linear finite element solution. The purpose of this section is to recall the relevant properties of this boundary value problem and discretization.

We shall use the following notation associated with a (Lebesgue) measurable set \(\omega \) of \({\mathbb {R}}^d\), \(d\in {\mathbb {N}}\). Given \(m\in {\mathbb {N}}\), we let \(L^2(\omega ;{\mathbb {R}}^m)\) denote the Lebesgue space of square integrable functions over \(\omega \) with values in \({\mathbb {R}}^m\). We write \(\left\langle v,\,w\right\rangle _{\omega }\) and \(\Vert \cdot \Vert _{\omega }^2\) for its scalar product and its induced norm. For \(m=1\), we abbreviate \(L^2(\omega ;{\mathbb {R}})\) to \(L^2(\omega )\).

If \(\omega \subset {\mathbb {R}}^d\) is non-empty and open, \(H^1(\omega )\) stands for the Sobolev space of all functions in \(L^2(\omega )\) whose distributional gradient is also in \(L^2(\omega ;{\mathbb {R}}^d)\). Moreover, we let \(H^1_0(\omega )\) be the closure in \(H^1(\omega )\) of all infinitely differentiable function with compact support in \(\omega \). If the boundary \(\partial \omega \) of \(\omega \) is sufficiently regular (e.g., Lipschitz), this are all functions in \(H^1(\omega )\) with vanishing trace on the boundary \(\partial \omega \). Thanks to Friedrichs’ inequality, \(H^1_0(\omega )\) is a Hilbert space with scalar product \(\left\langle \nabla \cdot ,\,\nabla \cdot \right\rangle _{\omega }\) and norm \(\Vert \nabla \cdot \Vert _{\omega }\). As usual, \(H^{-1}(\omega )\) indicates the dual space of \(H^1_0(\omega )\), i.e. the space of linear and continuous functionals on \(H^1_0(\omega )\). We identify \(L^2(\omega )\) with its dual space and thus have

$$\begin{aligned} H^1_0(\omega ) \subset L^2(\omega ) \subset H^{-1}(\omega ). \end{aligned}$$
(2.1)

The norm of \(H^{-1}(\omega )\) is given by

$$\begin{aligned} \Vert \ell \Vert _{H^{-1}(\omega )} := \sup _{w\in H^1_0(\omega )} \frac{\left\langle \ell ,\,w\right\rangle _\omega }{\Vert \nabla w\Vert _{\omega }}, \qquad \ell \in H^{-1}(\omega ), \end{aligned}$$

where the dual brackets \(\left\langle \ell ,\,w\right\rangle _\omega := \ell (w)\), \(w\in H^1_0(\omega )\), extend-restrict the scalar product in \(L^2(\omega )\). If \(D\subset {\mathbb {R}}^d\) is a set such that \(\mathring{D}\) is suitable for one of the preceding notations, we also use D instead of the more cumbersome \(\mathring{D}\), e.g. we write also \(H^1(D)\) instead of \(H^1(\mathring{D})\).

Let \(\Omega \) be an open, bounded and connected subset of \({\mathbb {R}}^d\) with Lipschitz boundary and whose closure can be subdivided into simplices. We shall omit \(\Omega \) in the notation of dual pairings and norms. The weak formulation of (1.1) reads as follows:

$$\begin{aligned} \begin{aligned} \text {Given} f\in H^{-1}(\Omega )\text {, find }&u=u_f\in H^1_0(\Omega ) \text { such that } \\&\forall v\in H^1_0(\Omega ) \quad \left\langle \nabla u,\,\nabla v\right\rangle _{} = \left\langle f,\,v\right\rangle . \end{aligned} \end{aligned}$$
(2.2)

In other words: we are looking for the Riesz representation of \(f\) in \(H^1_0(\Omega )\). Notice that the Riesz representation theorem establishes an isomorphism between the space \(H^1_0(\Omega )\) of solutions and the space \(H^{-1}(\Omega )\) of loads. In particular, a unique solution exists not only for \(f\in L^2(\Omega )\) but for all \(f\in H^{-1}(\Omega )\). This fact suggests that, at least conceptually, an approximation method for (2.2), along with its a posteriori analysis, should cover also loads in \(H^{-1}(\Omega )\).

In order to approximate the solution of (2.2), we use a Galerkin approximation based upon finite elements. For the sake of simplicity, we restrict ourselves to simplicial meshes and lowest order.

Let \({\mathcal {M}}\) be a simplicial, face-to-face (conforming) mesh of the domain \(\Omega \). Given an element \(K \in {\mathcal {M}}\), we denote by \(h_K:= {{\,\mathrm{diam}\,}}K:= \sup _{x,y\in K} |x-y|\) its diameter and by \(\rho _K:= \sup \{ {{\,\mathrm{diam}\,}}B \mid B \text { ball in }K \}\) the maximal diameter of inscribed balls. In what follows, ‘\(\lesssim \)’ stands for ‘\(\leqslant C\)’, where the generic constant C may depend on d and the shape coefficient

$$\begin{aligned} \sigma ({\mathcal {M}}) := \max _{K \in {\mathcal {M}}} \sigma _K \quad \text {with}\quad \sigma _K \mathrel {:=}\frac{h_K}{\rho _K}. \end{aligned}$$

In the case of both inequalities ‘\(\lesssim \)’ and ‘\( > rsim \)’, we shall use ‘\(\simeq \)’ as shorthand.

An interelement face of \({\mathcal {M}}\) is a simplex \(F \) with d vertices arising as the intersection \(F =K _1\cap K _2\) of two uniquely determined elements \(K _1,K _2\in {\mathcal {M}}\). Its associated patch is

$$\begin{aligned} \omega _F \mathrel {:=}K _1\cup K _2. \end{aligned}$$
(2.3)

We let \({\mathcal {F}}={\mathcal {F}}({\mathcal {M}})\) denote the set of all \((d-1)\)-dimensional interelement faces of \({\mathcal {M}}\). Given \(F \in {\mathcal {F}}\) and \(K \in {\mathcal {M}}\) with \(F \subset K \), we write

$$\begin{aligned} h_{K;F} = \frac{d|K |}{|F |} \in [\rho _K,h_K ] \end{aligned}$$
(2.4)

for the height of \(K \) over \(F \).

Furthermore, \({\mathcal {V}}={\mathcal {V}}({\mathcal {M}})\) stands for the set of all vertices of \({\mathcal {M}}\). To any vertex \(z\in {\mathcal {V}}\), we associate the sets

$$\begin{aligned} \omega _z := \bigcup \{ K \in {\mathcal {M}}: K \ni z\}, \quad \sigma _z := \bigcup \{F \in {\mathcal {F}}: F \ni z \}, \end{aligned}$$

for which we have

$$\begin{aligned} \#\{K \in {\mathcal {M}}\mid K \ni z\} \lesssim \#\{F \in {\mathcal {F}}\mid F \ni z \} \lesssim 1. \end{aligned}$$
(2.5)

If \(K \in {\mathcal {M}}\) with \(K \subset \omega _z\) for some \(z\in {\mathcal {V}}\), then the diameter \(h_z\) of \(\omega _z\) verifies

$$\begin{aligned} h_K \le h_z \lesssim h_K. \end{aligned}$$
(2.6)

Moreover, if e is a direction, i.e. \(e\in {\mathbb {R}}^d\) with \(|e|=1\), we write \(h_{z;e}\) for the maximal length of a line segment in \(\omega _z\) with direction e. Then

$$\begin{aligned} {\tilde{\rho }}_z \mathrel {:=}\inf _{|e|=1} h_{z;e} \end{aligned}$$
(2.7)

verifies

$$\begin{aligned} \rho _K \leqslant {\tilde{\rho }}_z \lesssim \rho _K \end{aligned}$$
(2.8)

whenever \(K \in {\mathcal {M}}\) with \(K \subset \omega _z\).

Let \({\mathbb {P}}_k\) be the space of polynomials of degree at most \(k\in {\mathbb {N}}\) over \({\mathbb {R}}^d\) and let

$$\begin{aligned} {\mathbb {P}}_k({\mathcal {M}}) := \big \{V \in L^\infty (\Omega ) \mid V|_{K}\in {\mathbb {P}}_k(K) \text { for all }K \in {\mathcal {M}}\big \} \end{aligned}$$

be its piecewise counterpart over \({\mathcal {M}}\). The space of continuous, piecewise affine functions over \({\mathcal {M}}\) is then

$$\begin{aligned} {\mathbb {V}}({\mathcal {M}}):= {\mathbb {P}}_1({\mathcal {M}})\cap H^1(\Omega ) = {\mathbb {P}}_1({\mathcal {M}})\cap C^0(\bar{\Omega }). \end{aligned}$$

Its nodal basis \(\{\phi _z\}_{z\in {\mathcal {V}}}\) is defined by

$$\begin{aligned} \phi _z\in {\mathbb {V}}({\mathcal {M}})\quad \text {such that}\quad \phi _z(y):=\delta _{zy}~\text {for all}~z,y\in {\mathcal {V}}. \end{aligned}$$

This basis provides the nodal value representation

$$\begin{aligned} V = \sum _{z\in {\mathcal {V}}} V(z) \phi _z \end{aligned}$$

for any \(V\in {\mathbb {V}}({\mathcal {M}})\) and the partition of unity

$$\begin{aligned} \sum _{z\in {\mathcal {V}}}\phi _z = 1 \quad \text {in }\Omega , \end{aligned}$$
(2.9)

where, for each vertex \(z\in {\mathcal {V}}\), we have \({{\,\mathrm{supp}\,}}\phi _z = \omega _z\), with skeleton \(\sigma _z\). Finally, we recall that, for any element \(K \in {\mathcal {M}}\) and any powers \(\alpha _z\in {\mathbb {N}}_0\), \(z\in {\mathcal {V}}\cap K \), we have

$$\begin{aligned} \int _{K} \prod _{z\in {\mathcal {V}}\cap K} \phi _z^{\alpha _z} = \frac{d!\prod _{z\in {\mathcal {V}}\cap K} \alpha _z!}{(\sum _{z\in {\mathcal {V}}\cap K} \alpha _z + d)!} |K |. \end{aligned}$$
(2.10)

The finite element functions satisfying the boundary condition in (2.2) form the space

$$\begin{aligned} {\mathbb {V}}_0({\mathcal {M}}):= \{ V\in {\mathbb {V}}({\mathcal {M}})\mid V(z) = 0 \text { for all } z\in {\mathcal {V}}\cap \partial \Omega \} = {\mathbb {P}}_1({\mathcal {M}}) \cap H^1_0(\Omega ). \end{aligned}$$

The associated Galerkin approximation \(U=U_{f;{\mathcal {M}}}\) is characterized by

$$\begin{aligned} U\in {\mathbb {V}}_0({\mathcal {M}})\quad \text {such that}\quad \forall V\in {\mathbb {V}}_0({\mathcal {M}})\quad \left\langle \nabla U,\,\nabla V\right\rangle _{}=\left\langle f,\,V\right\rangle . \end{aligned}$$
(2.11)

Notice that the right-hand side and so U are well-defined, also for \(f\in H^{-1}(\Omega )\), thanks to the conformity of \({\mathbb {V}}_0({\mathcal {M}})\). Céa’s lemma states that the Galerkin approximation is the best approximation with respect to the energy norm error, i.e.,

$$\begin{aligned} \Vert \nabla u-\nabla U\Vert _{\Omega } \le \Vert \nabla u -\nabla V\Vert _{\Omega } \qquad \text {for all}~V\in {\mathbb {V}}_0({\mathcal {M}}). \end{aligned}$$
(2.12)

In order to determine the Galerkin approximation U, one usually obtains its values at the interior vertices \({\mathcal {V}}_0 := {\mathcal {V}}\cap \Omega \) by solving the symmetric positive definite linear system

$$\begin{aligned} M\alpha = F, \end{aligned}$$

where

$$\begin{aligned} \alpha =(U(z))_{z\in {\mathcal {V}}_0}, \quad M=\big ( \left\langle \nabla \phi _z,\,\nabla \phi _y\right\rangle _{} \big )_{y,z\in {\mathcal {V}}_0}, \quad F=(\left\langle f,\,\phi _y\right\rangle )_{y\in {\mathcal {V}}_0}. \end{aligned}$$
(2.13)

We thus see that the Galerkin approximation U is computable whenever the load evaluations

$$\begin{aligned} \left\langle f,\,\phi _y\right\rangle , y \in {\mathcal {V}}_0, \text { are known exactly.} \end{aligned}$$
(2.14)

Strictly speaking, these evaluations are in general not computable. In fact, even if \(f\in L^2(\Omega )\) is a function, the evaluation of \(\left\langle f,\,\phi _y\right\rangle = \int _{\Omega } f\phi _y\) requires the computation of an integral, which in general can be done only approximately by means of numerical integration. Notwithstanding, error analyses of approximations like (2.11) have proved very useful for the theoretical understanding and underpinning of finite element methods and are therefore very common. Accordingly, we shall suppose that the evaluations (2.14) are known to us. In Sect. 3.6 below, we will discuss which kind of additional information is used in our a posteriori analysis.

3 A posteriori analysis with error-dominated oscillation

We present our new approach to a posteriori error analysis by deriving bounds for the energy norm error of the Galerkin approximation (2.11). The new idea for achieving error-dominated oscillation is described in Sect. 3.3.

3.1 Residual norms

Given some load \(f\in H^{-1}(\Omega )\) and a Galerkin approximation \(U_{f;{\mathcal {M}}}\), we want to quantify the energy norm error \(\Vert \nabla (u_f-U_{f;{\mathcal {M}}})\Vert _{}\), where the exact solution \(u_f\) of (2.2) is typically unknown to us.

Our starting point is the so-called residual \({{\,\mathrm{Res}\,}}(f;{\mathcal {M}})\in H^{-1}(\Omega )\) given by

$$\begin{aligned} \left\langle {{\,\mathrm{Res}\,}}(f;{\mathcal {M}}),\,v\right\rangle := \left\langle f,\,v\right\rangle - \left\langle \nabla U_{f;{\mathcal {M}}},\,\nabla v\right\rangle _{} \quad \text {for all } v\in H^1_0(\Omega ). \end{aligned}$$

It is defined in terms of data and the computable Galerkin approximation and vanishes if and only if the latter equals the exact solution. The following lemma shows that appropriately measuring the size of the residual relates to the error.

Lemma 1

(Error, residual and load) We have

$$\begin{aligned} \Vert \nabla (u_f-U_{f;{\mathcal {M}}})\Vert _{} = \Vert {{\,\mathrm{Res}\,}}(f;{\mathcal {M}})\Vert _{H^{-1}(\Omega )} \leqslant \Vert f\Vert _{H^{-1}(\Omega )}. \end{aligned}$$

Proof

Thanks to the differential equation in (2.2), we have, for all \(v\in H^1_0(\Omega )\),

$$\begin{aligned} \left\langle {{\,\mathrm{Res}\,}}(f;{\mathcal {M}}),\,v\right\rangle = \left\langle \nabla (u_f-U_{f;{\mathcal {M}}}),\,\nabla v\right\rangle _{} = \left\langle -\Delta (u_f-U_{f;{\mathcal {M}}}),\,v\right\rangle , \end{aligned}$$
(3.1)

where \(-\Delta \) indicates the distributional Laplacian. Consequently, the claimed equality follows from the fact that \(-\Delta :H^1_0(\Omega ) \rightarrow H^{-1}(\Omega )\) is an isometry (which follows from the Cauchy–Schwarz inequality in \(L^2(\Omega )\) and from testing with \(v=u_f-U_{f;{\mathcal {M}}}\)). The claimed inequality follows by invoking also (2.12):

$$\begin{aligned} \Vert \nabla (u_f-U_{f;{\mathcal {M}}})\Vert _{} \leqslant \Vert \nabla u_f\Vert _{} = \Vert f\Vert _{H^{-1}(\Omega )}. \end{aligned}$$

\(\square \)

Thus, we aim now at quantifying the dual norm \(\Vert {{\,\mathrm{Res}\,}}(f;{\mathcal {M}})\Vert _{H^{-1}(\Omega )}\). The following simple observation shows that this task requires much more information than computing the Galerkin approximation.

Lemma 2

(Bounding residual norms) Without any a priori information on the load \(f\in H^{-1}(\Omega )\), the residual norm \(\Vert {{\,\mathrm{Res}\,}}(f;{\mathcal {M}})\Vert _{H^{-1}(\Omega )}\) cannot be bounded in terms of a finite number of adaptive evaluations of the form: \(\left\langle f,\,v\right\rangle \) with \(v\in H^1_0(\Omega )\).

Proof

Suppose that the claim is false. Then, for each \(f\in H^{-1}(\Omega )\), there is a number \(B(f) \geqslant \Vert {{\,\mathrm{Res}\,}}(f;{\mathcal {M}})\Vert _{H^{-1}(\Omega )}\) which is given in terms of evaluations \(\left\langle f,\,v_i\right\rangle \), \(i=1,\dots ,n_f\), where the choice of \(v_i\) may depend deterministically on the previous evalutations \(\left\langle f,\,v_1\right\rangle , \dots , \left\langle f,\,v_{i-1}\right\rangle \). Fix some functional \(0\ne \ell \in H^{-1}(\Omega )\). Since \(H^1_0(\Omega )\) is infinite-dimensional, we can choose a normalized \(w \in H^1_0(\Omega )\) that is perpendicular to \({\mathbb {V}}_0({\mathcal {M}})\) and all test functions \(v_i\), \(i=1,\dots ,n_{\ell }\) associated with \(\ell \). Set \(\delta := 3B(\ell ) (-\Delta )w\) and observe that \(U_{\delta ;{\mathcal {M}}}=0\) and \(\left\langle \delta ,\,v_i\right\rangle =0\) for all \(i=1,\dots ,n_{\ell }\). Therefore \(\left\langle \ell +\delta ,\,v_i\right\rangle =\left\langle \ell ,\,v_i\right\rangle \) and we obtain the contradiction

$$\begin{aligned} B(\ell )&= B(\ell + \delta ) \geqslant \Vert {{\,\mathrm{Res}\,}}(\ell + \delta ;{\mathcal {M}})\Vert _{H^{-1}(\Omega )} = \Vert \delta + {{\,\mathrm{Res}\,}}(\ell ;{\mathcal {M}})\Vert _{H^{-1}(\Omega )} \\&\geqslant \Vert \delta \Vert _{H^{-1}(\Omega )} - \Vert {{\,\mathrm{Res}\,}}(\ell ;{\mathcal {M}})\Vert _{H^{-1}(\Omega )} \geqslant 3 B(\ell ) - B(\ell ) = 2 B(\ell ) > 0. \end{aligned}$$

\(\square \)

Remark 3

(Load evaluations vs exact integrals) A similar yet simpler argument shows that, without any a priori information on \(f \in L^2(\Omega )\), also \(\Vert f\Vert _{}\) cannot be bounded in terms of adaptive evaluations \(\int _{\Omega } fv\) with \(v\in L^2(\Omega )\).

Before discussing in Sect. 3.3 repercussions of Lemma 2, it is useful to take into account a further requirement for a posteriori bounds.

3.2 Localized residual norm

Adaptive mesh refinement is an important application of a posteriori bounds. It is usually based upon the comparison of ‘local’ quantities. Therefore, it is of interest to split a posteriori bounds, or the residual norm itself, into local contributions.

Such a localization appears implicitly, e.g., in the a posteriori error analysis of Babuška and Miller [3]. It is based upon the \(W^{1,\infty }\)-partition of unity (2.9) and the orthogonality property:

$$\begin{aligned} \left\langle {{\,\mathrm{Res}\,}}(f;{\mathcal {M}}),\,\phi _z\right\rangle = 0 \qquad \text {for all }z \in {\mathcal {V}}_0 = {\mathcal {V}}\cap \Omega . \end{aligned}$$

We thus introduce the subclass

$$\begin{aligned} {\mathcal {R}}_{{\mathcal {M}}} \mathrel {:=}\{ \ell \in H^{-1}(\Omega ) \mid \forall V\in {\mathbb {V}}_0({\mathcal {M}})\ \left\langle \ell ,\,V\right\rangle = 0 \} \end{aligned}$$

of residuals associated with Galerkin approximations. Recall that \({{\,\mathrm{supp}\,}}\phi _z = \omega _z\) and that \(H^{-1}(\omega _z)\) is a shorthand for \(H^{-1}(\mathring{\omega }_z)\).

Lemma 4

(Localization)

Let \(\ell \in H^{-1}(\Omega )\) be any functional.

  1. (i)

    If \(\ell \in {\mathcal {R}}_{\mathcal {M}}\), then

    $$\begin{aligned} \Vert \ell \Vert _{H^{-1}(\Omega )}^2 \lesssim \sum _{z\in {\mathcal {V}}} \Vert \ell \Vert _{H^{-1}(\omega _z)}^2, \end{aligned}$$

    where the hidden constant depends only on d and the shape coefficient \(\sigma ({\mathcal {M}})\).

  2. (ii)

    We have

    $$\begin{aligned} \sum _{z\in {\mathcal {V}}} \Vert \ell \Vert _{H^{-1}(\omega _z)}^2 \leqslant (d+1) \Vert \ell \Vert _{H^{-1}(\Omega )}^2. \end{aligned}$$

Proof

See also Cohen, DeVore, and Nochetto [10, §3.2 and §3.4], Ern and Guermond [11, Proposition 31.7] or Blechta, Málek, and Vohralík [5, Theorem 3.7]. For the sake of completeness, we provide details. In order to show (i), we fix an arbitrary \(v \in H^1_0(\Omega )\). In view of the partition of unity (2.9) and \(\ell \in {\mathcal {R}}_{\mathcal {M}}\), we can write

$$\begin{aligned} \left\langle \ell ,\,v\right\rangle = \sum _{z\in {\mathcal {V}}} \left\langle \ell ,\,v\phi _z\right\rangle = \sum _{z\in {\mathcal {V}}} \left\langle \ell ,\,(v-c_z)\phi _z\right\rangle , \end{aligned}$$
(3.2)

where the reals \(c_z\in {\mathbb {R}}\) are given by

$$\begin{aligned} c_z \mathrel {:=}\frac{1}{|\omega _z|}\int _{\omega _z}v\,\mathrm {d}x \text { for }z\in {\mathcal {V}}_0, \quad \text {and}\quad c_z=0 \text { for }z\in {\mathcal {V}}\setminus {\mathcal {V}}_0. \end{aligned}$$

Thanks to \(0\leqslant \phi _z\leqslant 1\), the inverse estimate \(\Vert \nabla \phi _z\Vert _{L^\infty (\omega _z)} \leqslant \max _{K \subset \omega _z} \rho _{K}^{-1} \lesssim h_z^{-1}\) and the Poincaré-Friedrichs inequality \(\Vert v-c_z\Vert _{\omega _z} \lesssim h_z \Vert \nabla v\Vert _{\omega _z}\) (see, e.g., Nochetto and Veeser [21, Lemma 4]), we have, for any \(z\in {\mathcal {V}}\),

$$\begin{aligned} \Vert \nabla \big ( (v-c_z)\phi _z \big )\Vert _{\omega _z} \leqslant \Vert \nabla v\Vert _{\omega _z} + \Vert v-c_z\Vert _{\omega _z} \Vert \nabla \phi _z\Vert _{L^\infty (\omega _z)} \leqslant C_{\sigma ({\mathcal {M}})} \Vert \nabla v\Vert _{\omega _z}, \end{aligned}$$
(3.3)

where the constant \(C_{\sigma ({\mathcal {M}})}\) depends only on \(\sigma ({\mathcal {M}})\). Thus, (3.2) leads to

$$\begin{aligned} |\left\langle \ell ,\,v\right\rangle | \lesssim \sum _{z\in {\mathcal {V}}} \Vert \ell \Vert _{H^{-1}(\omega _z)} \Vert \nabla v\Vert _{\omega _z} \leqslant \sqrt{d+1} \left( \sum _{z\in {\mathcal {V}}} \Vert \ell \Vert _{H^{-1}(\omega _z)}^2 \right) ^{1/2} \Vert \nabla v\Vert _{} \end{aligned}$$

and the proof of (i) is finished.

To prove (ii), we let \(v_z \in H^1_0(\omega _z)\) with \(\Vert \nabla v_z\Vert _{\omega _z}\leqslant 1\) for any node \(z \in {\mathcal {V}}\) and set \(v=\sum _{z\in {\mathcal {V}}} \left\langle \ell ,\,v_z\right\rangle v_z \in H^1_0(\Omega )\). Then

$$\begin{aligned} \sum _{z\in {\mathcal {V}}}\left\langle \ell ,\,v_z\right\rangle ^2 = \left\langle \ell ,\,v\right\rangle \leqslant \Vert \ell \Vert _{H^{-1}(\Omega )} \Vert \nabla v\Vert _{}, \end{aligned}$$

and, with the help of two Cauchy–Schwarz inequalities,

$$\begin{aligned} \Vert \nabla v\Vert _{}^2&= \sum _{K \in {\mathcal {M}}} \sum _{z,y\in {\mathcal {V}}\cap K} \left\langle \ell ,\,v_z\right\rangle \left\langle \ell ,\,v_y\right\rangle \int _{K} \nabla v_z \cdot \nabla v_y \\&\leqslant \sum _{K \in {\mathcal {M}}} \sum _{z\in {\mathcal {V}}\cap K} (d+1) |\left\langle \ell ,\,v_z\right\rangle |^2 \Vert \nabla v_z\Vert _{K}^2 = (d+1) \sum _{z\in {\mathcal {V}}} |\left\langle \ell ,\,v_z\right\rangle |^2. \end{aligned}$$

Consequently, we conclude (ii) by taking the suprema over all \(v_z\) for all \(z\in {\mathcal {V}}\). \(\square \)

Thus, in the context of adaptive mesh refinement, we are also interested in quantifying the single terms of the localized residual norm

$$\begin{aligned} \Vert {{\,\mathrm{Res}\,}}(f;{\mathcal {M}})\Vert _{H^{-1}({\mathcal {M}})}^2 \mathrel {:=}\sum _{z\in {\mathcal {V}}} \Vert {{\,\mathrm{Res}\,}}(f;{\mathcal {M}})\Vert _{H^{-1}(\omega _z)}^2. \end{aligned}$$
(3.4)

Of course, we face the same problem for the local residual norms as for the global one.

Corollary 5

(Bounding local residual norms) Without any a priori information on \(f\in H^{-1}(\Omega )\), each local residual norm \(\Vert {{\,\mathrm{Res}\,}}(f,{\mathcal {M}})\Vert _{H^{-1}(\omega _z)}\), \(z\in {\mathcal {V}}\), cannot be bounded in terms of a finite number of adaptive evaluations of \(f\).

Proof

Replace the domain \(\Omega \) by \(\omega _z\) in the proof of Lemma 2 and extend functionals in \(H^{-1}(\omega _z)\) by 0 on the orthogonal complement of \(H^1_0(\omega _z)\) in \(H^1_0(\Omega )\). \(\square \)

3.3 Towards error-dominated oscillation

Our approach to error-dominated oscillation relies on a projection operator. This section motivates and formulates its key properties. They are summarized in (3.14) and will guide the actual construction in the following sections.

In view of Lemma 2 and Corollary 5, a posteriori bounds for the residual norm or its localized variant require knowledge on the load \(f\) beyond a finite number of evaluations. The actual knowledge of \(f\) can be of different nature and, accordingly, may require different techniques. Here we want to address only aspects of a posteriori error estimation that are independent of the nature of this knowledge. Correspondingly, we split the residual into a discretized residual and data approximation:

$$\begin{aligned} {{\,\mathrm{Res}\,}}(f;{\mathcal {M}}) = \big ( {\mathcal {P}}_{{\mathcal {M}}}f+ \Delta U_{f;{\mathcal {M}}} \big ) + \big ( f- {\mathcal {P}}_{{\mathcal {M}}}f\big ) \end{aligned}$$
(3.5)

where \({\mathcal {P}}_{{\mathcal {M}}}\) maps onto a subspace \(\underline{{\mathbb {D}}}({\mathcal {M}})\) of \(H^{-1}(\Omega )\) such that

  • \(\Vert {\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}}\Vert _{H^{-1}({\mathcal {M}})}\) can be bounded with the help of a finite number of evaluations of \(f\) and

  • the task of bounding \(\Vert f- {\mathcal {P}}_{{\mathcal {M}}}f\Vert _{H^{-1}({\mathcal {M}})}\) hinges only on knowledge of the load \(f\); this task may be viewed as a matter of approximation theory since, apart from the choice of the norm, it is independent of the boundary value problem (2.2).

Here we have used the localized dual norm \(\Vert \cdot \Vert _{H^{-1}({\mathcal {M}})}\) in order to allow for applications in mesh adaptivity. It is then desirable that both parts are dominated by the error, i. e., we have

$$\begin{aligned} \Vert {\mathcal {P}}_{{\mathcal {M}}}f+ \Delta U_{f;{\mathcal {M}}}\Vert _{H^{-1}({\mathcal {M}})}&\lesssim \Vert \nabla (u_f-U_{f;{\mathcal {M}}})\Vert _{} , \end{aligned}$$
(3.6a)
$$\begin{aligned} \Vert f-{\mathcal {P}}_{{\mathcal {M}}}f\Vert _{H^{-1}({\mathcal {M}})}&\lesssim \Vert \nabla (u_f-U_{f;{\mathcal {M}}})\Vert _{} . \end{aligned}$$
(3.6b)

In view of Lemmas 1 and 4, the two conditions are equivalent.

The construction of a suitable mapping \({\mathcal {P}}_{{\mathcal {M}}}\) is the new twist in our approach. In order to get first hints on this, let us test out several candidates with necessary conditions arising from (3.6b).

The proof of Corollary 5 suggests that the problem lies in the fact that f is taken from an infinite-dimensional space. The projection \({\mathcal {P}}_{0,{\mathcal {M}}}\) into discrete data from (1.4) is thus a candidate for \({\mathcal {P}}_{{\mathcal {M}}}\). This choice, however, does not verify (3.6). In fact, Lemma 1, Lemma 4 (ii), and (3.6b) imply the stability estimate

$$\begin{aligned} \Vert {\mathcal {P}}_{{\mathcal {M}}}f\Vert _{H^{-1}({\mathcal {M}})} \lesssim \Vert f\Vert _{H^{-1}(\Omega )}, \end{aligned}$$
(3.7)

while \({\mathcal {P}}_{0,{\mathcal {M}}}f\) is not even defined for a general \(f\in H^{-1}(\Omega )\) (and cannot be continuously extended; cf. Lemma 20).

This flaw is easily remedied. For any element \(K \in {\mathcal {M}}\), we replace in (1.4) the characteristic function \(\chi _K \) of \(K \) by the weighted mean

$$\begin{aligned} \psi _K \mathrel {:=}\frac{(2d+1)!}{d!|K |} \prod _{z\in {\mathcal {V}}\cap K} \phi _z \in H^1_0(K) \quad \text {with}\quad \int _{K} \psi _K = 1 \end{aligned}$$
(3.8)

thanks to (2.10) and consider

$$\begin{aligned} \tilde{{\mathcal {P}}}_{0,{\mathcal {M}}}f\mathrel {:=}\sum _{K \in {\mathcal {M}}} \left\langle f,\,\psi _K \right\rangle \chi _K. \end{aligned}$$
(3.9)

Since \(\psi _K \in H^1_0(K)\subset H^1_0(\Omega )\) is an admissible test function, the operator \(\tilde{{\mathcal {P}}}_{0,{\mathcal {M}}}\) is defined for all functionals in \(H^{-1}(\Omega )\) and satisfies the stability estimate (3.7); see Remark 11 below.

But still, the new operator \(\tilde{{\mathcal {P}}}_{0,{\mathcal {M}}}\) does not verify (3.6). To see this, consider \(f=-\Delta V\) with \(V\in {\mathbb {V}}_0({\mathcal {M}})\) arbitrary. We then have

$$\begin{aligned} u_f=U_{f;{\mathcal {M}}} \end{aligned}$$

and therefore \({{{\,\mathrm{Res}\,}}(f;{\mathcal {M}})}=0\) and property (3.6b) entails

$$\begin{aligned} \forall V\in {\mathbb {V}}_0({\mathcal {M}})\quad {\mathcal {P}}_{{\mathcal {M}}}(\Delta V) = \Delta V. \end{aligned}$$
(3.10)

In addition, integration by parts yields that, for all \(v\in H^1_0(\Omega )\),

$$\begin{aligned} \left\langle \Delta V,\,v\right\rangle = -\int _{\Omega } \nabla V\cdot \nabla v = \sum _{F \in {\mathcal {F}}} \int _{F} J(V)v \,\mathrm {d}s, \end{aligned}$$
(3.11)

where \(\,\mathrm {d}s \) indicates the \((d-1)\)-dimensional Hausdorff measure in \({\mathbb {R}}^d\) and J(V) is the jump in the normal flux \(\nabla V\cdot n\) across interelement sides. More precisely, if \(F = K _1 \cap K _2\) is the intersection of the elements \(K _1,K _2\in {\mathcal {M}}\) with respective outer normals \(\varvec{n}_1\), \(\varvec{n}_2\), then \(J(V)|_{F} \mathrel {:=}\nabla V|_{K _1} \cdot \varvec{n}_1 + \nabla V|_{K _2} \cdot \varvec{n}_2\in {\mathbb {R}}\). If \(V \ne 0\), then we have also \(\Delta V \ne 0\), while (3.11) yields \(\tilde{{\mathcal {P}}}_{0,{\mathcal {M}}}(\Delta V)=0\), in contradiction with (3.10). Hence (3.6) does not hold for \(\tilde{{\mathcal {P}}}_{0,{\mathcal {M}}}\).

The two conditions (3.7) and (3.10) are central to our goals. Although they can be checked without involving the Galerkin approximation (2.11), they are also sufficient for (3.6). Incidentally, they imply that \({\mathcal {P}}_{{\mathcal {M}}}\) has to be a near best ‘interpolation’ operator in light of the Lebesgue lemma.

The failure of (3.10) for \(\tilde{{\mathcal {P}}}_{0,{\mathcal {M}}}\) is not related to the choice of the test functions \(\psi _K \), \(K \in {\mathcal {M}}\), but to its range. In fact, (3.11) and the fundamental lemma of calculus of variation show that \(\Delta V \not \in L^2(\Omega )\) whenever \(V \ne 0\), while \(\tilde{{\mathcal {P}}}_{0,{\mathcal {M}}}({\mathbb {V}}_0({\mathcal {M}})) \subset L^2(\Omega )\). In other words: to remedy, we have to change the range.

Finally, it is desirable that \({\mathcal {P}}_{{\mathcal {M}}}\) is a local operator for two reasons. First, this comes in useful when evaluating \({\mathcal {P}}_{{\mathcal {M}}}\). Second, since \(-\Delta \) is a local operator, we have the following lower bound for the local error:

$$\begin{aligned} \Vert {{\,\mathrm{Res}\,}}(f;{\mathcal {M}})\Vert _{H^{-1}(\omega _z)} \leqslant \Vert \nabla (u_f-U_{f;{\mathcal {M}}})\Vert _{\omega _z}, \end{aligned}$$
(3.12)

which follows from testing (3.1) with all v from \(H^1_0(\omega _z)\). This bound can be exploited if we strengthen (3.6) to the local conditions

$$\begin{aligned} \Vert {\mathcal {P}}_{{\mathcal {M}}}f+ \Delta U_{f;{\mathcal {M}}}\Vert _{H^{-1}(\omega _z)}&\lesssim \Vert {{\,\mathrm{Res}\,}}(f;{\mathcal {M}})\Vert _{H^{-1}(\omega _z)}, \end{aligned}$$
(3.13a)
$$\begin{aligned} \Vert f-{\mathcal {P}}_{{\mathcal {M}}}f\Vert _{H^{-1}(\omega _z)}&\lesssim \Vert {{\,\mathrm{Res}\,}}(f;{\mathcal {M}})\Vert _{H^{-1}(\omega _z)} \end{aligned}$$
(3.13b)

for all \(z\in {\mathcal {V}}\). We shall therefore demand the stability (3.7) and invariance (3.10) in a suitable local manner.

In order to formulate local invariance, let us introduce the following notations associated with an open subset \(\omega \subset \Omega \). Whenever two functionals \(\ell _1,\ell _2\in H^{-1}(\Omega )\) satisfy \(\ell _1(v) = \ell _2(v)\) for all \(v \in H^1_0(\omega )\), we say \(\ell _1=\ell _2\) on \(\omega \). Moreover, we write \(\ell _1 \in \underline{{\mathbb {D}}}({\mathcal {M}})\) on \(\omega \) when additionally \(\ell _2\) can be chosen such that \(\ell _2\in \underline{{\mathbb {D}}}({\mathcal {M}})\). Notice that, thanks to the fundamental lemma of the calculus of variations, these notions reduce to the usual ones if \(\ell \in L^2(\Omega )\), i.e. \(\ell (v) = \int _\Omega gv\) for all \(v \in H^1_0(\Omega )\).

Let us summarize our discussion by a list of desired properties for the operator \({\mathcal {P}}_{{\mathcal {M}}}\) and its range \(\underline{{\mathbb {D}}}({\mathcal {M}})\subset H^{-1}(\Omega )\), which corresponds to the set of all possible discretized residuals. This list provides the guidelines for our approach and choices. Denoting by \(\Delta ({\mathbb {V}}_0({\mathcal {M}})) = \{ \Delta V \mid V\in {\mathbb {V}}_0({\mathcal {M}})\}\) the image of \({\mathbb {V}}_0({\mathcal {M}})\) under the distributional Laplacian, we aim for the following properties:

$$\begin{aligned}&\Delta ({\mathbb {V}}_0({\mathcal {M}})) \subset \underline{{\mathbb {D}}}({\mathcal {M}}), \end{aligned}$$
(3.14a)
$$\begin{aligned}&\text {if }\ell \in \underline{{\mathbb {D}}}({\mathcal {M}})\text { on }\mathring{\omega }_z,\text { then } \Vert \ell \Vert _{H^{-1}(\omega _z)}\text { is quantifiable with a finite number} \nonumber \\&\qquad \text { of evaluations of }\ell , \end{aligned}$$
(3.14b)
$$\begin{aligned}&{\mathcal {P}}_{{\mathcal {M}}}\text { is linear}, \end{aligned}$$
(3.14c)
$$\begin{aligned}&{\mathcal {P}}_{{\mathcal {M}}}f\text { is locally computable in terms of a finite number of evaluations} \nonumber \\&\qquad \hbox { of}\ f, \end{aligned}$$
(3.14d)
$$\begin{aligned}&\text {if }\ell \in \underline{{\mathbb {D}}}({\mathcal {M}})\text { on }\mathring{\omega }_z,\text { then }{\mathcal {P}}_{{\mathcal {M}}}\ell = \ell \text { on }\mathring{\omega }_z, \end{aligned}$$
(3.14e)
$$\begin{aligned}&\Vert {\mathcal {P}}_{{\mathcal {M}}}\ell \Vert _{H^{-1}(\omega _z)} \lesssim \Vert \ell \Vert _{H^{-1}(\omega _z)}~\text {for all}~\ell \in H^{-1}(\omega _z). \end{aligned}$$
(3.14f)

Regarding the above discussion, we have that conditions (3.14f), (3.14e) and (3.14a) are equivalent to (3.13); cf. Sect. 3.7. Conditions (3.14d) and (3.14b) allow to quantify the local dual norms of the approximate residual \({\mathcal {P}}_{{\mathcal {M}}}f+ \Delta U_{f;{\mathcal {M}}} \in \underline{{\mathbb {D}}}({\mathcal {M}})\) in a computable manner; compare also with Sect. 3.6 below.

In the next three sections we construct two operators \({\mathcal {P}}_{{\mathcal {M}}}\) fulfilling (3.14).

3.4 Discretized residuals and a locally stable biorthogonal system

We present a possible choice of the set \(\underline{{\mathbb {D}}}({\mathcal {M}})\) of discretized residuals and introduce an associated biorthogonal system, which is instrumental in constructing a suitable operator \({\mathcal {P}}_{{\mathcal {M}}}\) with range \(\underline{{\mathbb {D}}}({\mathcal {M}})\).

We set

$$\begin{aligned} \underline{{\mathbb {D}}}({\mathcal {M}})\mathrel {:=}\{ \ell \in H^{-1}(\Omega ) \mid \left\langle \ell ,\,v\right\rangle = \sum _{K \in {\mathcal {M}}} \int _K c_K v\,\mathrm {d}x + \sum _{F \in {\mathcal {F}}} \int _F c_F v\,\mathrm {d}s \nonumber \\ \text {for all } v\in H^1_0(\Omega ) \text { with } c_K,c_F \in {\mathbb {R}}\text { for }K \in {\mathcal {M}}, F \in {\mathcal {F}}\}. \end{aligned}$$
(3.15)

Every functional \(\ell \in \underline{{\mathbb {D}}}({\mathcal {M}})\) is thus constant on each element and on each face. Obviously, condition (3.14a) is verified. More precisely, \(\underline{{\mathbb {D}}}({\mathcal {M}})\) is in general a strict superset of \(\Delta ({\mathbb {V}}_0({\mathcal {M}}))\), since in \(\Delta ({\mathbb {V}}_0({\mathcal {M}}))\) only certain linear combinations of the constants \(c_F \), \(F \in {\mathcal {F}}\) are allowed. The fact that these constants are independent in \(\underline{{\mathbb {D}}}({\mathcal {M}})\) facilitates the definition of \({\mathcal {P}}_{{\mathcal {M}}}\). Moreover, we have added the contributions given by the constants \(c_K \), \(K \in {\mathcal {M}}\), for comparability with the classical oscillations and a posteriori error estimators and because similar contributions will appear for higher order elements; cf. Kreuzer and Veeser [14]. In spite of these enlargements, we still have \(\dim \underline{{\mathbb {D}}}({\mathcal {M}})< \infty \). Consequently, an argument as in the proof of Lemma 2, which hinges on infinite dimension, is ruled out.

Let us associate a biorthogonal system with \(\underline{{\mathbb {D}}}({\mathcal {M}})\). To this end, we introduce the surface Dirac distributions

$$\begin{aligned} \chi _F: \left\{ \begin{matrix} H_0^1(\Omega ) &{}\rightarrow &{}{\mathbb {R}}, \\ v &{}\mapsto &{}\int _{F} v \,\mathrm {d}s, \end{matrix}\right. \quad F \in {\mathcal {F}}, \end{aligned}$$
(3.16a)

and we identify the characteristic functions \(\chi _K \), \(K \in {\mathcal {M}}\), with their associated distributions

$$\begin{aligned} \chi _K:\left\{ \begin{matrix} H_0^1(\Omega ) &{}\rightarrow &{}{\mathbb {R}}, \\ v &{} \mapsto &{}\int _{K} v \,\mathrm {d}x, \end{matrix}\right. \quad K \in {\mathcal {M}}. \end{aligned}$$
(3.16b)

Notice that the definitions of \(\chi _F \) and \(\chi _K \) involve different measures for integration: the \((d-1)\)-dimensional Hausdorff measure for \(\chi _F \) and the d-dimensional Lebesgue measure for \(\chi _K \). Correspondingly, each \(\chi _K \) is absolutely continuous and each \(\chi _F \) is singular with respect to the d-dimensional Lebesgue measure.

We collect all elements and interelement faces in the index set \({\mathcal {I}}={\mathcal {I}}({\mathcal {M}}) \mathrel {:=}{\mathcal {M}}\cup {\mathcal {F}}\) and derive in the next lemma the properties of the functionals \(\chi _i\), \(i\in {\mathcal {I}}\), that are of interest to us.

Lemma 6

(Basis and scaling) The functionals \(\chi _i\), \(i\in {\mathcal {I}}\), are a basis of \(\underline{{\mathbb {D}}}({\mathcal {M}})\). For any element \(K \in {\mathcal {M}}\) and any face \(F \in {\mathcal {F}}\) containing a vertex \(z \in {\mathcal {V}}\), we have

$$\begin{aligned} \Vert \chi _K \Vert _{H^{-1}(\omega _z)} \leqslant \left| K \right| ^{1/2} {\tilde{\rho }}_z \quad \text {and}\quad \Vert \chi _F \Vert _{H^{-1}(\omega _z)} \leqslant |F |^{1/2} {{\tilde{\rho }}}_z^{1/2} \end{aligned}$$

with \({\tilde{\rho }}_z\) from (2.7).

Proof

We will use the Friedrichs inequality

$$\begin{aligned} \forall v \in H^1_0(\omega _z) \qquad \Vert v\Vert _{\omega _z} \leqslant {\tilde{\rho }}_z \Vert \nabla v\Vert _{\omega _z} \end{aligned}$$
(3.17)

and the following trace theorem: if \(F \in {\mathcal {F}}\) with \(F \ni z\) and \(\varvec{n}\) denotes a normal of \(F \), then

$$\begin{aligned} \forall w \in W^{1,1}_0(\omega _z) \qquad \Vert w\Vert _{L^1(F)} \leqslant \frac{1}{2} \Vert \nabla w\cdot \varvec{n}\Vert _{L^1(\omega _z)}. \end{aligned}$$
(3.18)

Given \(K \in {\mathcal {M}}\) with \(K \ni z\) and any \(v\in H^1_0(\omega _z)\), the Cauchy–Schwarz inequality and (3.17) yield

$$\begin{aligned} \left| \left\langle \chi _K,\,v\right\rangle \right| = \left| \int _{K} v \,\mathrm {d}x \right| \leqslant \left| K \right| ^{1/2} \Vert v\Vert _{\omega _z} \leqslant \left| K \right| ^{1/2} {\tilde{\rho }}_z \Vert \nabla v\Vert _{\omega _z}, \end{aligned}$$

which verifies the first claimed inequality. To show the second one, fix \(F \in {\mathcal {F}}\) with \(F \ni z\) and let again \(v \in H^1_0(\omega _z)\). Using (3.18) with \(w=v^2\) and then again (3.17), we derive

$$\begin{aligned} \left| \left\langle \chi _F,\,v\right\rangle \right|= & {} \left| \int _{F} v \,\mathrm {d}s \right| \leqslant \left| F \right| ^{1/2} \Vert v\Vert _{F} \leqslant |F |^{1/2} \Vert v\Vert _{\omega _z}^{1/2} \Vert \nabla v\cdot \varvec{n}\Vert _{\omega _z}^{1/2} \\\leqslant & {} |F |^{1/2} {{\tilde{\rho }}}_z^{1/2} \Vert \nabla v\Vert _{\omega _z} \end{aligned}$$

and also the second claimed inequality is proved. \(\square \)

In order to complete the basis of Lemma 6 to a biorthogonal system, we use the following test functions: Given any element \(K \in {\mathcal {M}}\), take

$$\begin{aligned} \psi _K = \frac{(2d+1)!}{d!|K|} \prod _{z\in {\mathcal {V}}\cap K} \phi _z. \end{aligned}$$
(3.19a)

Given any interelement face \(F \in {\mathcal {F}}\), let \(z_i\), \(i=1,2\), be the vertices in the patch \(\omega _F \), see (2.3), that are opposite to \(F \) and set

$$\begin{aligned} \psi _F \mathrel {:=}\frac{(2d-1)!}{(d-1)!|F|} \left( \prod _{z \in {\mathcal {V}}\cap F} \phi _z \right) \left( 1 - (2d+1) \sum _{i=1}^2 \phi _{z_i} \right) . \end{aligned}$$
(3.19b)

Let us verify that the basis \(\chi _i\), \(i\in {\mathcal {I}}\) and the test functions \(\psi _i\), \(i \in {\mathcal {I}}\), actually form a biorthogonal system with a crucial stability condition.

Lemma 7

(Locally stable biorthogonal system) Together with the basis \(\chi _i\), \(i\in {\mathcal {I}}\), the test functions \(\psi _i\), \(i \in {\mathcal {I}}\), form a locally stable biorthogonal system:

  1. (i)

    We have

    $$\begin{aligned} \forall i,j \in {\mathcal {I}} \quad \left\langle \chi _i,\,\psi _j\right\rangle = \delta _{ij}. \end{aligned}$$
  2. (ii)

    Let \({\mathcal {I}}_z \mathrel {:=}\{i\in {\mathcal {I}} \mid i \ni z \}\) denote the elements and faces containing a vertex \(z\in {\mathcal {V}}\). Then

    $$\begin{aligned} \forall i \in {\mathcal {I}}_z \quad \Vert \chi _i\Vert _{H^{-1}(\omega _z)} \Vert \nabla \psi _i\Vert _{\omega _z} \leqslant C_\psi , \end{aligned}$$

    where the stability constant \(C_\psi \) only depends on d and the shape coefficient \(\sigma ({\mathcal {M}})\).

Proof

To show (i), we consider the cases of elements \(j\in {\mathcal {M}}\) and faces \(j\in {\mathcal {F}}\) separately. First, let \(K \in {\mathcal {M}}\) be an element. As already seen in (3.8), we have \(\left\langle \chi _K,\,\psi _K \right\rangle = \int _{K} \psi _K = 1\). Moreover, since \(\psi _K = 0\) in \(\Omega \setminus \mathring{K}\), we infer \(\left\langle \chi _{K '},\,\psi _K \right\rangle = 0\) for any \(K '\in {\mathcal {M}}\setminus \{K \}\) and \(\left\langle \chi _F,\,\psi _K \right\rangle = 0\) for any \(F \in {\mathcal {F}}\).

Second, fix a face \(F \in {\mathcal {F}}\). Using (2.10), we obtain

$$\begin{aligned} \left\langle \chi _F,\,\psi _F \right\rangle = \frac{(2d-1)!}{(d-1)!|F|} \int _{F} \prod _{z \in {\mathcal {V}}\cap F} \phi _z \,\mathrm {d}s = 1. \end{aligned}$$

From \(\psi _F = 0\) in \(\Omega \setminus \mathring{\omega }_F \), where \(\omega _F \) is the patch of the two elements containing the face \(F \), we infer \(\left\langle \chi _{F '},\,\psi _F \right\rangle =0\) for any \(F '\in {\mathcal {F}}\setminus \{F \}\) and \(\left\langle \chi _K,\,\psi _F \right\rangle =0\) for any \(K \in {\mathcal {M}}\) with \(K \not \supset F \). Last, let \(K \in {\mathcal {M}}\) such that \(K \supset F \). Using again (2.10), we deduce

$$\begin{aligned} \left\langle \chi _K,\,\psi _F \right\rangle = \frac{(2d-1)!}{(d-1)!|F|} \left( \int _{K} \prod _{z \in {\mathcal {V}}\cap F} \phi _z \,\mathrm {d}x- (2d+1) \int _{K} \prod _{z \in {\mathcal {V}}\cap K} \phi _z \,\mathrm {d}x \right) = 0. \end{aligned}$$

For (ii), we again treat elements and faces separately. Let \(K \in {\mathcal {M}}\) be an element containing z. The well-known inverse estimate \(\Vert \nabla \psi _K \Vert _{K} \leqslant C_d \rho _K ^{-1} \Vert \psi _K \Vert _{K}\), \(K \subset \omega _z\) and (2.10) imply

$$\begin{aligned} \Vert \nabla \psi _K \Vert _{\omega _z} = \Vert \nabla \psi _K \Vert _{K} \leqslant \frac{C_d}{ |K |^{1/2} \rho _K}. \end{aligned}$$

Combining this with the first inequality in Lemma 6 and (2.8), we obtain the claimed inequality for elements:

$$\begin{aligned} \Vert \chi _K \Vert _{H^{-1}(\omega _z)} \Vert \nabla \psi _K \Vert _{\omega _z} \leqslant C_d \frac{{\tilde{\rho }}_z}{\rho _K} \leqslant C_{d;\sigma ({\mathcal {M}})}. \end{aligned}$$

Let \(F \in {\mathcal {F}}\) be an interelement face containing z and write \(F = K _1 \cap K _2\), where \(K _1, K _2 \in {\mathcal {M}}\) are the two elements containing \(F \). Proceeding as before, we deduce

$$\begin{aligned} \Vert \nabla \psi _F \Vert _{\omega _z}^2= & {} \sum _{n=1,2} \Vert \nabla \psi _F \Vert _{K _n}^2 \leqslant C_d^{2} \sum _{n=1,2} \rho _{K _n}^{-2} \Vert \psi _F \Vert _{K _n}^2 \nonumber \\\leqslant & {} C_d^{2} \sum _{n=1,2} \frac{|K _n|}{|F |^2 \rho _{K _n}^2}. \end{aligned}$$
(3.20)

and

$$\begin{aligned} \Vert \chi _F \Vert _{H^{-1}(\omega _z)} \Vert \nabla \psi _F \Vert _{\omega _z} \leqslant C_d \left( \sum _{i=1,2} \frac{h_{K _n;F}\tilde{\rho _z}}{\rho _{K _n}^2} \right) ^{1/2} \leqslant C_{d;\sigma ({\mathcal {M}})}. \end{aligned}$$

\(\square \)

In what follows, we shall rely only on the properties of the test functions \(\psi _i\), \(i\in {\mathcal {I}}\), expressed in Lemma 7. In other words: what counts is not their special form, but the fact that they form a stable biorthogonal system with the basis \(\chi _i\), \(i\in {\mathcal {I}}\), of \(\underline{{\mathbb {D}}}({\mathcal {M}})\).

3.5 Construction and properties of \({\mathcal {P}}_{{\mathcal {M}}}\)

We now propose a possible choice for the projection operator \({\mathcal {P}}_{{\mathcal {M}}}\) and verify the desired properties (3.14). Set

$$\begin{aligned} {\mathcal {P}}_{{\mathcal {M}}}\ell = \sum _{i\in {\mathcal {I}}} \left\langle \ell ,\,\psi _i\right\rangle \chi _i, \end{aligned}$$
(3.21)

where the functionals \(\chi _i\), \(i\in {\mathcal {I}}\), are given by (3.16) and the test functions \(\psi _i\), \(i\in {\mathcal {I}}\), by (3.19). Clearly, \({\mathcal {P}}_{{\mathcal {M}}}\) is linear and \({\mathcal {P}}_{{\mathcal {M}}}f\) is locally computable in terms of a finite number of evaluations of \(f\), i. e., we have (3.14c) and (3.14d).

The biorthogonality of these functionals and test functions implies the following local counterparts of the algebraic condition (3.10).

Theorem 8

(Local invariance) For any functional \(\ell \in H^{-1}(\Omega )\), element \(K \in {\mathcal {M}}\), and side \(F \in {\mathcal {F}}\), the operator \({\mathcal {P}}_{{\mathcal {M}}}\) does not change the following discrete restrictions:

  1. (i)

    If \(\ell \in \underline{{\mathbb {D}}}({\mathcal {M}})\) on \(\mathring{K}\), then \({\mathcal {P}}_{{\mathcal {M}}}\ell = \ell \) on \(\mathring{K}\).

  2. (ii)

    If \(\ell \in \underline{{\mathbb {D}}}({\mathcal {M}})\) on \(\mathring{\omega }_F \), then \({\mathcal {P}}_{{\mathcal {M}}}\ell = \ell \) on \(\mathring{\omega }_F \).

Proof

Let \(\ell = c\chi _K\) on \(\mathring{K}\) with \(c\in {\mathbb {R}}\). For any \(i\in {\mathcal {I}}\), we have \(\left\langle \ell ,\,\psi _i\right\rangle = c \int _K \psi _i = c \delta _{K,i}\) by means of Lemma 7 (i). Consequently, \({\mathcal {P}}_{{\mathcal {M}}}\ell = c \chi _K \) on \(\mathring{K}\), which proves (i).

To show (ii), let \(K _1,K _2\in {\mathcal {M}}\) be the two elements containing \(F \) and let \(\ell = c \chi _F + \sum _{i=1,2} c_i \chi _{K _i}\) on \(\mathring{\omega }_F \) with \(c, c_1, c_2 \in {\mathbb {R}}\). Using again Lemma 7 (i), we observe

$$\begin{aligned} \left\langle \ell ,\,\psi _F \right\rangle = c \left\langle \chi _F,\,\psi _F \right\rangle + \sum _{i=1,2} c_i \left\langle \chi _{K_i},\, \psi _F \right\rangle = c \quad \text {and}\quad \left\langle \ell ,\,\psi _{K _i}\right\rangle = c_i \quad \text {for }i=1,2 \end{aligned}$$

and \(\left\langle \ell ,\,\psi _i\right\rangle = 0\) for all \(i\in {\mathcal {I}}\setminus \{F,K_1,K_2\}\). Consequently,

$$\begin{aligned} {\mathcal {P}}_{{\mathcal {M}}}\ell = c \chi _F + \sum _{i=1,2} c_i \chi _{K _i} = \ell \quad \text {on }\mathring{\omega }_F \end{aligned}$$

and also (ii) is verified. \(\square \)

Theorem 8 implies in particular (3.14e). Moreover, it has the following global consequences.

Corollary 9

(Global invariance) The operator \({\mathcal {P}}_{{\mathcal {M}}}\) is a linear projection onto the discretized residuals \(\underline{{\mathbb {D}}}({\mathcal {M}})\) from (3.15). In particular, we have

$$\begin{aligned} {\mathcal {P}}_{{\mathcal {M}}}(\Delta V) = \Delta V \qquad \text {and}\qquad {\mathcal {P}}_{{\mathcal {M}}}(f) = f \end{aligned}$$

for any \(V\in {\mathbb {V}}_0({\mathcal {M}})\) and any \({\mathcal {M}}\)-piecewise constant function \(f \in {\mathbb {P}}_0({\mathcal {M}})\).

Next, we verify the local stability (3.14f) of \({\mathcal {P}}_{{\mathcal {M}}}\). As a side product, we also obtain the local stabilty of the operator \(\tilde{{\mathcal {P}}}_{0,{\mathcal {M}}}\), which was left open in Sect. 3.3.

Theorem 10

(Local stability) The linear projection \({\mathcal {P}}_{{\mathcal {M}}}\) is locally \(H^{-1}\)-stable: for any functional \(\ell \in H^{-1}(\Omega )\) and any vertex \(z\in {\mathcal {V}}\), we have

$$\begin{aligned} \Vert {\mathcal {P}}_{{\mathcal {M}}}\ell \Vert _{H^{-1}(\omega _z)} \lesssim \Vert \ell \Vert _{H^{-1}(\omega _z)}, \end{aligned}$$

where the hidden constant depends only on d and \(\sigma ({\mathcal {M}})\).

Proof

Given \(v \in H^1_0(\omega _z)\), we derive

$$\begin{aligned} |\left\langle {\mathcal {P}}_{{\mathcal {M}}}\ell ,\,v\right\rangle |&\leqslant \sum _{i \in {\mathcal {I}}_z} |\left\langle \ell ,\,\psi _i\right\rangle \left\langle \chi _i,\,v\right\rangle | \leqslant \sum _{i \in {\mathcal {I}}_z} \Vert \ell \Vert _{H^{-1}(\omega _z)}\Vert \nabla \psi _i\Vert _{\omega _z} \Vert \chi _i\Vert _{H^{-1}(\omega _z)}\Vert \nabla v\Vert _{\omega _z} \\&\lesssim \Vert \ell \Vert _{H^{-1}(\omega _z)} \Vert \nabla v\Vert _{\omega _z}, \end{aligned}$$

where we used Lemma 7 (ii) and \(\#{\mathcal {I}}_z \lesssim 1\). \(\square \)

Remark 11

(Stability of \(\tilde{{\mathcal {P}}}_{0,{\mathcal {M}}}\)) The argument in the proof of Theorem 10 also shows that \(\tilde{{\mathcal {P}}}_{0,{\mathcal {M}}}\) is locally \(H^{-1}\)-stable. In fact, one simply replaces \({\mathcal {P}}_{{\mathcal {M}}}\) by \(\tilde{{\mathcal {P}}}_{0,{\mathcal {M}}}\) and the index set \({\mathcal {I}}_z\) by \({\mathcal {I}}_z \cap {\mathcal {M}}\).

Let us conclude this section with the following further remarks on the linear projection \({\mathcal {P}}_{{\mathcal {M}}}\).

Remark 12

(Orthogonality) For any \(\ell \in H^{-1}(\Omega )\), the error \(\ell - {\mathcal {P}}_{{\mathcal {M}}}\ell \) is orthogonal to \(\text {span}\,\{\psi _i \mid i \in {\mathcal {I}}\}\). This a immediate consequence of Lemma 7 (i).

Remark 13

(Adjoint of \({\mathcal {P}}_{{\mathcal {M}}}\)) Formally, the adjoint of \({\mathcal {P}}_{{\mathcal {M}}}\) is given by

$$\begin{aligned} {\mathcal {P}}_{{\mathcal {M}}}^*v = \sum _{i\in {\mathcal {I}}} \left\langle \chi _i,\,v\right\rangle \psi _i, \qquad v \in H^1_0(\Omega ). \end{aligned}$$

Here Lemma 7 (i) implies

$$\begin{aligned} \int _{K} {\mathcal {P}}_{{\mathcal {M}}}^*v = \int _{K} v \quad \text {and}\quad \int _{F} {\mathcal {P}}_{{\mathcal {M}}}^*v = \int _{F} v \end{aligned}$$
(3.22)

for all elements \(K \in {\mathcal {M}}\) and interelement faces \(F \in {\mathcal {F}}\). The operator \({\mathcal {P}}_{{\mathcal {M}}}^*\) and these conditions, which characterize it, were used in Veeser [24] to derive an a posteriori error upper bound in terms of a hierarchical estimator. That argument, as well as Morin, Nochetto, and Siebert [18, Theorem 3.6] and Verfürth [25, (3.14)], is closely related to Theorem 15 below.

3.6 Required a priori information, an alternative to \({\mathcal {P}}_{{\mathcal {M}}}\), and quantification of the discretized residual

The purpose of this section is twofold. First, we illustrate which type of a priori information on \(f\) in (2.2) is needed to carry out our approach, presenting also a possible alternative to \({\mathcal {P}}_{{\mathcal {M}}}\). Second, we show that a stable biorthogonal system is not only useful to construct \({\mathcal {P}}_{{\mathcal {M}}}\), but also to quantify the local dual norms of discretized residuals.

Clearly, the operator \({\mathcal {P}}_{{\mathcal {M}}}\) of §3.5 can be applied to the right-hand side \(f\) of (2.2) whenever

$$\begin{aligned} \left\langle f,\,\psi _i\right\rangle ,\, i\in {\mathcal {I}}, \text { are known exactly.} \end{aligned}$$
(3.23)

In order to ensure a meaningful discretized residual, this information goes beyond (2.14), the information necessary for the Galerkin approximation (2.11) on the mesh \({\mathcal {M}}\); it is available, e.g., when one is able to compute the counterpart of (2.11) of order \(d+1\) over \({\mathcal {M}}\).

There are other possibilities to obtain a meaningful discretized residual. The following one fits particularly well to (2.14) in the context of mesh adaptivity. Suppose that we are given an initial mesh and a refinement procedure such that the set \({\mathbb {M}}\) of all refined meshes form a shape-regular family. Furthermore, suppose that, for any mesh \({\mathcal {M}}\in {\mathbb {M}}\), there is a refinement \({{\widetilde{{\mathcal {M}}}}} \in {\mathbb {M}}\) with vertices \({\mathcal {V}}({{\widetilde{{\mathcal {M}}}}})\) that satisfies the following properties:

$$\begin{aligned} \begin{array}{ll} \forall {\widetilde{K}} \in {\widetilde{{\mathcal {M}}}} \; \exists K \in {\mathcal {M}}&\text {with}\quad {\widetilde{K}} \subset K \text { and } h_K \lesssim h_{{\widetilde{K}}}, \end{array} \end{aligned}$$
(3.24a)
$$\begin{aligned} \begin{array}{ll} \forall i \in {\mathcal {I}}({\mathcal {M}}) \; \exists {\widetilde{z}} \in {\mathcal {V}}({\widetilde{{\mathcal {M}}}})&\text {such that}\quad {\widetilde{z}} \text { is interior to } i. \end{array} \end{aligned}$$
(3.24b)

Let us now fix a mesh \({\mathcal {M}}\in {\mathbb {M}}\) and a refinement \({\widetilde{{\mathcal {M}}}}\in {\mathbb {M}}\) satisfying (3.24). For any \(i \in {\mathcal {I}}({\mathcal {M}})\), using (3.24b), we fix a vertex \({\widetilde{z}} \in {\mathcal {V}}({\widetilde{{\mathcal {M}}}})\) interior to i and denote by \({\widetilde{\phi }}_{{\widetilde{z}}}\) its associated hat function in \({\mathbb {V}}(\widetilde{{\mathcal {M}}})\). We then obtain counterparts \({\widetilde{\psi }}_i\), \(i \in {\mathcal {I}}\), of the test functions \(\psi _i\), \(i \in {\mathcal {I}}\), by using these hat functions with a suitable scaling in place of the element and faces bubble functions in (3.19) such that the following lemma holds. We skip the technical details, referring to Morin, Nochetto and Siebert [17] and Veeser [24].

Lemma 14

(Another locally stable biorthogonal system) Together with the basis \(\chi _i\), \(i\in {\mathcal {I}}\), the test functions \({\widetilde{\psi }}_i\), \(i \in {\mathcal {I}}\), form a locally stable biorthogonal system:

  1. (i)

    We have

    $$\begin{aligned} \forall i,j \in {\mathcal {I}} \quad \left\langle \chi _i,\,{\widetilde{\psi }}_j\right\rangle = \delta _{ij}. \end{aligned}$$
  2. (ii)

    Let \({\mathcal {I}}_z = \{i\in {\mathcal {I}} \mid i \ni z \}\) denote the elements and faces containing a vertex \(z\in {\mathcal {V}}\). Then

    $$\begin{aligned} \forall i \in {\mathcal {I}}_z \quad \Vert \chi _i\Vert _{H^{-1}(\omega _z)} \Vert \nabla {\widetilde{\psi }}_i\Vert _{\omega _z} \leqslant C_{{{\tilde{\psi }}}}, \end{aligned}$$

    where the stability constant \(C_{{{\tilde{\psi }}}}\) only depends on d and the shape coefficient \(\sigma ({\mathcal {M}})\).

Thus, the operator

$$\begin{aligned} \widetilde{{\mathcal {P}}}_{\mathcal {M}}\ell \mathrel {:=}\sum _{i \in {\mathcal {I}}} \left\langle \ell ,\,\widetilde{\psi _i}\right\rangle \chi _i \end{aligned}$$
(3.25)

defines an alternative to \({\mathcal {P}}_{{\mathcal {M}}}\) and the properties (3.14) without (3.14b) can be established as for \({\mathcal {P}}_{{\mathcal {M}}}\). The operator \(\widetilde{{\mathcal {P}}}_{\mathcal {M}}\) can be evaluated on any mesh \({\mathcal {M}}\in {\mathbb {M}}\) whenever

$$\begin{aligned} \forall {\widetilde{{\mathcal {M}}}}\in {\mathbb {M}} \; \forall z\in {\mathcal {V}}_0({\widetilde{{\mathcal {M}}}}) \quad \left\langle f,\,{\widetilde{\phi }}_z\right\rangle \text { are known exactly,} \end{aligned}$$
(3.26)

where \(\{{\widetilde{\phi }}_z\}_{z\in {\mathcal {V}}_0({\widetilde{{\mathcal {M}}}})}\) denotes the nodal basis of \({\mathbb {V}}_0({\widetilde{{\mathcal {M}}}})\). This is exactly (2.14) for all meshes in \({\mathbb {M}}\). Consequently, it is also needed to ensure that an adaptive algorithm with the above refinement procedure can always compute the Galerkin approximation (2.11).

Let us now turn to the quantification of the discretized residual and verify (3.14b), considering a general locally stable biorthogonal system.

Theorem 15

(Quantifying local dual norms) Let \(\psi _i\), \(i\in {\mathcal {I}}\), be the test functions from Lemma 7 or Lemma 14. If \(\ell \in \underline{{\mathbb {D}}}({\mathcal {M}})\) on a star \(\omega _z\), then the corresponding local dual norm can be quantified by a finite number of evaluations:

$$\begin{aligned} \frac{1}{d+1} \sum _{i\in {\mathcal {I}}_z} \left| \left\langle \ell ,\,\frac{\psi _i}{\Vert \nabla \psi _i\Vert _{}}\right\rangle \right| ^2 \leqslant \Vert \ell \Vert _{H^{-1}(\omega _z)}^2 \lesssim \sum _{i\in {\mathcal {I}}_z} \left| \left\langle \ell ,\,\frac{\psi _i}{\Vert \nabla \psi _i\Vert _{ }}\right\rangle \right| ^2, \end{aligned}$$

where the hidden constant depends on d, \(\sigma ({\mathcal {M}})\), and \(C_\psi \).

Proof

Let us first prove the lower bound, which holds for any arbitrary functional \(\ell \in H^{-1}(\Omega )\). In fact, the definition of the dual norm readily yields

$$\begin{aligned} \left| \left\langle \ell ,\,\frac{\psi _i}{\Vert \nabla \psi _i\Vert _{}}\right\rangle \right| \leqslant \Vert \ell \Vert _{H^{-1}({{\,\mathrm{supp}\,}}\psi _i) } \end{aligned}$$
(3.27)

for any \(i\in {\mathcal {I}}_z\). Notice that the essential supremum of \(x \mapsto \#\{i\in {\mathcal {I}}_z \mid {{\,\mathrm{supp}\,}}\psi _i \ni x \}\) is bounded by \(d+1\). Arguing as in the proof of Lemma 4 (ii), we therefore obtain

$$\begin{aligned} \sum _{i \in {\mathcal {I}}_z} \Vert \ell \Vert _{H^{-1}({\text {supp}}\psi _i)}^2 \leqslant (d+1) \Vert \ell \Vert _{H^{-1}(\omega _z)}^2 \end{aligned}$$
(3.28)

and the proof of the lower bound is finished.

To show the upper bound, we (need to) assume that \(\ell \in \underline{{\mathbb {D}}}({\mathcal {M}})\) on \(\omega _z\). Given \(v \in H^1_0(\omega _z)\), we can then write

$$\begin{aligned} \left\langle \ell ,\,v\right\rangle = \sum _{i\in {\mathcal {I}}_z} c_i \left\langle \chi _i,\,v\right\rangle \quad \text {with}\quad c_i \in {\mathbb {R}}. \end{aligned}$$

In light of the biorthogonality, we have \(c_i = \left\langle \ell ,\,\psi _i\right\rangle \). Using also the local stability of the biorthogonal system, we infer

$$\begin{aligned} |\left\langle \ell ,\,v\right\rangle |&\leqslant \sum _{i\in {\mathcal {I}}_z} |\left\langle \ell ,\,\psi _i\right\rangle \left\langle \chi _i,\,v\right\rangle | \\&\leqslant \sum _{i\in {\mathcal {I}}_z} \Vert \nabla \psi _i\Vert _{\omega _z} \Vert \chi _i\Vert _{H^{-1}(\omega _z)} \left| \left\langle \ell ,\,\frac{\psi _i}{\Vert \nabla \psi _i\Vert _{ }}\right\rangle \right| \Vert \nabla v\Vert _{\omega _z} \\&\leqslant C_{\psi } \left( \sum _{i\in {\mathcal {I}}_z} \left| \left\langle \ell ,\,\frac{\psi _i}{\Vert \nabla \psi _i\Vert _{ }}\right\rangle \right| \right) \Vert \nabla v\Vert _{\omega _z}. \end{aligned}$$

Since the solid angle of every simplex containing z is bounded away from 0 in terms of d and the shape coefficient \(\sigma ({\mathcal {M}})\), we have \(\#{\mathcal {I}}_z \leqslant C_{\sigma ({\mathcal {M}})}\). Consequently, the Cauchy–Schwarz inequality on the sum implies the desired upper bound. \(\square \)

Theorem 15 implies the missing (3.14b) for both operators \({\mathcal {P}}_{{\mathcal {M}}}\) and \(\widetilde{{\mathcal {P}}}_{\mathcal {M}}\) and, in accordance with Sect. 3.3, we have splittings of the local residual norms with the desired properties. Notice that, in view of the discussion of this section and Corollary 5, bounding the terms

$$\begin{aligned} \Vert {\mathcal {P}}_{{\mathcal {M}}}f-f\Vert _{H^{-1}(\omega _z)} \text { or } \Vert \widetilde{{\mathcal {P}}}_{\mathcal {M}}f-f\Vert _{H^{-1}(\omega _z)} \end{aligned}$$

cannot be done in general with a finite number of evaluations of the load \(f\). Notably, these terms involve only the load, and the discretized residuals

$$\begin{aligned} \Vert {\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}}\Vert _{H^{-1}(\omega _z)} \text { or } \Vert \widetilde{{\mathcal {P}}}_{\mathcal {M}}f+\Delta U_{f;{\mathcal {M}}}\Vert _{H^{-1}(\omega _z)} \end{aligned}$$

can be quantified with finite information, which, in light of Remark 3, is less than the information required for evaluating local \(L^2\)-norms of the load \(f\).

3.7 A posteriori error bounds

We now summarize our preceding results by deriving a posteriori error bounds. The resulting bounds are defined for any load \(f\in H^{-1}(\Omega )\) and the oscillation is dominated by the error.

The following statements remain correct if \({\mathcal {P}}_{{\mathcal {M}}}\) is replaced by \(\widetilde{{\mathcal {P}}}_{\mathcal {M}}\) from (3.25).

Theorem 16

(Abstract upper bound) For any functional \(f\in H^{-1}(\Omega )\) and any conforming mesh \({\mathcal {M}}\), we have

$$\begin{aligned} \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{}^2 \lesssim \sum _{z\in {\mathcal {V}}} \Vert {\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}}\Vert _{H^{-1}(\omega _z)}^2 + \Vert {\mathcal {P}}_{{\mathcal {M}}}f-f\Vert _{H^{-1}(\omega _z)}^2. \end{aligned}$$

Each local dual norm \(\Vert {\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}}\Vert _{H^{-1}(\omega _z)}\) of the discretized residual can be quantified with a finite number of evaluations of \(f\), while the quantification of the local dual norms \(\Vert {\mathcal {P}}_{{\mathcal {M}}}f-f\Vert _{H^{-1}(\omega _z)}\) of the oscillation requires additional a priori information on \(f\).

Proof

Lemma 1, Lemma 4 and a triangle inequality imply the claimed bound. Recalling that

$$\begin{aligned} {\mathcal {P}}_{{\mathcal {M}}}f+ \Delta U_{f;{\mathcal {M}}} \in \underline{{\mathbb {D}}}({\mathcal {M}}), \end{aligned}$$

Theorem 15 and Corollary 5 ensure the statements about the quantification of the two parts of the bound. \(\square \)

In contrast to previous results available in literature, the complete upper bound in Theorem 16 is also a lower bound, even locally.

Theorem 17

(Abstract local lower bounds) For any functional \(f\in H^{-1}(\Omega )\) and any conforming mesh \({\mathcal {M}}\), the discretized residual and the oscillation are locally dominated by the error: for every vertex \(z\in {\mathcal {V}}\), we have

$$\begin{aligned} \Vert {\mathcal {P}}_{{\mathcal {M}}}f+ \Delta U_{f;{\mathcal {M}}}\Vert _{H^{-1}(\omega _z)} \lesssim \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{\omega _z} \end{aligned}$$

and

$$\begin{aligned} \Vert {\mathcal {P}}_{{\mathcal {M}}}f- f\Vert _{H^{-1}(\omega _z)} \lesssim \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{\omega _z}. \end{aligned}$$

Proof

In light of (3.12), the first claimed inequality follows from the triangle inequality and the second one. The latter is a consequence of Theorems 8 and 10 and (3.12):

$$\begin{aligned} \Vert {\mathcal {P}}_{{\mathcal {M}}}f- f\Vert _{H^{-1}(\omega _z)}&\leqslant \Vert {\mathcal {P}}_{{\mathcal {M}}}(f+ \Delta U_{f;{\mathcal {M}}})\Vert _{H^{-1}(\omega _z)} + \Vert f- \Delta U_{f;{\mathcal {M}}} \Vert _{H^{-1}(\omega _z)} \\&\lesssim \Vert f- \Delta U_{f;{\mathcal {M}}} \Vert _{H^{-1}(\omega _z)} \lesssim \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{\omega _z}. \end{aligned}$$

\(\square \)

Squaring and summing, we readily get global lower bounds.

Corollary 18

(Abstract global lower bounds) For any functional \(f\in H^{-1}(\Omega )\) and any conforming mesh \({\mathcal {M}}\), the discretized residual and the oscillation are globally dominated by the error in that

$$\begin{aligned} \sum _{z\in {\mathcal {V}}} \Vert {\mathcal {P}}_{{\mathcal {M}}}f+ \Delta U_{f;{\mathcal {M}}}\Vert _{H^{-1}(\omega _z)}^2 \lesssim \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{}^2 \end{aligned}$$

and

$$\begin{aligned} \sum _{z\in {\mathcal {V}}} \Vert {\mathcal {P}}_{{\mathcal {M}}}f- f\Vert _{H^{-1}(\omega _z)}^2 \lesssim \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{}^2. \end{aligned}$$

To summarize: if we are able to quantify the oscillation terms \(\Vert {\mathcal {P}}_{{\mathcal {M}}}f- f\Vert _{H^{-1}(\omega _z)}\), \(z\in {\mathcal {V}}\), then the right-hand side in Theorem 16 is a truly equivalent a posteriori error estimator.

Remark 19

(Surrogate oscillation) The quantification of the local dual norms \(\Vert {\mathcal {P}}_{{\mathcal {M}}}f- f\Vert _{H^{-1}(\omega _z)}\), \(z\in {\mathcal {V}}\), of the oscillation appears to be a difficult matter. In [10, Section 7], Cohen, DeVore, and Nochetto consider similar terms for special \(f\) and resort to surrogates that can be approximated with the help of numerical integration. Those surrogates hinge on additional regularity of \(f\), which entails the risk of overestimation; cf. Lemma 20 below.

3.8 Classical versus error-dominated oscillation

In this section we compare the error-dominated oscillation with the classical

$$\begin{aligned} \left( \sum _{z\in {\mathcal {V}}} \Vert {\mathcal {P}}_{{\mathcal {M}}}f-f\Vert _{H^{-1}(\omega _z)}^2\right) ^{1/2} \end{aligned}$$

\(L^2\)- and \(H^{-1}\)-oscillation,

$$\begin{aligned} {{\,\mathrm{osc}\,}}_0(f,{\mathcal {M}}) \quad \text {and}\quad \min _{g \in {\mathbb {P}}_0({\mathcal {M}})} \Vert f- g\Vert _{H^{-1}(\Omega )}, \end{aligned}$$

from (1.4) and (1.5) in the introduction. Doing so, we verify statements of the introduction and substantiate the advantages of the stability and invariance properties of the operator \({\mathcal {P}}_{{\mathcal {M}}}\).

Let us first show that the error-dominated oscillation is always smaller, up to a multiplicative constant, than both classical oscillations. To this end, let \(f\in H^{-1}(\Omega )\) and let \(g\in {\mathbb {P}}_0({\mathcal {M}})\) be an arbitrary piecewise constant approximation over \({\mathcal {M}}\). The local invariance and stability properties of \({\mathcal {P}}_{{\mathcal {M}}}\) in Theorems 8 and 10 imply that, for all \(z\in {\mathcal {V}}\),

$$\begin{aligned} \begin{aligned} \Vert f-{\mathcal {P}}_{{\mathcal {M}}}f\Vert _{H^{-1}(\omega _z)}&\le \Vert f-g\Vert _{H^{-1}(\omega _z)}+\Vert {\mathcal {P}}_{{\mathcal {M}}}(g-f)\Vert _{H^{-1}(\omega _z)} \\&\lesssim \Vert f-g\Vert _{H^{-1}(\omega _z)}. \end{aligned} \end{aligned}$$
(3.29)

Combining this with Lemma 4 (ii) and minimizing over g, we obtain the bound in terms of the classical \(H^{-1}\)-oscillation:

$$\begin{aligned} \sum _{z\in {\mathcal {V}}}\Vert f-{\mathcal {P}}_{{\mathcal {M}}}f\Vert _{H^{-1}(\omega _z)}^2\lesssim \min _{g \in {\mathbb {P}}_0({\mathcal {M}})} \Vert f-g\Vert _{H^{-1}(\Omega )}^2. \end{aligned}$$
(3.30a)

To show the other bound, suppose \(f\in L^2(\Omega )\). Making use of the orthogonality of \(P_{0,{\mathcal {M}}}\) and Poincaré inequalities in the elements of \(\omega _z\), we deduce

$$\begin{aligned} \Vert f- P_{0,{\mathcal {M}}}f\Vert _{H^{-1}(\omega _z)}^2 \lesssim \sum _{K \subset \omega _z} h_K ^2\Vert f-P_{0,{\mathcal {M}}}f\Vert _{K}^2, \end{aligned}$$

which together with (3.29) gives the bound in terms of the \(L^2\)-oscillation:

$$\begin{aligned} \sum _{z\in {\mathcal {V}}}\Vert f-{\mathcal {P}}_{{\mathcal {M}}}f\Vert _{H^{-1}(\omega _z)}^2 \lesssim \sum _{K \in {\mathcal {M}}}h_K ^2\Vert f-P_{0,{\mathcal {M}}}f\Vert _{K}^2={{\,\mathrm{osc}\,}}_0(f,{\mathcal {M}})^2. \end{aligned}$$
(3.30b)

The converse bounds of (3.30) do not hold. For the classical \(L^2\)-oscillation, this applies even on a fixed mesh and is in particular due to stability issues. The following lemma provides an illustration, relating directly to the error instead of the error-dominated oscillation.

Lemma 20

(Overestimation of classical \(L^2\)-oscillation) For any conforming mesh \({\mathcal {M}}\), there exists a sequence \((f_k)_{k}\subset L^2(\Omega )\) such that

$$\begin{aligned} \frac{{{\,\mathrm{osc}\,}}_0(f_k,{\mathcal {M}})}{ \Vert \nabla (u_{f_k} - U_{{f_k};{\mathcal {M}}})\Vert _{}} \rightarrow \infty \qquad \text {as}~k\rightarrow \infty . \end{aligned}$$

Proof

Choose \(f\in H^{-1}(\Omega )\setminus L^2(\Omega )\). Since \(L^2(\Omega )\) is dense in \(H^{-1}(\Omega )\), there exists a sequence \((f_k)_k\subset L^2(\Omega )\) such that \(f_k\rightarrow f\) in \(H^{-1}(\Omega )\). On the one hand, the energy norm errors \(\Vert \nabla (u_{f_k} - U_{{f_k};{\mathcal {M}}})\Vert _{}\) are uniformly bounded with respect to k. On the other hand, in view of \(\lim _{k\rightarrow \infty } \Vert f_k\Vert _{L^2(\Omega )}=\infty \), the oscillation \({{\,\mathrm{osc}\,}}_0(f_k,{\mathcal {M}})\) becomes arbitrarily large for \(k\rightarrow \infty \). \(\square \)

In the case of the classical \(H^{-1}\)-oscillation, (3.30a) cannot be inverted because of invariance issues. Let us illustrate this again by the relationship to the Galerkin error. Consider

$$\begin{aligned} f=-\Delta V \text { for some } V\in {\mathbb {V}}_0({\mathcal {M}}^\dagger )\setminus \{0\}, \end{aligned}$$
(3.31)

where \({\mathcal {M}}^\dagger \) is some conforming simplicial mesh of \(\Omega \). For any conforming refinement \({\mathcal {M}}\) of \({\mathcal {M}}^\dagger \), we then have \(u_f= V = U_{f;{\mathcal {M}}}\) and \(f\not \in {\mathbb {P}}_0({\mathcal {M}})\). Hence

$$\begin{aligned} \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{} = 0 < \min _{g\in {\mathbb {P}}_0({\mathcal {M}})} \Vert f- g\Vert _{H^{-1}(\Omega )}, \end{aligned}$$

where the classical \(H^{-1}\)-oscillation can be made arbitrarily large for a given \({\mathcal {M}}\) but decreases to 0 under suitable refinement. One could argue that the (neighborhoods of the) loads (3.31) are very special, in particular because the optimal convergence rate of (3.31) is formally \(\infty \). Here is another example based upon Cohen, DeVore, and Nochetto [10, Section 6.4], where the optimal nonlinear convergence rate for the error is finite and often encountered in practice.

Lemma 21

(Another overestimation of classical \(H^{-1}\)-oscillation) Let \(\Omega =(0,1)^2\). There is a functional \(f\in H^{-1}(\Omega )\) and a sequence \((L_n)_n\) with \(\log n > rsim L_n \rightarrow \infty \) as \(n\rightarrow \infty \) such that

$$\begin{aligned} \min _{\#{\mathcal {M}}\le n}\Vert \nabla (u_f- U_{{f};{\mathcal {M}}})\Vert _{} \lesssim n^{-1/2}, \end{aligned}$$
(3.32a)

and

$$\begin{aligned} \min _{\#{\mathcal {M}}\le n} \min _{g \in {\mathbb {P}}_0({\mathcal {M}})} \Bigg ( \sum _{z\in {\mathcal {V}}({\mathcal {M}})}\Vert f-g\Vert _{H^{-1}(\omega _z)}^2 \Bigg )^{1/2} \ge L_n\, n^{-1/2}, \end{aligned}$$
(3.32b)

where \({\mathcal {M}}\) varies in all meshes created by recursive or iterative newest vertex bisection of some conforming initial mesh \({\mathcal {M}}_0\) of \(\Omega \).

Proof

In [10, Section 6.4] Cohen, DeVore and Nochetto construct some function \(u_f\in H_0^1(\Omega )\) and a sequence \(L_n\) as claimed for which (3.32a) and

$$\begin{aligned} \min _{\#{\mathcal {M}}\le n} \Bigg ( \sum _{z\in {\mathcal {V}}({\mathcal {M}})} \Vert f\Vert _{H^{-1}(\omega _z)}^2 \Bigg )^{1/2} \ge L_n\, n^{-1/2} \end{aligned}$$
(3.33)

hold. It thus remains to establish (3.32b). To this end, we fix temporarily an arbitrary vertex \(z\in {\mathcal {V}}\) of a conforming mesh \({\mathcal {M}}\) and let \(g\in {\mathbb {P}}_0({\mathcal {M}})\). The inverse triangle and (3.12) yield

$$\begin{aligned} \Vert f-g\Vert _{H^{-1}(\omega _z)}&\ge \Vert \Delta U_{f;{\mathcal {M}}} + g\Vert _{H^{-1}(\omega _z)} - \Vert f+ \Delta U_{f;{\mathcal {M}}}\Vert _{H^{-1}(\omega _z)} \\&\ge \Vert \Delta U_{f;{\mathcal {M}}} + g\Vert _{H^{-1}(\omega _z)} - \Vert \nabla (u_f-U_{f;{\mathcal {M}}})\Vert _{\omega _z}. \end{aligned}$$

By Lemma 7, we have, for all \(K \in {\mathcal {M}}\),

$$\begin{aligned}&\left\langle \Delta U_{f;{\mathcal {M}}},\,\psi _K \right\rangle = \sum _{F \in {\mathcal {F}}} J(U_{f;{\mathcal {M}}})|_F \int _F \chi _F \psi _K \,\mathrm {d}s = 0 \end{aligned}$$

and, for all \(F \in {\mathcal {F}}\) and \(K _1,K _2\in {\mathcal {M}}\) with \(K _1\cap K _2=F \),

$$\begin{aligned} \begin{aligned} \left\langle \Delta U_{f;{\mathcal {M}}} + g,\,\psi _F \right\rangle&= \int _F J( U_{f;{\mathcal {M}}})\psi _F \,\mathrm {d}s + \sum _{i=1,2} g|_{K _i}\int _{K _i}\chi _{K_i}\psi _F \,\mathrm {d}x \\&=\left\langle \Delta U_{f;{\mathcal {M}}},\,\psi _F \right\rangle . \end{aligned} \end{aligned}$$

Theorem 15 therefore implies

$$\begin{aligned} \Vert \Delta U_{f;{\mathcal {M}}} + g\Vert _{H^{-1}(\omega _z)}& > rsim \sum _{i\in {\mathcal {I}}_z\cap {\mathcal {F}}} \left| \left\langle \Delta U_{f;{\mathcal {M}}} + g,\,\frac{\psi _i}{\Vert \nabla \psi _i\Vert }\right\rangle \right| \\&= \sum _{i\in {\mathcal {I}}_z\cap {\mathcal {F}}} \left| \left\langle \Delta U_{f;{\mathcal {M}}},\,\frac{\psi _i}{\Vert \nabla \psi _i\Vert }\right\rangle \right| \\&= \sum _{i\in {\mathcal {I}}_z} \left| \left\langle \Delta U_{f;{\mathcal {M}}},\,\frac{\psi _i}{\Vert \nabla \psi _i\Vert }\right\rangle \right| > rsim \Vert \Delta U_{f;{\mathcal {M}}}\Vert _{H^{-1}(\omega _z)}. \end{aligned}$$

Exploiting also Lemma 4, we arrive at

$$\begin{aligned}&\left( \sum _{z\in {\mathcal {V}}} \Vert \Delta U_{f;{\mathcal {M}}} + g\Vert _{H^{-1}(\omega _z)}^2 \right) ^{1/2}\\& > rsim \left( \sum _{z\in {\mathcal {V}}} \Vert \Delta U_{f;{\mathcal {M}}}\Vert _{H^{-1}(\omega _z)}^2 \right) ^{1/2} \\& > rsim \left( \sum _{z\in {\mathcal {V}}} \Vert f\Vert _{H^{-1}(\omega _z)}^2 \right) ^{1/2} - \left( \sum _{z\in {\mathcal {V}}} \Vert f+\Delta U_{f;{\mathcal {M}}}\Vert _{H^{-1}(\omega _z)}^2 \right) ^{1/2} \\&\geqslant \left( \sum _{z\in {\mathcal {V}}} \Vert f\Vert _{H^{-1}(\omega _z)}^2 \right) ^{1/2} - C \, \Vert \nabla (u_f-U_{f,{\mathcal {M}}})\Vert _{}. \end{aligned}$$

Consequently, (3.32a) and (3.33) lead to

$$\begin{aligned} \min _{\#{\mathcal {M}}\le n} \min _{g\in {\mathbb {P}}_0({\mathcal {M}})} \left( \sum _{z\in {\mathcal {V}}} \Vert f- g\Vert _{H^{-1}(\omega _z)}^2 \right) ^{1/2} \geqslant (L_n-C)\,n^{-1/2}, \end{aligned}$$

which, upon redefining \((L_n)_n\), implies (3.32b) and the proof is finished. \(\square \)

Remark 22

(Overestimation of \(H^{-1}\)-variant of standard residual estimator) As pointed out by Cohen, DeVore, and Nochetto [10], the example of Lemma 21 entails that the right-hand side of

$$\begin{aligned} \Vert \nabla (u_f- U_{{f};{\mathcal {M}}})\Vert _{}^2 \lesssim \sum _{z\in {\mathcal {V}}({\mathcal {M}})}\Vert \Delta U_{f;{\mathcal {M}}}\Vert _{H^{-1}(\omega _z)}^2+\Vert f\Vert _{H^{-1}(\omega _z)}^2, \end{aligned}$$

a variant of the standard residual estimator defined for all loads \(f \in H^{-1}(\Omega )\), is overestimating. In Sect. 4.2 below, we propose through our new approach another variant that is free of overestimation.

4 Realizations with classical techniques

The a posteriori error bounds in Sect. 3.7 are abstract in that they are given in terms of the local dual norms \(\Vert \cdot \Vert _{H^{-1}(\omega _z)}\), \(z \in {\mathcal {V}}\), of the discretized residual and the oscillation. For the norms \(\Vert {\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}}\Vert _{H^{-1}(\omega _z)}\), \(z\in {\mathcal {V}}\), of the discretized residual, we required a quantification in terms of finite information on the load and provided a possible realization in Theorem 15. In this section we discuss a selection of alternative realizations. All realizations are motivated by classical approaches to a posteriori analysis and cover two explicit and two implicit techniques. It is worth making the following observations:

  • Hierarchical estimators and estimators based upon local problems implicitly introduce a splitting of the residual like the one proposed in Sect. 3.3.

  • The overestimation of the standard residual estimator in Remark 22 can be cured with the help of the splitting of the residual in Sect. 3.3.

  • Employing different local dual norms, the approach of Sect. 3 can be extended to estimators based on flux equilibration.

  • Each realization quantifies a local dual norm of the discretized residual by a computable, equivalent norm. Both equivalence and computability hinge on the finite-dimensional nature of the discretized residual.

4.1 A hierarchical estimator

Hierarchical estimators investigate the residual on an extension of the given finite element space. While higher order extensions were used originally, Bornemann, Erdmann, and Kornhuber show in [6] that an extension containing the functions

$$\begin{aligned} \lambda _K:=\prod _{z\in {\mathcal {V}}\cap K}\phi _z, \quad K \in {\mathcal {M}}, \qquad \text {and}\qquad \lambda _F:=\prod _{z\in {\mathcal {V}}\cap F}\phi _z, \quad F \in {\mathcal {F}}, \end{aligned}$$
(4.1)

already ensures reliability for piecewise constant loads \(f\in {\mathbb {P}}_0({\mathcal {M}})\). The indicators of a corresponding, ‘minimal’ hierarchical estimator are given by

$$\begin{aligned} {\mathcal {E}}_{\mathrm {H}}(f,{\mathcal {M}},i) \mathrel {:=}\left| \left\langle {{\,\mathrm{Res}\,}}(f;{\mathcal {M}}),\,\frac{\lambda _i}{\Vert \nabla \lambda _i\Vert _{}}\right\rangle \right| , \quad i\in {\mathcal {I}}={\mathcal {M}}\cup {\mathcal {F}}, \end{aligned}$$

and computable in terms of \(U_{f;{\mathcal {M}}}\) and the evaluations \(\left\langle f,\,\lambda _i\right\rangle \), \(i \in {\mathcal {I}}\). This definition implies the constant-free local lower bounds

$$\begin{aligned} {\mathcal {E}}_{\mathrm {H}}(f,{\mathcal {M}},i) \leqslant \Vert {{\,\mathrm{Res}\,}}(f;{\mathcal {M}})\Vert _{H^{-1}({\text {supp}}\lambda _i)} \end{aligned}$$

and therefore, cf. (3.28), we have that, for every \(z\in {\mathcal {V}}\) and \({I}_z= \{i\in {\mathcal {I}} \mid i \ni z \}\),

$$\begin{aligned} \left( \sum _{i\in {\mathcal {I}}_z} {\mathcal {E}}_{\mathrm {H}}(f,{\mathcal {M}},i)^2 \right) ^{1/2} \leqslant \sqrt{d+1} \, \Vert {{\,\mathrm{Res}\,}}(f,{\mathcal {M}})\Vert _{H^{-1}(\omega _z)}, \end{aligned}$$
(4.2)

which is a local counterpart of the global lower bound in Veeser [24, Lemma 3.3].

This estimator is very closely related to the discretized residuals of Sect. 3.4 and Theorem 15. Indeed, if \(K \in {\mathcal {M}}\) and \(F \in {\mathcal {F}}\), \(K _1,K _2\in {\mathcal {M}}\) such that \(F = K _1 \cap K _2\), we have

$$\begin{aligned} \psi _K = \frac{(2d+1)!}{d!|K |} \lambda _K \quad \text {and}\quad \psi _F = \frac{(2d-1)!}{(d-1)!|F |} \left( \lambda _F- (2d+1)\sum _{i=1}^2 \lambda _{K _i} \right) \end{aligned}$$
(4.3)

in view of (3.19). Hence \({\text {span}}\{\psi _i \mid i\in {\mathcal {I}}\} = {\text {span}}\{\lambda _i \mid i\in {\mathcal {I}}\}\) and Remark 12 yields \(\left\langle f,\,\lambda _i\right\rangle =\left\langle {\mathcal {P}}_{{\mathcal {M}}}f,\,\lambda _i\right\rangle \), \(i\in {\mathcal {I}}\), and the indicators may be viewed also as evaluations of the discretized residual: for \(i \in {\mathcal {I}}\),

$$\begin{aligned} {\mathcal {E}}_{\mathrm {H}}(f,{\mathcal {M}},i) = \left| \left\langle {\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f,{\mathcal {M}}},\,\frac{\lambda _i}{\Vert \nabla \lambda _i\Vert _{}}\right\rangle \right| . \end{aligned}$$

As a consequence, we also have the following counterpart of (4.2):

$$\begin{aligned} \left( \sum _{i\in {\mathcal {I}}_z} {\mathcal {E}}_{\mathrm {H}}(f,{\mathcal {M}},i)^2 \right) ^{1/2} \leqslant \sqrt{d+1} \, \Vert {\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f,{\mathcal {M}}}\Vert _{H^{-1}(\omega _z)}. \end{aligned}$$
(4.4)

In order to prove the converse bound, we may proceed with the help of \({\mathcal {P}}_{{\mathcal {M}}}^*\) as in [24]. However, having Theorem 15 at our disposal, it is simpler to exploit (4.3). We immediately see

$$\begin{aligned} {\mathcal {E}}_{\mathrm {H}}(f,{\mathcal {M}},K) = \left| \left\langle {\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f,{\mathcal {M}}},\,\frac{\psi _K}{\Vert \nabla \psi _K \Vert _{}}\right\rangle \right| . \end{aligned}$$
(4.5a)

Moreover, given \(F \in {\mathcal {F}}\), \(K _1,K _2\in {\mathcal {M}}\) with \(F = K _1 \cap K _2\), we deduce

$$\begin{aligned} C_d |F |^{-1} \leqslant \max _F \psi _F \leqslant h_F \max _{K _1} | \nabla \psi _F | \lesssim h_F |K |^{-1/2} \Vert \nabla \psi _F \Vert _{K _1} \end{aligned}$$

with \(h_F \mathrel {:=}{{\,\mathrm{diam}\,}}F \) and, for \(i \in \{F, K _1, K _2 \}\)

$$\begin{aligned} \Vert \nabla \lambda _i \Vert _{\omega _F} \leqslant C_d \max _{i=1,2} \rho _{K}^{-1} |\omega _F |^{1/2}. \end{aligned}$$

We therefore obtain \(\Vert \nabla \psi _F \Vert _{}^{-1} \Vert \nabla \lambda _i\Vert _{} \lesssim |F |\) and

$$\begin{aligned} \left| \left\langle {\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f,{\mathcal {M}}},\,\frac{\psi _F}{\Vert \nabla \psi _F \Vert _{}}\right\rangle \right| \lesssim \sum _{i\in \{F,K _1,K _2\}} {\mathcal {E}}_{\mathrm {H}}(f,{\mathcal {M}},i). \end{aligned}$$
(4.5b)

Summing up, the hierarchical estimator quantifies the local discretized residual,

$$\begin{aligned} \sum _{i\in {\mathcal {I}}_z} {\mathcal {E}}_{\mathrm {H}}(f,{\mathcal {M}},i)^2 \simeq \Vert {\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f,{\mathcal {M}}}\Vert _{H^{-1}(\omega _z)}, \quad z\in {\mathcal {V}}, \end{aligned}$$

and we have the following a posteriori bounds.

Theorem 23

(Hierarchical estimator with error-dominated oscillation) For any functional \(f\in H^{-1}(\Omega )\) and any conforming mesh \({\mathcal {M}}\), we have the global equivalence

$$\begin{aligned} \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{}^2 \eqsim \sum _{i\in {\mathcal {I}}} {\mathcal {E}}_{\mathrm {H}}(f,{\mathcal {M}},i)^2 + \sum _{z\in {\mathcal {V}}} \Vert {\mathcal {P}}_{{\mathcal {M}}}f-f\Vert _{H^{-1}(\omega _z)}^2, \end{aligned}$$

as well as the following local lower bounds: for every \(z \in {\mathcal {V}}\),

$$\begin{aligned} \sum _{i\in {\mathcal {I}}_z} {\mathcal {E}}_{\mathrm {H}}(f,{\mathcal {M}},i)^2 \leqslant (d+1) \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{\omega _z}^2, \\ \sum _{i\in {\mathcal {I}}_z} \Vert {\mathcal {P}}_{{\mathcal {M}}}f-f\Vert _{H^{-1}(\omega _z)}^2 \lesssim \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{\omega _z}^2. \end{aligned}$$

The hidden constants depend only on d and \(\sigma ({\mathcal {M}})\).

Proof

Combine Theorem 16, Theorem 17, Corollary 18, (3.12), (4.5), and (4.2). \(\square \)

4.2 An improved standard residual estimator

The standard residual estimator applies suitably scaled norms to the jump and element residual; see, e.g., Verfürth [26, Section 1.4]. In the case of the discretized residual

$$\begin{aligned} {\mathcal {P}}_{{\mathcal {M}}}f+ \Delta U_{f,{\mathcal {M}}} = \sum _{F \in {\mathcal {F}}} \big ( \left\langle f,\,\psi _F \right\rangle +J( U_{f;{\mathcal {M}}})|_{F} \big ) \chi _F + \sum _{K \in {\mathcal {M}}} \left\langle f,\,\psi _K \right\rangle \chi _K, \end{aligned}$$

this leads to the following indicators:

$$\begin{aligned} {\mathcal {E}}_{\mathrm {R}}(U_{f;{\mathcal {M}}},{\mathcal {P}}_{{\mathcal {M}}}f,F)&\mathrel {:=}h_F ^{1/2} \Vert \left\langle f,\,\psi _F \right\rangle +J( U_{f;{\mathcal {M}}})\Vert _{F},&F \in {\mathcal {F}}, \\ {\mathcal {E}}_{\mathrm {R}}(U_{f;{\mathcal {M}}},{\mathcal {P}}_{{\mathcal {M}}}f,K)&\mathrel {:=}h_K \Vert \left\langle f,\,\psi _K \right\rangle \Vert _{K},&K \in {\mathcal {M}}, \end{aligned}$$

where \(h_F \) and \(h_K \) denote, respectively, the diameters of \(F \) and \(K \) and computability is given in terms of \(U_{f;{\mathcal {M}}}\) and (3.23).

These indicators actually quantify the discretized residual and in a way that is very tight to Theorem 15: for any interelement face \(F \in {\mathcal {F}}\),

$$\begin{aligned} {\mathcal {E}}_{\mathrm {R}}(U_{f;{\mathcal {M}}},{\mathcal {P}}_{{\mathcal {M}}}f,F) \eqsim \left| \left\langle {\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}},\,\frac{\psi _F}{\Vert \nabla \psi _F \Vert _{}}\right\rangle \right| \end{aligned}$$
(4.6a)

and, for any element \(K \in {\mathcal {M}}\),

$$\begin{aligned} {\mathcal {E}}_{\mathrm {R}}(U_{f;{\mathcal {M}}},{\mathcal {P}}_{{\mathcal {M}}}f,K) \eqsim \left| \left\langle {\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}},\,\frac{\psi _K}{\Vert \nabla \psi _K \Vert _{}}\right\rangle \right| , \end{aligned}$$
(4.6b)

where the hidden constants depend only on d and \(\sigma ({\mathcal {M}})\). To see (4.6a), let \(F \in {\mathcal {F}}\) be any interelement face. Lemma 7 (i), the trace inequality (3.18) for \(w=\psi _F ^2\) and the Friedrichs inequality (3.17) for \(v=\psi _F \), both with \(\omega _F \) in place of \(\omega _z\), give

$$\begin{aligned}&\left| \left\langle {\mathcal {P}}_{{\mathcal {M}}}f+ \Delta U_{f;{\mathcal {M}}},\,\frac{\psi _F}{\Vert \nabla \psi _F \Vert _{}}\right\rangle \right| = \left| \left\langle \big (\left\langle f,\,\psi _F \right\rangle + J( U_{f;{\mathcal {M}}})|_F \big )\,\chi _F,\,\frac{\psi _F}{\Vert \nabla \psi _F \Vert _{}}\right\rangle \right| \\&\qquad \leqslant \Vert \left\langle f,\,\psi _F \right\rangle +J( U_{f;{\mathcal {M}}})\Vert _{F} \, \frac{\Vert \psi _F \Vert _{F}}{\Vert \nabla \psi _F \Vert _{}} \leqslant h_F ^{1/2}\Vert \left\langle f,\,\psi _F \right\rangle +J( U_{f;{\mathcal {M}}})\Vert _{F}, \end{aligned}$$

while (3.20) yields \(\Vert \nabla \psi _F \Vert _{\Omega } \lesssim (h_F |F |)^{-1/2}\) and so

$$\begin{aligned} \left| \left\langle {\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}},\,\frac{\psi _F}{\Vert \nabla \psi _F \Vert _{}}\right\rangle \right|&=\frac{\Vert \left\langle f,\,\psi _F \right\rangle +J( U_{f;{\mathcal {M}}})\Vert _{F} }{\left| F \right| ^{1/2}\Vert \nabla \psi _F \Vert _{}} \\& > rsim h_F ^{1/2} \Vert \left\langle f,\,\psi _F \right\rangle +J( U_{f;{\mathcal {M}}})\Vert _{F}. \end{aligned}$$

Similarly, we obtain (4.6b).

Inserting the combination of Theorem 15 and (4.6) in the abstract a posteriori analysis of Sect. 3.7, we obtain the following result.

Theorem 24

(Standard residual estimator with error-dominated oscillation) For any functional \(f\in H^{-1}(\Omega )\) and any conforming mesh \({\mathcal {M}}\), we have the global equivalence

$$\begin{aligned} \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{}^2 \eqsim \sum _{i\in {\mathcal {I}}} {\mathcal {E}}_{\mathrm {R}}(U_{f;{\mathcal {M}}},{\mathcal {P}}_{{\mathcal {M}}}f,i)^2 + \sum _{z\in {\mathcal {V}}} \Vert {\mathcal {P}}_{{\mathcal {M}}}f-f\Vert _{H^{-1}(\omega _z)}^2, \end{aligned}$$

as well as the following local lower bounds: for \(z\in {\mathcal {V}}\),

$$\begin{aligned} \sum _{i\in {\mathcal {I}}_z}{\mathcal {E}}_{\mathrm {R}}(U_{f;{\mathcal {M}}},{\mathcal {P}}_{{\mathcal {M}}}f,i)^2 + \Vert {\mathcal {P}}_{{\mathcal {M}}}f-f\Vert _{H^{-1}(\omega _z)}^2 \lesssim \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{\omega _z}^2. \end{aligned}$$

The hidden constants depend only on d and \(\sigma ({\mathcal {M}})\).

Theorem 24 relies on key features of the approach in §3, which the following remark elaborates on.

Remark 25

(Classical vs new standard residual estimator) In contrast to the classical standard residual estimator (1.2) and its \(H^{-1}\)-variant in Remark 22, the variant of Theorem 24 is completely equivalent to the error. The reason for this improvement lies in a suitable correction of the original jump residual. To elucidate this, remember that both the classical standard residual estimator and its \(H^{-1}\)-variant in Remark 22 do not discretize the residual and therefore compare them to

$$\begin{aligned} \sum _{F \in {\mathcal {F}}} h_F ^{1/2} \Vert J(U_{f;{\mathcal {M}}}) + \left\langle f,\,\psi _F \right\rangle \Vert _{F}^2 + \sum _{z\in {\mathcal {V}}} \Big \Vert f-\sum _{F \in {\mathcal {F}}} \left\langle f,\,\psi _F \right\rangle \chi _F \Big \Vert ^2_{H^{-1}(\omega _z)}, \end{aligned}$$

which also does not split off an infinite-dimensional part of the load \(f\). The corrections \(\left\langle f,\,\psi _F \right\rangle \), \(F \in {\mathcal {F}}\), of the jump residual make sure that the new jump residual has the invariance properties necessary for avoiding overestimation, i. e., it vanishes whenever the exact solution happens to be discrete. Corrections with this property have been used previously. For example, Nochetto [20] considers the special case \(f= f_1 +{{\,\mathrm{div}\,}}\varvec{f_2}\), where \(f_1,\varvec{f_2}\) are suitable functions, and assigns \(({{\,\mathrm{div}\,}}\varvec{f_2})|_K \), \(K \in {\mathcal {M}}\), to the element residual and the jumps in the normal trace of \(\varvec{f_2}\) across interelement sides correct the jump residual. Similarly, in standard residual estimators for the Stokes problem, pressure jumps correct the jump residual associated with the velocity. The novelty is that the corrections \(\left\langle f,\,\psi _F \right\rangle \), \(F \in {\mathcal {F}}\), are defined for an arbitrary \(f\in H^{-1}(\Omega )\) and also locally \(H^{-1}\)-stable and so fulfill the second necessary condition to avoid local overestimation. Notably, the latter entails that, even if \(f\) is a smooth function, the jump residual will be corrected significantly in certain cases.

4.3 An estimator based on local problems

A local problem lifts the residual to a local extension of the given finite element space and so provides a local correction, the norm of which is used as an error indicator; cf. Babuška and Rheinboldt [4]. While computability requires finite-dimensional extensions, the higher cost with respect to the previous explicit estimators is tied up with the hope of improved accuracy.

The following instance from Verfürth [26, Section 1.7.1 and Remark 1.21] is vertex-based and uses the local extensions

$$\begin{aligned} {\mathbb {U}}_z \mathrel {:=}{\text {span}}\{\lambda _i \mid i\in {\mathcal {I}}_z\} = {\text {span}}\{\psi _i \mid i\in {\mathcal {I}}_z\}, \quad z\in {\mathcal {V}}, \end{aligned}$$

where the functions \(\psi _i\) and \(\lambda _i\) are defined, respectively, in (3.19) and (4.1). Given a vertex \(z\in {\mathcal {V}}\), the indicator is then

$$\begin{aligned} {\mathcal {E}}_{\mathrm {L}}(f,{\mathcal {M}},z) \mathrel {:=}\Vert \nabla \nu _z\Vert _{}, \end{aligned}$$

where

$$\begin{aligned} \nu _z\in {\mathbb {U}}_z \quad \text {such that}\quad \forall \lambda \in {\mathbb {U}}_z \quad \int _\Omega \nabla \nu _z\cdot \nabla \lambda \,\mathrm {d}x = \left\langle {{\,\mathrm{Res}\,}}(f;{\mathcal {M}}) ,\,\lambda \right\rangle . \end{aligned}$$

Thus, \(\nu _z\) is computable in terms of \(U_{f;{\mathcal {M}}}\) and, e.g., (3.23). The indicator \({\mathcal {E}}_{\mathrm {L}}(f,{\mathcal {M}},z)\) may be viewed as an implicit counterpart of \((\sum _{i \in {\mathcal {I}}_z} {\mathcal {E}}_{\mathrm {H}}(f,{\mathcal {M}},i)^2)^{1/2}\) from §4.1. Taking \(\lambda =\nu _z\), we immediately obtain the constant-free lower bound

$$\begin{aligned} {\mathcal {E}}_{\mathrm {L}}(f,{\mathcal {M}},z) \leqslant \Vert {{\,\mathrm{Res}\,}}(f;{\mathcal {M}})\Vert _{H^{-1}(\omega _z)}, \end{aligned}$$
(4.7)

which slightly improves upon (4.2).

Notice that, in light of Remark 12, the solution \(\nu _z\) can be interpreted also as a lift of the discretized residual \({\mathcal {P}}_{{\mathcal {M}}}f+ \Delta U_{f;{\mathcal {M}}}\). Consequently, the first inequality in

$$\begin{aligned} {\mathcal {E}}_{\mathrm {L}}(f,{\mathcal {M}},z) \leqslant \Vert {\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}}\Vert _{H^{-1}(\omega _z)} \lesssim {\mathcal {E}}_{\mathrm {L}}(f,{\mathcal {M}},z) \end{aligned}$$
(4.8)

is correct. The second one follows from Remark 13 and Theorem 10 in the spirit of Morin, Nochetto and Siebert [18]. In fact, for \(v\in H^1_0(\omega _z)\), we have

$$\begin{aligned} \left\langle {\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}},\,v\right\rangle&= \left\langle {{\,\mathrm{Res}\,}}(f;{\mathcal {M}}),\,{\mathcal {P}}_{{\mathcal {M}}}^*v\right\rangle = \int _{\omega _z} \nabla \nu _z\cdot \nabla {\mathcal {P}}_{{\mathcal {M}}}^{*} v \,\mathrm {d}x \\&\leqslant \Vert \nabla \nu _z\Vert _{} \Vert \nabla {\mathcal {P}}_{{\mathcal {M}}}^* v\Vert _{\omega _z} \lesssim {\mathcal {E}}_{\mathrm {L}}(f,{\mathcal {M}},z) \Vert \nabla v\Vert _{\omega _z}. \end{aligned}$$

Theorem 26

(Estimator based on local problems with error-dominated oscillation) For any functional \(f\in H^{-1}(\Omega )\) and any conforming mesh \({\mathcal {M}}\), we have the global equivalence

$$\begin{aligned} \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{}^2 \eqsim \sum _{z\in {\mathcal {V}}} {\mathcal {E}}_{\mathrm {L}}(f,{\mathcal {M}},z)^2 + \Vert {\mathcal {P}}_{{\mathcal {M}}}f-f\Vert _{H^{-1}(\omega _z)}^2, \end{aligned}$$

as well as the following local lower bounds: for every \(z\in {\mathcal {V}}\),

$$\begin{aligned}&{\mathcal {E}}_{\mathrm {L}}(f,{\mathcal {M}},z) \leqslant \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{\omega _z} \\&\qquad \qquad \qquad \qquad \text {and}\; \Vert {\mathcal {P}}_{{\mathcal {M}}}f-f\Vert _{H^{-1}(\omega _z)} \lesssim \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{\omega _z}. \end{aligned}$$

The hidden constants depend only on d and \(\sigma ({\mathcal {M}})\).

Proof

Combine Theorem 16, Theorem 17, Corollary 18, (3.12), (4.7) and (4.8). \(\square \)

4.4 An estimator based on flux equilibration

While indicators based on local problems provide constant-free local lower bounds, estimators based on flux equilibration aim for a constant-free, or at least explicit, global upper bound. This is achieved with the help of other, more sophisticated liftings within the framework of the fundamental theorem of Prager and Synge [22], which, for the homogeneous Dirichlet problem (1.1), can be formulated as follows: For any \(v\in H_0^1(\Omega )\), we have

$$\begin{aligned} \Vert \nabla (v-u) \Vert _{}=\min \left\{ \Vert \varvec{\xi }\Vert _{}\mid \varvec{\xi }\in L^2(\Omega ;{\mathbb {R}}^d)~\text {with}~ {{\,\mathrm{div}\,}}\varvec{\xi } =\Delta v+f~\text {in}~ H^{-1}(\Omega )\right\} . \end{aligned}$$
(4.9)

Realizations of this idea in Ainsworth [1], Braess and Schöberl [8], Ern, Smears and Vohralik [12, 13], and Luce and Wohlmuth [16] make use of some classical oscillation. Its replacement by an error-dominated oscillation requires some adjustment to the approach of Sect. 3.

The upper bound in the localization of Lemma 4 involves a non-explicit multiplicative constant. In order to improve on this, we replace the local spaces \(H^1_0(\omega _z)\), \(z\in {\mathcal {V}}\), with

$$\begin{aligned} H_z \mathrel {:=}{\left\{ \begin{array}{ll} \{v \in H^1(\omega _z) \mid \int _{\omega _z}v\,\mathrm {d}x =0\}, &{}\text {if }z\in {\mathcal {V}}_0 = {\mathcal {V}}\cap \Omega , \\ \{v\in H^1(\omega _z) \mid v|_{\partial \omega _z\cap \partial \Omega }=0\}, &{}\text {if } z \in {\mathcal {V}}\setminus {\mathcal {V}}_0, \end{array}\right. } \end{aligned}$$

equip them with the norm \(\Vert \nabla \cdot \Vert _{\omega _z}\), and denote the respective dual spaces by \(H_z^*\).

Lemma 27

(Alternative localization) Let \(\ell \in H^{-1}(\Omega )\) be any functional.

  1. (i)

    If \(\ell \in {\mathcal {R}}_{\mathcal {M}}\), then

    $$\begin{aligned} \Vert \ell \Vert _{H^{-1}(\Omega )}^2 \le (d+1)\sum _{z\in {\mathcal {V}}} \Vert \phi _z\ell \Vert _{H^{*}_z}^2. \end{aligned}$$
  2. (ii)

    We have

    $$\begin{aligned} \sum _{z\in {\mathcal {V}}} \Vert \phi _z\ell \Vert _{H^{*}_z}^2 \lesssim \Vert \ell \Vert _{H^{-1}(\Omega )}^2, \end{aligned}$$

    where the hidden constant depends only on d and the shape coefficient \(\sigma ({\mathcal {M}})\).

Proof

The proof is essentially a regrouping of the arguments proving Lemma 4, where (3.3) slips into the proof of (ii); cf. Canuto et al. [9, Proposition 3.1]. \(\square \)

Splitting the residual up in discretized residual and oscillation, we then obtain the following abstract error bounds; we do not state the global lower bound as it is immediate consequence of the local one.

Lemma 28

(Alternative abstract error bounds) For any functional \(f\in H^{-1}(\Omega )\) and any conforming mesh \({\mathcal {M}}\), we have the global upper bound

$$\begin{aligned}&\Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{}^2 \\&\quad \leqslant (d+1) \sum _{z\in {\mathcal {V}}}\left( \Vert \phi _z({\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}})\Vert _{H_z^*}+ \Vert \phi _z({\mathcal {P}}_{{\mathcal {M}}}f-f)\Vert _{H_z^*}\right) ^2, \end{aligned}$$

as well as the following local lower bounds: for every vertex \(z\in {\mathcal {V}}\),

$$\begin{aligned} \Vert \phi _z({\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}})\Vert _{H_z^*} + \Vert \phi _z({\mathcal {P}}_{{\mathcal {M}}}f-f)\Vert _{H_z^*} \lesssim \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{\omega _z}. \end{aligned}$$

The hidden constants depend only on d and \(\sigma ({\mathcal {M}})\).

Proof

The global upper bound follows from Lemma 27 (i) and the triangle inequality. To prove the local lower bounds, we recall Theorem 17 and take \(\ell ={\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}}\) and \(\ell = {\mathcal {P}}_{{\mathcal {M}}}f-f\) in

$$\begin{aligned} \left\langle \phi _z\ell ,\,v_z\right\rangle =\left\langle \ell ,\,\phi _zv_z\right\rangle \le \Vert \ell \Vert _{H^{-1}(\omega _z)} \Vert \nabla (v_z\phi _z)\Vert _{\omega _z} \lesssim \Vert \ell \Vert _{H^{-1}(\omega _z)}\Vert \nabla v_z\Vert _{\omega _z}, \end{aligned}$$
(4.10)

which exploits (3.3) for \(v_z\in H_z\) and \(z\in {\mathcal {V}}\). \(\square \)

In order to quantify the local discretized residual, we construct local equilibrated fluxes following the ideas of Braess, Pillwein, and Schöberl [7] and Ern, Smears, and Vohralík [12]. To this end, fix any vertex \(z\in {\mathcal {V}}\) and define the operator \(\pi _z:\{ \pi _z: H_z^* \rightarrow H_z^*\} \rightarrow H_z^*\) by

$$\begin{aligned} \pi _z \big ( \phi _z\ell \big ) := {\left\{ \begin{array}{ll} \phi _z\ell - \frac{\left\langle \phi _z\ell ,\,1\right\rangle }{|\omega _z|} &{}\text {if}~z\in {\mathcal {V}}_0, \\ \phi _z\ell &{}\text {if }z\in {\mathcal {V}}\setminus {\mathcal {V}}_0. \end{array}\right. } \end{aligned}$$
(4.11)

We emphasize that \(\pi _z\big (\phi _z({\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}})\big )\) can be computed in terms of \(U_{f;{\mathcal {M}}}\) and (3.23). Thanks to the definition of the spaces \(H_z\), \(z\in {\mathcal {V}}\), and the general form of the theorem of Prager and Synge (see, e.g., Verfürth [26, Proposition 1.40]), we have

$$\begin{aligned} \Vert \phi _z({\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}})\Vert _{H_z^*} = \Vert \pi _z\phi _z({\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}})\Vert _{H_z^*} = \min _{\varvec{\xi }\in {\mathbb {W}}_z}\Vert \varvec{\xi }\Vert _{\omega _z} \end{aligned}$$
(4.12)

with the affine space

$$\begin{aligned} {\mathbb {W}}_z := \big \{ \varvec{\xi }\in L^2(\omega _z;{\mathbb {R}}^d) \mid&{{\,\mathrm{div}\,}}\varvec{\xi } = \pi _z\big (\phi _z({\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}})\big )\in H_z^* \\&\text {and}~\varvec{\xi }\cdot \varvec{n}= 0~\text {on}~\partial \omega _z~\text {if}~z\in {\mathcal {V}}_0 \\&\text {and}~\varvec{\xi }\cdot \varvec{n}= 0~\text {on}~\partial \omega _z\setminus \partial \Omega ~\text {if}~z\in {\mathcal {V}}\setminus {\mathcal {V}}_0 \big \}, \end{aligned}$$

and the equalities in the definition of \({\mathbb {W}}_z\) have to be understood in the sense of distributions; the space \({\mathbb {W}}_z\) is not empty since \(\left\langle \pi _z\big (\phi _z({\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}})\big ),\,1\right\rangle =0\) for every \(z\in {\mathcal {V}}_0\).

In order to introduce a discrete counterpart of \({\mathbb {W}}_z\) in (4.12), we employ the Raviart-Thomas-Nédélec spaces

$$\begin{aligned} \text {RTN}(K) := \{ \varvec{\Xi } : {K}\rightarrow {\mathbb {R}}^d \mid \varvec{\Xi }(x)=\varvec{a}+b x~\text {for some}~ \varvec{a}\in {\mathbb {P}}_1^d, b\in {\mathbb {P}}_1 \}, \quad K \in {\mathcal {M}}, \end{aligned}$$

and define

$$\begin{aligned} {\mathbb {W}}_z({\mathcal {M}}) \mathrel {:=}\big \{ \varvec{\Xi }\in L^2(\omega _z) \mid \&\varvec{\Xi }|_K \in \text {RTN}(K) \text { for all }K \in {\mathcal {M}}~\text {with}~K \subset \omega _z \\&\text {and}~{{\,\mathrm{div}\,}}\varvec{\Xi } = \pi _z\big (\phi _z({\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}}) \big )\in H_z^* \\&\text {and}~\varvec{\Xi }\cdot \varvec{n}= 0~\text {on}~\partial \omega _z~\text {if}~z\in {\mathcal {V}}_0 \\&\text {and}~\varvec{\Xi }\cdot \varvec{n}= 0~\text {on}~\partial \omega _z\setminus \partial \Omega ~\text {if}~z\in {\mathcal {V}}\setminus {\mathcal {V}}_0\big \}, \end{aligned}$$

which satisfies

$$\begin{aligned} \min _{\varvec{\Xi }\in {\mathbb {W}}_z({\mathcal {M}})} \Vert \varvec{\Xi }\Vert _{\omega _z} \lesssim \min _{\varvec{\xi }\in {\mathbb {W}}_z} \Vert \varvec{\xi }\Vert _{\omega _z} \le \min _{\varvec{\Xi }\in {\mathbb {W}}_z({\mathcal {M}})} \Vert \varvec{\Xi }\Vert _{\omega _z} \end{aligned}$$
(4.13)

and the hidden constant depends only on d and \(\sigma ({\mathcal {M}})\). Indeed, the right inequality is obvious because of \({\mathbb {W}}_z({\mathcal {M}})\subset {\mathbb {W}}_z\). The left inequality can be proved by an explicit construction; see, e.g., [7, 12]. For the ease of presentation, however, we shall assume

$$\begin{aligned} \varvec{\Xi }_z \mathrel {:=}\underset{_{\varvec{\Xi }\in {\mathbb {W}}_z({\mathcal {M}})}}{{\text {arg min}}} \Vert \varvec{\Xi }\Vert _{\omega _z} \end{aligned}$$

and note

$$\begin{aligned} \Vert \varvec{\Xi }_z\Vert _{\omega _z} \lesssim \Vert \phi _z({\mathcal {P}}_{{\mathcal {M}}}f+\Delta U_{f;{\mathcal {M}}})\Vert _{H_z^*} \leqslant \Vert \varvec{\Xi }_z\Vert _{\omega _z} \end{aligned}$$

in view of (4.12) and (4.13). Inserting this in the abstract bounds of Lemma 28, we readily obtain the following a posteriori bounds; as before, we suppress the global lower bound.

Theorem 29

(Equilibrated flux estimator with error-dominated oscillation) For any functional \(f\in H^{-1}(\Omega )\) and any conforming mesh \({\mathcal {M}}\), we have the global upper bound

$$\begin{aligned} \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{}^2 \leqslant (d+1) \sum _{z\in {\mathcal {V}}}\left( \Vert \varvec{\Xi }_z\Vert _{\omega _z}+ \Vert \phi _z({\mathcal {P}}_{{\mathcal {M}}}f-f)\Vert _{H_z^*}\right) ^2 \end{aligned}$$

as well as the following local lower bounds: for every vertex \(z\in {\mathcal {V}}\),

$$\begin{aligned} \Vert \varvec{\Xi }_z\Vert _{\omega _z}^2+\Vert \phi _z({\mathcal {P}}_{{\mathcal {M}}}f-f)\Vert _{H_z^*}^2\lesssim \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{\omega _z}^2. \end{aligned}$$

The hidden constant depends only on d and \(\sigma ({\mathcal {M}})\).

In contrast to the cited previous bounds, the upper bound in Theorem 29 contains the multiplicative constant \(d+1\). This constant arises from the localization in Lemma 27. As an alternative to this localization, one may use the constant-free upper bound in the following remark and split the estimator part \(\Vert \varvec{\Xi }\Vert _{}\) therein into local \(L^2\)-contributions.

Remark 30

(Alternative upper bound) Observing that

$$\begin{aligned} \sum _{z\in {\mathcal {V}}}{{\,\mathrm{div}\,}}\varvec{\Xi }_z = f+\Delta U_{f;{\mathcal {M}}} + \sum _{z\in {\mathcal {V}}}\pi _z\big (\phi _z({\mathcal {P}}_{{\mathcal {M}}}f-f)\big ), \end{aligned}$$

we set \(\varvec{\Xi }\mathrel {:=}\sum _{z\in {\mathcal {V}}}\varvec{\Xi }_z\) and apply the theorem of Prager and Synge (4.9) globally and Lemma 27 to obtain

$$\begin{aligned} \Vert \nabla (u_f- U_{f;{\mathcal {M}}})\Vert _{} \leqslant \Vert \varvec{\Xi }\Vert _{}+ \sqrt{d+1} \left( \sum _{z\in {\mathcal {V}}}\Vert \phi _z({\mathcal {P}}_{{\mathcal {M}}}f-f)\Vert _{H^*_z}^2\right) ^{1/2}. \end{aligned}$$