1 Introduction

In this paper we approach the numerical solution of the p-Poisson problem

$$\begin{aligned} -{\text {div}}(|\nabla u|^{p-2}\nabla u) = f\quad \text {in }\Omega , \nonumber \\ u = 0\quad \text {on }\partial \Omega , \end{aligned}$$
(1.1)

where \(\Omega \subset \mathbb {R}^d\) is open and bounded and \(1<p<\infty \). The solution might be scalar or vector-valued.Footnote 1

Nonlinear problems of this type appear in many applications, e.g. non-Newtonian fluid theory [21], turbulent flow of a gas in porous media, glaciology or plastic modeling. Moreover, the p-Laplacian has a similar model character for nonlinear problems as the ordinary Laplace operator for linear problems; see [22] for an introduction.

As usual we are looking for the weak solution of (1.1). In particular, we are searching for a function \(u\in W_0^{1,p}(\Omega )\) such that

$$\begin{aligned} \int \limits _\Omega |\nabla u|^{p-2} \nabla u \cdot \nabla \xi \, dx = \langle f,\xi \rangle \quad \forall \xi \in W_0^{1,p}(\Omega ), \end{aligned}$$
(1.2)

where in the most general case \(f\in (W_0^{1,p}(\Omega ))^*\). It is well-known that the solution is unique and coincides with the minimizer of the energy \(\mathcal {J}: W_0^{1,p}(\Omega )\rightarrow \mathbb {R}\) defined by

$$\begin{aligned} \mathcal {J}(v) := \tfrac{1}{p}\int \limits _\Omega |\nabla v|^p \, dx -\langle f,v\rangle . \end{aligned}$$
(1.3)

Due to the nonlinearity of the problem it is harder to obtain efficient numerical solutions of this problem with a guaranteed performance. Our goal is to construct solutions of (1.2) by means of a numerically accessible algorithm. In particular, we construct an iterative algorithm that approximates solutions of (1.2), where in each step only a linear elliptic problem has to be solved. Primarily, we focus here on the iteration on the infinite dimensional space \(W^{1,p}_0(\Omega )\). However, the same algorithm will immediately apply also to discretized versions of the p-Poisson problem, e.g., by means of finite elements or wavelets. This approach would coincide with the one adopted, for instance, in [5] of first finding an iteration on the infinite-dimensional solution space and then discretizing in space. We will consider in subsequent work the effect of the discretization and its adaptation to error estimators.

In this paper we also restrict ourselves to the case \(p\in (1,2]\), since we are in particular interested in relatively small values of p, also because the case of \(p>2\) is already addressed to a certain extent in [5]. We will see, e.g., in Example 20 that our algorithm actually only works properly for the range of \(p \in (1,2]\).

Coming from the weak formulation (1.2) one can interpret the problem as a weighted Poisson problem

$$\begin{aligned} \int \limits _\Omega a^{p-2} \nabla u\cdot \nabla \xi \, dx =\langle f,\xi \rangle \quad \forall \xi \in W_0^{1,2}(\Omega ) \end{aligned}$$
(1.4)

for the given f, where \(a:\Omega \rightarrow \mathbb {R}\) and \(a=|\nabla u|\). This suggests to iteratively calculate for a given function \(v_n\) the new iterate \(v_{n+1}\) as the solution of

$$\begin{aligned} \int \limits _\Omega |\nabla v_n|^{p-2} \nabla v_{n+1}\cdot \nabla \xi \, dx = \langle f,\xi \rangle \quad \forall \xi \in W_0^{1,2}(\Omega ). \end{aligned}$$

The advantage of this step is that the calculation of \(v_{n+1}\) only requires solving a linear problem. This allows invoking relatively standard appraoches to discretize this step and solve it numerically with guaranteed performances. The problem with this approach, however, is that the weighted Poisson problem is only well posed if a is bounded from above and from below away from zero. However, the weight \(|\nabla v_n|^{p-2}\) may be degenerating, at points where \({|{\nabla u}|}=0\) or \({|{\nabla u}|}=\infty \).

To overcome this problem we will use a relaxation arguments. Therefore, we introduce in our algorithm two relaxation parameters \(\epsilon _-, \epsilon _+ \in (0,\infty )\) with \(\epsilon _- \leqslant \epsilon _+\) that ensure that the weight is truncated properly from below and above. In particular, we replace a by its truncation

$$\begin{aligned} \epsilon _- \vee a \wedge \epsilon _+ := \max {\{{ \epsilon _-, \min {\{{ a, \epsilon _+}\}}}\}}. \end{aligned}$$

Note that this is just the (pointwise) closest point projection of a to the truncation interval \( [\epsilon _-,\epsilon _+]\). The limits \(\epsilon _- \searrow 0\) and \(\epsilon _+ \nearrow \infty \) will recover the unrelaxed or original problem. We also write \(\epsilon :=[\epsilon _-, \epsilon _+]\) and interpret \(\epsilon \) both as a pair \(\{\epsilon _-,\epsilon _+\}\) and as the truncation interval \([\epsilon _-,\epsilon _+]\). We will write \(\epsilon \rightarrow [0,\infty ]\) as a short version of \(\epsilon _- \searrow 0\) and \(\epsilon _+ \nearrow \infty \). We will see later, see Corollary 15, that for f in the Lorentz space \(L^{d,1}(\Omega )\) the lower parameter \(\epsilon _-\) is the crucial one.

According to these considerations we propose the following algorithm:

figure a

Since \(0< \epsilon _{n,-} \leqslant \epsilon _{n,+}<\infty \) the equation for \(v_{n+1}\) in the algorithm is always well defined, since it is uniformly elliptic (with constant depending on \(\epsilon _n\)).

This algorithm is not completely new in the realm of quasi-linear equations. Such an iterative linearization approach is in fact called the Kačanov method in [17, 18] and we refer to those papers for additional references related to the history of this method for solving numerically quasi-linear equations. It was also proposed and analyzed to solve total variation minimization problems in image processing, which can be formally related to the 1-Laplace differential operator in [6, 27].

Unfortunately, the results obtained in these aforementioned papers cannot be applied straightforwardly to justify the convergence of the Kačanov iteration for equations involving the p-Laplace operator. In particular, to obtain quantitative estimates of convergence with precise rates, as we do in this paper, one needs to employ several finer tools, which have been explored in, e.g., [2, 11, 12, 24], precisely to handle singularities in nonlinear differential operators such as the p-Laplacian. In particular, the theory of N-functions, Orlicz spaces [19], shifted N-functions [11] and Lipschitz truncations, see [13] and [4] have been used systematically in the analysis of such nonlinear operators, allowing the development of a potential theory analogous to the one known of linear equations.

Besides these tools from nonlinear potential theory, the variational formulation of the algorithm, as introduced first in [6], and further used to analyze other related iteratively re-weighted least squares algorithms [10, 16], offers the right framework for the analysis also of the p-Kačanov iteration.

Taking inspiration from [6, 10], in Sect. 2 we provide the variational derivation of this algorithm based on the alternating minimization of a relaxed energy with two parameters.

If we apply the algorithm with fixed relaxation parameter \(\epsilon \) independent on n, i.e. \(0<\epsilon _- \leqslant \epsilon _+ < \infty \), then our iterates \(v_n\) converge to the unique minimizer \(u_\epsilon \) of another one-parameter relaxed energy \(\mathcal {J}_\epsilon \). We study this limit in Sect. 4 and present (linear) exponential rates of convergence.

In Sect. 3 we study how the minimizers \(u_\epsilon \) of the relaxed energy \(\mathcal {J}_\epsilon \) converge to the minimizer u of the original problem. This convergence can also be interpreted as a limit in the sense of \(\Gamma \)-convergence [3, 9]. Differently, e.g., from [6], we use a novel argument based on the Lipschitz truncation technique to establish a recovery sequence for the \(\Gamma -\lim \sup \). In particular, thanks to the finer tools mentioned above, we can go beyond a pure compactness argument as provided by the \(\Gamma \)-limit and derive precise rates of convergence depending on \(\epsilon \).

Finally, in Sect. 5 we combine the estimates of the two previous sections to deduce an overall error analysis with algebraic rates.

2 Variational formulation of the algorithm

In this section we show that the algorithm can be deduced from an alternating minimization of a relaxed energy. Recall that \(1 < p \leqslant 2\) throughout this article. Since the case \(p=2\) is just the standard Laplace problem, it suffices in the following to consider the case \(1<p<2\) only.

Let us introduce some standard notation. We use \(W^{1,p}(\Omega )\) and \(W^{1,p}_0(\Omega )\) for the Sobolev space without and with zero boundary values. We use c for a generic positive constant whose value may change from line to line. We use \(f \lesssim g\) for \(f \leqslant c\, g\). We also write \(f \eqsim g\) for \(f \lesssim g\) and \(g \lesssim f\).

The most important feature of the algorithm is that it only needs to solve linear sub-problems, which carry their own energy depending on the weight. Therefore, very much inspired by the work [6, 10] and with appropriate adjustments, we extend the energy by an additional parameter \(a \,:\, \Omega \rightarrow [0,\infty )\) such that the new functional is quadratic with respect to v. In particular, we define

$$\begin{aligned} \mathcal {J}(v,a) := \int \limits _\Omega \tfrac{1}{2} a^{p-2}|\nabla v|^2 + \left( \tfrac{1}{p} - \tfrac{1}{2}\right) a^p\, dx - \langle f,v\rangle . \end{aligned}$$

This energy is well-defined for all \(v \in W^{1,p}_0(\Omega )\) and measurable \(a\,:\, \Omega \rightarrow [0,\infty )\) but might take the value \(\infty \).

This relaxed energy is convex with respect to (va). This follows from the fact that \(\beta (t,a) := \frac{1}{2} a^{p-2}t^2\) is convex on \([0,\infty )^2\), since

$$\begin{aligned} (\nabla ^2 \beta )(t,a) = \begin{pmatrix} a^{p-2} &{} (p-2)a^{p-3}t \\ (p-2)a^{p-3}t &{}\quad \frac{1}{2} (p-2)(p-3)a^{p-4}t^2 \end{pmatrix} \end{aligned}$$

is nonnegative definite as \(a^{p-2}\geqslant 0\) and \(\det ((\nabla ^2 \beta )(t,a))=a^{2p-6} t^2 (2-p)(p-1)\geqslant 0\). Notice that in the latter lower bound we specifically used \(1 < p \leqslant 2\).

Remark 1

If \(p>2\), then the relaxed energy \(\mathcal {J}(v,a)\) is neither bounded from below nor convex with respect to a. Therefore, the algorithm derived below using the minimization with respect to a does not lead to a feasible problem for \(p>2\). See also Remark 21.

Note that \(\mathcal {J}(v,a)\) (for fixed a) is quadratic with respect to v and a minimization with respect to v leads formally to the elliptic equation

$$\begin{aligned} -{\text {div}}(a^{p-2} \nabla v)&= f, \end{aligned}$$

see (1.4) for its weak form.

Unfortunately, the ellipticity of this system degenerates for \(a(x)\rightarrow 0\) and \(a(x) \rightarrow \infty \). To overcome this problem we restrict the minimization with respect to a (for fixed v) to functions with values within a relaxation interval \([\epsilon _-, \epsilon _+] \subset (0,\infty )\), i.e. \(\epsilon _- \leqslant a(x) \leqslant \epsilon _+\). This minimization with respect to a (for fixed v) has a simple solution, namely

$$\begin{aligned} \mathop {\mathrm {arg\,min}}_{a\,:\,\epsilon _- \leqslant a \leqslant \epsilon _+} \mathcal {J}(v,a) = \epsilon _- \vee {|{\nabla v}|} \wedge \epsilon _+, \end{aligned}$$
(2.1)

where \(\vee \) denotes the maximum and \(\wedge \) the minimum, since

$$\begin{aligned} \tfrac{\partial }{\partial a} \Big ( \tfrac{1}{2} a^{p-2}|\nabla v|^2 + (\tfrac{1}{p} - \tfrac{1}{2})a^p \Big ) = \tfrac{2-p}{2} a^{p-3} (a^2 - {|{\nabla v}|}^2). \end{aligned}$$

This allows us to define for fixed \(\epsilon = [\varepsilon _-,\varepsilon _+] \subset [0,\infty ]\) another relaxed energy

$$\begin{aligned} \mathcal {J}_\epsilon (v) := \mathcal {J}\left( v,\varepsilon _-\vee \vert \nabla v\vert \wedge \varepsilon _+\right) = \min _{a\,:\,\epsilon _- \leqslant a \leqslant \epsilon _+} \mathcal {J}(v,a). \end{aligned}$$
(2.2)

This immediately implies that the relaxed energy \(\mathcal {J}_\epsilon (v)\) is monotonically decreasing with respect to \(\epsilon \), i.e., an increasing interval \(\epsilon \) in terms of inclusion decreases the energy \(\mathcal {J}_\epsilon (v)\).

This new relaxed energy \(\mathcal {J}_\epsilon \) somehow “hides” the constrained minimization with respect to a. We can write \(\mathcal {J}_\epsilon \,:\, W^{1,p}_0(\Omega ) \rightarrow \mathbb {R}\cup {\{{\infty }\}}\) explicitly as

$$\begin{aligned} \mathcal {J}_\epsilon (v) = \int \limits _\Omega \kappa _\epsilon (\vert \nabla v\vert )\, dx - \langle f,v\rangle \end{aligned}$$

with \(\kappa _\epsilon \,:\,\mathbb {R}_{\geqslant 0}\rightarrow \mathbb {R}\) given by

$$\begin{aligned} \kappa _\epsilon (t)&:= {\left\{ \begin{array}{ll} \frac{1}{2}\varepsilon _-^{p-2}t^2 + (\frac{1}{p}-\frac{1}{2})\varepsilon _-^p &{} \text {for }t\leqslant \varepsilon _-\\ \frac{1}{p}t^p &{} \text {for } \varepsilon _-\leqslant t\leqslant \varepsilon _+\\ \frac{1}{2}\varepsilon _+^{p-2}t^2 + (\frac{1}{p}-\frac{1}{2})\varepsilon _+^{p} &{} \text {for } t\geqslant \varepsilon _+. \end{array}\right. } \end{aligned}$$

Note that \(\frac{1}{p} t^p \leqslant \kappa _\epsilon (t)\) for all \(t \geqslant 0\) and \(\frac{1}{p} t^p = \lim _{\epsilon \rightarrow [0,\infty ]} \kappa _\epsilon (t)\) for all \(t \geqslant 0\). Since \(\kappa _\epsilon (t) \eqsim \epsilon _+^{p-2} t^2\) for large t, we see that \(\mathcal {J}_\epsilon (v) < \infty \) if and only if \(v \in W^{1,2}_0(\Omega )\). Moreover, \(\lim _{\epsilon \rightarrow [0,\infty ]} \mathcal {J}_\epsilon (v) = \mathcal {J}(v)\) for all \(v \in W^{1,2}_0(\Omega )\) and \(\mathcal {J}(v) \leqslant \liminf _{\epsilon \rightarrow [0,\infty ]} \mathcal {J}_\epsilon (v)\) for all \(v \in W^{1,p}_0(\Omega )\).

Based on the above observations it is natural to iteratively minimize \(\mathcal {J}(v,a)\) alternating between v and a. Certainly, we have also to increase the relaxation interval \(\epsilon \). Thus our algorithm reads as follows:

figure b

This is just the algorithm given in the introduction written in different form.

3 Convergence in the relaxation parameter

In this section we show that the minimizers \(u_\epsilon \) of the relaxed energy \(\mathcal {J}_\epsilon \) converge to the minimizer u of \(\mathcal {J}\) for \(\epsilon \rightarrow [0,\infty ]\) and derive an upper bound for the relaxation error.

Since \(\mathcal {J}_\epsilon (v) \geqslant \mathcal {J}(v)\) and \(\mathcal {J}\) is \(W^{1,p}_0(\Omega )\) coercive, it follows that \(\mathcal {J}_\epsilon \) is also coercive in \(W^{1,p}_0(\Omega )\). However, \(\mathcal {J}_\epsilon (v)<\infty \) requires \(v \in W^{1,2}_0(\Omega )\), as we have seen above. Certainly, there is a gap between the space \(W^{1,p}_0(\Omega )\) and \(W^{1,2}_0(\Omega )\). To close this gap we need a finer analysis of the energies, which requires the use of Orlicz spaces. We state in the following some standard results for these spaces, see for Example [19].

A function \(\phi :\mathbb {R}_{\geqslant 0}\rightarrow \mathbb {R}\) is called an N-function if and only if there is a right-continuous, positive on the positive real line, and non-decreasing function \(\phi ':\mathbb {R}_{\geqslant 0}\rightarrow \mathbb {R}\) with \(\phi '(0)=0\) and \(\lim _{t \rightarrow \infty } \phi '(t)=\infty \) such that \(\phi (t) = \int _0^{t}\phi '(\tau )\, d\tau \). An N-function is said to satisfy the \(\Delta _2\)-condition if and only if there is a constant \(c>1\) such that \(\phi (2t)\leqslant c\,\phi (t)\). For an N-function satisfying the \(\Delta _2\)-condition we define the Orlicz space to consist of those functions \(v \in L_{{\mathrm {loc}}}^1(\Omega )\) with \(\int _\Omega \phi (|v|)\, dx<\infty \). It becomes a Banach space with the norm \({\Vert {f}\Vert }_\phi := \inf {\{{\gamma >0\,:\, \int _\Omega \phi ({|{v}|}/\gamma )\,dx\leqslant 1}\}}\). The Orlicz-Sobolev space \(W^{1,\phi }(\Omega )\) then consists of those \(v \in L^\phi \) such that the weak derivative \(\nabla v\) is also in \(L^\phi \), equipped with the norm \({\Vert {v}\Vert }_\phi + {\Vert {\nabla v}\Vert }_\phi \). The space \(W^{1,\phi }_0(\Omega )\) denotes the subspace of those functions from \(W^{1,\phi }(\Omega )\) with zero boundary trace, which coincides with the closure of \(C^\infty _0(\Omega )\) in \(W^{1,\phi }(\Omega )\). For example choosing \(\phi (t):=\tfrac{1}{p} t^p\) we have \(L^\phi (\Omega ) = L^p(\Omega )\) and \(W^{1,\phi }_0(\Omega ) = W^{1,p}_0(\Omega )\).

The function \(\kappa _\epsilon \) cannot be an N-function, since \(\kappa _\epsilon (0)\ne 0\), . However, if we define

$$\begin{aligned} \phi _\epsilon (t) := \kappa _\epsilon (t) - \kappa _\epsilon (0), \end{aligned}$$
(3.1)

then \(\phi _\epsilon \) is actually an N-function. It can be verified that \(\phi _\epsilon \) satisfies the \(\Delta _2\)-condition with a constant independent of \(\epsilon \).

Since \(\phi _\epsilon (t) \eqsim \epsilon _+^{p-2} t^2\) for large t and \(\Omega \) is bounded, we have \(L^{\phi _\epsilon }(\Omega ) \eqsim L^2(\Omega )\). However, the constant of the embedding \(L^{\phi _\epsilon }(\Omega ) \hookrightarrow L^2(\Omega )\) depends on \(\epsilon \), so this equivalence is not of much use. Instead we use the chain of embeddings

$$\begin{aligned} L^2(\Omega ) \hookrightarrow L^{\phi _\epsilon }(\Omega ) \hookrightarrow L^p(\Omega ), \end{aligned}$$
(3.2)

with constants independent of \(\epsilon \). This follows from the fact that the Simonenko indices of \(\phi _\epsilon \) are within [p, 2]. We refer the reader to, e.g., [28, Chapter 2] for the details.

Since \(\phi _\epsilon \) is strictly convex and \(\kappa _\epsilon (t) = \phi _\epsilon (t) + \kappa _\epsilon (0)\), the energy \(\mathcal {J}_\epsilon \) admits a unique minimizer \(u_\epsilon \in W_0^{1,\phi _\epsilon }(\Omega )\) whose Euler-Lagrange equation is

$$\begin{aligned} \int _\Omega (\varepsilon _- \vee |\nabla u_\epsilon | \wedge \varepsilon _+)^{p-2}\nabla u_\epsilon \cdot \nabla \xi \, dx = \langle f,\xi \rangle \qquad \forall \xi \in W_0^{1,\phi _\epsilon }(\Omega ). \end{aligned}$$
(3.3)

At this we used that

$$\begin{aligned} \frac{\phi _\epsilon '(t)}{t}&= (\varepsilon _- \vee t \wedge \epsilon _+)^{p-2}. \end{aligned}$$
(3.4)

Remark 2

Let us consider the special case \(\varepsilon _+=\infty \). Then, the derivative of the truncated function reads \(\phi _\epsilon '(t) = (\varepsilon _-\vee t)^{p-2} t\). A different modification, namely \((\varepsilon _-+ t)^{p-2} t\), would lead to the so-called shifted N-function of \(\frac{1}{p}t^p\), as introduced in more generality in [11], which has similar properties as our truncation functions. However, the version from this paper is more suitable for our energy relaxation, since it is closer to the original function \(\frac{1}{p} t^p\) on the truncation interval \(\epsilon \) (the derivatives agree there). See the Appendix for more information on uniformly convex Orlicz functions and their shifted verions.

Lemma 3

The functions \(\phi \) and \(\phi _\epsilon \) are uniformly convex Orlicz functions in the sense of Sect. B from the Appendix. The convexity constants are independent of \(\epsilon \).

Proof

The uniform convexity of \(\phi \) follows from \(\frac{\phi ''(t)\,t}{\phi '(t)}= (p-1)\) and Lemma (B.2). Now, for \(t \in (\epsilon _-,\epsilon _+)\) we have \(\frac{\phi _\epsilon ''(t)\,t}{\phi \epsilon '(t)}= (p-1)\), while for \(t \in (0,\infty ) \setminus [\epsilon _-,\epsilon _+]\) we have \(\frac{\phi _\epsilon ''(t)\,t}{\phi _\epsilon '(t)} = 1\). Hence, the claim for \(\phi _\epsilon \) follows again by Lemma (B.2). \(\square \)

Since \(W^{1,p}_0(\Omega )\) is the largest space, see (3.2), which contains both \(u_\epsilon \) and u, it is natural to consider all energies \(\mathcal {J}\) and \(\mathcal {J}_\epsilon \) as functionals on \(W^{1,p}_0(\Omega )\) with possible value \(\infty \).

Let us recall that the goal of this section is to show that \(u_\epsilon \) converges to u in \(W^{1,p}_0(\Omega )\). Since \(W^{1,p}_0(\Omega )\) is uniformly convex, strong convergence is a consequence of weak convergence and norm convergence, or equivalently, in this case, energy convergence \(\mathcal {J}(u_\epsilon ) \rightarrow \mathcal {J}(u)\). It is possible to show the weak convergence as well as that of the energy by means of \(\Gamma \)-convergence. Indeed, we will see in Remark 11 that \(\mathcal {J}_\epsilon \rightarrow \mathcal {J}\) in the sense of \(\Gamma \)-convergence. However, we will derive in the following much stronger results that provide us with a precise rate of convergence for the energies. This energy convergence implies strong convergence of the sequence, see the proof of Corollary 10.

Let us turn to the convergence of the energies \(\mathcal {J}(u_\epsilon ) \rightarrow \mathcal {J}(u)\) for \(\epsilon \rightarrow [0,\infty ]\). Since \(\mathcal {J}_\epsilon \) is monotonically decreasing with respect to \(\epsilon \), it follows from the minimizing properties of u and \(u_\epsilon \) that

$$\begin{aligned} 0&\leqslant \mathcal {J}(u_\epsilon ) - \mathcal {J}(u) \leqslant \mathcal {J}_\epsilon (u_\epsilon ) - \mathcal {J}(u). \end{aligned}$$
(3.5)

Therefore, it suffices to prove the stronger claim

(3.6)

In fact, we will later need this stronger estimate in the other sections.

It follows from the minimizing property of \(u_\epsilon \) that

$$\begin{aligned} \mathcal {J}_\epsilon (u_\epsilon ) - \mathcal {J}(u) \leqslant \mathcal {J}_\epsilon (u) - \mathcal {J}(u). \end{aligned}$$

So it would be natural to estimate \(\mathcal {J}_\epsilon (u) - \mathcal {J}(u)\) in terms of \(\epsilon \) and u. However, the solution u is unfortunately a priori only a \(W^{1,p}_0\)-function, so \(\mathcal {J}_\epsilon (u)\) might be infinity. Hence, we cannot assure that this difference is small. This is only possible if we assume higher regularity of u. In order to treat arbitrary right-hand sides \(f \in (W^{1,p}_0(\Omega ))^*\) at this point, we have to use a much more subtle argument. For this we need a result from [13, Subsection 3.5] and [4, Theorem 2.7], which allows to change u on a small set such that it becomes a Lipschitz function. This technique is known as the Lipschitz truncation technique. Its origin goes back to [1]. As a tool we need the Hardy-Littlewood operator, e.g. [25],

where \({|{B_r(x)}|}\) denote the Lebesgue measure of \(B_r(x)\).

Theorem 4

[Lipschitz trunction [4, 13]] Let \(v\in W_0^{1,p}(\Omega )\) and for all \(\lambda >0\) define

$$\begin{aligned} \mathcal {O}_\lambda := \mathcal {O}_\lambda (v) := \{x\in \Omega \,:\, M(\nabla v)(x)>\lambda \}, \end{aligned}$$

Then, there exists an approximation \(T_\lambda v\in W_0^{1,\infty }(\Omega )\) of v with the following properties:

  1. (a)

    \(\{v\ne T_\lambda v\}\subset \mathcal {O}_\lambda \).

  2. (b)

    \(\Vert T_\lambda v\Vert _{L^p(\Omega )}\lesssim \Vert v\Vert _{L^p(\Omega )}\).

  3. (c)

    \(\Vert \nabla T_\lambda v\Vert _{L^p(\Omega )}\lesssim \Vert \nabla v\Vert _{L^p(\Omega )}\).

  4. (d)

    \(\vert \nabla T_\lambda v\vert \lesssim \lambda \chi _{\mathcal {O}_\lambda } + \vert \nabla v\vert \chi _{\Omega \setminus \mathcal {O}_\lambda } \leqslant \lambda \) almost everywhere.

  5. (e)

    \({\Vert {\nabla (v-T_\lambda v)}\Vert }_{L^p(\Omega )} \lesssim {\Vert {\nabla v}\Vert }_{L^p(\mathcal {O}_\lambda )}\).

  6. (f)

    \(v_\lambda \rightarrow v\) in \(W^{1,p}_0(\Omega )\) as \(\lambda \rightarrow \infty \).

All our convergence results concerning the relaxation parameter \(\varepsilon \) are based on the following result, which shows how the energy relaxation depends on the truncation interval \(\varepsilon \).

Theorem 5

The estimate

$$\begin{aligned} \mathcal {J}_\epsilon (u_\varepsilon ) - \mathcal {J}(u) \lesssim \varepsilon _-^p + \int \limits _{\mathcal {O}_\lambda (u)}|\nabla u|^p \, dx \end{aligned}$$
(3.7)

holds for all \(\lambda \leqslant \varepsilon _+/c_1\), where \(c_1\) is the (hidden) constant from Theorem 4 (d).

Proof

Let \(\lambda \leqslant \varepsilon _+/c_1\) and let \(T_\lambda u\) be the Lipschitz truncation of u. Then

$$\begin{aligned} {|{\nabla T_\lambda u}|} \leqslant c_1\,\lambda \leqslant \varepsilon _+. \end{aligned}$$

Using the minimizing property of \(u_\varepsilon \) and the equation for u we get

$$\begin{aligned} \mathcal {J}_\epsilon (u_\varepsilon ) {-} \mathcal {J}(u)\leqslant & {} \mathcal {J}_\epsilon (T_\lambda u) - \mathcal {J}(u) = \int \limits _\Omega \left( \kappa _\epsilon (|\nabla T_\lambda u|) {-} \tfrac{1}{p} |\nabla u|^p \right) \, dx - \langle f,T_\lambda u - u\rangle \\= & {} \int \limits _\Omega \left( \kappa _\epsilon (|\nabla T_\lambda u|){-} \tfrac{1}{p} |\nabla u|^p \right) \, dx - \int _\Omega |\nabla u|^{p-2}\nabla u\cdot \nabla (T_\lambda u {-} u) \, dx. \end{aligned}$$

Using \({|{\nabla T_\lambda u}|} \leqslant \varepsilon _+\), \(\kappa _\epsilon (t)=\tfrac{1}{p} t^p\) for \(t \in [\varepsilon _-,\varepsilon _+]\), \(\kappa _\epsilon (t) \leqslant \tfrac{1}{p} \varepsilon _-^p\) for \(t \in [0, \varepsilon _-]\), and  \(T_\lambda u=u\) outside of \(\mathcal {O}_\lambda \) we get

$$\begin{aligned} \kappa _\epsilon (|\nabla T_\lambda u|) - \tfrac{1}{p} |\nabla u|^p&\leqslant {\left\{ \begin{array}{ll} \frac{1}{p} \varepsilon _-^p &{}\qquad \text {on } {\{{{|{\nabla T_\lambda u}|} \leqslant \varepsilon _-}\}}, \\ 0 &{}\qquad \text {on } \big (\Omega \setminus \mathcal {O}_\lambda \big ) \cap {\{{{|{\nabla T_\lambda u}|}> \varepsilon _-}\}}, \\ \frac{1}{p} {|{\nabla T_\lambda u}|}^p &{}\qquad \text {on } \mathcal {O}_\lambda \cap {\{{{|{\nabla T_\lambda u}|} > \varepsilon _-}\}}. \end{array}\right. } \end{aligned}$$

This, the previous estimate and Theorem 4 (e) imply

$$\begin{aligned} \mathcal {J}_\epsilon (u_\varepsilon ) - \mathcal {J}(u)\leqslant & {} {|{\Omega }|}\,\tfrac{1}{p} \varepsilon _-^p + \int \limits _{\mathcal {O}_\lambda (u)} \tfrac{1}{p} |\nabla T_\lambda u|^p \, dx + \int \limits _{\mathcal {O}_\lambda (u)} |\nabla u|^{p-1} {|{\nabla (T_\lambda u - u)}|}\, dx \\\lesssim & {} \varepsilon _-^p {+} \int \limits _{\mathcal {O}_\lambda (u)} |\nabla T_\lambda u|^p \, dx + \int \limits _{\mathcal {O}_\lambda (u)} |\nabla u|^p \, dx {+} \int \limits _{\mathcal {O}_\lambda (u)} |\nabla (T_\lambda u-u)|^p \, dx \\\lesssim & {} \varepsilon _-^p + \int \limits _{\mathcal {O}_\lambda (u)} |\nabla u|^p \, dx. \end{aligned}$$

This proves the claim. \(\square \)

Corollary 6

\(\mathcal {J}_\epsilon (u_\varepsilon ) \rightarrow \mathcal {J}(u)\) and \(\mathcal {J}(u_\varepsilon ) \rightarrow \mathcal {J}(u)\) as \(\varepsilon \rightarrow [0,\infty ]\).

Proof

Due to (3.5) it suffices to prove \(\mathcal {J}_\epsilon (u_\varepsilon ) \rightarrow \mathcal {J}(u)\). Consider the right-hand side of (3.7) with \(\lambda := \varepsilon _+/c_1\). The first term goes to zero as \(\varepsilon _-\rightarrow 0\). Now consider the second term. Since \(\mathcal {O}_\lambda (u) \subset {\{{M(\nabla u) > \lambda }\}}\) and \(\nabla u\in L^p(\Omega )\) we get by the weak \(L^p\)-estimate of the maximal operator \(|\mathcal {O}_\lambda (u)| \lesssim \lambda ^{-p} {\Vert {\nabla u}\Vert }_p^p\). Therefore \(|\mathcal {O}_\lambda (u)|\rightarrow 0\) as \(\varepsilon _+\rightarrow \infty \), which implies \(\int _{\mathcal {O}_\lambda (u)} |\nabla u|^p \, dx \rightarrow 0\) as \(\varepsilon _+\rightarrow \infty \). \(\square \)

Before we continue we need the following natural quantities, see [11].

Definition 7

For \(P\in \mathbb {R}^d\) we define

$$\begin{aligned} A_\epsilon (P):= {\left\{ \begin{array}{ll}\frac{\phi _\epsilon '(\vert P\vert )}{\vert P\vert }P &{}\text {if } P\ne 0\\ 0 &{}\text {if }P=0\end{array}\right. }\quad \text {and}\quad V_\epsilon (P):={\left\{ \begin{array}{ll}\sqrt{\frac{\phi _\epsilon '(\vert P\vert )}{\vert P\vert }}P&{}\text {if }P\ne 0\\ 0&{}\text {if }P=0.\end{array}\right. } \end{aligned}$$

Moreover, by \(A := A_{[0,\infty ]}\) and \(V := V_{[0,\infty ]}\) we denote the unrelaxed versions.

The following two lemmas are modifications of similar results of [11, Lemma 3] and [12, Lemma 16]. In fact, they follow from the properties of uniformly convex Orlicz functions; see Sect. B from the Appendix for more details.

Lemma 8

For \(P,Q\in \mathbb {R}^d\)

$$\begin{aligned} (A_\epsilon (P)-A_\epsilon (Q))\cdot (P-Q)\eqsim \frac{\phi _\epsilon '(|P|\vee |Q|)}{|P|\vee |Q|}\vert P-Q\vert ^2 \eqsim {|{V_\epsilon (P) - V_\epsilon (Q)}|}^2. \end{aligned}$$

where the constants can be chosen independently of \(\varepsilon \).

Proof

This follows directly from the uniform convexity of \(\phi _\epsilon \), see Lemma 3, Lemma 41 and Lemma 40. \(\square \)

Lemma 9

The following estimates hold for arbitrary \(v\in W_0^{1,\phi _\epsilon }(\Omega )\) and \(u_\varepsilon \) being the minimizer of \(\mathcal {J}_\epsilon \):

$$\begin{aligned} \mathcal {J}_\epsilon (v)-\mathcal {J}_\epsilon (u_\varepsilon )\leqslant & {} \int \limits _\Omega (A_\epsilon (\nabla v)-A_\epsilon (\nabla u_\varepsilon ))\cdot \nabla (v-u_\varepsilon )\, dx \\\lesssim & {} \int \limits _\Omega |V_\epsilon (\nabla v)-V_\epsilon (\nabla u_\varepsilon )|^2\, dx \\\lesssim & {} \mathcal {J}_\epsilon (v)-\mathcal {J}_\epsilon (u_\varepsilon ) . \end{aligned}$$

In particular, for the case where \(\varepsilon =[0,\infty ]\) the statement actually implies also

$$\begin{aligned} \mathcal J(v)-\mathcal J(u)&\eqsim \int \limits _\Omega |V(\nabla v)-V(\nabla u)|^2\, dx. \end{aligned}$$

Proof

This is just Lemma 42 applied to \(\phi _\epsilon \). \(\square \)

We are now prepared to show the convergence of minimizers \(u_\epsilon \) of \(\mathcal {J}_\epsilon \) to u.

Corollary 10

\(u_\epsilon \rightarrow u\) in \(W^{1,p}_0(\Omega )\) as \(\varepsilon \rightarrow [0,\infty ]\).

Proof

Due to Corollary 6 we have \(\mathcal {J}(u_\epsilon ) - \mathcal {J}(u) \rightarrow 0\) as \(\epsilon \rightarrow [0,\infty ]\). Now Lemma 9 for the case where \(\epsilon =[0,\infty ]\) and \(v =u_\varepsilon \) implies \(V(\nabla u_\epsilon ) \rightarrow V(\nabla u)\) in \(L^2(\Omega )\). It follows from the shift-change-lema, see Corollary 44 or [12, Corollary 26], that for all \(\delta >0\) there exists \(c_\delta >0\) such that

$$\begin{aligned} {|{\nabla u - \nabla u_\epsilon }|}^p&\leqslant c_\delta {|{V(\nabla u_\epsilon ) - V(\nabla u)}|}^2 + \delta {|{\nabla u}|}^p. \end{aligned}$$

This and \(V(\nabla u_\epsilon ) \rightarrow V(\nabla u)\) in \(L^2(\Omega )\) implies \(\nabla u_\epsilon \rightarrow \nabla u\) in \(L^p(\Omega )\). \(\square \)

Remark 11

It is also possible to deduce \(\mathcal {J}_\epsilon (u_\epsilon ) \rightarrow \mathcal {J}(u)\) and \(u_\epsilon \rightarrow u\) in \(W^{1,p}_0(\Omega )\) by means of \(\Gamma \)-convergence: As the underlying topological space we choose \(W^{1,p}_0(\Omega )\) equipped with the weak topology. Then the Lipschitz truncation provides a recovery sequence for \(v \in W^{1,p}_0(\Omega )\) implying \(\mathop {\Gamma \text {-}\lim }\nolimits _{\epsilon \rightarrow [0,\infty ]} \mathcal {J}_\epsilon = \mathcal {J}\). Indeed, it follows as in the proof of Theorem 5 that for all \(v \in W^{1,p}_0(\Omega )\)

$$\begin{aligned} {\big |{\mathcal {J}_\epsilon (T_{\varepsilon _+/c_1} v) - \mathcal {J}(v)}\big |} \lesssim \varepsilon _-^p + \int _{\mathcal {O}_{\varepsilon _+/c_1}(v)} {|{\nabla v}|}^p\,dx + {\big |{ {\langle {f}, {T_{\varepsilon _+/c_1} v -v} \rangle }}\big |}. \end{aligned}$$

So the properties of the Lipschitz truncation, see Theorem 4 (f), imply that the right-hand side goes to zero as \(\epsilon \rightarrow [0,\infty ]\). Hence, \(T_{\varepsilon _+/c_1} v\) is a recovery sequence of v.

Moreover, \(\mathcal {J}_\epsilon \geqslant \mathcal {J}\), so the standard theory of \(\Gamma \)-convergence [3, 9] proves \(u_\epsilon \rightharpoonup u\) in \(W^{1,p}_0(\Omega )\) and \(\mathcal {J}_\epsilon (u_\epsilon ) \rightarrow \mathcal {J}(u)\) for \(\epsilon \rightarrow [0,\infty ]\). The uniform convexity of \(W^{1,p}_0(\Omega )\) implies \(u_\epsilon \rightarrow u\) in \(W^{1,p}_0(\Omega )\).

To our knowledge this is the first time that the Lipschitz truncation is used to construct a recovery sequence for the \(\Gamma -\lim \sup \) in a \(\Gamma \)-convergence argument related to energies on \(W^{1,p}_0(\Omega )\).

Up to now, we discussed the convergence of \(u_\epsilon \rightarrow u\) without any additional assumptions on the data \(f \in (W_0^{1,p}(\Omega ))^*\) and the domain \(\Omega \). If f is more regular and \(\partial \Omega \) is suitably smooth, then we obtain specific rates for the convergence. The rates of convergence will follow from the regularity of \(\nabla u\) in terms of the weak-\(L^q\) spaces \(L^{q,\infty }(\Omega )\), which consists of all functions v such that

$$\begin{aligned} \Vert v\Vert _{L^{q,\infty }(\Omega )} := \sup \limits _{t>0} {\Vert {t\, \chi _{{\{{{|{v}|}>t}\}}}}\Vert }_{L^q(\Omega )} < \infty . \end{aligned}$$

Lemma 12

Let \(\nabla u\in L^{q,\infty }(\Omega )\) for some \(q>p\). Then,

$$\begin{aligned} \mathcal {J}_\epsilon (u_\varepsilon )- \mathcal {J}(u) \lesssim \varepsilon _-^p + \varepsilon _+^{-(q-p)}\,\Vert \nabla u\Vert _{L^{q,\infty }(\Omega )}^q. \end{aligned}$$

Proof

First note that \(M:L^{q,\infty }(\Omega )\rightarrow L^{q,\infty }(\Omega )\) is bounded. This follows for example by extrapolation theory, see [8, Theorem 1.1]. In particular,

$$\begin{aligned} \lambda {|{\mathcal {O}_\lambda (u)}|}^{\frac{1}{q}} \eqsim {\Vert {\lambda \,\chi _{{\{{M(\nabla u)> \lambda }\}}}}\Vert }_{L^q(\Omega )} \leqslant {\Vert {M(\nabla u)}\Vert }_{L^{q,\infty }(\Omega )} \lesssim {\Vert {\nabla u}\Vert }_{L^{q,\infty }(\Omega )}. \end{aligned}$$

Moreover, let \(L^{q,1}(\Omega )\) denote the usual Lorentz space, which consists of functions v such that

$$\begin{aligned} \Vert v\Vert _{L^{q,1}(\Omega )} := q\int \limits _0^\infty |\{|v|> t\}|^\frac{1}{q}\, dt = q \int \limits _0^\infty {\Vert {t\, \chi _{{\{{{|{v}|}>t}\}}}}\Vert }_{L^q(\Omega )} \frac{dt}{t} < \infty . \end{aligned}$$

Since \((L^{s',1})^* =L^{s,\infty }\) for \(1<s<\infty \) and \(\frac{1}{s}+\frac{1}{s'}=1\) we obtain

where \({|{\mathcal {O}_\lambda }|}\) denotes the Lebesgue measure of \(\mathcal {O}_\lambda \). Applying Theorem 5 with \(\lambda := \varepsilon _+/c_1\) yields the statement. \(\square \)

To exemplify the consequences of Lemma 12 we combine it with the regularity results of [7, 14]:

Theorem 13

([7], Theorems 1.3 and 1.4) Let \(\Omega \subset \mathbb {R}^d\) be convex or let its boundary \(\partial \Omega \in W^2 L^{d-1,1}\) (for example \(\partial \Omega \in C^2\) sufficesFootnote 2) and additionally \(f\in L^{d,1}(\Omega )\). Then \(\nabla u \in L^\infty (\Omega )\).

Theorem 14

([14], (4.3)) Let \(\Omega \) be a polyhedral domain where the inner angle is strictly less than \(2\pi \) and \(f\in L^{p'}(\Omega )\) and \(\frac{1}{p} + \frac{1}{p'}=1\). Then \(\nabla u \in L^{\frac{pd}{d-1},\infty }(\Omega )\).

Proof

Actually, it is proven in [14] (4.3) that \({|{\nabla u}|}^{\frac{p}{2}} \in \mathcal {N}^{\frac{1}{2},2}(\Omega )\) (Nikolskiĭ space). Now, one can use the embedding \(\mathcal {N}^{\frac{1}{2},2}(\Omega )\hookrightarrow L^{\frac{2d}{d-1},\infty }(\Omega )\) of Lemma 26 in the Appendix. \(\square \)

Corollary 15

Under the assumptions of Theorem 13 we have

$$\begin{aligned} \mathcal {J}_\epsilon (u_\varepsilon )- \mathcal {J}(u) \lesssim \varepsilon _-^p. \end{aligned}$$

Proof

Since \(\nabla u \in L^\infty (\Omega )\), we have \(M(\nabla u)\in L^\infty (\Omega )\) and so for \(\lambda :=\varepsilon _+/c_1\) and \(\varepsilon _+\) large enough, \(\mathcal {O}_\lambda (u)=\emptyset \). Hence, Theorem 5 implies the estimate. \(\square \)

Corollary 16

Under the assumptions of Theorem 14 we have

$$\begin{aligned} \mathcal {J}_\epsilon (u_\varepsilon )- \mathcal {J}(u) \lesssim \varepsilon _-^p + \varepsilon _+^{-\frac{p}{d-1}} \end{aligned}$$

Proof

Since \(\nabla u\in L^{\frac{pd}{d-1},\infty }(\Omega )\), an application of Lemma 12 finishes the pf. \(\square \)

Remark 17

The choice \(f=0\) and hence \(u=0\) gives \(\mathcal {J}_\epsilon (u) = \kappa _\epsilon (0) {|{\Omega }|} \eqsim \varepsilon _-^p\). This shows that the estimate in Corollary 15 is sharp.

4 Convergence of the Kačanov-iteration

In this section we study the convergence of the Kačanov-iteration for fixed relaxation parameter \(\epsilon =[\varepsilon _-,\varepsilon _+]\). In particular, for \(v_0\in W_0^{1,2}(\Omega )\) arbitrary we calculate recursively \(v_{n+1}\) by

$$\begin{aligned} \int \limits _\Omega (\varepsilon _-\vee |\nabla v_n|\wedge \varepsilon _+)^{p-2}\nabla v_{n+1}\cdot \nabla \xi \, dx = \langle f,\xi \rangle \quad \forall \xi \in W_0^{1,2}(\Omega ). \end{aligned}$$
(4.1)

We will show that \(v_n\) converges to the minimizer \(u_\epsilon \) of the relaxed energy \(\mathcal {J}_\epsilon \). In particular, we show exponential decay of the energy error \(\mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\epsilon )\). The proof is based on the following estimate, proved below.

Theorem 18

There is a constant \(c_K>1\) such that

$$\begin{aligned} \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (v_{n+1}) \geqslant \delta \,(\mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon )) \end{aligned}$$

holds for \(\delta := \tfrac{1}{c_K} (\tfrac{\varepsilon _-}{\varepsilon _+})^{2-p}\).

This theorem says that in each iteration we reduce the energy by a certain part of the remaining energy error. This implies

$$\begin{aligned} \begin{aligned} \mathcal {J}_\epsilon (v_{n+1})-\mathcal {J}_\epsilon (u_\varepsilon )&= \big ( \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon ) \big ) - \big ( \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (v_{n+1}) \big ) \\&\leqslant (1-\delta )\, \big ( \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon ) \big ). \end{aligned} \end{aligned}$$
(4.2)

As a direct consequence we will obtain the following exponential convergence result.

Corollary 19

There is a constant \(c_K>1\) such that

$$\begin{aligned} \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon )\leqslant (1-\delta )^n(\mathcal {J}_\epsilon (v_0)-\mathcal {J}_\epsilon (u_\varepsilon )) . \end{aligned}$$

holds for \(\delta := \tfrac{1}{c_K} (\tfrac{\varepsilon _-}{\varepsilon _+})^{2-p}\).

Let us get to the proof of Theorem 18.

Proof (Proof of Theorem 18)

Using Lemma 9, the equation (3.3) for \(u_\varepsilon \), the equation (4.1) for \(v_{n+1}\), and Young’s inequality (see Remark 32) we get, for arbitrary \(\gamma > 0\),

$$\begin{aligned} \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon )\leqslant & {} \int \limits _\Omega (A_\epsilon (\nabla v_n)-A_\epsilon (\nabla u_\varepsilon ))\cdot \nabla (v_n-u_\varepsilon )\, dx \\= & {} \int \limits _\Omega \tfrac{\phi _\epsilon '(\vert \nabla v_n\vert )}{\vert \nabla v_n\vert }\nabla ( v_n - v_{n+1})\cdot \nabla (v_n-u_\varepsilon )\, dx \\\leqslant & {} \tfrac{1}{\gamma }\underbrace{ \tfrac{1}{2} \int \limits _\Omega \tfrac{\phi _\epsilon '(\vert \nabla v_n\vert )}{\vert \nabla v_n\vert }\vert \nabla ( v_n - v_{n+1})\vert ^2\, dx}_{=:I} \\&+ \gamma \,\underbrace{\tfrac{1}{2} \int \limits _\Omega \tfrac{\phi _\epsilon '(\vert \nabla v_n\vert )}{\vert \nabla v_n\vert }\vert \nabla (v_n-u_\varepsilon )\vert ^2\, dx}_{=:II} . \end{aligned}$$

Let us define

$$\begin{aligned} \mathcal {J}_\epsilon (v, a) := \mathcal {J}\left( v,\varepsilon _-\vee a\wedge \varepsilon _+\right) . \end{aligned}$$

For the first term I we calculate with the equation (4.1) for \(v_{n+1}\)

$$\begin{aligned} I= & {} \tfrac{1}{2} \int \limits _\Omega \tfrac{\phi _\epsilon '(\vert \nabla v_n\vert )}{\vert \nabla v_n\vert }\vert \nabla ( v_n - v_{n+1})\vert ^2\, dx \\= & {} \mathcal {J}_\epsilon (v_n,{|{\nabla v_n}|})-\mathcal {J}_\epsilon (v_{n+1},{|{\nabla v_n}|}) \\\leqslant & {} \mathcal {J}_\epsilon (v_n,{|{\nabla v_n}|})-\mathcal {J}_\epsilon (v_{n+1},{|{\nabla v_{n+1}}|}) \\= & {} \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (v_{n+1}). \end{aligned}$$

To establish the inequality above, we used the fact that

$$\begin{aligned} a_{n+1} = \varepsilon _-\vee {|{\nabla v_{n+1}}|}\wedge \varepsilon _+ = \mathop {\mathrm {arg\,min}}\limits _{a\,:\,\epsilon _- \leqslant a \leqslant \epsilon _+} \mathcal {J}(v_{n+1}, a), \end{aligned}$$

and

$$\begin{aligned} \mathcal {J}_\epsilon (v_{n+1},{|{\nabla v_{n+1}}|})= & {} \mathcal J(v_{n+1},\varepsilon _-\vee {|{\nabla v_{n+1}}|}\wedge \varepsilon _+) \\\leqslant & {} \mathcal J(v_{n+1},\varepsilon _-\vee {|{\nabla v_{n}}|}\wedge \varepsilon _+) =\mathcal {J}_\epsilon (v_{n+1},{|{\nabla v_n}|}). \end{aligned}$$

For the second term II we use \(\varepsilon _+^{p-2}\leqslant \tfrac{\phi _\epsilon '(t)}{t} \leqslant \varepsilon _-^{p-2}\), implying \(\tfrac{\phi _\epsilon '(t)}{t} \leqslant (\tfrac{\varepsilon _+}{\varepsilon _-})^{2-p}\tfrac{\phi _\epsilon '(s)}{s}\), for any \(s,t \geqslant 0\), Lemma 8 and Lemma 9 to get

$$\begin{aligned} II\leqslant & {} \tfrac{1}{2} (\tfrac{\varepsilon _+}{\varepsilon _-})^{2-p}\int \limits _\Omega \tfrac{\phi _\epsilon '(\vert \nabla v_n \vert \vee \vert \nabla u_\varepsilon \vert )}{\vert \nabla v_n \vert \vee \vert \nabla u_\varepsilon \vert }\vert \nabla (v_n-u_\varepsilon )\vert ^2\, dx \\\leqslant & {} c\, (\tfrac{\varepsilon _+}{\varepsilon _-})^{2-p}\,(\mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon )) . \end{aligned}$$

Putting all estimates together we get

$$\begin{aligned} \gamma (1-c\gamma (\tfrac{\varepsilon _+}{\varepsilon _-})^{2-p})\, ( \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon ))\leqslant \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (v_{n+1}). \end{aligned}$$

Now, \(\max _{\gamma >0} \gamma (1-c\gamma (\tfrac{\varepsilon _+}{\varepsilon _-})^{2-p}) = \tfrac{1}{4c} (\tfrac{\varepsilon _-}{\varepsilon _+})^{2-p}\) yields the statement. \(\square \)

Example 20

(Peak function) Let \(\Omega := B_1(0)\) and \(f(x)=-{\text {div}}(\tfrac{x}{|x|})\). Then \(f\notin L^1(\Omega )\) but still \(f \in (W_0^{1,1}(\Omega ))^*\subset (W_0^{1,p}(\Omega ))^*\) . Then the minimizer of \(\mathcal {J}\) is given by \(u(x) = 1-{|{x}|}\), which look like a peak. Since \(|\nabla u|\equiv 1\), the factor \({|{\nabla u}|}^{p-2}\) in the p-Laplace operator does not appear for the minimizer. So in this case u also minimizes every \(\mathcal {J}_\epsilon \) as long as \(\varepsilon _-\leqslant 1\) and \(\varepsilon _+\geqslant 1\). This follows from

$$\begin{aligned} \mathcal {J}_\epsilon (u) = \mathcal {J}(u)\leqslant \mathcal {J}(v)\leqslant \mathcal {J}_\epsilon (v) \end{aligned}$$

for all \(v\in W_0^{1,\phi _\epsilon }(\Omega )\).

Let us see how our algorithm performs with the starting value \(v_0:=0\). It is easy to see that \(v_n = \alpha _n u\) with

$$\begin{aligned} \alpha _0 := 0 \quad \text {and}\quad \alpha _{n+1}:=(\varepsilon _-\vee \alpha _n\wedge \varepsilon _+)^{2-p}. \end{aligned}$$
(4.3)

Since \(p\in (1,2)\) one can show \(\alpha _ n = \varepsilon _-^{(2-p)^n}\) by induction and

$$\begin{aligned} \mathcal {J}_\epsilon (v_n)-\mathcal {J}(u) p(\varepsilon _-^{(2-p)^n} -1)) . \end{aligned}$$

Note that

$$\begin{aligned} \tfrac{1}{p} t^p - \tfrac{1}{p} - (t-1) \leqslant \tfrac{p-1}{2} (\ln (t))^2 \qquad \text {for all }t \in (0,1]. \end{aligned}$$

Moreover,

$$\begin{aligned} \frac{\tfrac{1}{p} t^p - \tfrac{1}{p} - (t-1)}{\tfrac{p-1}{2} (\ln (t))^2}&\rightarrow 1 \qquad \text {as } t \nearrow 1. \end{aligned}$$
(4.4)

This estimate with \(s:=(2-p)^n \in (0,1]\) and \(t := \varepsilon _-^s \in (0,1]\) gives

$$\begin{aligned} \mathcal {J}_\epsilon (v_n)- \mathcal {J}(u)\leqslant \tfrac{1}{2}|B_1(0)|(p-1)\ln (\varepsilon _-)^2(2-p)^{2n} \end{aligned}$$
(4.5)

is sharp. Indeed, the energy differences \(\mathcal {J}(v_n)-\mathcal {J}(u) = \mathcal {J}_\epsilon (v_n)-\mathcal {J}(u)\) asymptotically behave like \(\tfrac{1}{2}|B_1(0)|(p-1)\ln (\varepsilon _-)^2(2-p)^{2n}\) for large n, in view of (4.4).

This shows that it is impossible to get an energy reduction as in (4.2) with \(\delta \) independent of \(\epsilon \). Indeed, Corollary 19 would imply

$$\begin{aligned} \mathcal {J}_\epsilon (v_n) - \mathcal {J}_\epsilon (u_\epsilon ) \leqslant (1-\delta )^n \big (\mathcal {J}_\epsilon (v_0) - \mathcal {J}_\epsilon (u_\epsilon )\big ) \leqslant (1-\delta )^n \mathcal {J}_1(v_0), \end{aligned}$$

which contradicts the above asymptotic estimate (4.5).

Nevertheless, our asymptotic shows that in this particular case

$$\begin{aligned} \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon )\leqslant c_\epsilon \,(1-\delta )^n \end{aligned}$$

with \(1-\delta = (2-p)^2 <1\) independent of \(\epsilon \). Therefore, it remains open if such an estimate holds in the general case.

Remark 21

Although our considerations are all under the assumption \(1<p\leqslant 2\) it is interesting to check how the algorithm performs in the case \(p> 2\) for our Example 20.

If \(p\geqslant 3\) and \(\varepsilon _+:= \tfrac{1}{\varepsilon _-}\) for some \(\varepsilon _-<1\), then it follows from (4.3) that

$$\begin{aligned} \alpha _0 = 0\quad \text {and}\quad \alpha _n = \varepsilon _-^{(-1)^n(p-2)} \quad \text { for } n\geqslant 1 . \end{aligned}$$

Therefore, the Kačanov iteration does not converge as \(p\geqslant 3\).

If \(p \in (2,3)\) and \(\varepsilon _+:= \tfrac{1}{\varepsilon _-}\), then it follows from (4.3) that

$$\begin{aligned} \alpha _0 = 0\quad \text {and}\quad \alpha _n = \varepsilon _-^{(2-p)^n} \quad \text { for } n\geqslant 1 \end{aligned}$$

and \(v_n\) still converges to u.

5 Algebraic rate

As we learned in the last section the Kačanov iteration converges for fixed \(\epsilon \), but the rate depends badly on the choice of the relaxation interval \(\varepsilon =[\varepsilon _-,\varepsilon _+]\). Furthermore, we have algebraic convergence of the error \(\mathcal {J}_\epsilon (u_\varepsilon )- \mathcal {J}(u)\) induced by the relaxation. We will combine these results to deduce an algebraic rate of the full error \(\mathcal {J}_{\epsilon _n}(v_n) -\mathcal {J}(u)\) in terms of n for a specific predefined choice of \(\epsilon _n\). To achieve our goal we will use that \(|\nabla u|\in L^{q,\infty }(\Omega )\) for some \(q>p\), which is justified by Theorems 13 and 14.

Let us consider a sequence of solutions created by our relaxed p-Kačanov algorithm. In particular, \(\epsilon _n=[\epsilon _{n,-},\epsilon _{n,+}]\) is now an increasing sequence of intervals. Then exactly as in Theorem 18 we get the following estimate.

Theorem 22

There is a constant \(c_K>1\) such that

$$\begin{aligned} \mathcal {J}_{\epsilon _n}(v_n)-\mathcal {J}_{\epsilon _n}(v_{n+1})&\geqslant \delta _n \,(\mathcal {J}_{\epsilon _n}(v_n)-\mathcal {J}_{\epsilon _n}(u_{\varepsilon _n})) \end{aligned}$$

holds for \( \delta _n := \tfrac{1}{c_K} (\tfrac{\epsilon _{n,-}}{\epsilon _{n,+}})^{2-p}\).

Since \(\epsilon _n \subset \epsilon _{n+1}\), we have \(\mathcal {J}_{\epsilon _{n+1}} \leqslant \mathcal {J}_{\epsilon _n}\). This and Theorem 22 imply

$$\begin{aligned}&\mathcal {J}_{\epsilon _{n+1}}(v_{n+1}) - \mathcal {J}(u) \\&\quad \leqslant \mathcal {J}_{\epsilon _n}(v_{n+1}) - \mathcal {J}(u) \\&\quad = \big (\mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}(u)\big ) - \big ( \mathcal {J}_{\epsilon _n}(v_n)- \mathcal {J} {\epsilon _n}(v_{n+1}) \big ) \\&\quad \leqslant \big (\mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}(u)\big ) - \delta _n \big ( \mathcal {J}_{\epsilon _n}(v_n)- \mathcal {J}_{\epsilon _n}(u_{\epsilon _n}) \big ) \\&\quad = (1- \delta _n) \big (\mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}(u)\big ) + \delta _n \big ( \mathcal {J}_{\epsilon _n}(u_{\epsilon _n}) - \mathcal {J}(u) \big ). \end{aligned}$$

Now, Lemma 12 and \(|\nabla u|\in L^{q,\infty }(\Omega )\) ensure the existence of \(c_R>0\) such that

$$\begin{aligned} \mathcal {J}_\epsilon (u_\varepsilon )- \mathcal {J}(u) \leqslant c_R(\varepsilon _-^p + \varepsilon _+^{-(q-p)}) . \end{aligned}$$
(5.1)

This and the previous estimate therefore imply

$$\begin{aligned} \mathcal {J}_{\epsilon _{n+1}}(v_{n+1}) - \mathcal {J}(u)&\leqslant (1- \delta _n) \big (\mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}(u)\big ) + \delta _n c_R(\epsilon _{n,-}^p + \epsilon _{n,+}^{-(q-p)}).\nonumber \\ \end{aligned}$$
(5.2)

Without the last term \(\delta _{n+1} c_R(\epsilon _{n,-}^p + \epsilon _{n,+}^{-(q-p)})\) we would get a reduction of the error \(\mathcal {J}_{\epsilon _n}(v_n) -\mathcal {J}(u)\) by the factor \((1-\delta _n)\). On the other hand this last term is small if \(\epsilon _{n,-} \rightarrow 0\) and \(\epsilon _{n,+} \rightarrow \infty \), so it should not bother too much. Nevertheless, the reduction factor \((1-\delta _n)\) tends to 1 if \(\epsilon _{n,-} \rightarrow 0\) and \(\epsilon _{n,+} \rightarrow \infty \). The idea however is the following: if \(\delta _n\) goes to zero slowly, then the product \(\prod _{i=1}^n (1-\delta _i)\) still tends to zero algebraically.

Let us be more precise. We define another relaxed energy \(\mathcal {G}_n\) by

$$\begin{aligned} \mathcal {G}_n := \mathcal {J}_{\epsilon _n}(v_n) + K_1\, (\varepsilon _{-,n}^p + \varepsilon _{+,n}^{-(q-p)}) \quad \text {and}\quad \mathcal {G}_\infty := \mathcal {J}(u) , \end{aligned}$$
(5.3)

where \(K_1>0\) will be determined below. Moreover, choose

(5.4)

and define

$$\begin{aligned} \varepsilon _n := [(n+1)^{-\alpha },(n+1)^\beta ]. \end{aligned}$$
(5.5)

Then

$$\begin{aligned} \delta _n&= \tfrac{1}{c_K} \big (\tfrac{\epsilon _{n,-}}{\epsilon _{n,+}}\big )^{2-p} = \tfrac{1}{c_K} ((n+1)^{-\alpha -\beta })^{2-p} \geqslant \tfrac{1}{c_K} \tfrac{1}{n+1}. \end{aligned}$$
(5.6)

In particular, the algorithm with this choice of \(\epsilon _n\) reads as follows:

figure c

We continue to derive a decay estimate for \(\mathcal {G}_n - \mathcal {G}_\infty \).

Lemma 23

There exists \(K=K(\alpha ,\beta ,p,q)\) (which appears in the definition of \(\mathcal {G}_n\)) and some \(c_3=c_3(\alpha ,\beta ,p,q)\geqslant 1\), such that for all \(n \in \mathbb {N}\)

$$\begin{aligned} (\mathcal {G}_{n+1} - \mathcal {G}_\infty )&\leqslant (1- \tfrac{1}{c_3 (n+1)}) ( \mathcal {G}_n - \mathcal {G}_\infty ). \end{aligned}$$

Proof

Define

$$\begin{aligned} \rho _n := \varepsilon _{-,n}^p + \varepsilon _{+,n}^{-(q-p)} = (n+1)^{-\alpha p} + (n+1)^{-(q-p)\beta }. \end{aligned}$$

Hence it follows by Lemma 27 in the Appendix that there exists \(c_2 = c_2(\alpha ,\beta ,p,q) \geqslant 1\) with

$$\begin{aligned} \rho _n - \rho _{n+1}&\geqslant \frac{1}{c_2} \frac{1}{n+1} \rho _n. \end{aligned}$$
(5.7)

In particular, \(\rho _n\) satisfies a decay estimate!

On the other hand it follows from Theorem 22 that

$$\begin{aligned}&\mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}_{\epsilon _{n+1}}(v_{n+1}) \\&\quad \geqslant \mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}_{\epsilon _n}(v_{n+1}) \\&\quad \geqslant \delta _n (\mathcal {J}_{\epsilon _n}(v_n)-\mathcal {J}_{\epsilon _n}(u_{\varepsilon _n})) \\&\quad \geqslant \frac{1}{c_K} \frac{1}{n+1} (\mathcal {J}_{\epsilon _n}(v_n)-\mathcal {J}_{\epsilon _n}(u_{\varepsilon _n})) \\&\quad = \frac{1}{c_K} \frac{1}{n+1} (\mathcal {J}_{\epsilon _n}(v_n)-\mathcal {J}(u)) - \frac{1}{c_K} \frac{1}{n+1}(\mathcal {J}_{\epsilon _n}(u_{\varepsilon _n}) - \mathcal {J}(u)). \end{aligned}$$

We deduce from (5.1), the definition of \(\epsilon _n\) and \(\rho _n\) that

$$\begin{aligned} \mathcal {J}_\epsilon (u_\varepsilon )- \mathcal {J}(u) \leqslant c_R(\epsilon _{n,-}^p + \epsilon _{n,+}^{-(q-p)}) = c_R \rho _n. \end{aligned}$$

This and the previous estimate prove

$$\begin{aligned} \mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}_{\epsilon _{n+1}}(v_{n+1})&\geqslant \frac{1}{c_K} \frac{1}{n+1} \,(\mathcal {J}_{\epsilon _n}(v_n)-\mathcal {J}(u)) - \frac{c_R}{c_K} \frac{1}{n+1} \rho _n. \end{aligned}$$

Since \(\mathcal {G}_n = \mathcal {J}_{\epsilon _n}(v_n) + K_1\rho _n\), it follows together with (5.7) that

$$\begin{aligned} \mathcal {G}_{n+1} - \mathcal {G}_n&\geqslant \frac{1}{c_K} \frac{1}{n+1} \,(\mathcal {J}_{\epsilon _n}(v_n)-\mathcal {J}(u)) + \bigg ( \frac{K_1}{c_2} - \frac{c_R}{c_K}\bigg ) \frac{1}{n+1} \rho _n. \end{aligned}$$

We finally fix \(K_1\): We choose \(K_1\) so large such that

$$\begin{aligned} \frac{K_1}{c_2} - \frac{c_R}{c_K}&\geqslant \frac{K_1}{2\,\max {\{{c_K,c_2}\}}}, \end{aligned}$$

which is always possible. Combining this with our previous estimates we deduce

$$\begin{aligned} \mathcal {G}_{n+1} - \mathcal {G}_n&\geqslant \frac{1}{2\, \max {\{{c_K,c_2}\}} } \frac{1}{n+1} (\mathcal {G}_n - \mathcal {G}_\infty ). \end{aligned}$$

This proves the theorem with \(c_3 = 2\, \max {\{{c_K,c_2}\}}\). \(\square \)

We are now able to present our convergence result.

Theorem 24

Let \(\nabla u\in L^{q,\infty }(\Omega )\) for some \(q>p\) (as given for example in Theorem 13 or Theorem 14). Then, the sequence \((v_n)_{n\in \mathbb N}\) produced by the algorithm above described satisfies

$$\begin{aligned} \mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}(u) \leqslant \mathcal {G}_n-\mathcal {G}_\infty \leqslant n^{-\frac{1}{c_3}}\,(\mathcal {G}_0 - \mathcal {G}_\infty ), \end{aligned}$$

where \(c_3\) is the constant of Lemma 23. In particular, the energy error decreases at least algebraically.

Proof

The estimate \(\mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}(u) \leqslant \mathcal {G}_n-\mathcal {G}_\infty \) is obvious, so it remains to prove the decay of \(\mathcal {G}_n - \mathcal {G}_\infty \). If follows from Lemma 23 that, for \(n \in \mathbb {N}\),

$$\begin{aligned} \mathcal {G}_n - \mathcal {G}_\infty&\leqslant \prod _{i=0}^{n-1} \big ( 1 - \tfrac{1}{c_3 (i+1)}\big ) (\mathcal {G}_0 - \mathcal {G}_\infty ). \end{aligned}$$

Now,

$$\begin{aligned} \prod _{i=0}^n \Big (1- \tfrac{1}{c_3(i+1)} \Big )= & {} \exp \bigg ( \sum _{i=0}^n\ln \Big (1- \tfrac{1}{c_3(i+1)} \Big ) \bigg ) \leqslant \exp \bigg ( -\sum _{i=0}^n \frac{1}{c_3(i+1)} \bigg ) \\\leqslant & {} \exp \bigg ( - \frac{\ln (n)}{c_3} \bigg ) = n^{-\frac{1}{c_3}}. \end{aligned}$$

This proves the lemma. \(\square \)

Remark 25

We have seen that the choice \(\epsilon _n = [(n+1)^{-\alpha }, (n+1)^\beta ]\) ensures that the error decreases at least with an algebraic rate. However, the decay of the relaxed energy error \(\mathcal {G}_n- \mathcal {G}_\infty \) can never be faster than algebraical with this choice of \(\epsilon _n\). Hence, this choice is also very restrictive. From the numerical experiments we performed, we have seen that it is possible to decrease \(\epsilon _{n,-}\) and increase \(\epsilon _{n,+}\) much faster and still obtain convergence. Moreover, the observed convergence is much faster than algebraic and more of exponential type. We will present the details of such numerical experiments in a subsequent work. Let us summarize: the algorithm of this section ensures an algebraic convergence rate, but in practice we expect a better behavior for other, perhaps adaptive, choices of \(\epsilon _n\), still to be fully investigated.

6 Numerical experiments

We have performed numerous experiments on the basis of the adaptive finite element method with piecewise linear elements. We developed preliminary versions of error estimators that capture the effect of the truncation, the adaptivity of the mesh and the fixpoint iteration. Let \(v_n\) denote the iterated solution generated by the algorithm, then we used the following ad hoc estimators:

  • We use

    $$\begin{aligned} \eta ^2_{\epsilon ^+}(v_n)&:= \mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}_{(\epsilon _{n,-},\infty )}(v_n) \end{aligned}$$

    to measure the effect of the upper truncation bound \(\epsilon _{n,+}\) and

    $$\begin{aligned} \eta ^2_{\epsilon ^-}(v_n)&:= \mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}_{(0,\epsilon _{n,+})}(v_n) \end{aligned}$$

    to measure the effect of the lower truncation bound \(\epsilon _{n,-}\).

  • We use the optimal estimators of [2, 12] with the Orlicz function \(\phi _{\epsilon _n}\) to estimate the error due to mesh refinement, i.e. on elements T we use the estimators

    $$\begin{aligned} \eta ^2_h(v_n)&:= \int _T (\phi _{\epsilon _n, {|{\nabla v_n}|}})^*(h_T {|{f}|})\,dx + \sum _{\gamma \subset \partial T} h_\gamma \! \int _\gamma {\big |{\llbracket {V_{\epsilon _n}(\nabla v_n)}\rrbracket _\gamma }\big |}^2\,ds. \end{aligned}$$
  • To measure the error due to the fixpoint iteration (for \(n \geqslant 1\))

    (6.1)

    which is in fact an upper bound for \(\mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon )\) and \(\mathcal {J}_\epsilon (v_{n-1})-\mathcal {J}_\epsilon (u_\varepsilon )\).

We used these estimators to implement a fully adaptive version of our relaxed p-Kačanov iteration.

figure d

Let us present three experiments that measure different aspects. We have chosen quite critical situations and small exponents p in order to see how the algorithm behaves in particularly bad situations. In particular, we have chosen a quite small exponent \(p= \frac{16}{15}\) for all of our experiments presented here. (The results of our numerical experiments behave much nicer for p closer to 2.) The results for larger exponents are in fact much nicer.

  • The bump On \(\Omega = [-1,1]^2\) choose f and the boundary values of u such that \(u(x)=(x_1^2\!-\!1)(x_2^2\!-\!1)\). There is only one critical point at (0, 0), where \(\nabla u(0,0)=0\). This example is chosen to see how the algorithm behaves for smooth functions with isolated extrema.Footnote 3

  • The needle On \(\Omega = [-1,1]^2\) choose f and the boundary values of u such that \(u(x) = {|{x}|}^{1-\frac{1}{p}} -1\). In this case \(\nabla u(x) = {|{x}|}^{-\frac{1}{p}} \frac{x}{{|{x}|}}\) and \(A(\nabla u) = {|{x}|}^{-\frac{1}{p'}} \frac{x}{{|{x}|}}\) and \(V(\nabla u) = {|{x}|}^{-\frac{1}{2}} \frac{x}{{|{x}|}}\). This example is chosen such that \(V(\nabla u) \in W^{\frac{1}{2}}L^{2,\infty }(\Omega )\) meaning that half a derivativeFootnote 4 of \(V(\nabla u)\) is still in the Lorentz space \(L^{2,\infty }\). This corresponds to the regularity of a p-harmonic function on a slit domain. It is the critical regularity that allows optimal estimates for the adaptive finite element method, while for uniform refinement a rate of \(O(h^{1/2})\) instead of the optimal one O(h).

  • Constant force in slit domain Choose \(f=2\) on the slit domain \(\Omega = (-1,1)^2 \setminus [0,1]^2\) and \(u=0\) on its boundary. There is no explicit formula for the exact solution (different from the linear case of \(p=2\)) for the solution. The reentrant corner reduces the regularity of the solution.

There is a nice gap between our theory and the numerical experiments that we performed. In fact, our experiments shows a significantly faster convergence rate. This shows that we are on a good track and have developed a good algorithm.

Figures 1, 2 and 3 show the results of our experiments. The pictures show a log-log-plot of energy accuracy vs. the computed costs. Let us explain the picture from Fig. 1 in more detail. The others figures are analogously. The dotted black line is the most important line and shows the energy difference \(\mathcal {J}_\epsilon (u_n)-\mathcal {J}(u)\), which is the measure of the error between the computational solution and the exact solution. The lines \(\epsilon _+\) and \(\epsilon _-\) represent the truncation parameters \((\epsilon _-,\epsilon _+)\). The other lines represent error indicators for \((\epsilon _-,\epsilon _+)\) named \(\eta ^2_{\epsilon ^+}\) and \(\epsilon ^2_{\epsilon ^-}\), the fixpoint iteration , and refinement \(\eta _h^2(u_h)\). The straight black line \(\text {costs}^{-1}\) is the optimal convergence rate (1/costs), where the costs are the accumulative sum of the degrees of freedom for each step that requires solving a linear system. (Here we have assumed a linear cost for solving the linear system, which might be possible with a multi grid method.) It is important that we use the accumulated cost instead of the degrees of freedom, since only this truly measures the effort, in particular if the number of fixpoint iterations increases.

Fig. 1
figure 1

The bump

Fig. 2
figure 2

The needle

Fig. 3
figure 3

Constant force in slit domain

Let us explain the numerical results in more detail.

  • Bump The algorithm converges optimally with respect to the accumulated costs. The exact solution has a bounded gradient, so the upper truncation bound \(\epsilon _{n,+}\) does not increase much. There is only one critical point at \(x=0\), where \(\nabla u(0)=0\). Thus, the algorithm decreases the lower truncation bound \(\epsilon _{n,-}\) quite moderately, since the error from truncation appears only in a small neighborhood around zero. All estimators decrease nicely with the optimal rate (1/cost). The number of fixpoint iterations remains bounded, so the cost is proportional to the current number of degrees of freedom.

  • The needle The algorithm converges optimally with respect to the accumulated costs. The exact solution has unbounded gradient at the isolated point zero. Therefore, the upper truncation bound \(\epsilon _{n,+}\) is increased by the algorithm. This upper truncation starts quite late, since it is only necessary in set of small measure. The number of fixpoint iterations remain bounded.

  • Constant force in slit domain The gradient of the exact solution is bounded, so the upper truncation bound \(\epsilon _{n,+}\) does not increase much. The energy error still decreases very fast. It decreases almost optimally with respect to the accumulated cost, but the slope seems slightly worse. It is however still much faster, than our worst-case theory predicts.

Overall, we see that our algorithm converges with a rate, which is optimal in many cases.