A Relaxed Kačanov iteration for the p-poisson problem

Diening, L.; Fornasier, M.; Tomasi, R.; Wank, M.

doi:10.1007/s00211-020-01107-1

A Relaxed Kačanov iteration for the p-poisson problem

Open access
Published: 25 February 2020

Volume 145, pages 1–34, (2020)
Cite this article

Download PDF

You have full access to this open access article

Numerische Mathematik Aims and scope Submit manuscript

A Relaxed Kačanov iteration for the p-poisson problem

Download PDF

L. Diening¹,
M. Fornasier²,
R. Tomasi³ &
…
M. Wank³

Abstract

In this paper we introduce and analyze an iteratively re-weighted algorithm, that allows to approximate the weak solution of the p-Poisson problem for $1 < p \leqslant 2$ by iteratively solving a sequence of linear elliptic problems. The algorithm can be interpreted as a relaxed Kačanov iteration, as so-called in the specific literature of the numerical solution of quasi-linear equations. The main contribution of the paper is proving that the algorithm converges at least with an algebraic rate.

A damped Kačanov scheme for the numerical solution of a relaxed p(x)-Poisson equation

Article Open access 31 August 2023

Adaptive Two-Stage Bregman Method for Variational Inequalities

Article 26 November 2021

Convergence of Adaptive Finite Element Methods with Error-Dominated Oscillation

1 Introduction

In this paper we approach the numerical solution of the p-Poisson problem

$$\begin{aligned} -{\text {div}}(|\nabla u|^{p-2}\nabla u) = f\quad \text {in }\Omega , \nonumber \\ u = 0\quad \text {on }\partial \Omega , \end{aligned}$$

(1.1)

where $\Omega \subset \mathbb {R}^d$ is open and bounded and $1<p<\infty $. The solution might be scalar or vector-valued.^{Footnote 1}

Nonlinear problems of this type appear in many applications, e.g. non-Newtonian fluid theory [21], turbulent flow of a gas in porous media, glaciology or plastic modeling. Moreover, the p-Laplacian has a similar model character for nonlinear problems as the ordinary Laplace operator for linear problems; see [22] for an introduction.

As usual we are looking for the weak solution of (1.1). In particular, we are searching for a function $u\in W_0^{1,p}(\Omega )$ such that

$$\begin{aligned} \int \limits _\Omega |\nabla u|^{p-2} \nabla u \cdot \nabla \xi \, dx = \langle f,\xi \rangle \quad \forall \xi \in W_0^{1,p}(\Omega ), \end{aligned}$$

(1.2)

where in the most general case $f\in (W_0^{1,p}(\Omega ))^*$. It is well-known that the solution is unique and coincides with the minimizer of the energy $\mathcal {J}: W_0^{1,p}(\Omega )\rightarrow \mathbb {R}$ defined by

$$\begin{aligned} \mathcal {J}(v) := \tfrac{1}{p}\int \limits _\Omega |\nabla v|^p \, dx -\langle f,v\rangle . \end{aligned}$$

(1.3)

Due to the nonlinearity of the problem it is harder to obtain efficient numerical solutions of this problem with a guaranteed performance. Our goal is to construct solutions of (1.2) by means of a numerically accessible algorithm. In particular, we construct an iterative algorithm that approximates solutions of (1.2), where in each step only a linear elliptic problem has to be solved. Primarily, we focus here on the iteration on the infinite dimensional space $W^{1,p}_0(\Omega )$. However, the same algorithm will immediately apply also to discretized versions of the p-Poisson problem, e.g., by means of finite elements or wavelets. This approach would coincide with the one adopted, for instance, in [5] of first finding an iteration on the infinite-dimensional solution space and then discretizing in space. We will consider in subsequent work the effect of the discretization and its adaptation to error estimators.

In this paper we also restrict ourselves to the case $p\in (1,2]$, since we are in particular interested in relatively small values of p, also because the case of $p>2$ is already addressed to a certain extent in [5]. We will see, e.g., in Example 20 that our algorithm actually only works properly for the range of $p \in (1,2]$.

Coming from the weak formulation (1.2) one can interpret the problem as a weighted Poisson problem

$$\begin{aligned} \int \limits _\Omega a^{p-2} \nabla u\cdot \nabla \xi \, dx =\langle f,\xi \rangle \quad \forall \xi \in W_0^{1,2}(\Omega ) \end{aligned}$$

(1.4)

for the given f, where $a:\Omega \rightarrow \mathbb {R}$ and $a=|\nabla u|$. This suggests to iteratively calculate for a given function $v_n$ the new iterate $v_{n+1}$ as the solution of

$$\begin{aligned} \int \limits _\Omega |\nabla v_n|^{p-2} \nabla v_{n+1}\cdot \nabla \xi \, dx = \langle f,\xi \rangle \quad \forall \xi \in W_0^{1,2}(\Omega ). \end{aligned}$$

The advantage of this step is that the calculation of $v_{n+1}$ only requires solving a linear problem. This allows invoking relatively standard appraoches to discretize this step and solve it numerically with guaranteed performances. The problem with this approach, however, is that the weighted Poisson problem is only well posed if a is bounded from above and from below away from zero. However, the weight $|\nabla v_n|^{p-2}$ may be degenerating, at points where ${|{\nabla u}|}=0$ or ${|{\nabla u}|}=\infty $.

To overcome this problem we will use a relaxation arguments. Therefore, we introduce in our algorithm two relaxation parameters $\epsilon _-, \epsilon _+ \in (0,\infty )$ with $\epsilon _- \leqslant \epsilon _+$ that ensure that the weight is truncated properly from below and above. In particular, we replace a by its truncation

$$\begin{aligned} \epsilon _- \vee a \wedge \epsilon _+ := \max {\{{ \epsilon _-, \min {\{{ a, \epsilon _+}\}}}\}}. \end{aligned}$$

Note that this is just the (pointwise) closest point projection of a to the truncation interval $ [\epsilon _-,\epsilon _+]$. The limits $\epsilon _- \searrow 0$ and $\epsilon _+ \nearrow \infty $ will recover the unrelaxed or original problem. We also write $\epsilon :=[\epsilon _-, \epsilon _+]$ and interpret $\epsilon $ both as a pair $\{\epsilon _-,\epsilon _+\}$ and as the truncation interval $[\epsilon _-,\epsilon _+]$. We will write $\epsilon \rightarrow [0,\infty ]$ as a short version of $\epsilon _- \searrow 0$ and $\epsilon _+ \nearrow \infty $. We will see later, see Corollary 15, that for f in the Lorentz space $L^{d,1}(\Omega )$ the lower parameter $\epsilon _-$ is the crucial one.

According to these considerations we propose the following algorithm:

Since $0< \epsilon _{n,-} \leqslant \epsilon _{n,+}<\infty $ the equation for $v_{n+1}$ in the algorithm is always well defined, since it is uniformly elliptic (with constant depending on $\epsilon _n$).

This algorithm is not completely new in the realm of quasi-linear equations. Such an iterative linearization approach is in fact called the Kačanov method in [17, 18] and we refer to those papers for additional references related to the history of this method for solving numerically quasi-linear equations. It was also proposed and analyzed to solve total variation minimization problems in image processing, which can be formally related to the 1-Laplace differential operator in [6, 27].

Unfortunately, the results obtained in these aforementioned papers cannot be applied straightforwardly to justify the convergence of the Kačanov iteration for equations involving the p-Laplace operator. In particular, to obtain quantitative estimates of convergence with precise rates, as we do in this paper, one needs to employ several finer tools, which have been explored in, e.g., [2, 11, 12, 24], precisely to handle singularities in nonlinear differential operators such as the p-Laplacian. In particular, the theory of N-functions, Orlicz spaces [19], shifted N-functions [11] and Lipschitz truncations, see [13] and [4] have been used systematically in the analysis of such nonlinear operators, allowing the development of a potential theory analogous to the one known of linear equations.

Besides these tools from nonlinear potential theory, the variational formulation of the algorithm, as introduced first in [6], and further used to analyze other related iteratively re-weighted least squares algorithms [10, 16], offers the right framework for the analysis also of the p-Kačanov iteration.

Taking inspiration from [6, 10], in Sect. 2 we provide the variational derivation of this algorithm based on the alternating minimization of a relaxed energy with two parameters.

If we apply the algorithm with fixed relaxation parameter $\epsilon $ independent on n, i.e. $0<\epsilon _- \leqslant \epsilon _+ < \infty $, then our iterates $v_n$ converge to the unique minimizer $u_\epsilon $ of another one-parameter relaxed energy $\mathcal {J}_\epsilon $. We study this limit in Sect. 4 and present (linear) exponential rates of convergence.

In Sect. 3 we study how the minimizers $u_\epsilon $ of the relaxed energy $\mathcal {J}_\epsilon $ converge to the minimizer u of the original problem. This convergence can also be interpreted as a limit in the sense of $\Gamma $-convergence [3, 9]. Differently, e.g., from [6], we use a novel argument based on the Lipschitz truncation technique to establish a recovery sequence for the $\Gamma -\lim \sup $. In particular, thanks to the finer tools mentioned above, we can go beyond a pure compactness argument as provided by the $\Gamma $-limit and derive precise rates of convergence depending on $\epsilon $.

Finally, in Sect. 5 we combine the estimates of the two previous sections to deduce an overall error analysis with algebraic rates.

2 Variational formulation of the algorithm

In this section we show that the algorithm can be deduced from an alternating minimization of a relaxed energy. Recall that $1 < p \leqslant 2$ throughout this article. Since the case $p=2$ is just the standard Laplace problem, it suffices in the following to consider the case $1<p<2$ only.

Let us introduce some standard notation. We use $W^{1,p}(\Omega )$ and $W^{1,p}_0(\Omega )$ for the Sobolev space without and with zero boundary values. We use c for a generic positive constant whose value may change from line to line. We use $f \lesssim g$ for $f \leqslant c\, g$. We also write $f \eqsim g$ for $f \lesssim g$ and $g \lesssim f$.

The most important feature of the algorithm is that it only needs to solve linear sub-problems, which carry their own energy depending on the weight. Therefore, very much inspired by the work [6, 10] and with appropriate adjustments, we extend the energy by an additional parameter $a \,:\, \Omega \rightarrow [0,\infty )$ such that the new functional is quadratic with respect to v. In particular, we define

$$\begin{aligned} \mathcal {J}(v,a) := \int \limits _\Omega \tfrac{1}{2} a^{p-2}|\nabla v|^2 + \left( \tfrac{1}{p} - \tfrac{1}{2}\right) a^p\, dx - \langle f,v\rangle . \end{aligned}$$

This energy is well-defined for all $v \in W^{1,p}_0(\Omega )$ and measurable $a\,:\, \Omega \rightarrow [0,\infty )$ but might take the value $\infty $.

This relaxed energy is convex with respect to (v, a). This follows from the fact that $\beta (t,a) := \frac{1}{2} a^{p-2}t^2$ is convex on $[0,\infty )^2$, since

$$\begin{aligned} (\nabla ^2 \beta )(t,a) = \begin{pmatrix} a^{p-2} &{} (p-2)a^{p-3}t \\ (p-2)a^{p-3}t &{}\quad \frac{1}{2} (p-2)(p-3)a^{p-4}t^2 \end{pmatrix} \end{aligned}$$

is nonnegative definite as $a^{p-2}\geqslant 0$ and $\det ((\nabla ^2 \beta )(t,a))=a^{2p-6} t^2 (2-p)(p-1)\geqslant 0$. Notice that in the latter lower bound we specifically used $1 < p \leqslant 2$.

Remark 1

If $p>2$, then the relaxed energy $\mathcal {J}(v,a)$ is neither bounded from below nor convex with respect to a. Therefore, the algorithm derived below using the minimization with respect to a does not lead to a feasible problem for $p>2$. See also Remark 21.

Note that $\mathcal {J}(v,a)$ (for fixed a) is quadratic with respect to v and a minimization with respect to v leads formally to the elliptic equation

$$\begin{aligned} -{\text {div}}(a^{p-2} \nabla v)&= f, \end{aligned}$$

see (1.4) for its weak form.

Unfortunately, the ellipticity of this system degenerates for $a(x)\rightarrow 0$ and $a(x) \rightarrow \infty $. To overcome this problem we restrict the minimization with respect to a (for fixed v) to functions with values within a relaxation interval $[\epsilon _-, \epsilon _+] \subset (0,\infty )$, i.e. $\epsilon _- \leqslant a(x) \leqslant \epsilon _+$. This minimization with respect to a (for fixed v) has a simple solution, namely

$$\begin{aligned} \mathop {\mathrm {arg\,min}}_{a\,:\,\epsilon _- \leqslant a \leqslant \epsilon _+} \mathcal {J}(v,a) = \epsilon _- \vee {|{\nabla v}|} \wedge \epsilon _+, \end{aligned}$$

(2.1)

where $\vee $ denotes the maximum and $\wedge $ the minimum, since

$$\begin{aligned} \tfrac{\partial }{\partial a} \Big ( \tfrac{1}{2} a^{p-2}|\nabla v|^2 + (\tfrac{1}{p} - \tfrac{1}{2})a^p \Big ) = \tfrac{2-p}{2} a^{p-3} (a^2 - {|{\nabla v}|}^2). \end{aligned}$$

This allows us to define for fixed $\epsilon = [\varepsilon _-,\varepsilon _+] \subset [0,\infty ]$ another relaxed energy

$$\begin{aligned} \mathcal {J}_\epsilon (v) := \mathcal {J}\left( v,\varepsilon _-\vee \vert \nabla v\vert \wedge \varepsilon _+\right) = \min _{a\,:\,\epsilon _- \leqslant a \leqslant \epsilon _+} \mathcal {J}(v,a). \end{aligned}$$

(2.2)

This immediately implies that the relaxed energy $\mathcal {J}_\epsilon (v)$ is monotonically decreasing with respect to $\epsilon $, i.e., an increasing interval $\epsilon $ in terms of inclusion decreases the energy $\mathcal {J}_\epsilon (v)$.

This new relaxed energy $\mathcal {J}_\epsilon $ somehow “hides” the constrained minimization with respect to a. We can write $\mathcal {J}_\epsilon \,:\, W^{1,p}_0(\Omega ) \rightarrow \mathbb {R}\cup {\{{\infty }\}}$ explicitly as

$$\begin{aligned} \mathcal {J}_\epsilon (v) = \int \limits _\Omega \kappa _\epsilon (\vert \nabla v\vert )\, dx - \langle f,v\rangle \end{aligned}$$

with $\kappa _\epsilon \,:\,\mathbb {R}_{\geqslant 0}\rightarrow \mathbb {R}$ given by

$$\begin{aligned} \kappa _\epsilon (t)&:= {\left\{ \begin{array}{ll} \frac{1}{2}\varepsilon _-^{p-2}t^2 + (\frac{1}{p}-\frac{1}{2})\varepsilon _-^p &{} \text {for }t\leqslant \varepsilon _-\\ \frac{1}{p}t^p &{} \text {for } \varepsilon _-\leqslant t\leqslant \varepsilon _+\\ \frac{1}{2}\varepsilon _+^{p-2}t^2 + (\frac{1}{p}-\frac{1}{2})\varepsilon _+^{p} &{} \text {for } t\geqslant \varepsilon _+. \end{array}\right. } \end{aligned}$$

Note that $\frac{1}{p} t^p \leqslant \kappa _\epsilon (t)$ for all $t \geqslant 0$ and $\frac{1}{p} t^p = \lim _{\epsilon \rightarrow [0,\infty ]} \kappa _\epsilon (t)$ for all $t \geqslant 0$. Since $\kappa _\epsilon (t) \eqsim \epsilon _+^{p-2} t^2$ for large t, we see that $\mathcal {J}_\epsilon (v) < \infty $ if and only if $v \in W^{1,2}_0(\Omega )$. Moreover, $\lim _{\epsilon \rightarrow [0,\infty ]} \mathcal {J}_\epsilon (v) = \mathcal {J}(v)$ for all $v \in W^{1,2}_0(\Omega )$ and $\mathcal {J}(v) \leqslant \liminf _{\epsilon \rightarrow [0,\infty ]} \mathcal {J}_\epsilon (v)$ for all $v \in W^{1,p}_0(\Omega )$.

Based on the above observations it is natural to iteratively minimize $\mathcal {J}(v,a)$ alternating between v and a. Certainly, we have also to increase the relaxation interval $\epsilon $. Thus our algorithm reads as follows:

This is just the algorithm given in the introduction written in different form.

3 Convergence in the relaxation parameter

In this section we show that the minimizers $u_\epsilon $ of the relaxed energy $\mathcal {J}_\epsilon $ converge to the minimizer u of $\mathcal {J}$ for $\epsilon \rightarrow [0,\infty ]$ and derive an upper bound for the relaxation error.

Since $\mathcal {J}_\epsilon (v) \geqslant \mathcal {J}(v)$ and $\mathcal {J}$ is $W^{1,p}_0(\Omega )$ coercive, it follows that $\mathcal {J}_\epsilon $ is also coercive in $W^{1,p}_0(\Omega )$. However, $\mathcal {J}_\epsilon (v)<\infty $ requires $v \in W^{1,2}_0(\Omega )$, as we have seen above. Certainly, there is a gap between the space $W^{1,p}_0(\Omega )$ and $W^{1,2}_0(\Omega )$. To close this gap we need a finer analysis of the energies, which requires the use of Orlicz spaces. We state in the following some standard results for these spaces, see for Example [19].

A function $\phi :\mathbb {R}_{\geqslant 0}\rightarrow \mathbb {R}$ is called an N-function if and only if there is a right-continuous, positive on the positive real line, and non-decreasing function $\phi ':\mathbb {R}_{\geqslant 0}\rightarrow \mathbb {R}$ with $\phi '(0)=0$ and $\lim _{t \rightarrow \infty } \phi '(t)=\infty $ such that $\phi (t) = \int _0^{t}\phi '(\tau )\, d\tau $. An N-function is said to satisfy the $\Delta _2$-condition if and only if there is a constant $c>1$ such that $\phi (2t)\leqslant c\,\phi (t)$. For an N-function satisfying the $\Delta _2$-condition we define the Orlicz space to consist of those functions $v \in L_{{\mathrm {loc}}}^1(\Omega )$ with $\int _\Omega \phi (|v|)\, dx<\infty $. It becomes a Banach space with the norm ${\Vert {f}\Vert }_\phi := \inf {\{{\gamma >0\,:\, \int _\Omega \phi ({|{v}|}/\gamma )\,dx\leqslant 1}\}}$. The Orlicz-Sobolev space $W^{1,\phi }(\Omega )$ then consists of those $v \in L^\phi $ such that the weak derivative $\nabla v$ is also in $L^\phi $, equipped with the norm ${\Vert {v}\Vert }_\phi + {\Vert {\nabla v}\Vert }_\phi $. The space $W^{1,\phi }_0(\Omega )$ denotes the subspace of those functions from $W^{1,\phi }(\Omega )$ with zero boundary trace, which coincides with the closure of $C^\infty _0(\Omega )$ in $W^{1,\phi }(\Omega )$. For example choosing $\phi (t):=\tfrac{1}{p} t^p$ we have $L^\phi (\Omega ) = L^p(\Omega )$ and $W^{1,\phi }_0(\Omega ) = W^{1,p}_0(\Omega )$.

The function $\kappa _\epsilon $ cannot be an N-function, since $\kappa _\epsilon (0)\ne 0$, . However, if we define

$$\begin{aligned} \phi _\epsilon (t) := \kappa _\epsilon (t) - \kappa _\epsilon (0), \end{aligned}$$

(3.1)

then $\phi _\epsilon $ is actually an N-function. It can be verified that $\phi _\epsilon $ satisfies the $\Delta _2$-condition with a constant independent of $\epsilon $.

Since $\phi _\epsilon (t) \eqsim \epsilon _+^{p-2} t^2$ for large t and $\Omega $ is bounded, we have $L^{\phi _\epsilon }(\Omega ) \eqsim L^2(\Omega )$. However, the constant of the embedding $L^{\phi _\epsilon }(\Omega ) \hookrightarrow L^2(\Omega )$ depends on $\epsilon $, so this equivalence is not of much use. Instead we use the chain of embeddings

$$\begin{aligned} L^2(\Omega ) \hookrightarrow L^{\phi _\epsilon }(\Omega ) \hookrightarrow L^p(\Omega ), \end{aligned}$$

(3.2)

with constants independent of $\epsilon $. This follows from the fact that the Simonenko indices of $\phi _\epsilon $ are within [p, 2]. We refer the reader to, e.g., [28, Chapter 2] for the details.

Since $\phi _\epsilon $ is strictly convex and $\kappa _\epsilon (t) = \phi _\epsilon (t) + \kappa _\epsilon (0)$, the energy $\mathcal {J}_\epsilon $ admits a unique minimizer $u_\epsilon \in W_0^{1,\phi _\epsilon }(\Omega )$ whose Euler-Lagrange equation is

$$\begin{aligned} \int _\Omega (\varepsilon _- \vee |\nabla u_\epsilon | \wedge \varepsilon _+)^{p-2}\nabla u_\epsilon \cdot \nabla \xi \, dx = \langle f,\xi \rangle \qquad \forall \xi \in W_0^{1,\phi _\epsilon }(\Omega ). \end{aligned}$$

(3.3)

At this we used that

$$\begin{aligned} \frac{\phi _\epsilon '(t)}{t}&= (\varepsilon _- \vee t \wedge \epsilon _+)^{p-2}. \end{aligned}$$

(3.4)

Remark 2

Let us consider the special case $\varepsilon _+=\infty $. Then, the derivative of the truncated function reads $\phi _\epsilon '(t) = (\varepsilon _-\vee t)^{p-2} t$. A different modification, namely $(\varepsilon _-+ t)^{p-2} t$, would lead to the so-called shifted N-function of $\frac{1}{p}t^p$, as introduced in more generality in [11], which has similar properties as our truncation functions. However, the version from this paper is more suitable for our energy relaxation, since it is closer to the original function $\frac{1}{p} t^p$ on the truncation interval $\epsilon $ (the derivatives agree there). See the Appendix for more information on uniformly convex Orlicz functions and their shifted verions.

Lemma 3

The functions $\phi $ and $\phi _\epsilon $ are uniformly convex Orlicz functions in the sense of Sect. B from the Appendix. The convexity constants are independent of $\epsilon $.

Proof

The uniform convexity of $\phi $ follows from $\frac{\phi ''(t)\,t}{\phi '(t)}= (p-1)$ and Lemma (B.2). Now, for $t \in (\epsilon _-,\epsilon _+)$ we have $\frac{\phi _\epsilon ''(t)\,t}{\phi \epsilon '(t)}= (p-1)$, while for $t \in (0,\infty ) \setminus [\epsilon _-,\epsilon _+]$ we have $\frac{\phi _\epsilon ''(t)\,t}{\phi _\epsilon '(t)} = 1$. Hence, the claim for $\phi _\epsilon $ follows again by Lemma (B.2). $\square $

Since $W^{1,p}_0(\Omega )$ is the largest space, see (3.2), which contains both $u_\epsilon $ and u, it is natural to consider all energies $\mathcal {J}$ and $\mathcal {J}_\epsilon $ as functionals on $W^{1,p}_0(\Omega )$ with possible value $\infty $.

Let us recall that the goal of this section is to show that $u_\epsilon $ converges to u in $W^{1,p}_0(\Omega )$. Since $W^{1,p}_0(\Omega )$ is uniformly convex, strong convergence is a consequence of weak convergence and norm convergence, or equivalently, in this case, energy convergence $\mathcal {J}(u_\epsilon ) \rightarrow \mathcal {J}(u)$. It is possible to show the weak convergence as well as that of the energy by means of $\Gamma $-convergence. Indeed, we will see in Remark 11 that $\mathcal {J}_\epsilon \rightarrow \mathcal {J}$ in the sense of $\Gamma $-convergence. However, we will derive in the following much stronger results that provide us with a precise rate of convergence for the energies. This energy convergence implies strong convergence of the sequence, see the proof of Corollary 10.

Let us turn to the convergence of the energies $\mathcal {J}(u_\epsilon ) \rightarrow \mathcal {J}(u)$ for $\epsilon \rightarrow [0,\infty ]$. Since $\mathcal {J}_\epsilon $ is monotonically decreasing with respect to $\epsilon $, it follows from the minimizing properties of u and $u_\epsilon $ that

$$\begin{aligned} 0&\leqslant \mathcal {J}(u_\epsilon ) - \mathcal {J}(u) \leqslant \mathcal {J}_\epsilon (u_\epsilon ) - \mathcal {J}(u). \end{aligned}$$

(3.5)

Therefore, it suffices to prove the stronger claim

(3.6)

In fact, we will later need this stronger estimate in the other sections.

It follows from the minimizing property of $u_\epsilon $ that

$$\begin{aligned} \mathcal {J}_\epsilon (u_\epsilon ) - \mathcal {J}(u) \leqslant \mathcal {J}_\epsilon (u) - \mathcal {J}(u). \end{aligned}$$

So it would be natural to estimate $\mathcal {J}_\epsilon (u) - \mathcal {J}(u)$ in terms of $\epsilon $ and u. However, the solution u is unfortunately a priori only a $W^{1,p}_0$-function, so $\mathcal {J}_\epsilon (u)$ might be infinity. Hence, we cannot assure that this difference is small. This is only possible if we assume higher regularity of u. In order to treat arbitrary right-hand sides $f \in (W^{1,p}_0(\Omega ))^*$ at this point, we have to use a much more subtle argument. For this we need a result from [13, Subsection 3.5] and [4, Theorem 2.7], which allows to change u on a small set such that it becomes a Lipschitz function. This technique is known as the Lipschitz truncation technique. Its origin goes back to [1]. As a tool we need the Hardy-Littlewood operator, e.g. [25],

where ${|{B_r(x)}|}$ denote the Lebesgue measure of $B_r(x)$.

Theorem 4

[Lipschitz trunction [4, 13]] Let $v\in W_0^{1,p}(\Omega )$ and for all $\lambda >0$ define

$$\begin{aligned} \mathcal {O}_\lambda := \mathcal {O}_\lambda (v) := \{x\in \Omega \,:\, M(\nabla v)(x)>\lambda \}, \end{aligned}$$

Then, there exists an approximation $T_\lambda v\in W_0^{1,\infty }(\Omega )$ of v with the following properties:

(a)
$\{v\ne T_\lambda v\}\subset \mathcal {O}_\lambda $.
(b)
$\Vert T_\lambda v\Vert _{L^p(\Omega )}\lesssim \Vert v\Vert _{L^p(\Omega )}$.
(c)
$\Vert \nabla T_\lambda v\Vert _{L^p(\Omega )}\lesssim \Vert \nabla v\Vert _{L^p(\Omega )}$.
(d)
$\vert \nabla T_\lambda v\vert \lesssim \lambda \chi _{\mathcal {O}_\lambda } + \vert \nabla v\vert \chi _{\Omega \setminus \mathcal {O}_\lambda } \leqslant \lambda $ almost everywhere.
(e)
${\Vert {\nabla (v-T_\lambda v)}\Vert }_{L^p(\Omega )} \lesssim {\Vert {\nabla v}\Vert }_{L^p(\mathcal {O}_\lambda )}$.
(f)
$v_\lambda \rightarrow v$ in $W^{1,p}_0(\Omega )$ as $\lambda \rightarrow \infty $.

All our convergence results concerning the relaxation parameter $\varepsilon $ are based on the following result, which shows how the energy relaxation depends on the truncation interval $\varepsilon $.

Theorem 5

The estimate

$$\begin{aligned} \mathcal {J}_\epsilon (u_\varepsilon ) - \mathcal {J}(u) \lesssim \varepsilon _-^p + \int \limits _{\mathcal {O}_\lambda (u)}|\nabla u|^p \, dx \end{aligned}$$

(3.7)

holds for all $\lambda \leqslant \varepsilon _+/c_1$, where $c_1$ is the (hidden) constant from Theorem 4 (d).

Proof

Let $\lambda \leqslant \varepsilon _+/c_1$ and let $T_\lambda u$ be the Lipschitz truncation of u. Then

$$\begin{aligned} {|{\nabla T_\lambda u}|} \leqslant c_1\,\lambda \leqslant \varepsilon _+. \end{aligned}$$

Using the minimizing property of $u_\varepsilon $ and the equation for u we get

$$\begin{aligned} \mathcal {J}_\epsilon (u_\varepsilon ) {-} \mathcal {J}(u)\leqslant & {} \mathcal {J}_\epsilon (T_\lambda u) - \mathcal {J}(u) = \int \limits _\Omega \left( \kappa _\epsilon (|\nabla T_\lambda u|) {-} \tfrac{1}{p} |\nabla u|^p \right) \, dx - \langle f,T_\lambda u - u\rangle \\= & {} \int \limits _\Omega \left( \kappa _\epsilon (|\nabla T_\lambda u|){-} \tfrac{1}{p} |\nabla u|^p \right) \, dx - \int _\Omega |\nabla u|^{p-2}\nabla u\cdot \nabla (T_\lambda u {-} u) \, dx. \end{aligned}$$

Using ${|{\nabla T_\lambda u}|} \leqslant \varepsilon _+$, $\kappa _\epsilon (t)=\tfrac{1}{p} t^p$ for $t \in [\varepsilon _-,\varepsilon _+]$, $\kappa _\epsilon (t) \leqslant \tfrac{1}{p} \varepsilon _-^p$ for $t \in [0, \varepsilon _-]$, and $T_\lambda u=u$ outside of $\mathcal {O}_\lambda $ we get

$$\begin{aligned} \kappa _\epsilon (|\nabla T_\lambda u|) - \tfrac{1}{p} |\nabla u|^p&\leqslant {\left\{ \begin{array}{ll} \frac{1}{p} \varepsilon _-^p &{}\qquad \text {on } {\{{{|{\nabla T_\lambda u}|} \leqslant \varepsilon _-}\}}, \\ 0 &{}\qquad \text {on } \big (\Omega \setminus \mathcal {O}_\lambda \big ) \cap {\{{{|{\nabla T_\lambda u}|}> \varepsilon _-}\}}, \\ \frac{1}{p} {|{\nabla T_\lambda u}|}^p &{}\qquad \text {on } \mathcal {O}_\lambda \cap {\{{{|{\nabla T_\lambda u}|} > \varepsilon _-}\}}. \end{array}\right. } \end{aligned}$$

This, the previous estimate and Theorem 4 (e) imply

$$\begin{aligned} \mathcal {J}_\epsilon (u_\varepsilon ) - \mathcal {J}(u)\leqslant & {} {|{\Omega }|}\,\tfrac{1}{p} \varepsilon _-^p + \int \limits _{\mathcal {O}_\lambda (u)} \tfrac{1}{p} |\nabla T_\lambda u|^p \, dx + \int \limits _{\mathcal {O}_\lambda (u)} |\nabla u|^{p-1} {|{\nabla (T_\lambda u - u)}|}\, dx \\\lesssim & {} \varepsilon _-^p {+} \int \limits _{\mathcal {O}_\lambda (u)} |\nabla T_\lambda u|^p \, dx + \int \limits _{\mathcal {O}_\lambda (u)} |\nabla u|^p \, dx {+} \int \limits _{\mathcal {O}_\lambda (u)} |\nabla (T_\lambda u-u)|^p \, dx \\\lesssim & {} \varepsilon _-^p + \int \limits _{\mathcal {O}_\lambda (u)} |\nabla u|^p \, dx. \end{aligned}$$

This proves the claim. $\square $

Corollary 6

$\mathcal {J}_\epsilon (u_\varepsilon ) \rightarrow \mathcal {J}(u)$ and $\mathcal {J}(u_\varepsilon ) \rightarrow \mathcal {J}(u)$ as $\varepsilon \rightarrow [0,\infty ]$.

Proof

Due to (3.5) it suffices to prove $\mathcal {J}_\epsilon (u_\varepsilon ) \rightarrow \mathcal {J}(u)$. Consider the right-hand side of (3.7) with $\lambda := \varepsilon _+/c_1$. The first term goes to zero as $\varepsilon _-\rightarrow 0$. Now consider the second term. Since $\mathcal {O}_\lambda (u) \subset {\{{M(\nabla u) > \lambda }\}}$ and $\nabla u\in L^p(\Omega )$ we get by the weak $L^p$-estimate of the maximal operator $|\mathcal {O}_\lambda (u)| \lesssim \lambda ^{-p} {\Vert {\nabla u}\Vert }_p^p$. Therefore $|\mathcal {O}_\lambda (u)|\rightarrow 0$ as $\varepsilon _+\rightarrow \infty $, which implies $\int _{\mathcal {O}_\lambda (u)} |\nabla u|^p \, dx \rightarrow 0$ as $\varepsilon _+\rightarrow \infty $. $\square $

Before we continue we need the following natural quantities, see [11].

Definition 7

For $P\in \mathbb {R}^d$ we define

$$\begin{aligned} A_\epsilon (P):= {\left\{ \begin{array}{ll}\frac{\phi _\epsilon '(\vert P\vert )}{\vert P\vert }P &{}\text {if } P\ne 0\\ 0 &{}\text {if }P=0\end{array}\right. }\quad \text {and}\quad V_\epsilon (P):={\left\{ \begin{array}{ll}\sqrt{\frac{\phi _\epsilon '(\vert P\vert )}{\vert P\vert }}P&{}\text {if }P\ne 0\\ 0&{}\text {if }P=0.\end{array}\right. } \end{aligned}$$

Moreover, by $A := A_{[0,\infty ]}$ and $V := V_{[0,\infty ]}$ we denote the unrelaxed versions.

The following two lemmas are modifications of similar results of [11, Lemma 3] and [12, Lemma 16]. In fact, they follow from the properties of uniformly convex Orlicz functions; see Sect. B from the Appendix for more details.

Lemma 8

For $P,Q\in \mathbb {R}^d$

$$\begin{aligned} (A_\epsilon (P)-A_\epsilon (Q))\cdot (P-Q)\eqsim \frac{\phi _\epsilon '(|P|\vee |Q|)}{|P|\vee |Q|}\vert P-Q\vert ^2 \eqsim {|{V_\epsilon (P) - V_\epsilon (Q)}|}^2. \end{aligned}$$

where the constants can be chosen independently of $\varepsilon $.

Proof

This follows directly from the uniform convexity of $\phi _\epsilon $, see Lemma 3, Lemma 41 and Lemma 40. $\square $

Lemma 9

The following estimates hold for arbitrary $v\in W_0^{1,\phi _\epsilon }(\Omega )$ and $u_\varepsilon $ being the minimizer of $\mathcal {J}_\epsilon $:

$$\begin{aligned} \mathcal {J}_\epsilon (v)-\mathcal {J}_\epsilon (u_\varepsilon )\leqslant & {} \int \limits _\Omega (A_\epsilon (\nabla v)-A_\epsilon (\nabla u_\varepsilon ))\cdot \nabla (v-u_\varepsilon )\, dx \\\lesssim & {} \int \limits _\Omega |V_\epsilon (\nabla v)-V_\epsilon (\nabla u_\varepsilon )|^2\, dx \\\lesssim & {} \mathcal {J}_\epsilon (v)-\mathcal {J}_\epsilon (u_\varepsilon ) . \end{aligned}$$

In particular, for the case where $\varepsilon =[0,\infty ]$ the statement actually implies also

$$\begin{aligned} \mathcal J(v)-\mathcal J(u)&\eqsim \int \limits _\Omega |V(\nabla v)-V(\nabla u)|^2\, dx. \end{aligned}$$

Proof

This is just Lemma 42 applied to $\phi _\epsilon $. $\square $

We are now prepared to show the convergence of minimizers $u_\epsilon $ of $\mathcal {J}_\epsilon $ to u.

Corollary 10

$u_\epsilon \rightarrow u$ in $W^{1,p}_0(\Omega )$ as $\varepsilon \rightarrow [0,\infty ]$.

Proof

Due to Corollary 6 we have $\mathcal {J}(u_\epsilon ) - \mathcal {J}(u) \rightarrow 0$ as $\epsilon \rightarrow [0,\infty ]$. Now Lemma 9 for the case where $\epsilon =[0,\infty ]$ and $v =u_\varepsilon $ implies $V(\nabla u_\epsilon ) \rightarrow V(\nabla u)$ in $L^2(\Omega )$. It follows from the shift-change-lema, see Corollary 44 or [12, Corollary 26], that for all $\delta >0$ there exists $c_\delta >0$ such that

$$\begin{aligned} {|{\nabla u - \nabla u_\epsilon }|}^p&\leqslant c_\delta {|{V(\nabla u_\epsilon ) - V(\nabla u)}|}^2 + \delta {|{\nabla u}|}^p. \end{aligned}$$

This and $V(\nabla u_\epsilon ) \rightarrow V(\nabla u)$ in $L^2(\Omega )$ implies $\nabla u_\epsilon \rightarrow \nabla u$ in $L^p(\Omega )$. $\square $

Remark 11

It is also possible to deduce $\mathcal {J}_\epsilon (u_\epsilon ) \rightarrow \mathcal {J}(u)$ and $u_\epsilon \rightarrow u$ in $W^{1,p}_0(\Omega )$ by means of $\Gamma $-convergence: As the underlying topological space we choose $W^{1,p}_0(\Omega )$ equipped with the weak topology. Then the Lipschitz truncation provides a recovery sequence for $v \in W^{1,p}_0(\Omega )$ implying $\mathop {\Gamma \text {-}\lim }\nolimits _{\epsilon \rightarrow [0,\infty ]} \mathcal {J}_\epsilon = \mathcal {J}$. Indeed, it follows as in the proof of Theorem 5 that for all $v \in W^{1,p}_0(\Omega )$

$$\begin{aligned} {\big |{\mathcal {J}_\epsilon (T_{\varepsilon _+/c_1} v) - \mathcal {J}(v)}\big |} \lesssim \varepsilon _-^p + \int _{\mathcal {O}_{\varepsilon _+/c_1}(v)} {|{\nabla v}|}^p\,dx + {\big |{ {\langle {f}, {T_{\varepsilon _+/c_1} v -v} \rangle }}\big |}. \end{aligned}$$

So the properties of the Lipschitz truncation, see Theorem 4 (f), imply that the right-hand side goes to zero as $\epsilon \rightarrow [0,\infty ]$. Hence, $T_{\varepsilon _+/c_1} v$ is a recovery sequence of v.

Moreover, $\mathcal {J}_\epsilon \geqslant \mathcal {J}$, so the standard theory of $\Gamma $-convergence [3, 9] proves $u_\epsilon \rightharpoonup u$ in $W^{1,p}_0(\Omega )$ and $\mathcal {J}_\epsilon (u_\epsilon ) \rightarrow \mathcal {J}(u)$ for $\epsilon \rightarrow [0,\infty ]$. The uniform convexity of $W^{1,p}_0(\Omega )$ implies $u_\epsilon \rightarrow u$ in $W^{1,p}_0(\Omega )$.

To our knowledge this is the first time that the Lipschitz truncation is used to construct a recovery sequence for the $\Gamma -\lim \sup $ in a $\Gamma $-convergence argument related to energies on $W^{1,p}_0(\Omega )$.

Up to now, we discussed the convergence of $u_\epsilon \rightarrow u$ without any additional assumptions on the data $f \in (W_0^{1,p}(\Omega ))^*$ and the domain $\Omega $. If f is more regular and $\partial \Omega $ is suitably smooth, then we obtain specific rates for the convergence. The rates of convergence will follow from the regularity of $\nabla u$ in terms of the weak-$L^q$ spaces $L^{q,\infty }(\Omega )$, which consists of all functions v such that

$$\begin{aligned} \Vert v\Vert _{L^{q,\infty }(\Omega )} := \sup \limits _{t>0} {\Vert {t\, \chi _{{\{{{|{v}|}>t}\}}}}\Vert }_{L^q(\Omega )} < \infty . \end{aligned}$$

Lemma 12

Let $\nabla u\in L^{q,\infty }(\Omega )$ for some $q>p$. Then,

$$\begin{aligned} \mathcal {J}_\epsilon (u_\varepsilon )- \mathcal {J}(u) \lesssim \varepsilon _-^p + \varepsilon _+^{-(q-p)}\,\Vert \nabla u\Vert _{L^{q,\infty }(\Omega )}^q. \end{aligned}$$

Proof

First note that $M:L^{q,\infty }(\Omega )\rightarrow L^{q,\infty }(\Omega )$ is bounded. This follows for example by extrapolation theory, see [8, Theorem 1.1]. In particular,

$$\begin{aligned} \lambda {|{\mathcal {O}_\lambda (u)}|}^{\frac{1}{q}} \eqsim {\Vert {\lambda \,\chi _{{\{{M(\nabla u)> \lambda }\}}}}\Vert }_{L^q(\Omega )} \leqslant {\Vert {M(\nabla u)}\Vert }_{L^{q,\infty }(\Omega )} \lesssim {\Vert {\nabla u}\Vert }_{L^{q,\infty }(\Omega )}. \end{aligned}$$

Moreover, let $L^{q,1}(\Omega )$ denote the usual Lorentz space, which consists of functions v such that

$$\begin{aligned} \Vert v\Vert _{L^{q,1}(\Omega )} := q\int \limits _0^\infty |\{|v|> t\}|^\frac{1}{q}\, dt = q \int \limits _0^\infty {\Vert {t\, \chi _{{\{{{|{v}|}>t}\}}}}\Vert }_{L^q(\Omega )} \frac{dt}{t} < \infty . \end{aligned}$$

Since $(L^{s',1})^* =L^{s,\infty }$ for $1<s<\infty $ and $\frac{1}{s}+\frac{1}{s'}=1$ we obtain

where ${|{\mathcal {O}_\lambda }|}$ denotes the Lebesgue measure of $\mathcal {O}_\lambda $. Applying Theorem 5 with $\lambda := \varepsilon _+/c_1$ yields the statement. $\square $

To exemplify the consequences of Lemma 12 we combine it with the regularity results of [7, 14]:

Theorem 13

([7], Theorems 1.3 and 1.4) Let $\Omega \subset \mathbb {R}^d$ be convex or let its boundary $\partial \Omega \in W^2 L^{d-1,1}$ (for example $\partial \Omega \in C^2$ suffices^{Footnote 2}) and additionally $f\in L^{d,1}(\Omega )$. Then $\nabla u \in L^\infty (\Omega )$.

Theorem 14

([14], (4.3)) Let $\Omega $ be a polyhedral domain where the inner angle is strictly less than $2\pi $ and $f\in L^{p'}(\Omega )$ and $\frac{1}{p} + \frac{1}{p'}=1$. Then $\nabla u \in L^{\frac{pd}{d-1},\infty }(\Omega )$.

Proof

Actually, it is proven in [14] (4.3) that ${|{\nabla u}|}^{\frac{p}{2}} \in \mathcal {N}^{\frac{1}{2},2}(\Omega )$ (Nikolskiĭ space). Now, one can use the embedding $\mathcal {N}^{\frac{1}{2},2}(\Omega )\hookrightarrow L^{\frac{2d}{d-1},\infty }(\Omega )$ of Lemma 26 in the Appendix. $\square $

Corollary 15

Under the assumptions of Theorem 13 we have

$$\begin{aligned} \mathcal {J}_\epsilon (u_\varepsilon )- \mathcal {J}(u) \lesssim \varepsilon _-^p. \end{aligned}$$

Proof

Since $\nabla u \in L^\infty (\Omega )$, we have $M(\nabla u)\in L^\infty (\Omega )$ and so for $\lambda :=\varepsilon _+/c_1$ and $\varepsilon _+$ large enough, $\mathcal {O}_\lambda (u)=\emptyset $. Hence, Theorem 5 implies the estimate. $\square $

Corollary 16

Under the assumptions of Theorem 14 we have

$$\begin{aligned} \mathcal {J}_\epsilon (u_\varepsilon )- \mathcal {J}(u) \lesssim \varepsilon _-^p + \varepsilon _+^{-\frac{p}{d-1}} \end{aligned}$$

Proof

Since $\nabla u\in L^{\frac{pd}{d-1},\infty }(\Omega )$, an application of Lemma 12 finishes the pf. $\square $

Remark 17

The choice $f=0$ and hence $u=0$ gives $\mathcal {J}_\epsilon (u) = \kappa _\epsilon (0) {|{\Omega }|} \eqsim \varepsilon _-^p$. This shows that the estimate in Corollary 15 is sharp.

4 Convergence of the Kačanov-iteration

In this section we study the convergence of the Kačanov-iteration for fixed relaxation parameter $\epsilon =[\varepsilon _-,\varepsilon _+]$. In particular, for $v_0\in W_0^{1,2}(\Omega )$ arbitrary we calculate recursively $v_{n+1}$ by

$$\begin{aligned} \int \limits _\Omega (\varepsilon _-\vee |\nabla v_n|\wedge \varepsilon _+)^{p-2}\nabla v_{n+1}\cdot \nabla \xi \, dx = \langle f,\xi \rangle \quad \forall \xi \in W_0^{1,2}(\Omega ). \end{aligned}$$

(4.1)

We will show that $v_n$ converges to the minimizer $u_\epsilon $ of the relaxed energy $\mathcal {J}_\epsilon $. In particular, we show exponential decay of the energy error $\mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\epsilon )$. The proof is based on the following estimate, proved below.

Theorem 18

There is a constant $c_K>1$ such that

$$\begin{aligned} \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (v_{n+1}) \geqslant \delta \,(\mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon )) \end{aligned}$$

holds for $\delta := \tfrac{1}{c_K} (\tfrac{\varepsilon _-}{\varepsilon _+})^{2-p}$.

This theorem says that in each iteration we reduce the energy by a certain part of the remaining energy error. This implies

$$\begin{aligned} \begin{aligned} \mathcal {J}_\epsilon (v_{n+1})-\mathcal {J}_\epsilon (u_\varepsilon )&= \big ( \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon ) \big ) - \big ( \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (v_{n+1}) \big ) \\&\leqslant (1-\delta )\, \big ( \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon ) \big ). \end{aligned} \end{aligned}$$

(4.2)

As a direct consequence we will obtain the following exponential convergence result.

Corollary 19

There is a constant $c_K>1$ such that

$$\begin{aligned} \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon )\leqslant (1-\delta )^n(\mathcal {J}_\epsilon (v_0)-\mathcal {J}_\epsilon (u_\varepsilon )) . \end{aligned}$$

holds for $\delta := \tfrac{1}{c_K} (\tfrac{\varepsilon _-}{\varepsilon _+})^{2-p}$.

Let us get to the proof of Theorem 18.

Proof (Proof of Theorem 18)

Using Lemma 9, the equation (3.3) for $u_\varepsilon $, the equation (4.1) for $v_{n+1}$, and Young’s inequality (see Remark 32) we get, for arbitrary $\gamma > 0$,

$$\begin{aligned} \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon )\leqslant & {} \int \limits _\Omega (A_\epsilon (\nabla v_n)-A_\epsilon (\nabla u_\varepsilon ))\cdot \nabla (v_n-u_\varepsilon )\, dx \\= & {} \int \limits _\Omega \tfrac{\phi _\epsilon '(\vert \nabla v_n\vert )}{\vert \nabla v_n\vert }\nabla ( v_n - v_{n+1})\cdot \nabla (v_n-u_\varepsilon )\, dx \\\leqslant & {} \tfrac{1}{\gamma }\underbrace{ \tfrac{1}{2} \int \limits _\Omega \tfrac{\phi _\epsilon '(\vert \nabla v_n\vert )}{\vert \nabla v_n\vert }\vert \nabla ( v_n - v_{n+1})\vert ^2\, dx}_{=:I} \\&+ \gamma \,\underbrace{\tfrac{1}{2} \int \limits _\Omega \tfrac{\phi _\epsilon '(\vert \nabla v_n\vert )}{\vert \nabla v_n\vert }\vert \nabla (v_n-u_\varepsilon )\vert ^2\, dx}_{=:II} . \end{aligned}$$

Let us define

$$\begin{aligned} \mathcal {J}_\epsilon (v, a) := \mathcal {J}\left( v,\varepsilon _-\vee a\wedge \varepsilon _+\right) . \end{aligned}$$

For the first term I we calculate with the equation (4.1) for $v_{n+1}$

$$\begin{aligned} I= & {} \tfrac{1}{2} \int \limits _\Omega \tfrac{\phi _\epsilon '(\vert \nabla v_n\vert )}{\vert \nabla v_n\vert }\vert \nabla ( v_n - v_{n+1})\vert ^2\, dx \\= & {} \mathcal {J}_\epsilon (v_n,{|{\nabla v_n}|})-\mathcal {J}_\epsilon (v_{n+1},{|{\nabla v_n}|}) \\\leqslant & {} \mathcal {J}_\epsilon (v_n,{|{\nabla v_n}|})-\mathcal {J}_\epsilon (v_{n+1},{|{\nabla v_{n+1}}|}) \\= & {} \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (v_{n+1}). \end{aligned}$$

To establish the inequality above, we used the fact that

$$\begin{aligned} a_{n+1} = \varepsilon _-\vee {|{\nabla v_{n+1}}|}\wedge \varepsilon _+ = \mathop {\mathrm {arg\,min}}\limits _{a\,:\,\epsilon _- \leqslant a \leqslant \epsilon _+} \mathcal {J}(v_{n+1}, a), \end{aligned}$$

and

$$\begin{aligned} \mathcal {J}_\epsilon (v_{n+1},{|{\nabla v_{n+1}}|})= & {} \mathcal J(v_{n+1},\varepsilon _-\vee {|{\nabla v_{n+1}}|}\wedge \varepsilon _+) \\\leqslant & {} \mathcal J(v_{n+1},\varepsilon _-\vee {|{\nabla v_{n}}|}\wedge \varepsilon _+) =\mathcal {J}_\epsilon (v_{n+1},{|{\nabla v_n}|}). \end{aligned}$$

For the second term II we use $\varepsilon _+^{p-2}\leqslant \tfrac{\phi _\epsilon '(t)}{t} \leqslant \varepsilon _-^{p-2}$, implying $\tfrac{\phi _\epsilon '(t)}{t} \leqslant (\tfrac{\varepsilon _+}{\varepsilon _-})^{2-p}\tfrac{\phi _\epsilon '(s)}{s}$, for any $s,t \geqslant 0$, Lemma 8 and Lemma 9 to get

$$\begin{aligned} II\leqslant & {} \tfrac{1}{2} (\tfrac{\varepsilon _+}{\varepsilon _-})^{2-p}\int \limits _\Omega \tfrac{\phi _\epsilon '(\vert \nabla v_n \vert \vee \vert \nabla u_\varepsilon \vert )}{\vert \nabla v_n \vert \vee \vert \nabla u_\varepsilon \vert }\vert \nabla (v_n-u_\varepsilon )\vert ^2\, dx \\\leqslant & {} c\, (\tfrac{\varepsilon _+}{\varepsilon _-})^{2-p}\,(\mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon )) . \end{aligned}$$

Putting all estimates together we get

$$\begin{aligned} \gamma (1-c\gamma (\tfrac{\varepsilon _+}{\varepsilon _-})^{2-p})\, ( \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon ))\leqslant \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (v_{n+1}). \end{aligned}$$

Now, $\max _{\gamma >0} \gamma (1-c\gamma (\tfrac{\varepsilon _+}{\varepsilon _-})^{2-p}) = \tfrac{1}{4c} (\tfrac{\varepsilon _-}{\varepsilon _+})^{2-p}$ yields the statement. $\square $

Example 20

(Peak function) Let $\Omega := B_1(0)$ and $f(x)=-{\text {div}}(\tfrac{x}{|x|})$. Then $f\notin L^1(\Omega )$ but still $f \in (W_0^{1,1}(\Omega ))^*\subset (W_0^{1,p}(\Omega ))^*$ . Then the minimizer of $\mathcal {J}$ is given by $u(x) = 1-{|{x}|}$, which look like a peak. Since $|\nabla u|\equiv 1$, the factor ${|{\nabla u}|}^{p-2}$ in the p-Laplace operator does not appear for the minimizer. So in this case u also minimizes every $\mathcal {J}_\epsilon $ as long as $\varepsilon _-\leqslant 1$ and $\varepsilon _+\geqslant 1$. This follows from

$$\begin{aligned} \mathcal {J}_\epsilon (u) = \mathcal {J}(u)\leqslant \mathcal {J}(v)\leqslant \mathcal {J}_\epsilon (v) \end{aligned}$$

for all $v\in W_0^{1,\phi _\epsilon }(\Omega )$.

Let us see how our algorithm performs with the starting value $v_0:=0$. It is easy to see that $v_n = \alpha _n u$ with

$$\begin{aligned} \alpha _0 := 0 \quad \text {and}\quad \alpha _{n+1}:=(\varepsilon _-\vee \alpha _n\wedge \varepsilon _+)^{2-p}. \end{aligned}$$

(4.3)

Since $p\in (1,2)$ one can show $\alpha _ n = \varepsilon _-^{(2-p)^n}$ by induction and

$$\begin{aligned} \mathcal {J}_\epsilon (v_n)-\mathcal {J}(u) p(\varepsilon _-^{(2-p)^n} -1)) . \end{aligned}$$

Note that

$$\begin{aligned} \tfrac{1}{p} t^p - \tfrac{1}{p} - (t-1) \leqslant \tfrac{p-1}{2} (\ln (t))^2 \qquad \text {for all }t \in (0,1]. \end{aligned}$$

Moreover,

$$\begin{aligned} \frac{\tfrac{1}{p} t^p - \tfrac{1}{p} - (t-1)}{\tfrac{p-1}{2} (\ln (t))^2}&\rightarrow 1 \qquad \text {as } t \nearrow 1. \end{aligned}$$

(4.4)

This estimate with $s:=(2-p)^n \in (0,1]$ and $t := \varepsilon _-^s \in (0,1]$ gives

$$\begin{aligned} \mathcal {J}_\epsilon (v_n)- \mathcal {J}(u)\leqslant \tfrac{1}{2}|B_1(0)|(p-1)\ln (\varepsilon _-)^2(2-p)^{2n} \end{aligned}$$

(4.5)

is sharp. Indeed, the energy differences $\mathcal {J}(v_n)-\mathcal {J}(u) = \mathcal {J}_\epsilon (v_n)-\mathcal {J}(u)$ asymptotically behave like $\tfrac{1}{2}|B_1(0)|(p-1)\ln (\varepsilon _-)^2(2-p)^{2n}$ for large n, in view of (4.4).

This shows that it is impossible to get an energy reduction as in (4.2) with $\delta $ independent of $\epsilon $. Indeed, Corollary 19 would imply

$$\begin{aligned} \mathcal {J}_\epsilon (v_n) - \mathcal {J}_\epsilon (u_\epsilon ) \leqslant (1-\delta )^n \big (\mathcal {J}_\epsilon (v_0) - \mathcal {J}_\epsilon (u_\epsilon )\big ) \leqslant (1-\delta )^n \mathcal {J}_1(v_0), \end{aligned}$$

which contradicts the above asymptotic estimate (4.5).

Nevertheless, our asymptotic shows that in this particular case

$$\begin{aligned} \mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon )\leqslant c_\epsilon \,(1-\delta )^n \end{aligned}$$

with $1-\delta = (2-p)^2 <1$ independent of $\epsilon $. Therefore, it remains open if such an estimate holds in the general case.

Remark 21

Although our considerations are all under the assumption $1<p\leqslant 2$ it is interesting to check how the algorithm performs in the case $p> 2$ for our Example 20.

If $p\geqslant 3$ and $\varepsilon _+:= \tfrac{1}{\varepsilon _-}$ for some $\varepsilon _-<1$, then it follows from (4.3) that

$$\begin{aligned} \alpha _0 = 0\quad \text {and}\quad \alpha _n = \varepsilon _-^{(-1)^n(p-2)} \quad \text { for } n\geqslant 1 . \end{aligned}$$

Therefore, the Kačanov iteration does not converge as $p\geqslant 3$.

If $p \in (2,3)$ and $\varepsilon _+:= \tfrac{1}{\varepsilon _-}$, then it follows from (4.3) that

$$\begin{aligned} \alpha _0 = 0\quad \text {and}\quad \alpha _n = \varepsilon _-^{(2-p)^n} \quad \text { for } n\geqslant 1 \end{aligned}$$

and $v_n$ still converges to u.

5 Algebraic rate

As we learned in the last section the Kačanov iteration converges for fixed $\epsilon $, but the rate depends badly on the choice of the relaxation interval $\varepsilon =[\varepsilon _-,\varepsilon _+]$. Furthermore, we have algebraic convergence of the error $\mathcal {J}_\epsilon (u_\varepsilon )- \mathcal {J}(u)$ induced by the relaxation. We will combine these results to deduce an algebraic rate of the full error $\mathcal {J}_{\epsilon _n}(v_n) -\mathcal {J}(u)$ in terms of n for a specific predefined choice of $\epsilon _n$. To achieve our goal we will use that $|\nabla u|\in L^{q,\infty }(\Omega )$ for some $q>p$, which is justified by Theorems 13 and 14.

Let us consider a sequence of solutions created by our relaxed p-Kačanov algorithm. In particular, $\epsilon _n=[\epsilon _{n,-},\epsilon _{n,+}]$ is now an increasing sequence of intervals. Then exactly as in Theorem 18 we get the following estimate.

Theorem 22

There is a constant $c_K>1$ such that

$$\begin{aligned} \mathcal {J}_{\epsilon _n}(v_n)-\mathcal {J}_{\epsilon _n}(v_{n+1})&\geqslant \delta _n \,(\mathcal {J}_{\epsilon _n}(v_n)-\mathcal {J}_{\epsilon _n}(u_{\varepsilon _n})) \end{aligned}$$

holds for $ \delta _n := \tfrac{1}{c_K} (\tfrac{\epsilon _{n,-}}{\epsilon _{n,+}})^{2-p}$.

Since $\epsilon _n \subset \epsilon _{n+1}$, we have $\mathcal {J}_{\epsilon _{n+1}} \leqslant \mathcal {J}_{\epsilon _n}$. This and Theorem 22 imply

$$\begin{aligned}&\mathcal {J}_{\epsilon _{n+1}}(v_{n+1}) - \mathcal {J}(u) \\&\quad \leqslant \mathcal {J}_{\epsilon _n}(v_{n+1}) - \mathcal {J}(u) \\&\quad = \big (\mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}(u)\big ) - \big ( \mathcal {J}_{\epsilon _n}(v_n)- \mathcal {J} {\epsilon _n}(v_{n+1}) \big ) \\&\quad \leqslant \big (\mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}(u)\big ) - \delta _n \big ( \mathcal {J}_{\epsilon _n}(v_n)- \mathcal {J}_{\epsilon _n}(u_{\epsilon _n}) \big ) \\&\quad = (1- \delta _n) \big (\mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}(u)\big ) + \delta _n \big ( \mathcal {J}_{\epsilon _n}(u_{\epsilon _n}) - \mathcal {J}(u) \big ). \end{aligned}$$

Now, Lemma 12 and $|\nabla u|\in L^{q,\infty }(\Omega )$ ensure the existence of $c_R>0$ such that

$$\begin{aligned} \mathcal {J}_\epsilon (u_\varepsilon )- \mathcal {J}(u) \leqslant c_R(\varepsilon _-^p + \varepsilon _+^{-(q-p)}) . \end{aligned}$$

(5.1)

This and the previous estimate therefore imply

$$\begin{aligned} \mathcal {J}_{\epsilon _{n+1}}(v_{n+1}) - \mathcal {J}(u)&\leqslant (1- \delta _n) \big (\mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}(u)\big ) + \delta _n c_R(\epsilon _{n,-}^p + \epsilon _{n,+}^{-(q-p)}).\nonumber \\ \end{aligned}$$

(5.2)

Without the last term $\delta _{n+1} c_R(\epsilon _{n,-}^p + \epsilon _{n,+}^{-(q-p)})$ we would get a reduction of the error $\mathcal {J}_{\epsilon _n}(v_n) -\mathcal {J}(u)$ by the factor $(1-\delta _n)$. On the other hand this last term is small if $\epsilon _{n,-} \rightarrow 0$ and $\epsilon _{n,+} \rightarrow \infty $, so it should not bother too much. Nevertheless, the reduction factor $(1-\delta _n)$ tends to 1 if $\epsilon _{n,-} \rightarrow 0$ and $\epsilon _{n,+} \rightarrow \infty $. The idea however is the following: if $\delta _n$ goes to zero slowly, then the product $\prod _{i=1}^n (1-\delta _i)$ still tends to zero algebraically.

Let us be more precise. We define another relaxed energy $\mathcal {G}_n$ by

$$\begin{aligned} \mathcal {G}_n := \mathcal {J}_{\epsilon _n}(v_n) + K_1\, (\varepsilon _{-,n}^p + \varepsilon _{+,n}^{-(q-p)}) \quad \text {and}\quad \mathcal {G}_\infty := \mathcal {J}(u) , \end{aligned}$$

(5.3)

where $K_1>0$ will be determined below. Moreover, choose

(5.4)

and define

$$\begin{aligned} \varepsilon _n := [(n+1)^{-\alpha },(n+1)^\beta ]. \end{aligned}$$

(5.5)

Then

$$\begin{aligned} \delta _n&= \tfrac{1}{c_K} \big (\tfrac{\epsilon _{n,-}}{\epsilon _{n,+}}\big )^{2-p} = \tfrac{1}{c_K} ((n+1)^{-\alpha -\beta })^{2-p} \geqslant \tfrac{1}{c_K} \tfrac{1}{n+1}. \end{aligned}$$

(5.6)

In particular, the algorithm with this choice of $\epsilon _n$ reads as follows:

We continue to derive a decay estimate for $\mathcal {G}_n - \mathcal {G}_\infty $.

Lemma 23

There exists $K=K(\alpha ,\beta ,p,q)$ (which appears in the definition of $\mathcal {G}_n$) and some $c_3=c_3(\alpha ,\beta ,p,q)\geqslant 1$, such that for all $n \in \mathbb {N}$

$$\begin{aligned} (\mathcal {G}_{n+1} - \mathcal {G}_\infty )&\leqslant (1- \tfrac{1}{c_3 (n+1)}) ( \mathcal {G}_n - \mathcal {G}_\infty ). \end{aligned}$$

Proof

Define

$$\begin{aligned} \rho _n := \varepsilon _{-,n}^p + \varepsilon _{+,n}^{-(q-p)} = (n+1)^{-\alpha p} + (n+1)^{-(q-p)\beta }. \end{aligned}$$

Hence it follows by Lemma 27 in the Appendix that there exists $c_2 = c_2(\alpha ,\beta ,p,q) \geqslant 1$ with

$$\begin{aligned} \rho _n - \rho _{n+1}&\geqslant \frac{1}{c_2} \frac{1}{n+1} \rho _n. \end{aligned}$$

(5.7)

In particular, $\rho _n$ satisfies a decay estimate!

On the other hand it follows from Theorem 22 that

$$\begin{aligned}&\mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}_{\epsilon _{n+1}}(v_{n+1}) \\&\quad \geqslant \mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}_{\epsilon _n}(v_{n+1}) \\&\quad \geqslant \delta _n (\mathcal {J}_{\epsilon _n}(v_n)-\mathcal {J}_{\epsilon _n}(u_{\varepsilon _n})) \\&\quad \geqslant \frac{1}{c_K} \frac{1}{n+1} (\mathcal {J}_{\epsilon _n}(v_n)-\mathcal {J}_{\epsilon _n}(u_{\varepsilon _n})) \\&\quad = \frac{1}{c_K} \frac{1}{n+1} (\mathcal {J}_{\epsilon _n}(v_n)-\mathcal {J}(u)) - \frac{1}{c_K} \frac{1}{n+1}(\mathcal {J}_{\epsilon _n}(u_{\varepsilon _n}) - \mathcal {J}(u)). \end{aligned}$$

We deduce from (5.1), the definition of $\epsilon _n$ and $\rho _n$ that

$$\begin{aligned} \mathcal {J}_\epsilon (u_\varepsilon )- \mathcal {J}(u) \leqslant c_R(\epsilon _{n,-}^p + \epsilon _{n,+}^{-(q-p)}) = c_R \rho _n. \end{aligned}$$

This and the previous estimate prove

$$\begin{aligned} \mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}_{\epsilon _{n+1}}(v_{n+1})&\geqslant \frac{1}{c_K} \frac{1}{n+1} \,(\mathcal {J}_{\epsilon _n}(v_n)-\mathcal {J}(u)) - \frac{c_R}{c_K} \frac{1}{n+1} \rho _n. \end{aligned}$$

Since $\mathcal {G}_n = \mathcal {J}_{\epsilon _n}(v_n) + K_1\rho _n$, it follows together with (5.7) that

$$\begin{aligned} \mathcal {G}_{n+1} - \mathcal {G}_n&\geqslant \frac{1}{c_K} \frac{1}{n+1} \,(\mathcal {J}_{\epsilon _n}(v_n)-\mathcal {J}(u)) + \bigg ( \frac{K_1}{c_2} - \frac{c_R}{c_K}\bigg ) \frac{1}{n+1} \rho _n. \end{aligned}$$

We finally fix $K_1$: We choose $K_1$ so large such that

$$\begin{aligned} \frac{K_1}{c_2} - \frac{c_R}{c_K}&\geqslant \frac{K_1}{2\,\max {\{{c_K,c_2}\}}}, \end{aligned}$$

which is always possible. Combining this with our previous estimates we deduce

$$\begin{aligned} \mathcal {G}_{n+1} - \mathcal {G}_n&\geqslant \frac{1}{2\, \max {\{{c_K,c_2}\}} } \frac{1}{n+1} (\mathcal {G}_n - \mathcal {G}_\infty ). \end{aligned}$$

This proves the theorem with $c_3 = 2\, \max {\{{c_K,c_2}\}}$. $\square $

We are now able to present our convergence result.

Theorem 24

Let $\nabla u\in L^{q,\infty }(\Omega )$ for some $q>p$ (as given for example in Theorem 13 or Theorem 14). Then, the sequence $(v_n)_{n\in \mathbb N}$ produced by the algorithm above described satisfies

$$\begin{aligned} \mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}(u) \leqslant \mathcal {G}_n-\mathcal {G}_\infty \leqslant n^{-\frac{1}{c_3}}\,(\mathcal {G}_0 - \mathcal {G}_\infty ), \end{aligned}$$

where $c_3$ is the constant of Lemma 23. In particular, the energy error decreases at least algebraically.

Proof

The estimate $\mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}(u) \leqslant \mathcal {G}_n-\mathcal {G}_\infty $ is obvious, so it remains to prove the decay of $\mathcal {G}_n - \mathcal {G}_\infty $. If follows from Lemma 23 that, for $n \in \mathbb {N}$,

$$\begin{aligned} \mathcal {G}_n - \mathcal {G}_\infty&\leqslant \prod _{i=0}^{n-1} \big ( 1 - \tfrac{1}{c_3 (i+1)}\big ) (\mathcal {G}_0 - \mathcal {G}_\infty ). \end{aligned}$$

Now,

$$\begin{aligned} \prod _{i=0}^n \Big (1- \tfrac{1}{c_3(i+1)} \Big )= & {} \exp \bigg ( \sum _{i=0}^n\ln \Big (1- \tfrac{1}{c_3(i+1)} \Big ) \bigg ) \leqslant \exp \bigg ( -\sum _{i=0}^n \frac{1}{c_3(i+1)} \bigg ) \\\leqslant & {} \exp \bigg ( - \frac{\ln (n)}{c_3} \bigg ) = n^{-\frac{1}{c_3}}. \end{aligned}$$

This proves the lemma. $\square $

Remark 25

We have seen that the choice $\epsilon _n = [(n+1)^{-\alpha }, (n+1)^\beta ]$ ensures that the error decreases at least with an algebraic rate. However, the decay of the relaxed energy error $\mathcal {G}_n- \mathcal {G}_\infty $ can never be faster than algebraical with this choice of $\epsilon _n$. Hence, this choice is also very restrictive. From the numerical experiments we performed, we have seen that it is possible to decrease $\epsilon _{n,-}$ and increase $\epsilon _{n,+}$ much faster and still obtain convergence. Moreover, the observed convergence is much faster than algebraic and more of exponential type. We will present the details of such numerical experiments in a subsequent work. Let us summarize: the algorithm of this section ensures an algebraic convergence rate, but in practice we expect a better behavior for other, perhaps adaptive, choices of $\epsilon _n$, still to be fully investigated.

6 Numerical experiments

We have performed numerous experiments on the basis of the adaptive finite element method with piecewise linear elements. We developed preliminary versions of error estimators that capture the effect of the truncation, the adaptivity of the mesh and the fixpoint iteration. Let $v_n$ denote the iterated solution generated by the algorithm, then we used the following ad hoc estimators:

We use
$$\begin{aligned} \eta ^2_{\epsilon ^+}(v_n)&:= \mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}_{(\epsilon _{n,-},\infty )}(v_n) \end{aligned}$$
to measure the effect of the upper truncation bound $\epsilon _{n,+}$ and
$$\begin{aligned} \eta ^2_{\epsilon ^-}(v_n)&:= \mathcal {J}_{\epsilon _n}(v_n) - \mathcal {J}_{(0,\epsilon _{n,+})}(v_n) \end{aligned}$$
to measure the effect of the lower truncation bound $\epsilon _{n,-}$.
We use the optimal estimators of [2, 12] with the Orlicz function $\phi _{\epsilon _n}$ to estimate the error due to mesh refinement, i.e. on elements T we use the estimators
$$\begin{aligned} \eta ^2_h(v_n)&:= \int _T (\phi _{\epsilon _n, {|{\nabla v_n}|}})^*(h_T {|{f}|})\,dx + \sum _{\gamma \subset \partial T} h_\gamma \! \int _\gamma {\big |{\llbracket {V_{\epsilon _n}(\nabla v_n)}\rrbracket _\gamma }\big |}^2\,ds. \end{aligned}$$
To measure the error due to the fixpoint iteration (for $n \geqslant 1$)
(6.1)
which is in fact an upper bound for $\mathcal {J}_\epsilon (v_n)-\mathcal {J}_\epsilon (u_\varepsilon )$ and $\mathcal {J}_\epsilon (v_{n-1})-\mathcal {J}_\epsilon (u_\varepsilon )$.

We used these estimators to implement a fully adaptive version of our relaxed p-Kačanov iteration.

Let us present three experiments that measure different aspects. We have chosen quite critical situations and small exponents p in order to see how the algorithm behaves in particularly bad situations. In particular, we have chosen a quite small exponent $p= \frac{16}{15}$ for all of our experiments presented here. (The results of our numerical experiments behave much nicer for p closer to 2.) The results for larger exponents are in fact much nicer.

The bump On $\Omega = [-1,1]^2$ choose f and the boundary values of u such that $u(x)=(x_1^2\!-\!1)(x_2^2\!-\!1)$. There is only one critical point at (0, 0), where $\nabla u(0,0)=0$. This example is chosen to see how the algorithm behaves for smooth functions with isolated extrema.^{Footnote 3}
The needle On $\Omega = [-1,1]^2$ choose f and the boundary values of u such that $u(x) = {|{x}|}^{1-\frac{1}{p}} -1$. In this case $\nabla u(x) = {|{x}|}^{-\frac{1}{p}} \frac{x}{{|{x}|}}$ and $A(\nabla u) = {|{x}|}^{-\frac{1}{p'}} \frac{x}{{|{x}|}}$ and $V(\nabla u) = {|{x}|}^{-\frac{1}{2}} \frac{x}{{|{x}|}}$. This example is chosen such that $V(\nabla u) \in W^{\frac{1}{2}}L^{2,\infty }(\Omega )$ meaning that half a derivative^{Footnote 4} of $V(\nabla u)$ is still in the Lorentz space $L^{2,\infty }$. This corresponds to the regularity of a p-harmonic function on a slit domain. It is the critical regularity that allows optimal estimates for the adaptive finite element method, while for uniform refinement a rate of $O(h^{1/2})$ instead of the optimal one O(h).
Constant force in slit domain Choose $f=2$ on the slit domain $\Omega = (-1,1)^2 \setminus [0,1]^2$ and $u=0$ on its boundary. There is no explicit formula for the exact solution (different from the linear case of $p=2$) for the solution. The reentrant corner reduces the regularity of the solution.

There is a nice gap between our theory and the numerical experiments that we performed. In fact, our experiments shows a significantly faster convergence rate. This shows that we are on a good track and have developed a good algorithm.

Figures 1, 2 and 3 show the results of our experiments. The pictures show a log-log-plot of energy accuracy vs. the computed costs. Let us explain the picture from Fig. 1 in more detail. The others figures are analogously. The dotted black line is the most important line and shows the energy difference $\mathcal {J}_\epsilon (u_n)-\mathcal {J}(u)$, which is the measure of the error between the computational solution and the exact solution. The lines $\epsilon _+$ and $\epsilon _-$ represent the truncation parameters $(\epsilon _-,\epsilon _+)$. The other lines represent error indicators for $(\epsilon _-,\epsilon _+)$ named $\eta ^2_{\epsilon ^+}$ and $\epsilon ^2_{\epsilon ^-}$, the fixpoint iteration , and refinement $\eta _h^2(u_h)$. The straight black line $\text {costs}^{-1}$ is the optimal convergence rate (1/costs), where the costs are the accumulative sum of the degrees of freedom for each step that requires solving a linear system. (Here we have assumed a linear cost for solving the linear system, which might be possible with a multi grid method.) It is important that we use the accumulated cost instead of the degrees of freedom, since only this truly measures the effort, in particular if the number of fixpoint iterations increases.

Let us explain the numerical results in more detail.

Bump The algorithm converges optimally with respect to the accumulated costs. The exact solution has a bounded gradient, so the upper truncation bound $\epsilon _{n,+}$ does not increase much. There is only one critical point at $x=0$, where $\nabla u(0)=0$. Thus, the algorithm decreases the lower truncation bound $\epsilon _{n,-}$ quite moderately, since the error from truncation appears only in a small neighborhood around zero. All estimators decrease nicely with the optimal rate (1/cost). The number of fixpoint iterations remains bounded, so the cost is proportional to the current number of degrees of freedom.
The needle The algorithm converges optimally with respect to the accumulated costs. The exact solution has unbounded gradient at the isolated point zero. Therefore, the upper truncation bound $\epsilon _{n,+}$ is increased by the algorithm. This upper truncation starts quite late, since it is only necessary in set of small measure. The number of fixpoint iterations remain bounded.
Constant force in slit domain The gradient of the exact solution is bounded, so the upper truncation bound $\epsilon _{n,+}$ does not increase much. The energy error still decreases very fast. It decreases almost optimally with respect to the accumulated cost, but the slope seems slightly worse. It is however still much faster, than our worst-case theory predicts.

Overall, we see that our algorithm converges with a rate, which is optimal in many cases.

Notes

All our results also hold for the p-Poisson system, where the functions are vector-valued. To this end sometimes $\mathbb {R}^d$ has to be replaced by $\mathbb {R}^{N \times d}$.
The condition $\partial \Omega \in W^2 L^{d-1,1}$ means that the weak Hessians of the local maps characterizing the boundary are in the Lorentz space $L^{d-1,1}$.
Note that ${|{f(x)}|}$ behaves like ${|{x}|}^{p-2}$. Thus, $f \in L^2$ for all $p>1$ but $f \in L^{p'}$ only for $p > \sqrt{2}$. This makes potential troubles with the used error estimator, but since the error estimator is also truncated with $\epsilon $ the effect is manageable.
This is understood in a heuristic way only: half a derivative of $V(\nabla u)$ growths like the function ${|{x}|}^{-1}$, which is in the Lorentz space $L^{2,\infty }(\Omega )$.
We thank W. Sickel for explaining the details to us.
See the beginning of Sect. 3 for the definition of an N-function.

References

Acerbi, E., Fusco, N.: An approximation lemma for $W^{1,p}$ functions. In: Material Instabilities in Continuum Mechanics (Edinburgh, 1985–1986), Oxford Univ. Press, New York, pp. 1–5 (1988)
Belenki, L., Diening, L., Kreuzer, C.: Optimality of an adaptive finite element method for the $p$-Laplacian equation. IMA J. Numer. Anal. 32(2), 484–510 (2012)
Article MathSciNet MATH Google Scholar
Braides, A.: $\Gamma $-Convergence for Beginners. Oxford Lecture Series in Mathematics and its Applications, vol. 22. Oxford University Press, Oxford (2002)
MATH Google Scholar
Bulíček, M., Diening, L., Schwarzacher, S.: Existence, uniqueness and optimal regularity results for very weak solutions to nonlinear elliptic systems. Anal. PDE 9(5), 1115–1151 (2016)
Article MathSciNet MATH Google Scholar
Canuto, C., Urban, K.: Adaptive optimization of convex functionals in Banach spaces. SIAM J. Numer. Anal. 42(5), 2043–2075 (2005)
Article MathSciNet MATH Google Scholar
Chambolle, A., Lions, P.L.: Image recovery via total variation minimization and related problems. Numer. Math. 76(2), 167–188 (1997)
Article MathSciNet MATH Google Scholar
Cianchi, A., Maz’ya, V.: Global Lipschitz regularity for a class of quasilinear elliptic equations. Commun. Partial Differ. Equ. 36(1), 100–133 (2010). https://doi.org/10.1080/03605301003657843
Article MathSciNet MATH Google Scholar
Cruz-Uribe, D., Martell, J.M., Pérez, C.: Extrapolation from $A_\infty $ weights and applications. J. Funct. Anal. 213(2), 412–439 (2004)
Article MathSciNet MATH Google Scholar
Dal Maso, G.: An Introduction to $\Gamma $-Convergence. Progress in Nonlinear Differential Equations and their Applications, vol. 8. Birkhäuser Boston Inc, Boston (1993)
MATH Google Scholar
Daubechies, I., DeVore, R., Fornasier, M., Güntürk, C.S.: Iteratively reweighted least squares minimization for sparse recovery. Commun. Pure Appl. Math. 63(1), 1–38 (2010)
Article MathSciNet MATH Google Scholar
Diening, L., Ettwein, F.: Fractional estimates for non-differentiable elliptic systems with general growth. Forum Mathematicum 20(3), 523–556 (2008). https://doi.org/10.1515/FORUM.2008.027
Article MathSciNet MATH Google Scholar
Diening, L., Kreuzer, C.: Linear convergence of an adaptive finite element method for the $p$-Laplacian equation. SIAM J. Numer. Anal. 46, 614–638 (2008)
Article MathSciNet MATH Google Scholar
Diening, L., Kreuzer, C., Süli, E.: Finite element approximation of steady flows of incompressible fluids with implicit power-law-like rheology. SIAM J. Numer. Anal. 51(2), 984–1015 (2013). https://doi.org/10.1137/120873133
Article MathSciNet MATH Google Scholar
Ebmeyer, C.: Mixed boundary value problems for nonlinear elliptic systems with $p$-structure in polyhedral domains. Mathematische Nachrichten 236(1), 91–108 (2002)
Article MathSciNet MATH Google Scholar
Edmunds, D., Evans, W., Karadzhov, G.: Sharp estimates of the embedding constants for Besov spaces. Rev. Mat. Complut. 19(1), 161–182 (2006)
MathSciNet MATH Google Scholar
Fornasier, M., Rauhut, H., Ward, R.: Low-rank matrix recovery via iteratively reweighted least squares minimization. SIAM J. Optim. 21(4), 1614–1640 (2011)
Article MathSciNet MATH Google Scholar
Garau, E.M., Morin, P., Zuppa, C.: Convergence of an adaptive Kačanov FEM for quasi-linear problems. Appl. Numer. Math. 61(4), 512–529 (2011)
Article MathSciNet MATH Google Scholar
Han, W., Jensen, S., Shimansky, I.: The Kačanov method for some nonlinear problems. Appl. Numer. Math. 24(1), 57–79 (1997)
Article MathSciNet MATH Google Scholar
Krasnosel’skiĭ, M., Rutickiĭ, Y.: Convex Functions and Orlicz Spaces. Cambridge Univ Press, Cambridge (1961)
Google Scholar
Kufner, A., Oldrich, J., Fučík, S.: Function Spaces. Noordhoff International Pub, Leyden (1977)
MATH Google Scholar
Ladyzhenskaya, O.: New equations for the description of motion of viscous incompressible fluids and solvability in the large of boundary value problems for them. Proc. Stek. Inst. Math. 102, 95–118 (1967)
Google Scholar
Lindqvist, P.: Notes on the $p$-laplace equation. http://www.math.ntnu.no/lqvist/p-laplace.pdf (2006). Accessed 28 Nov 2017
Peetre, J.: Espaces d’interpolation et théorème de Soboleff. Ann. Inst. Fourier (Grenoble) 16(1), 279–317 (1966)
Article MathSciNet MATH Google Scholar
Ružička, M., Diening, L.: Non-Newtonian Fluids and Function Spaces. NAFSA 8–Nonlinear Analysis. Function Spaces and Applications, vol. 8, pp. 94–143. Czech. Acad. Sci, Prague (2007)
Stein, E., Murphy, T.: Harmonic Analysis: Real-Variable Methods, Orthogonality, and Oscillatory Integrals. Princeton, Princeton mathematical series (1993)
Google Scholar
Triebel, H.: Interpolation Theory, Function Spaces, Differential Operators. North-Holland Pub. Co, Amsterdam (1978)
MATH Google Scholar
Vogel, C.R., Oman, M.: Fast, robust total variation-based reconstruction of noisy, blurred images. IEEE Trans. Image Process. 7(6), 813–824 (1998)
Article MathSciNet MATH Google Scholar
Wank, M.: A Kačanov Type Iteration for the $p$-Poisson Problem. Doctoral thesis, University of Osnabrück (2016)

Download references

Acknowledgements

Open Access funding provided by Projekt DEAL. Many thanks to Johannes Storn who did the final numerical experiments of this article. Finally, we thank the anonymous reviewer for the careful reading of the manuscript.

Author information

Authors and Affiliations

Faculty of Mathematics, Bielefeld University, Postfach 10 01 31, 33501, Bielefeld, Germany
L. Diening
Department of Mathematics, TUM, Boltzmannstraße 3, 85748, Garching, Munich, Germany
M. Fornasier
Osnabrück University, Osnabrück, Germany
R. Tomasi & M. Wank

Authors

L. Diening
View author publications
You can also search for this author in PubMed Google Scholar
M. Fornasier
View author publications
You can also search for this author in PubMed Google Scholar
R. Tomasi
View author publications
You can also search for this author in PubMed Google Scholar
M. Wank
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to L. Diening.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

We acknowledge support for the Article Processing Charge by the Deutsche Forschungsgemeinschaft and the Open Access Publication Fund of Bielefeld University. This research was supported by the DFG project “Optimal Adaptive Numerical Methods for p-Poisson Elliptic Equations”.

Appendices

Auxiliaries

The following embedding is probably well known. However, since we could not find a reference for it and we need it for the proof of Theorem 14, we include a short proof of it.^{Footnote 5} In the following $\mathcal {N}^{\frac{1}{2},2}$ denotes the usual Nikolskiĭ space, see e.g. [20].

Lemma 26

$\mathcal {N}^{\frac{1}{2},2}(\Omega )\hookrightarrow L^{\frac{2d}{d-1},\infty }(\Omega )$.

Proof

We will not recapitulate the definitions of the occurring spaces. First of all, we use the identity

$$\begin{aligned} \mathcal {N}^{\frac{1}{2},2}(\Omega ) = B_{2,\infty }^{\frac{1}{2}}(\Omega ) \end{aligned}$$

as stated in [20, Remark 8.4.5], where $B_{p,q}^s(\Omega )$ denotes the standard Besov spaces. In [26, Theorems 1 and 2 in 4.3.1] we find the interpolation pair $\{B^{\frac{1}{4}}_{2,1} (\Omega ), B^{\frac{3}{4}}_{2,1} (\Omega )\}$ such that

$$\begin{aligned} B_{2,\infty }^{\frac{1}{2}}(\Omega ) = (B^{\frac{1}{4}}_{2,1} (\Omega ), B^{\frac{3}{4}}_{2,1} (\Omega ))_{\frac{1}{2}, \infty } \end{aligned}$$

holds. The embeddings (see [15] respectively [23])

$$\begin{aligned} B^{\frac{1}{4}}_{2,1} (\Omega )\hookrightarrow L^{\frac{4d}{2d-1}}(\Omega ) \quad \text {and}\quad B^{\frac{3}{4}}_{2,1} (\Omega )\hookrightarrow L^{\frac{4d}{2d-3}}(\Omega ) \end{aligned}$$

yield

$$\begin{aligned} (B^{\frac{1}{4}}_{2,1} (\Omega ), B^{\frac{3}{4}}_{2,1} (\Omega ))_{\frac{1}{2}, \infty }\hookrightarrow (L^{\frac{4d}{2d-1}}(\Omega ), L^{\frac{4d}{2d-3}}(\Omega ))_{\frac{1}{2}, \infty } . \end{aligned}$$

Finally, by [26, Theorem 2 in 1.18.6] we get

$$\begin{aligned} (L^{\frac{4d}{2d-1}}(\Omega ), L^{\frac{4d}{2d-3}}(\Omega ))_{\frac{1}{2}, \infty }&= L^{\frac{2d}{d-1},\infty }(\Omega ) . \end{aligned}$$

This proves the claim. $\square $

Moreover, in the proof Lemma 23 we used the following algebraic estimate:

Lemma 27

Let $\gamma >0$. Then for all $n\geqslant 1$ we have

$$\begin{aligned} n^{-\gamma } -(n+1)^{-\gamma } \geqslant n^{-\gamma -1}\min \{\tfrac{\gamma }{2},1-2^{-\gamma }\} . \end{aligned}$$

Proof

We define $h:[0,\tfrac{1}{2}]\rightarrow \mathbb {R}$ via $h(t):=1-(1-t)^\gamma $. Note that $h'(t) = \gamma (1-t)^{\gamma -1}$ and $h''(t) = \gamma (1-\gamma )(1-t)^{\gamma -2}$. For $\gamma \geqslant 1$ this implies that h is concave, so

$$\begin{aligned} h(t)&\geqslant t\Big (\tfrac{h(\frac{1}{2})-h(0)}{\frac{1}{2}}\Big )= 2 (1-2^{-\gamma })t . \end{aligned}$$

On the other hand, if $\gamma \in (0,1)$, the function h is convex. Therefore,

$$\begin{aligned} h(t)&\geqslant h(0) + th'(0) = \gamma t . \end{aligned}$$

This implies $h(t)\geqslant \min \{\gamma , 2(1-2^{-\gamma })\}t$. Therefore, we get

$$\begin{aligned} n^{-\gamma } - (n+1)^{-\gamma }&= n^{-\gamma }(1-(1-\tfrac{1}{n+1})^\gamma ) = n^{-\gamma }h(\tfrac{1}{n+1}) \\&\geqslant n^{-\gamma }\min \{\gamma , 2(1-2^{-\gamma })\}\tfrac{1}{n+1}\geqslant n^{-\gamma -1}\min \{\tfrac{\gamma }{2},1-2^{-\gamma }\} . \end{aligned}$$

This proves the claim. $\square $

On uniformly convex Orlicz functions

In this appendix we introduce the concept of uniformly convex Orlicz functions and their shifted versions. The results presented below are modifications of [11, Lemma 3], [12, Lemma 16], and [12, Corollary 26]. However, since we use here a slightly different notion of shifted functions and less regularity for our Orlicz functions, we decided to include a proof and keep our paper self-contained. Throughout this section we assume that our Orlicz function satisfies the following assumptions.

Definition 28

Let $\phi $ be an N-function.^{Footnote 6} We say that $\phi $ is uniformly convex if there exist $c_4,c_5 >0$ with

(B.1)

The choice $t=0$ implies that $c_4 \leqslant 1 \leqslant c_5$. Moreover, (B.1) implies that $\phi '$ is strict increasing.

For any N-function $\rho $ the complementary (or dual) N-function $\rho ^*$ is given by $\rho ^*(u) = \sup _{t \geqslant 0} (ut - \phi (t))$. This is equivalent to $(\rho ^*)' = (\rho ')^{-1}$. Note that $(\rho ^*)^*=\rho $.

Lemma 29

If $\phi $ is a uniformly convex N-function with constants $c_4$ and $c_5$, then $\phi ^*$ is a uniformly convex N-function with constants $1/c_5$ and $1/c_4$.

Proof

Let $s=(\phi ^*)'(u)$ and $t=(\phi ^*)'(v)$. Then by strict monotonicity of $\phi '$ and $(\phi ^*)'$ condition (B.1) is equivalent to

Taking the reciprocal proves the uniform convexity of $\phi ^*$. The reverse conclusion follows by duality, i.e. $\phi =(\phi ^*)^*$. $\square $

Lemma 30

Let $\phi $ be an N-function, which is piecwise $C^2$ on $(0,\infty )$. Then $\phi $ is uniformly convex if and only if

$$\begin{aligned} c_6 \leqslant \frac{\phi ''(t)\,t}{\phi '(t)}&\leqslant c_7 \qquad \text {for all t>0} \end{aligned}$$

(B.2)

for some $c_6,c_7>0$. This is in fact the uniform convexity condition in [11].

Proof

Note that $s \searrow t$ in (B.1) implies (B.2) with $c_6=c_4$ and $c_7=c_5$. So assume now, that (B.2) holds.

If $s \geqslant 2t$, then $s-t \geqslant s/2$ and the upper bound of (B.1) is obvious with $c_7=2$. So let us assume $t < s \leqslant 2t$. Then

$$\begin{aligned} \frac{\phi '(s)-\phi '(t)}{s-t}&= \frac{1}{s-t}\int _t^s \phi ''(\tau )\,d\tau \leqslant \frac{c_7}{s-t} \int _t^s \frac{\phi '(\tau )}{\tau }\,d\tau \leqslant \frac{c_7 \phi '(s)}{s-t} \log (s/t). \end{aligned}$$

Now $\log (1+a) \leqslant a$ for $a\geqslant 0$ and $s \leqslant 2t$ imply

$$\begin{aligned} \frac{\phi '(s)-\phi '(t)}{s-t}&\leqslant \frac{c_7 \phi '(s)}{s-t} \frac{s-t}{t} = \frac{c_7 \phi '(s)}{t} \leqslant 2c_7 \frac{\phi '(s)}{s}. \end{aligned}$$

This proves the upper bound of (B.1). The lower bound follows by duality: Indeed, it follows from

that $\phi ^*$ satisfies B.2 with constants $1/c_7$ and $1/c_6$. Thus, by the already proven $\phi ^*$ satisfies the upper estimate of (B.1). By duality, see Lemma 29. we get the lower estimate of $\phi $ in (B.1). $\square $

Lemma 31

Let $\phi $ be a uniformly convex N-function. Then the functions $\phi $ and $\phi ^*$ satisfy the $\Delta _2$-condition, with constants only depending on $c_4$ and $c_5$.

Proof

Using $s=\lambda t$ with $\lambda >1$ in (B.1) we obtain

$$\begin{aligned} \Big ( 1 - c_5 \frac{\lambda -1}{\lambda }\Big ) \phi '(\lambda t) \leqslant \phi '(t). \end{aligned}$$

(B.3)

Now, we can choose $\lambda _0 > 1$ such that $\mu := (1-c_4 \frac{\lambda _0-1}{\lambda _0}) > \frac{1}{2}$. So, we have $\phi '(\lambda _0 t) \leqslant 2 \phi '(t)$. From this it follows by iteration (also using the monotonicity of $\phi ')$ that $\phi '(2t) \leqslant \tilde{c}\, \phi '(t)$, where $\tilde{c}$ only depends on $\lambda _0$ and $\mu $ and therefore only on $c_4$. Thus, it follows that

$$\begin{aligned} \phi (2t)-\phi (t)= 2\,\int _{t/2}^{t} \phi '(2s)\,ds \leqslant 2\,\tilde{c} \int _{t/2}^{t} \phi '(s)\,ds = 2\,\tilde{c}\, \big (\phi (t) - \phi (t/2) \big ) \leqslant 2 \tilde{c}\, \phi (t). \end{aligned}$$

Hence, $\phi (2t) \leqslant (1+2\tilde{c}) \phi (t)$, which proves the claim for $\phi $. The claim for $\phi ^*$ follows by duality with Lemma 29. $\square $

Remark 32

Let $\phi $ be an N-function such that $\phi $ and $\phi ^*$ satisfy the $\Delta _2$-condition. Then for every $s,t \geqslant 0$ we have by Young’s inequality

$$\begin{aligned} st&\leqslant \phi (s) +\phi ^*(t). \end{aligned}$$

This and $\phi (t) \leqslant \phi '(t)\,t \leqslant \phi (2t)$ implies that for all $s,t\geqslant 0$

$$\begin{aligned} \phi '(s)\,t&\leqslant \delta \phi (s) + c_\delta \phi (t), \\ \phi '(s)\,t&\leqslant c_\delta \phi (s) + \delta \phi (t), \end{aligned}$$

where $c_\delta $ depends only on $\delta $ and the $\Delta _2$-constants.

For each $a\geqslant 0$ we define the shifted N-function $\phi _a$ by its derivative

$$\begin{aligned} \phi _a'(t)&:= \frac{\phi '(t \vee a)}{t \vee a} t \end{aligned}$$

(B.4)

and $\phi _a(t) = \int _0^t \phi _a'(\tau )\,d\tau $. In the notation of Section 3 this is just $\phi _\epsilon $ with $\epsilon =(a,\infty )$, see also Remark 2.

Lemma 33

There holds $(\phi _a)^* = (\phi ^*)_{\phi '(a)}$.

Proof

Note that $((\phi ^*)_{\phi '(a)})'(u) = \frac{(\phi ^*)'(u \vee \phi '(a))}{u \vee \phi '(a)} u$ is the inverse of $\phi _a'(t)$. Thus $(\phi ^*)_{\phi '(a)}$ and $(\phi _a)^*$ are conjugate to each other. $\square $

Remark 34

The shifted N-functions have already been originally introduced in [11] with the modified definition $\phi _a'(t) = \frac{\phi '(t + a)}{t+a}\,t$. This original version shares almost all of the properties with the version of this paper. However, our exact formula $(\phi _a)^* = (\phi ^*)_{\phi '(a)}$ of Lemma 33 is replaced in [11] by equivalence. This is one of the advantages of our new definition.

Given our uniformly convex N-function $\phi $ we define the stress A and the auxiliary function V as in Definition 7 by

Definition 35

For $P\in \mathbb {R}^d$ we define

$$\begin{aligned} A(P):= {\left\{ \begin{array}{ll} \frac{\phi '(\vert P\vert )}{\vert P\vert }P &{}\text {if } P\ne 0 \\ 0&{}\text {if }P=0\end{array}\right. }\quad \text {and}\quad V(P):={\left\{ \begin{array}{ll} \sqrt{\frac{\phi '(\vert P\vert )}{\vert P\vert }}P&{}\text {if }P\ne 0\\ 0&{}\text {if }P=0. \end{array}\right. } \end{aligned}$$

Lemma 36

Let $\phi $ be a uniformly convex N-function with constants $c_4$ and $c_5$. Define $\psi $ via its derivative by $\frac{\psi '(s)}{s} := \sqrt{\frac{\phi '(s)}{s}}$ and let $\psi (t) := \int _0^t \psi '(s)\,ds$. Then $\psi $ is a uniformly convex N-function with constants $\frac{1}{2}$ and $1+c_5$. Moreover, $V(P) = \frac{\psi '({|{P}|})}{{|{P}|}} P$.

Proof

For $s>t$ we calculate

$$\begin{aligned} I&:= \frac{\psi '(s)- \psi '(t)}{s-t} = \frac{\sqrt{\phi '(s)s}- \sqrt{\phi '(t) t}}{s-t} = \frac{1}{\sqrt{\phi '(s)s}+ \sqrt{\phi '(t) t}} \frac{\phi '(s) s- \phi '(t) t}{s-t} \\&= \frac{\phi '(s)}{\sqrt{\phi '(s)s}+ \sqrt{\phi '(t) t}} + \frac{\phi '(s) - \phi '(t)}{s-t} \frac{t}{\sqrt{\phi '(s)s}+ \sqrt{\phi '(t) t}} =: II + III. \end{aligned}$$

Now, with (B.1)

$$\begin{aligned} \frac{1}{2} \frac{\psi '(s)}{s}&= \frac{\sqrt{\phi '(s)}}{2 \sqrt{s}} \leqslant II \leqslant \frac{\sqrt{\phi '(s)}}{\sqrt{s}} = \frac{\psi '(s)}{s} \end{aligned}$$

and

$$\begin{aligned} 0&\leqslant III \leqslant c_5 \frac{\phi '(s)}{s} \frac{t}{\sqrt{\phi '(s)s}+ \sqrt{\phi '(t) t}} \leqslant c_5 \frac{\sqrt{\phi '(s)}}{\sqrt{s}} = c_5 \frac{\psi '(s)}{s}. \end{aligned}$$

Overall, we get $\frac{1}{2} \frac{\psi '(s)}{s} \leqslant I \leqslant (1+c_5) \frac{\psi '(s)}{s}$, which proves the claim. $\square $

Lemma 37

Let $\phi $ be a uniformly convex N-function. Then $\phi _a$ is also uniformly convex with constants $c_4$ and $c_5$ replaced by $c_4$ and $c_5+1$.

Proof

For $s>t$ define

$$\begin{aligned} I&:= \frac{\phi _a(s)-\phi _a(t)}{s-t} \qquad \text {and} \qquad II := \frac{\phi _a'(s)}{s}. \end{aligned}$$

If $s,t \geqslant a$, then $\phi _a'(s)=\phi '(s)$ and $\phi _a'(t)=\phi '(t)$. Hence $c_4 II \leqslant I \leqslant c_5 II$ by assumptions on $\phi $.

If $s,t \leqslant a$, then $\phi _a(s) = \frac{\phi '(a)}{a} s$ and $\phi _a(t) = \frac{\phi '(a)}{a} t$, so $I= \frac{\phi '(a)}{a} =II$. Since $c_4 \leqslant 1 \leqslant c_5$ the claim follows in this case.

If remains to consider $s> a > t$. Then $\phi _a'(s)=\phi '(s)$ and $\phi _a'(t) = \frac{\phi '(a)}{a} s$, so

$$\begin{aligned} I&= \frac{\phi '(s) - \frac{\phi '(a)}{a} t}{s-t} = \frac{\phi '(s) - \phi '(a)}{s-t} + \frac{\phi '(a)}{a} \frac{a-t}{s-t} =: I_1 + I_2. \end{aligned}$$

Thus,

$$\begin{aligned} I&\geqslant \frac{\phi '(s) - \phi '(a)}{s-a} + 0 \geqslant c_4 \frac{\phi '(s)}{s}. \end{aligned}$$

On the hand using that $a \mapsto \frac{a-t}{a}$ is increasing in a we get

$$\begin{aligned} I&\leqslant \frac{\phi '(s) - \phi '(t)}{s-a} + \frac{\phi '(a)}{s} \frac{s-t}{s-t} \leqslant c_5 \frac{\phi '(s)}{s} + \frac{\phi '(s)}{s}. \end{aligned}$$

This proves the claim. $\square $

Remark 38

If follows from Lemma 37 and 31 that the families ${\{{\phi _a}\}}_{a \geqslant 0}$ and ${\{{(\phi _a)^*}\}}_{a \geqslant 0}$ satisfy the $\Delta _2$-condition with constants independent of a.

Lemma 39

Let $\phi $ be a uniformly convex N-function with constants $c_4$ and $c_5$. Then for all P, Q we have

$$\begin{aligned} (c_4 \wedge \tfrac{1}{2}) \frac{\phi '({|{P}|} \vee {|{Q}|})}{{|{P}|} \vee {|{Q}|}} {|{P-Q}|}^2\leqslant & {} (A(P)- A(Q)) : (P-Q) \\\leqslant & {} (c_5 \vee 2) \frac{\phi '({|{P}|} \vee {|{Q}|})}{{|{P}|} \vee {|{Q}|}} {|{P-Q}|}^2, \\ {|{A(P) - A(Q)}|}\leqslant & {} (c_5 + 1) \frac{\phi '({|{P}|} \vee {|{Q}|})}{{|{P}|} \vee {|{Q}|}} {|{P-Q}|}. \end{aligned}$$

Proof

We define $\hat{Q} := \frac{P}{{|{P}|}}$, $\hat{P} := \frac{P}{{|{P}|}}$, $\theta := \hat{P} : \hat{Q}$,

$$\begin{aligned} f(P,Q)&:= (A(P)- A(Q)) : (P-Q) \quad \text {and} \quad g(P,Q) := \frac{\phi '({|{P}|} \vee {|{Q}|})}{{|{P}|}\vee {|{Q}|}} {|{P-Q}|}^2. \end{aligned}$$

Then

$$\begin{aligned} f(P,Q)= & {} \big ( \phi '({|{P}|}) \hat{P} - \phi '({|{P}|}) \hat{Q}\big ) :({|{P}|} \hat{P} - {|{Q}|} \hat{Q}) \\= & {} \phi '({|{P}|}){|{P}|} + \phi '({|{Q}|}) {|{Q}|} -\big ( \phi '({|{P}|}){|{Q}|} + \phi '({|{Q}|}) {|{P}|}\big ) \theta \\=: & {} f({|{P}|},{|{Q}|}, \theta ) \end{aligned}$$

and

$$\begin{aligned} g(P,Q)&= \frac{\phi '({|{P}|} \vee {|{Q}|})}{{|{P}|}\vee {|{Q}|}} \big ( {|{P}|}^2 + {|{Q}|}^2 - 2 \theta {|{P}|} {|{Q}|}\big ) =: g({|{P}|},{|{Q}|}, \theta ). \end{aligned}$$

We need to estimate $f({|{P}|},{|{Q}|},\theta )/g({|{P}|},{|{Q}|},\theta )$. Both f and g are non-negative and linear in $\theta $. Hence $\frac{f}{g}$ is monotone in $\theta $. In particular, it suffices to control the cases $\theta =1$ and $\theta =-1$. We begin with the simple case $\theta =-1$.

Since $a \vee b \leqslant a+b \leqslant 2 (a \vee b)$, this implies

$$\begin{aligned} \frac{1}{2}&\leqslant \frac{f({|{P}|},{|{Q}|},-1)}{g({|{P}|},{|{Q}|},-1)} \leqslant 2. \end{aligned}$$

Now consider the case $\theta =1$. Without loss of generality we can assume ${|{P}|} \geqslant {|{Q}|}$.

Now, (B.1) implies

$$\begin{aligned} c_4&\leqslant \frac{g({|{P}|},{|{Q}|},1)}{f({|{P}|},{|{Q}|},1)} \leqslant c_5. \end{aligned}$$

Combining the two cases proves the first claim. To prove the second one we assume again ${|{P}|} \geqslant {|{Q}|}$ and estimate

$$\begin{aligned} {|{A(P) - A(Q)}|} = {\big |{\phi '({|{P}|}) \hat{P} - \phi '({|{Q}|}) \hat{Q}}\big |}\leqslant & {} {\big |{\phi '({|{P}|}) - \phi '({|{Q}|})}\big |} {+} \phi '({|{Q}|}) {|{\hat{P} - \hat{Q}}|} \\\leqslant & {} (c_5+1) \frac{\phi '({|{P}|} \vee {|{Q}|})}{{|{P}|} \vee {|{Q}|}} {|{P-Q}|}. \end{aligned}$$

This proves the second claim. $\square $

Lemma 40

Let $\phi $ be a uniformly convex N-function with constants $c_4$ and $c_5$. Then for all $P,Q \in \mathbb {R}^d$

$$\begin{aligned} \phi _{{|{Q}|}}'({|{P - Q}|})&\eqsim \frac{\phi '(|P|\vee |Q|)}{|P|\vee |Q|}{|{P-Q}|}, \\ \phi _{{|{Q}|}}({|{P - Q}|})&\eqsim \frac{\phi '(|P|\vee |Q|)}{|P|\vee |Q|}{|{P-Q}|}^2, \end{aligned}$$

where the constants only depend on $c_4$ and $c_5$.

Proof

The estimates follows from $\frac{1}{2} ({|{Q}|} + {|{P-Q}|}) \leqslant {|{P}|} \vee {|{Q}|} \leqslant 2 ({|{Q}|} + {|{P-Q}|})$, the $\Delta _2$-property of $\phi $ (see Lemma 31) and $\phi _{{|{Q}|}}'(t)\,t \eqsim \phi _{{|{Q}|}}(t)$. $\square $

Lemma 41

Let $\phi $ be a uniformly convex N-function with constants $c_4$ and $c_5$ and let A and V be as in Lemma 35. Then for all $P,Q\in \mathbb {R}^d$

$$\begin{aligned} (A(P)-A(Q))\cdot (P-Q)&\eqsim \phi _{{|{Q}|}}({|{P - Q}|}) \eqsim {|{V(P) - V(Q)}|}^2, \\ {|{A(P)-A(Q)}|}&\eqsim \phi _{{|{Q}|}}'({|{P - Q}|}). \end{aligned}$$

where the constants only depend on $c_4$ and $c_5$.

Proof

The first equivalence in the first claim and the second claim follow immediately from Lemma 39 and 40. It remains to prove the second equivalence of the first claim. For this we recall $\psi $ from Lemma 36 and observe that V is induced by $\psi $ exactly as A is induced by $\phi $, see Definition 35. In particular, we have ${|{V(P)-V(Q)}|} \eqsim \psi _{{|{Q}|}}'({|{P-Q}|})$. This proves

$$\begin{aligned} {|{V(P) - V(Q)}|}^2&\eqsim {\big |{\psi _{{|{Q}|}}'({|{P-Q}|})}\big |}^2 =\bigg (\frac{\psi '({|{Q}|} \vee {|{P-Q}|})}{{|{Q}|} \vee {|{P-Q}|}} {|{P-Q}|}\bigg )^2 \\&=\frac{\phi '({|{Q}|} \vee {|{P-Q}|})}{{|{Q}|} \vee {|{P-Q}|}} {|{P-Q}|}^2 \eqsim \phi _{{|{Q}|}}({|{P-Q}|}). \end{aligned}$$

This proves the claim. $\square $

Lemma 42

The following estimates hold for arbitrary $v\in W_0^{1,\phi }(\Omega )$ and u being the minimizer of $\mathcal {J}(w) := \int _\Omega \phi ({|{\nabla w}|}) - fw\,dx$:

Proof

It follows by convexity and $\mathcal {J}'(u)=0$ that

$$\begin{aligned} \mathcal {J}(v) - \mathcal {J}(u)&\leqslant \mathcal {J}'(v)(v-u) = (\mathcal {J}'(v)-\mathcal {J}'(u))(v-u). \end{aligned}$$

Since $\mathcal {J}'(w)(\xi ) = \int _\Omega A(\nabla w) \nabla \xi - f\,\xi \,dx$, this proves the first estimate. The equivalence in the claim follows then by Lemma 41. For the last estimate we calculate with Taylor, $\mathcal {J}'(u)=0$, Lemma 41 and the uniform $\Delta _2$-condition of the family $\phi _a$ that

This proves the claim. $\square $

The following lemma is a sharper version of Lemma 25 and Lemma 27 of [12].

Lemma 43

(Shift-change) For all $a,b \geqslant 0$ and $t \geqslant 0$, there holds

$$\begin{aligned} {|{\phi _a'(t) - \phi _b'(t)}|}\lesssim & {} \phi _a'({|{a-b}|}) \eqsim \phi _b'({|{a-b}|}), \end{aligned}$$

(B.5)

$$\begin{aligned} {\big |{\big ((\phi _a)^*\big )'(t) - \big (\phi _b)^*\big )'(t)}\big |}\lesssim & {} {|{a-b}|}. \end{aligned}$$

(B.6)

Proof

We begin with the proof of (B.5). The equivalence in (B.5) follows from Lemma 40. Thus the claim is symmetric in a and b and we can assume that $a \leqslant b$. If $a \leqslant b \leqslant t$, then $\phi _a'(t)=\phi _b'(t)=0$ and (B.5) follows. If $t \leqslant a \leqslant b$, then with Lemma 41

$$\begin{aligned} {\big |{\phi _a'(t)-\phi _b'(t)}\big |}= & {} {\Big |{\frac{\phi '(a)}{a}t - \frac{\phi '(b)}{b} t}\Big |} \leqslant {|{\phi '(a)-\phi '(b)}|} \frac{t}{a} + \phi '(b) {\Big |{\frac{1}{a} -\frac{1}{b}}\Big |} t \\\lesssim & {} \phi _a'({|{a-b}|}) \frac{t}{a} + \phi '(b) \frac{{|{b-a}|}\,t}{a b} \lesssim \phi _b'({|{a-b}|}). \end{aligned}$$

This proves (B.5). It remains to consider the case $a \leqslant t \leqslant b$. In this situation we estimate using the previous two cases

$$\begin{aligned} {\big |{\phi _a'(t)-\phi _b'(t)}\big |}\leqslant & {} {\big |{\phi _a'(t)-\phi _t'(t)}\big |} + {\big |{\phi _t'(t)-\phi _b'(t)}\big |} \lesssim \phi _a'({|{a-t}|}) + \phi _b'({|{b-t}|}) \\\lesssim & {} \phi _a'({|{a-b}|}) + \phi _b'({|{a-b}|}). \end{aligned}$$

This proves the remaining case of (B.5). To prove (B.6) we estimate with Lemma 41 and $(\phi _a^*)' = (\phi _a')^{-1}$

$$\begin{aligned} {\big |{\big ((\phi _a)^*\big )'(t) - \big (\phi _b)^*\big )'(t)}\big |}\lesssim & {} \big ((\phi _a)^*\big )'({|{\phi '(a)-\phi '(b)}|}) \\\lesssim & {} \big ((\phi _a)^*\big )'\big (\phi _a'({|{b-a}|})\big ) = {|{b-a}|}. \end{aligned}$$

This proves the claim. $\square $

With Lemma 43 we deduce the following estimates similar to Corollary 26 and Corollary 28 of [12].

Corollary 44

(Shift-change) For all P, Q and all $t \geqslant 0$ there holds

$$\begin{aligned} \phi _{{|{P}|}}(t)\leqslant & {} (1+ c_\delta ) \phi _{{|{Q}|}}(t) + \delta {|{V(P) - V(Q)}|}^2, \\ \phi _{{|{P}|}}(t)\leqslant & {} (1+ \delta ) \phi _{{|{Q}|}}(t) + c_\delta {|{V(P) - V(Q)}|}^2, \\ (\phi _{{|{P}|}})^*(t)\leqslant & {} (1+ c_\delta ) (\phi _{{|{Q}|}})^*(t) + \delta {|{V(P) - V(Q)}|}^2 \\ (\phi _{{|{P}|}})^*(t)\leqslant & {} (1+ \delta ) (\phi _{{|{Q}|}})^*(t) + c_\delta {|{V(P) - V(Q)}|}^2. \end{aligned}$$

Proof

We estimate with Lemma 43, Young’s inequality (see Remark 32) with $\phi _{{|{Q}|}}$ and Lemma 41

$$\begin{aligned} \phi _{{|{P}|}}(t)= & {} \int _0^t \phi _{{|{P}|}}'(s)\,ds \leqslant \int _0^t \phi _{{|{P}|}}'(s) + c\, \phi _{{|{P}|}}'({|{P-Q}|})\,ds \\= & {} \phi _{{|{P}|}}(t) + \phi _{{|{P}|}}'({|{P-Q}|})\,t \\= & {} (1+c_\delta ) \phi _{{|{P}|}}(t) + \delta \phi _{{|{P}|}}({|{P-Q}|}) \\\leqslant & {} (1+c_\delta ) \phi _{{|{P}|}}(t) + \delta {|{V(P)-V(Q)}|}^2. \end{aligned}$$

This proves the first inequality. We can exchange $\delta $ and $c_\delta $ within the proof to get the second inequality (see Remark 32). The other estimates follow analogously using $(\phi _{{|{Q}|}})^* = (\phi ^*)_{\phi '({|{Q}|})}$. $\square $

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Diening, L., Fornasier, M., Tomasi, R. et al. A Relaxed Kačanov iteration for the p-poisson problem. Numer. Math. 145, 1–34 (2020). https://doi.org/10.1007/s00211-020-01107-1

Download citation

Received: 23 December 2018
Revised: 18 July 2019
Published: 25 February 2020
Issue Date: May 2020
DOI: https://doi.org/10.1007/s00211-020-01107-1

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A Relaxed Kačanov iteration for the p-poisson problem

Abstract

Similar content being viewed by others

A damped Kačanov scheme for the numerical solution of a relaxed p(x)-Poisson equation

Adaptive Two-Stage Bregman Method for Variational Inequalities

Convergence of Adaptive Finite Element Methods with Error-Dominated Oscillation

1 Introduction

2 Variational formulation of the algorithm

Remark 1

3 Convergence in the relaxation parameter

Remark 2

Lemma 3

Proof

Theorem 4

Theorem 5

Proof

Corollary 6

Proof

Definition 7

Lemma 8

Proof

Lemma 9

Proof

Corollary 10

Proof

Remark 11

Lemma 12

Proof

Theorem 13

Theorem 14

Proof

Corollary 15

Proof

Corollary 16

Proof

Remark 17

4 Convergence of the Kačanov-iteration

Theorem 18

Corollary 19

Proof (Proof of Theorem 18)

Example 20

Remark 21

5 Algebraic rate

Theorem 22

Lemma 23

Proof

Theorem 24

Proof

Remark 25

6 Numerical experiments

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Auxiliaries

Lemma 26

Proof

Lemma 27

Proof

On uniformly convex Orlicz functions

Definition 28

Lemma 29

Proof

Lemma 30

Proof

Lemma 31

Proof

Remark 32

Lemma 33

Proof

Remark 34

Definition 35

Lemma 36

Proof

Lemma 37