1 Introduction

In the present paper, we consider the quasilinear wave equation

$$\begin{aligned} \lambda (u(t,x)) \partial _{tt}u(t,x) = \mathrm {\Delta }u(t,x) + g(t,x,u(t,x),\partial _tu(t,x)), \end{aligned}$$
(1.1)

for \( t\in [0,T]\), \(x \in \varOmega \subset {\mathbb {R}}^N\), \( N=1, 2, 3.\) We assume the domain \(\varOmega \) to be bounded with a sufficiently regular boundary and impose homogeneous Dirichlet boundary conditions. We discretize (1.1) in space using isoparametric finite elements and employ for the time discretization a semi-implicit Euler and midpoint rule as well as an exponential Euler and midpoint rule. We derive error bounds in norms stronger than the standard energy \(H^1 \times L^2\)-norm.

The first wellposedness results for a large class of quasilinear wave type equation were given by Kato in [25, 26]. This approach was refined in [11] for the problem (1.1) to account for the state-dependent norms necessary in the analysis. A typical example in nonlinear acoustics is the model \(\lambda (u) = 1 - u^m\) for some \(m\in {\mathbb {N}}\). Hence, in order to ensure \(\lambda (u)>0\), a key ingredient in the proof is to establish pointwise bounds on \(u\) (as well as \(\partial _tu\)), often via Sobolev’s embedding \(H^2\hookrightarrow L^\infty \). To inherit this property in the spatial discretization, we need pointwise bounds on the numerical approximations in the error analysis. However, since the finite elements space is not \(H^2\)-conforming, we cannot follow the above approach.

So far in the literature, bounds in \(H^1 \times L^2\) are shown by inverse estimates which yield a factor \(h^{-\beta }\) for some \(\beta \ge 1\) with the spatial mesh width h. This induces unsatisfactory CFL-type conditions and excludes linear finite elements. In contrast with this, we adapt the idea from the wellposedness and perform the error analysis not in the energy space \(H^1 \times L^2\), but employ a discrete version of the \(H^2\)-norm. A discrete variant of Sobolev’s embedding and a suitably defined solution space for the numerical approximation allow us to remove lower bounds on the polynomial degree of the finite element space and significantly improve the CFL-type condition compared to the literature. For the temporal step size \(\tau \) and the spatial mesh width h, we show convergence in \(N=2\) under the restriction \(\tau \lesssim h^\alpha \), for any \(\alpha >0\), and in \(N=3\) we have \(\tau \lesssim h^{1/2 +\alpha }\) for the first-order methods in time and \(\tau \lesssim h^{1/4 +\alpha }\) for the second-order method. In addition, we fully remove the CFL-type condition for \(N=1\).

The strategy of the semi discrete proof relies on a bootstrap argument. We set up a suitable solution space for the numerical approximation and show that the initial value lies in this. Instead of the usual choice of interpolated initial values, we have to use a Ritz map for which we provide a computable alternative of the correct order. Since we are working with a finite-dimensional subspace, this directly yields local wellposedness up to some time \(t^*_h>0\). On this possibly short time interval, we prove convergence in the stronger norm and use this to extend \(t^*_h\) beyond T and to close the argument. For the fully discrete error bounds, this approach is generalized using an induction argument.

We give a brief overview of the literature on the numerical treatment of quasilinear wave equations. In the pioneering works [10, 24, 27, 37], existence of solutions to quasilinear and nonlinear evolution equations is established, and one can find approximation rates of the implicit and semi-implicit Euler method. Within an (extended) Kato framework, optimal order for these methods was achieved in [22] and rigorous error bounds for the time discretization by higher-order Runge–Kutta methods are derived in [23, 28].

Concerning the spatial discretization, the results in [21] yield optimal order of convergence for the equation (1.1), however, only for polynomials of degree greater than two. For the strongly damped Westervelt equation, continuous and discontinuous Galerkin methods were analyzed in [1, 34]. Very recently, mixed finite elements for the Kuznetsov and Westervelt equations were studied in [33].

In [31], error bounds for two variant of the midpoint rule are derived of optimal order, but only for polynomials of degree greater than two and under a stronger CFL-type condition compared to our results. In the case of one-dimensional wave equation subject to periodic boundary conditions, full discretization error bounds are established in [19]. A sophisticated energy technique combined with the properties of the spectral discretization yields convergence without a CFL-type condition.

For a slightly different quasilinear wave equation, optimal error bounds in \(L^2\) for continuous finite elements were considered in the literature. One-step methods of different order are analyzed in [3, 4, 17], and two-step methods are considered in [5]. For a class of linearly implicit single-step schemes as well as a linearly and a fully implicit two-step scheme, optimal error bounds are derived in [32]. However, all of these results require a CFL-type condition at least as strong as \(\tau \lesssim h\) and do not allow for linear finite elements. We expect that our technique can be generalized to these problems, but this will be part of future research.

The paper is organized as follows: We describe in Sect. 2 the analytical framework, the space discretization by isoparametric Lagrange finite element, and state our main results. The proof of the spatial convergence rates is given in Sect. 3, where we first reduce the main result to error bounds in a stronger energy norm which is established afterward. In Sect. 4, we extend this technique to the fully discrete case for the three presented methods. Certain stability estimates and the bounds on the defects are given in Sect. 5, and some postponed results are shown in Appendices A, B, C, and D.

1.1 Notation

In the rest of the paper, we use the notation

$$\begin{aligned} a \lesssim b , \end{aligned}$$

if there is a constant \(C>0\) independent of the spatial parameter h and the time step size \(\tau \) such that \(a \le Cb\). For the sake of readability, we introduce the notation \(t^{n} = n \tau \) and

$$\begin{aligned} x^n {:}{=}x(t^{n}) \end{aligned}$$

for an arbitrary time-dependent, continuous object x(t). If it is clear from the context, we write \(L^p\) instead of \(L^p(\varOmega )\) or \(L^p(\varOmega _h)\).

2 General Setting

For a bounded domain \(\varOmega \subset {\mathbb {R}}^N\), \(N=1,2,3\), with boundary \(\partial \varOmega \in C^{s}\), \(s\in {\mathbb {N}}\), we study the quasilinear wave Eq. (1.1) with homogeneous Dirichlet boundary conditions and initial values

$$\begin{aligned} u(0)&= u^{0},&\partial _tu(0)&= v^{0} . \end{aligned}$$

We note that the operator \(-\mathrm {\Delta }\) is positive and self-adjoint on \(L^2(\varOmega )\), and we define the spaces \(H= L^2(\varOmega )\) and \(V= H_0^1(\varOmega )\). Throughout the paper, we impose the following conditions on the function \(\lambda \) and \(g\). Additional requirements are stated in our main results.

Assumption 2.1

\((\lambda _1)\):

The function \(\lambda :{\mathbb {R}}\rightarrow {\mathbb {R}}\) satisfies \(\lambda \in C^2({\mathbb {R}},{\mathbb {R}})\).

\((\lambda _2)\):

There is some radius \({\widehat{r}}_\infty >0\) such that there is a constant \(c_{\lambda } = c_{\lambda } ({\widehat{r}}_\infty )>0\) such that

$$\begin{aligned} c_{\lambda }&\le \lambda (x), \quad |x| \le {{\widehat{r}}_\infty } . \end{aligned}$$
\((g_1)\):

The function \(g:[0,T] \times {\overline{\varOmega }} \times {\mathbb {R}}\times {\mathbb {R}}\rightarrow {\mathbb {R}}\) satisfies \(g\in C^2([0,T] \times {\overline{\varOmega }} \times {\mathbb {R}}\times {\mathbb {R}},{\mathbb {R}})\).

\((g_2)\):

For \(x \in \partial \varOmega \) and \(y=z=0\) it holds \( g(t,x,y,z) = 0\).

The conditions \((\lambda _1),(g_1)\) are structural assumptions which allow us to show crucial stability estimates. The lower bound in \((\lambda _2)\) prevents the degeneracy of (1.1). The main effort in the discretization and error analysis is to ensure that this condition is inherited. We note that condition \((g_2)\) implies in particular that for \(u, v \in V\) one has \(g(t,u,v)\in V\), and that all conditions are already required for the wellposedness. We recall an example for the quasilinear problem (1.1) given in [11].

Example 2.2

Let \(K \in C^4({\mathbb {R}},{\mathbb {R}})\) with \(1 + K'(0)> 0 \) and consider the problem

$$\begin{aligned} \partial _{tt}( u+ K(u) ) = \varDelta u, \end{aligned}$$

for example, with the Kerr model \(K(z) = \alpha z^3\) for \(\alpha \in {\mathbb {R}}\). If we rewrite it in the form (1.1), we obtain

$$\begin{aligned} \lambda (z) = 1 + K'(z), \qquad g(t,x,u,v) = -K''(u) v^2, \end{aligned}$$

which satisfy Assumption 2.1. Denoting the fractional powers of the Laplacian by \({\mathcal {H}}_k {:}{=}{\mathcal {D}}\bigl ( (-\varDelta )^{k/2} \bigr )\), under suitable smallness assumptions on the initial values, the existence of a solution

$$\begin{aligned} u \in C([0,T],{\mathcal {H}}_3) \cap C^1([0,T],{\mathcal {H}}_2) \cap C^2([0,T],{\mathcal {H}}_1) \end{aligned}$$

is shown in [11, Thm. 4.1]. \(\lozenge \)

Equivalently to (1.1), we consider the quasilinear wave equation in first-order formulation

$$\begin{aligned} \varLambda (y(t)) \partial _ty(t)&= \textrm{A}y(t) + G(t,y(t)),&t&\in [0,T],&y&= \begin{pmatrix} u\\ \partial _tu\end{pmatrix} , \end{aligned}$$
(2.1)

with initial value \(y(0) = y^{0}\) in the product space \(X= V\times H\), and

$$\begin{aligned} y^{0}= & {} \begin{pmatrix} u^{0} \\ v^{0} \end{pmatrix}, \hspace{2ex} \varLambda (y) = \begin{pmatrix} \textrm{Id}&{} 0 \\ 0 &{} \lambda (u) \end{pmatrix}, \hspace{2ex}\\ \textrm{A}= & {} \begin{pmatrix} 0 &{} \textrm{Id}\\ \mathrm {\Delta }&{} 0 \end{pmatrix}, \hspace{2ex} G(t,y) = \begin{pmatrix} 0 \\ g(t,u,\partial _tu) \end{pmatrix}. \end{aligned}$$

Remark 2.3

The assumption on the regularity of the boundary is not essential in the error analysis, which also works on a convex, polygonal domain. Hence, one could apply a conforming finite element method. However, since the wellposedness of quasilinear equations requires a regular boundary, we will work in the nonconforming framework in the following.

2.1 Space Discretization

We study the nonconforming space discretization of (2.1) based on isoparametric finite elements. For further details on this approach, we refer to [15, 16]. In particular, we introduce a shape-regular and quasi-uniform triangulation \({\mathcal {T}}_h\), consisting of isoparametric elements of degree \(k \in {\mathbb {N}}\) and let \(\partial \varOmega \in C^{k+1}\). The computational domain \(\varOmega _h\) is given by

$$\begin{aligned} \varOmega _h= \bigcup _{K \in {\mathcal {T}}_h} K \approx \varOmega , \end{aligned}$$

where the subscript h denotes the maximal diameter of all elements \(K \in {\mathcal {T}}_h\). In the following, we require that h is sufficiently small such that all cited results below hold true. We note that the smallness only depends on the geometry of the domain \(\varOmega \) and the polynomial degree k. The semi-discrete approximations are given by \(u_h(t) \approx u(t)\) and \(v_h(t) \approx \partial _tu(t)\). Based on the transformations \(F_K\) mapping the reference element \({\widehat{K}}\) to \(K \in {\mathcal {T}}_h\), we introduce the finite element space of degree k

$$\begin{aligned} W_h&= \{\varphi \in C_0({\overline{\varOmega }}_h) \mid \varphi |_{K} = {\widehat{\varphi }} \circ (F_K)^{-1} \text { with } {\widehat{\varphi }} \in {\mathcal {P}}^k({\widehat{K}}) \text { for all } K \in {\mathcal {T}}_h\} \,. \end{aligned}$$

Here, \({\mathcal {P}}^k({\widehat{K}})\) consists of all polynomials on \({\widehat{K}}\) of degree at most k. The discrete approximation spaces are given by

$$\begin{aligned} H_h&= \bigl (W_h,\left( \cdot \mid \cdot \right) _{L^2(\varOmega _h)}\bigr ), \quad V_h= \bigl (W_h,\left( \cdot \mid \cdot \right) _{H_0^1(\varOmega _h)}\bigr ) \,, \end{aligned}$$

and we set \(X_h= V_h\times H_h\).

Following the detailed construction in [16, Sec. 5], we introduce the lift operator \({\mathcal {L}}_h:H_h\rightarrow H\). In particular, for \(p \in [1,\infty ]\) there are constants \(c_{p}, C_{p} > 0\) with

$$\begin{aligned} c_{p} \left\Vert \varphi _h \right\Vert _{L^p(\varOmega _h)}&\le \left\Vert {\mathcal {L}}_h\varphi _h \right\Vert _{L^p(\varOmega )} \le C_{p} \left\Vert \varphi _h \right\Vert _{L^p(\varOmega _h)},{} & {} \varphi _h \in L^p, \end{aligned}$$
(2.2a)
$$\begin{aligned} c_{p} \left\Vert \varphi _h \right\Vert _{W^{1,p}(\varOmega _h)}&\le \left\Vert {\mathcal {L}}_h\varphi _h \right\Vert _{W^{1,p}(\varOmega )} \le C_{p} \left\Vert \varphi _h \right\Vert _{W^{1,p}(\varOmega _h)},{} & {} \varphi _h \in W^{1,p}, \end{aligned}$$
(2.2b)

cf. [16, Prop. 5.8]. By construction, the boundary nodes of \(\varOmega _h\) lie on \(\partial \varOmega \) and zero boundary conditions are preserved by \({\mathcal {L}}_h\), see [16, Sec. 8.5]. Further by [15, Sec. 4], the lift preserves values at the nodes, i.e., in particular

$$\begin{aligned} I_h{\mathcal {L}}_h\varphi _h = \varphi _h, \quad \varphi _h \in V_h, \end{aligned}$$
(2.3)

where we denote the nodal interpolation operator by \(I_h:C_0(\varOmega ) \rightarrow V_h\) and, enriching the space \(W_h\) by basis functions corresponding to the boundary nodes, its extension \(I_h^{e}:C(\varOmega ) \rightarrow C(\varOmega _h)\). Further, we define the adjoint lift operators \({\mathcal {L}}_h^{H*}:H\rightarrow H_h\) and \({\mathcal {L}}_h^{V*}:V\rightarrow V_h\) by

$$\begin{aligned} \bigl ( {\mathcal {L}}_h^{H*}\varphi \mid \psi _h \bigr )_{H_h}&= \left( \varphi \mid {\mathcal {L}}_h\psi _h \right) _{L^2(\varOmega )},&\varphi&\in H, \, \psi _h \in H_h, \end{aligned}$$
(2.4a)
$$\begin{aligned} \bigl ( {\mathcal {L}}_h^{V*}\varphi \mid \psi _h \bigr )_{V_h}&= \left( \varphi \mid {\mathcal {L}}_h\psi _h \right) _{H_0^1(\varOmega )},&\varphi&\in V, \, \psi _h \in V_h. \end{aligned}$$
(2.4b)

We note that in the conforming case, i.e., \(\varOmega = \varOmega _h\), \({\mathcal {L}}_h^{H*}\) and \({\mathcal {L}}_h^{V*}\) coincide with the \(L^2\)-projection \(\pi _h:L^2(\varOmega _h) \rightarrow H_h\) and the Ritz projection \(R_h:H^1_0(\varOmega _h) \rightarrow V_h\), respectively, given by

$$\begin{aligned} \left( \pi _h\psi \mid \psi _h \right) _{H_h}&= \left( \psi \mid \psi _h \right) _{L^2(\varOmega _h)},&\psi&\in L^2(\varOmega _h), \, \psi _h \in H_h, \end{aligned}$$
(2.5a)
$$\begin{aligned} \left( R_h\psi \mid \psi _h \right) _{V_h}&= \left( \psi \mid \psi _h \right) _{H_0^1(\varOmega _h)},&\psi&\in H^1_0(\varOmega _h), \, \psi _h \in V_h. \end{aligned}$$
(2.5b)

For \(u_h,v_h\in V_h\) we define the discrete operator \(\lambda _h(u_h) :H_h\rightarrow H_h\) and the discrete right-hand side \(g_h\) by

$$\begin{aligned} \lambda _h(u_h) \varphi _h&= \pi _h\bigl ( I_h\lambda ( {\mathcal {L}}_hu_h) \, \varphi _h \bigr ),&g_h(t, u_h, v_h)&= I_hg(t,{\mathcal {L}}_hu_h, {\mathcal {L}}_hv_h) , \end{aligned}$$
(2.6)

respectively. Denoting by \(I_h^{\varOmega _h}:C(\varOmega _h) \rightarrow V_h\) the nodal interpolation operator with the same nodes as \(I_h\), we have by (2.3) the identity \( I_h\lambda ( {\mathcal {L}}_hu_h) = I_h^{\varOmega _h}\lambda ( u_h)\) and similarly for \(g_h\). The first-order counterparts of (2.6) are given by

$$\begin{aligned} y_h&= \begin{pmatrix} u_h\\ v_h\end{pmatrix} , \quad \varLambda _h(y_h) = \begin{pmatrix} \textrm{Id}&{} 0 \\ 0 &{} \lambda _h(u_h) \end{pmatrix}, \quad G_h(t,y_h) = \begin{pmatrix} 0 \\ g_h(t, u_h, v_h) \end{pmatrix}. \end{aligned}$$
(2.7)

Moreover, we show in Sect. 2.5 that under certain assumption on \(u_h\in V_h\) there exists a modified \(L^2\)-projection \(Q_h(u_h) :L^2(\varOmega _h) \rightarrow V_h\) such that the inverse of \(\lambda (u_h)\) is given by

$$\begin{aligned} \lambda _h^{-1}{{(u_h)}} \varphi _h&= Q_h(u_h) \bigl ( (I_h\lambda ({\mathcal {L}}_hu_h))^{-1} \varphi _h \bigr ),&\varLambda _h^{-1}{(y_h)}&= \begin{pmatrix} \textrm{Id}&{} 0 \\ 0 &{} \lambda _h^{-1}{{(u_h)}} \end{pmatrix}. \end{aligned}$$
(2.8)

Finally, we introduce the operators \(\mathrm {\Delta }_h:V_h\rightarrow H_h\) and \(\textrm{A}_h:X_h\rightarrow X_h\) given by

$$\begin{aligned} - \left( \mathrm {\Delta }_h\varphi _h \mid \psi _h \right) _{H_h}&= \left( \varphi _h \mid \psi _h \right) _{V_h},&\textrm{A}_h&= \begin{pmatrix} 0 &{} \textrm{Id}\\ \mathrm {\Delta }_h&{} 0 \end{pmatrix},&\varphi _h, \psi _h&\in V_h. \end{aligned}$$
(2.9)

Note that \(\mathrm {\Delta }_h\) is symmetric and \(\textrm{A}_h\) is skew-symmetric with respect to \(H_h\) and \(X_h\), respectively, but they are not uniformly bounded with respect to h. The spatially discrete quasilinear wave equation in first-order formulation then reads

$$\begin{aligned} \varLambda _h(y_h(t)) \partial _ty_h(t)&= \textrm{A}_hy_h(t) + G_h(t,y_h),&t&\in [0,T], \end{aligned}$$
(2.10)

with the initial value \(y_h(0) = y_h^{0}\).

2.2 Choice of the Initial Value

As already mentioned, an appropriately chosen initial value is a key ingredient in the subsequent error analysis. An ideal initial value would include the adjoint lift operator \({\mathcal {L}}_h^{V*}\) defined in (2.4b). However, in order to compute this operator, integrals over the exact domain \(\varOmega \) have to be evaluated.

We thus propose an alternative that involves to use a finite element space of degree \(k' \ge k+1\) denoted by \({\widetilde{V}}_h\) over the computational domain \({\widetilde{\varOmega }}_h\). Further, let \({\widetilde{{\mathcal {L}}}}_h\) and \({\widetilde{I}}_h\) be the corresponding lift and interpolation operators. Then, for \(u\in H^2\), we define the modified Ritz map \({\widetilde{R}}_hu\) via

$$\begin{aligned} \bigl ( {\widetilde{R}}_hu \mid \varphi _h \bigr )_{V_h} = \bigl ( {\widetilde{I_h}} u \mid {\widetilde{{\mathcal {L}}_h}}^{-1} {\mathcal {L}}_h\varphi _h \bigr )_{{\widetilde{V}}_h}, \qquad \varphi _h \in V_h. \end{aligned}$$
(2.11)

We use this operator together with the interpolation to define the initial value by

$$\begin{aligned} y_h^{0} = \begin{pmatrix} u_h^{0} \\ v_h^{0} \end{pmatrix} = \begin{pmatrix} {\widetilde{R}}_hu^{0} \\ I_hv^{0} \end{pmatrix}. \end{aligned}$$
(2.12)

In Appendix A, we prove the following approximation property and discuss the computation of \({\widetilde{R}}_h\).

Proposition 2.4

For \(u^{0} \in H^{k+2}(\varOmega ) \cap V\), the difference of the adjoint lift \({\mathcal {L}}_h^{V*}\) defined in (2.4b) and \({\widetilde{R}}_h\) in (2.11) satisfies the bound

$$\begin{aligned} \bigl \Vert {\mathcal {L}}_h^{V*}u^{0} - {\widetilde{R}}_hu^{0} \bigr \Vert _{H^1(\varOmega _h)} + h \bigl \Vert \mathrm {\Delta }_h\bigl ( {\mathcal {L}}_h^{V*}u^{0} - {\widetilde{R}}_hu^{0} \bigr ) \bigr \Vert _{L^2(\varOmega _h)}&\le C h^{k+1} \bigl \Vert u^{0} \bigr \Vert _{H^{k+2}} , \end{aligned}$$

where the constant C is independent of h.

We emphasize that the precise construction of the initial value is not important in the error analysis, but only the bounds obtained in Proposition 2.4. Hence, if we can compute the adjoint lift exactly, which is the Ritz projection in the conforming case, then one can also choose \(u_h^{0} = {\mathcal {L}}_h^{V*}u^{0}\). However, we cannot make the standard choice \(u_h^{0} = I_hu^{0}\), since this would imply the statement of Proposition 2.4 only with k instead of \(k+1\).

2.3 Main Result for the Semi-Discretization in Space

Before we state our main error bounds, we chose some exponent \({p^*}\), depending on the dimension \(N= 1,2,3\), as

$$\begin{aligned} N< {p^*}{\left\{ \begin{array}{ll} \le \infty , &{}N= 1, \\< \infty , &{}N= 2, \\ < 6, &{}N=3. \end{array}\right. } \end{aligned}$$
(2.13)

This choice in particular implies the Sobolev embeddings

$$\begin{aligned} H^1 \hookrightarrow L^{p^*}\qquad \text { and } \qquad H^2\hookrightarrow W^{1,{p^*}} \hookrightarrow L^\infty . \end{aligned}$$
(2.14)

Our first main result gives an error bound on the spatially discrete solution defined in (2.10), and the proof is given in Sect. 3. Recall the fractional powers of the Dirichlet Laplacian denoted by \({\mathcal {H}}_k {:}{=}{\mathcal {D}}\bigl ( (-\varDelta )^{k/2} \bigr )\).

Theorem 2.5

Let \(\partial \varOmega \in C^{k+1}\), and Assumption 2.1 hold. Further, let the solution \(u\) satisfy

$$\begin{aligned} \begin{aligned} u&\in C ([0,T], {\mathcal {H}}_3 \cap H^{k+3}(\varOmega )) \cap C^2 ([0,T], V\cap W^{k+1,\infty }(\varOmega ) ), \\ \lambda (u)&\in C ([0,T], W^{k+1,\infty }(\varOmega ) ), \quad g(\cdot ,u,\partial _tu) \in C([0,T], H^{k+1}(\varOmega ) ), \end{aligned} \end{aligned}$$
(2.15)

and choose the initial value (2.12). Then, there is \(h_0>0\) such that for all \(h \le h_0\), it holds for \(t \in [0,T]\)

$$\begin{aligned} \left\Vert u(t) - {\mathcal {L}}_hu_h(t) \right\Vert _{W^{1,{p^*}}(\varOmega )} + \left\Vert \partial _tu(t) - {\mathcal {L}}_hv_h(t) \right\Vert _{H^1(\varOmega )} \le C h^k \end{aligned}$$

with a constant \(C>0\) which is independent of h.

Using (2.14), the theorem implies convergence in the maximum norm for \(u_h\) and in \(L^{p^*}\) for \(v_h\) and is in particular applicable to linear finite elements. We note that the results from the literature so far had the limitation \(k\ge 2\).

2.4 Main Results for Full Discretization

We further discuss the convergence of four different fully discrete schemes. We recall that by \(\tau >0\) we denote the time step size and define for \(n=0,\ldots ,N\) the times \(t^{n} = n \tau \), with \(T = N \tau \). The fully discrete approximations are given by \(u_h^{n} \approx u(t^{n})\) and \(v_h^{n} \approx \partial _tu(t^{n})\). The proofs of the convergence results are given in Sect. 4.

2.4.1 Semi-Implicit Euler Method

For a variant of the implicit Euler method, we introduce the discrete derivative

$$\begin{aligned} \partial _{\tau }a_n {:}{=}\tfrac{1}{\tau } \bigl ( a^n - a^{n-1} \bigr ), \quad n\ge 1, \qquad \partial _{\tau }a_0 {:}{=}a_0, \end{aligned}$$
(2.16)

and consider as in [10, 22] the semi-implicit Euler method

$$\begin{aligned} \varLambda _h(y_h^{n}) \partial _{\tau }y_h^{n+1} = \textrm{A}_hy_h^{n+1} + G_h(t^{n},y_h^{n}), \quad n \ge 0, \end{aligned}$$
(2.17)

by freezing the nonlinear parts at the numerical approximation in the last step. The computation of the next approximation thus only requires the solution of a linear system. For the analysis, we impose the following weak CFL-type condition

$$\begin{aligned} \tau \le c h^{N/{p^*}+ \varepsilon _0} \end{aligned}$$
(2.18)

with \({p^*}\) from (2.13) and some arbitrary \(\varepsilon _0>0\). This yields the following convergence result.

Theorem 2.6

Let \(\partial \varOmega \in C^{k+1}\), and Assumption 2.1 hold. Further, let the solution \(u\) in addition to (2.15) satisfy

$$\begin{aligned} u\in C^3 ([0,T], L^{2}(\varOmega )), \end{aligned}$$

and choose the initial value (2.12). Then, under the condition (2.18) there are \(h_0,\tau _0 > 0\) such that for all \(h \le h_0\) and \(\tau \le \tau _0\), it holds for \(0 \le t^{n} \le T\)

$$\begin{aligned} \left\Vert u(t^{n}) - {\mathcal {L}}_hu_h^{n} \right\Vert _{W^{1,{p^*}}(\varOmega )} + \left\Vert \partial _tu(t^{n}) - {\mathcal {L}}_hv_h^{n} \right\Vert _{H^1(\varOmega )} \le C \bigl ( \tau + h^k \bigr ) \end{aligned}$$

with a constant \(C>0\) which is independent of h and \(\tau \).

We emphasize that the CFL-type condition in (2.18) is essentially no restriction for \(N=2\) since \({p^*}\) can be chosen arbitrarily large due to (2.13). For \(N=3\), the CFL roughly yields \(\tau \lesssim h^{1/2+\varepsilon }\). However, even for \(k=1\), the error behaves as \(\tau + h\), and one would choose \(\tau \sim h\) anyway.

2.4.2 Semi-Implicit Midpoint Rule or Crank–Nicolson Scheme

As a second order in time method, we consider a variant of the midpoint rule proposed in [28]

$$\begin{aligned} \varLambda _h({\bar{y}}_h^{n + 1/2}) \partial _{\tau }y_h^{n+1}&= \textrm{A}_hy_h^{n+1/2} + G_h(t^{n+1/2},{\bar{y}}_h^{n + 1/2}), \quad n \ge 1, \end{aligned}$$
(2.19a)

with average \(y_h^{n+1/2}\) and extrapolation \({\bar{y}}_h^{n + 1/2}\) given by

$$\begin{aligned} y_h^{n+1/2}&= \tfrac{1}{2} \bigl ( y_h^{n+1} + y_h^{n}\bigr ), \quad {\bar{y}}_h^{n + 1/2} = \tfrac{3}{2} y_h^{n} - \tfrac{1}{2} y_h^{n -1 }. \end{aligned}$$
(2.19b)

The first approximation \(y^{1}\) is computed with the Euler method (2.17), and as before, in every time step only a linear system has to be solved. For the analysis of the second-order method, we can weaken the CFL-type condition compared to (2.18) and require only

$$\begin{aligned} \tau \le c h^{N/2{p^*}+ \varepsilon _0}. \end{aligned}$$
(2.20)

Theorem 2.7

Let \(\partial \varOmega \in C^{k+1}\), and Assumption 2.1 hold. Further, let the solution \(u\) in addition to (2.15) satisfy

$$\begin{aligned} u\in C^2 ([0,T], {\mathcal {H}}_3) \cap C^3 ([0,T], {\mathcal {H}}_2) \cap C^4 ([0,T], L^{2}(\varOmega )), \end{aligned}$$

and choose the initial value (2.12). Then, under the condition (2.20) there are \(h_0,\tau _0>0\) such that for all \(h \le h_0\) and \(\tau \le \tau _0\), it holds for \(0\le t^{n} \le T\)

$$\begin{aligned} \left\Vert u(t^{n}) - {\mathcal {L}}_hu_h^{n} \right\Vert _{W^{1,{p^*}}(\varOmega )} + \left\Vert \partial _tu(t^{n}) - {\mathcal {L}}_hv_h^{n} \right\Vert _{H^1(\varOmega )} \le C (\tau ^2 + h^k), \end{aligned}$$

where C is independent of h and \(\tau \).

Since, there is again essentially no CFL-type condition for \(N=2\), we only discuss the case \(N=3\). We require \(\tau \lesssim h^{1/4+\varepsilon }\), whereas in [31] not only \(k\ge 2\) but also \( \tau \lesssim h^{3/4+\varepsilon } \) has to be imposed.

2.4.3 Exponential Euler Method

We turn to exponential methods which employ the variation-of-constants formula and an exact evaluation of the matrix exponential applied to a vector. For the approximation \(y_h^{n} \approx y(t^{n})\), we use the shorthand notation \({\textbf{A}}_h^{\hspace{-0.3mm}n} = \varLambda _h^{-1}(y_h^{n}) \textrm{A}_h\) and consider the method which was proposed in [12]

$$\begin{aligned} \begin{aligned} y_h^{n+1}&= \textrm{e}^{\tau {\textbf{A}}_h^{\hspace{-0.3mm}n} } y_h^{n} + \tau \varphi _1(\tau {\textbf{A}}_h^{\hspace{-0.3mm}n}) G_h(t^{n},y_h^{n}) \\&= y_h^{n} + \tau \varphi _1(\tau {\textbf{A}}_h^{\hspace{-0.3mm}n}) \bigl ( {\textbf{A}}_h^{\hspace{-0.3mm}n} y_h^{n} + \varLambda _h^{-1}(y_h^{n}) G_h(t^{n},y_h^{n}) \bigr ) \end{aligned} \end{aligned}$$

with the analytic function \(\varphi _1(z) = \int _0^1 e^{s z}\,ds\). We obtain the following error bound.

Theorem 2.8

Let \(\partial \varOmega \in C^{k+1}\), and Assumption 2.1 hold. Further, let the solution \(u\) satisfy (2.15), and choose the initial value (2.12). Then, under the condition (2.18) there are \(h_0,\tau _0>0\) such that for all \(h \le h_0\) and \(\tau \le \tau _0\), it holds for \(0 \le t^{n} \le T\)

$$\begin{aligned} \left\Vert u(t^{n}) - {\mathcal {L}}_hu_h^{n} \right\Vert _{W^{1,{p^*}}(\varOmega )} + \left\Vert \partial _tu(t^{n}) - {\mathcal {L}}_hv_h^{n} \right\Vert _{H^1(\varOmega )} \le C \bigl ( \tau + h^k \bigr ) , \end{aligned}$$

where C is independent of h and \(\tau \).

We note that the CFL-type condition is the same as in the error bound of the semi-implicit Euler in Theorem 2.6.

2.4.4 Exponential Midpoint Rule

A second-order exponential variant is for example given by the exponential midpoint rule proposed in [12]. Using the notation in (2.19b), we define

$$\begin{aligned} {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} {:}{=}\varLambda _h^{-1}({\bar{y}}_h^{n + 1/2}) \textrm{A}_h\end{aligned}$$

and consider the scheme

$$\begin{aligned} y_h^{n+1}&= \textrm{e}^{\tau {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} } y_h^{n} + \tau \varphi (\tau {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2}) \varLambda _h^{-1}({\bar{y}}_h^{n + 1/2}) G_h(t^{n+1/2},{\bar{y}}_h^{n + 1/2}) \\&= y_h^{n} + \tau \varphi (\tau {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2}) \bigl ( {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} y_h^{n} + \varLambda _h^{-1}({\bar{y}}_h^{n + 1/2}) G_h(t^{n+1/2},{\bar{y}}_h^{n + 1/2}) \bigr ). \end{aligned}$$

Employing the techniques established for the proofs of Theorems 2.6 and 2.8, and combining them with the techniques in [12], allows for a convergence result as in Theorem 2.7 under the weaker CFL-type condition (2.20).

Theorem 2.9

Let \(\partial \varOmega \in C^{k+1}\), and Assumption 2.1 hold. Further, let the solution \(u\) in addition to (2.15) satisfy

$$\begin{aligned} \begin{aligned} u&\in C^1 ([0,T], H^{k+3} ) \cap C^3([0,T], W^{k+1,\infty }(\varOmega )) \cap C^4([0,T], H^{1}(\varOmega )), \\ \lambda (u)&\in C^3([0,T], W^{k+1,\infty }(\varOmega )), \quad g(\cdot ,u,\partial _tu) \in C^1([0,T], {\mathcal {H}}_2 ) \cap C^3([0,T], H^{1}(\varOmega ) ), \end{aligned} \end{aligned}$$

and choose the initial value (2.12). Then, under the condition (2.20) there are \(h_0,\tau _0>0\) such that for all \(h \le h_0\) and \(\tau \le \tau _0\), it holds for \(0\le t^{n} \le T\)

$$\begin{aligned} \left\Vert u(t^{n}) - {\mathcal {L}}_hu_h^{n} \right\Vert _{W^{1,{p^*}}(\varOmega )} + \left\Vert \partial _tu(t^{n}) - {\mathcal {L}}_hv_h^{n} \right\Vert _{H^1(\varOmega )} \le C (\tau ^2 + h^k), \end{aligned}$$

where C is independent of h and \(\tau \).

2.5 Numerical Experiments

To illustrate our theoretical findings, we present some numerical experiments for the non-exponential methods. We first illustrate the optimality of our error bounds using a smooth solution and then consider the formation of a shock wave.

2.5.1 Smooth Solution

Let \(\varOmega = B_1(0) \subset {\mathbb {R}}^2\) be the two-dimensional unit sphere and consider Eq. (1.1) from Example 2.2 with \(\alpha = - \frac{1}{6}\) and data given by

$$\begin{aligned} u^{0}({\textbf{x}})&= \frac{1}{10} \sin (\pi r^2)^3 x_1 x_2 \,,{} & {} v^{0} ({\textbf{x}}) = \frac{1}{10} \sin (\pi r^2)^3 x_1 x_2 \,, \\ \lambda (u)&= 1- \frac{1}{2} u^2,{} & {} g(t,u, v) = uv^2 + {\widehat{f}}(t), \end{aligned}$$

where \(r^2 = |{\textbf{x}}|^2\). The additional forcing term \({\widehat{f}}\) is chosen such that the exact solution is given by

$$\begin{aligned} u(t,{\textbf{x}}) = \frac{1}{10} \textrm{e}^t \sin (\pi r^2)^3 x_1 x_2. \end{aligned}$$

A simple calculation shows that the regularity assumptions of Theorems 2.5, 2.6, and 2.7 are satisfied. The scaling by a factor 10 is used to approximately normalize the \(W^{1,\infty }\)-norm of the solution \(u\).

2.5.2 Discretization

We discretize in space using the mass and stiffness matrices

$$\begin{aligned} \bigl ( M_h (u_h) \bigr )_{i,j}&{:}{=}\bigl ( (I_h^{\varOmega _h}\lambda (u_h) ) \varphi _i \mid \varphi _j \bigr )_{L^2(\varOmega _h)}, \quad \bigl ( {\widetilde{M}}_h \bigr )_{i,j} {:}{=}\left( \varphi _i \mid \varphi _j \right) _{L^2(\varOmega _h)}, \\ \bigl ( L_h \bigr )_{i,j}&{:}{=}\left( \nabla \varphi _i \mid \nabla \varphi _j \right) _{L^2(\varOmega _h)} , \end{aligned}$$

where we denote by \((\varphi _i)_i\) the nodal basis of \(V_h\). Then, the discrete solution in (2.10) satisfies

$$\begin{aligned} M_h \bigl (u_h(t)) \partial _{tt}u_h(t) = - L_h u_h(t) + {\widetilde{M}}_h I_h^{\varOmega _h}g(t,u_h(t),v_h(t)), \end{aligned}$$

by abusing the notation for the coefficient vectors and their corresponding function in \(V_h\). The Euler method in (2.17) is then given for \(n\ge 0\) by

$$\begin{aligned} \bigl ( M_h^n + \tau ^2 L_h \bigr ) v_h^{n+1}&= M_h^n v_h^{n} - \tau L_h u_h^{n} + \tau {\widetilde{M}}_h I_h^{\varOmega _h}g(t^{n},u_h^{n},v_h^{n}) , \\ u_h^{n+1}&= u_h^{n} + \tau v_h^{n+1}, \end{aligned}$$

where we abbreviate \(M_h^n = M_h \bigl (u_h^{n} )\). For the fully discrete midpoint rule, (2.19) is then given for \(n\ge 1\) by

$$\begin{aligned} \bigl ( M_h^{n+1/2} + \frac{\tau ^2}{4} L_h \bigr ) v_h^{n+1}&= \bigl ( M_h^{n+1/2} - \frac{\tau ^2}{4} L_h \bigr ) v_h^{n} - \tau L_h u_h^{n} \\&\quad + \tau {\widetilde{M}}_h I_h^{\varOmega _h}g(t^{n+1/2}, {\bar{u}}_h^{n + 1/2},{\bar{v}}_h^{n + 1/2}) , \\ u_h^{n+1}&= u_h^{n} + \frac{\tau }{2} \bigl ( v_h^{n} + v_h^{n+1} \bigr ), \end{aligned}$$

denoting the extrapolations by \({\bar{u}}_h^{n + 1/2} = \frac{3}{2} u_h^{n} - \frac{1}{2} u_h^{n-1}\) and \({\bar{v}}_h^{n + 1/2} = \frac{3}{2} v_h^{n} - \frac{1}{2} v_h^{n-1} \), and the mass matrix by \(M_h^{n+1/2} = M_h \bigl ({\bar{u}}_h^{n + 1/2} ) \). For the step \(n=0\), we use the Euler scheme from above. We implemented the numerical experiments in C++ using the finite element library deal.II (version 9.4) [2, 6]. A precise description of the implementation can be found for example in [29, Ch. 6.5.1]. For the implementation of the initial value in (2.12), we refer to Appendix A. Concerning the computational costs, let us note that in each step the right-hand side as well as the mass matrix have to be assemble. The stiffness matrix is stored after assembling it before the time stepping. In addition, a linear system for the sparse matrix \(M_h + \frac{\tau ^2}{4} L_h\) has to be solved in each step using the conjugate gradient method. The codes written by Malik Scheifinger under the author’s supervision to reproduce the experiments are available at https://doi.org/10.35097/1792.

2.5.3 Numerical Results

For the problem described above, we performed experiments for the time and space discretization, where we used finite elements of order \(k=1,2,3\). In the error bounds of Sect. 2, for \(N=2\), the norm \(W^{1,p} \times H^1\) is used for \(p< \infty \) arbitrarily large. Hence, we chose \(p=\infty \) in our experiments, but note that the plots were qualitatively very similar for finite p. Since the computation of the lift of a finite element function is very laborious, and in application usually also not available, we do not compute the full error in the form \({\mathcal {L}}_hu- u_h\). Instead, in our numerical examples we consider the error

$$\begin{aligned} {\textbf{E}}(t) {:}{=}\left\Vert u_h(t) - I_hu(t) \right\Vert _{W^{1,\infty }(\varOmega _h)} + \left\Vert v_h{} (t) - I_h\partial _tu(t) \right\Vert _{H^1(\varOmega _h)}, \end{aligned}$$

for the nodal interpolation operator \(I_h\) which is of the same order by the standard interpolation estimates. Note that in practice, one is only interested in \(u_h\), and the computation of the error here is only relevant to confirm our theoretical error bounds.

Fig. 1
figure 1

Left: error \({\textbf{E}}(0.8)\) of the semi-implicit midpoint rule (with time step size \(\tau = 8 \cdot 10^{-4}\) and \(\tau = 2.67 \cdot 10^{-4}\)) combined with finite elements of order \(k = 1,2,3\) plotted against the mesh width h. The dashed lines indicate order \(h^k\) for \(k=1,2,3\). Right: Error \({\textbf{E}}(0.8)\) of the semi-implicit Euler method and midpoint rule combined with finite elements of order \(k=3\) (\(h = 1.52 \cdot 10^{-2}\)) plotted against the time step size \(\tau \). The dashed lines indicate order 1 and 2

In the left part of Fig. 1, the convergence of the error with respect to the spatial mesh width h is shown when using the semi-implicit midpoint rule with \(\tau = 8 \cdot 10^{-4}\). We observe that for finite elements of order k the error converges with order k in space as predicted by Theorem 2.5 and 2.7 until the error for \(k=3\) is dominated by the error of the temporal approximation. For \(k=3\), we ran the same experiment again with the smaller time step size \(\tau = 2.67 \cdot 10^{-4}\) to remove this plateau. Running the same experiment with the semi-implicit Euler method instead of the midpoint rule yields a qualitatively similar picture. Due to slower convergence in time, the error already stagnates at about \(10^{-3}\).

In the right part of Fig. 1, we consider the convergence of the error with respect to the time step size \(\tau \) for the semi-implicit Euler method and midpoint rule. In space, we discretized with finite elements of order \(k=3\) and \(h = 1.52 \cdot 10^{-2}\) such that the spatial error is negligible. Aligning to Theorem 2.6, we observe convergence of order 1 in time for the Euler method and, confirming Theorem 2.7, convergence of order 2 for the midpoint rule.

2.5.4 Steepening Wave

In this second experiment, we consider the formation of a shock wave which is an often observed phenomenon in nonlinear waves. Since we are in a bounded domain, we force the wave to form a large gradient close to the origin. To this end, we chose our data by

$$\begin{aligned} u^{0}({\textbf{x}})&= (1-r^2)^3 \arctan (x_1 ) \,,{} & {} v^{0} ({\textbf{x}}) = - (1-r^2)^3 \frac{\alpha x_1}{x_1^2 + 1 }\,, \\ \lambda (u)&= 1- \frac{1}{2} u^2,{} & {} g(t,u, v) = uv^2 + {\widehat{f}}(t), \end{aligned}$$

where \(r^2 = |{\textbf{x}}|^2\). The additional forcing term \({\widehat{f}}\) is chosen such that the exact solution is given by

$$\begin{aligned} u(t,{\textbf{x}}) = (1-r^2)^3 \arctan \left( \frac{x_1}{1-\alpha t} \right) . \end{aligned}$$
(2.21)

We observe that for \(\alpha t \rightarrow 1\), the maximum norm of \(\nabla u\) tends to infinity. We thus simulate up to the end time \(T = 1\) for different, increasing values of \(\alpha <1\), and in Fig. 2 we depicted the corresponding solutions at the end time. The discretization in space and time is performed as described in Sect. 2.4.1.

Fig. 2
figure 2

Plots of the (exact) shock wave solution given in (2.21) for the final time \(t = 1 \) and different values of \(\alpha = 0.6, 0.8, 0.9, 0.95\)

2.5.5 Numerical Results

We restrict ourselves to the approximation quality in the spatial discretization in this case and thus apply the semi-implicit midpoint rule with \(\tau = 8 \cdot 10^{-4}\) and linear and quadratic finite elements. We then use increasing values of \(\alpha = 0.6, 0.8, 0.9, 0.95\), which can be translated into simulating closer to the blow-up point \(\alpha t = 1\). We depicted the convergence in Fig. 3. In the linear case, we observe that we obtain a reasonable approximation for moderate values of \(\alpha \), which correspond to the smooth case. However, when a shock occurs, our method suffers from large errors due to the large gradient. Compared to the linear polynomials, using quadratic polynomials appears to be advantageous, not only because of the better resolution near the blow-up, but also because of smaller error constants. Nevertheless, in the quadratic case, the error constants become large for \(\alpha \rightarrow 1\) as well.

Fig. 3
figure 3

Error \({\textbf{E}}(1.0)\) of the semi-implicit midpoint rule (with time step size \(\tau = 8 \cdot 10^{-4}\)) combined with finite elements of order \(k=1\) (left) and order \(k=2\) (right) plotted against the mesh width h for the values of \(\alpha = 0.6,0.8,0.9,0.95\). The dashed lines indicate order h or \(h^2\), respectively

2.6 Additional Results for Isoparametric Finite Elements

In this section, we provide further estimates on the spatially discrete objects which are used throughout the paper. As shown in [16, Thm. 5.9], we have for the nodal interpolation operator for \(m \in \{0,1\}\), \(1 \le p \le \infty \), and \(1 \le \ell \le k\) the estimates

$$\begin{aligned} \left\Vert (\textrm{Id}- {\mathcal {L}}_hI_h^{e}) \varphi \right\Vert _{W^{m,p}(\varOmega )}&\lesssim h^{\ell +1-m} \left\Vert \varphi \right\Vert _{W^{\ell +1,p}(\varOmega )},&\varphi&\in W^{\ell +1,p}(\varOmega ). \end{aligned}$$
(2.22)

Further, by [9, Thm. 3.1.6] \(\ell =0\) is allowed for \(N< p \le \infty \). Another crucial property of the interpolation concerns the stability when applied to the product of functions. We give a proof in Appendix B.

Lemma 2.10

Let \(\psi _h \in V_h\), \(\delta >0\), and \(\varphi \in W^{1,N+\delta }(\varOmega )\). Then,

$$\begin{aligned} \left\Vert I_h( \varphi \, {\mathcal {L}}_h\psi _h) \right\Vert _{L^2}&\le C \left\Vert \varphi \right\Vert _{L^\infty } \left\Vert \psi _h \right\Vert _{L^2} , \\ \left\Vert I_h( \varphi \, {\mathcal {L}}_h\psi _h) \right\Vert _{H^1}&\le C \left\Vert \varphi \right\Vert _{W^{1,N+\delta }} \left\Vert \psi _h \right\Vert _{H^1} , \end{aligned}$$

where the constant \(C>0\) is independent of h.

Concerning the adjoint lifts defined in (2.4), we show in Appendix B the following bounds for \(1\le \ell \le k \)

$$\begin{aligned} \bigl \Vert {\mathcal {L}}_h^{H*}\varphi \bigr \Vert _{H_h}&\lesssim \left\Vert \varphi \right\Vert _{L^2(\varOmega )},&\varphi&\in L^2(\varOmega ), \end{aligned}$$
(2.23a)
$$\begin{aligned} \bigl \Vert (I_h- {\mathcal {L}}_h^{H*}) \varphi \bigr \Vert _{H_h}&\lesssim h^{\ell +1} \left\Vert \varphi \right\Vert _{H^{\ell +1}(\varOmega )},&\varphi&\in H^{\ell +1}(\varOmega ) \cap V. \end{aligned}$$
(2.23b)

An interpolation argument between [16, Lem. 3.8] and [14, Thm. 2.5] yields

$$\begin{aligned} \bigl \Vert (\textrm{Id}- {\mathcal {L}}_h{\mathcal {L}}_h^{V*}) \varphi \bigr \Vert _{W^{1,p}(\varOmega )}&\lesssim h^{\ell } \left\Vert \varphi \right\Vert _{W^{\ell +1,p}(\varOmega )},&\varphi&\in H^{\ell +1}(\varOmega ) \cap V, \end{aligned}$$
(2.24)

for \(2 \le p \le \infty \), \(0\le \ell \le k \). We will further make use of the inverse estimates, cf. [8, Thm. 4.5.11] or [30, Lem. 5.6],

$$\begin{aligned} \left\Vert \varphi _h \right\Vert _{V_h} \le C h^{-1} \left\Vert \varphi _h \right\Vert _{L^2(\varOmega _h)}, \qquad \left\Vert \varphi _h \right\Vert _{L^q} \le C h^{N/q-N/p} \left\Vert \varphi _h \right\Vert _{L^p}, \end{aligned}$$
(2.25)

for \(1\le p \le q \le \infty \).

For \(u_h\in V_h\) with \(\left\Vert u_h \right\Vert _{L^\infty } \le {\widehat{r}}_\infty \) and \(\left\Vert u_h \right\Vert _{W^{1,N+\delta }} \le r\), we define an inner product for \(\varphi ,\psi \in L^2(\varOmega _h)\) and the corresponding \(L^2\)-projection \(Q_h(u_h) :L^2(\varOmega _h) \rightarrow V_h\) used in (2.8) for \(\psi _h \in V_h\) by

$$\begin{aligned} \left( \varphi \mid \psi \right) _{\lambda _h} {:}{=}\left( I_h\lambda ({\mathcal {L}}_hu_h) \, \varphi \mid \psi \right) _{L^2(\varOmega _h)}, \qquad \left( Q_h(u_h) \psi \mid \psi _h \right) _{\lambda _h} = \left( \psi \mid \psi _h \right) _{\lambda _h}, \end{aligned}$$

and obtain by the standard techniques for \(p \in [2,\infty ]\) and \(\psi \in L^p, \varphi \in H^1(\varOmega _h)\)

$$\begin{aligned} \left\Vert \pi _h\psi \right\Vert _{L^p(\varOmega _h)}&\lesssim \left\Vert \psi \right\Vert _{L^p(\varOmega _h)}, \quad \left\Vert \pi _h\varphi \right\Vert _{H^1(\varOmega _h)} \lesssim \left\Vert \varphi \right\Vert _{H^1(\varOmega _h)}, \end{aligned}$$
(2.26a)
$$\begin{aligned} \left\Vert Q_h( u_h) \psi \right\Vert _{L^p(\varOmega _h)}&\lesssim \left\Vert \psi \right\Vert _{L^p(\varOmega _h)}, \quad \left\Vert Q_h(u_h) \varphi \right\Vert _{H^1(\varOmega _h)} \lesssim \left\Vert \varphi \right\Vert _{H^1(\varOmega _h)}, \end{aligned}$$
(2.26b)

see for example [35] in the case \(p=\infty \). The constants are controlled by the norms of \(u_h\) in \(L^\infty \) and \(W^{1,N+\delta }\).

Finally, we introduce the first-order lift operator \({\varvec{{\mathcal {L}}_h}}:W^{\ell ,p}(\varOmega _h)^2 \rightarrow W^{\ell ,p}(\varOmega )^2\), \(\ell =0,1\), \(2 \le p \le \infty \), the adjoint lift \({\varvec{{\mathcal {L}}_h}}^{*}:X\rightarrow X_h\), and the reference operator \(J_h:V\times V\rightarrow X_h\) defined by

$$\begin{aligned} {\varvec{{\mathcal {L}}_h}}= \begin{pmatrix} {\mathcal {L}}_h&{} 0 \\ 0 &{} {\mathcal {L}}_h\end{pmatrix}, \qquad {\varvec{{\mathcal {L}}_h}}^{*}= \begin{pmatrix} {\mathcal {L}}_h^{V*}&{} 0 \\ 0 &{} {\mathcal {L}}_h^{H*}\end{pmatrix}, \qquad J_h= \begin{pmatrix} {\mathcal {L}}_h^{V*}&{} 0 \\ 0 &{} {\mathcal {L}}_h^{V*}\end{pmatrix}, \end{aligned}$$
(2.27)

which are bounded uniformly in h due to (2.2), (2.23), and (2.24). From the proof of [20, Lem. 4.7], we then obtain the identity

$$\begin{aligned} \textrm{A}_hJ_h= {\varvec{{\mathcal {L}}_h}}^{*}\textrm{A}, \end{aligned}$$
(2.28)

which is used several times in the proofs.

3 Error Analysis for the Space Discretization

In this section, we give the proof of Theorem 2.5. We decompose the error into

$$\begin{aligned} \begin{aligned} y(t) - {\varvec{{\mathcal {L}}_h}}y_h(t)&= \bigl ( \textrm{Id}- {\varvec{{\mathcal {L}}_h}}J_h\bigr ) y(t) + {\varvec{{\mathcal {L}}_h}}\bigl ( J_hy(t) - y_h(t) \bigr ) \\&{=}{:}e_{J_h} (t) + {\varvec{{\mathcal {L}}_h}}e_h(t), \end{aligned} \end{aligned}$$

where the projection error \(e_{J_h}\) is easily bounded using (2.24). The first part of the proof consists in reducing the bound on \(\left\Vert e_h \right\Vert _{W^{1,{p^*}}\times H^1}\) to an estimate in the stronger norm induced by \(\left\Vert \textrm{A}_h\cdot \right\Vert _{X_h}\), and not in the standard \(X_h\)-norm.

The second part consists in establishing the stronger norm bound on \(\left\Vert \textrm{A}_he_h \right\Vert _{X_h}\) in Sect. 3.2. We note that a key idea is to set up an appropriate solution space for the numerical approximation, see (3.3), which allows for an appropriate formulation of the error equation. We give a detailed explanation in Remark 3.6.

3.1 Reduction to Stronger Norm Estimates

For \({p^*}\) defined in (2.13), we chose some fixed \(\delta >0\) such that

$$\begin{aligned} \frac{1}{2} - \frac{1}{N+\delta } \le \frac{1}{{p^*}}, \end{aligned}$$
(3.1)

a radius \(r_\infty < {\widehat{r}}_\infty \) from Assumption 2.1, and another radius \(r'_\infty >0\), such that

$$\begin{aligned} \left\Vert u \right\Vert _{L^\infty {( L^\infty )}} \le r_\infty , \text { and } \max \bigl \{ \left\Vert u \right\Vert _{L^\infty {(W^{1,N+\delta })}}, \left\Vert \partial _tu \right\Vert _{L^\infty {(W^{1,N+\delta })}} \bigr \} \le \tfrac{1}{2} r'_\infty , \end{aligned}$$
(3.2)

where \(\left\Vert x \right\Vert _{L^\infty {(X)}} {:}{=}\max _{[0,T]}\left\Vert x(t) \right\Vert _{X} \). We denote by \(t^*_h\) the time with

$$\begin{aligned} \begin{aligned} t^*_h&{:}{=}\sup \{ t \in [0,T] \mid \sup _{ s \in [0,t] } \left\Vert {\mathcal {L}}_hu_h(s) \right\Vert _{L^{\infty }} \le {\widehat{r}}_\infty \text { and } \\&\quad \sup _{ s \in [0,t] } \left\Vert {\mathcal {L}}_hu_h(s) \right\Vert _{W^{1,N+\delta }}, \sup _{ s \in [0,t] } \left\Vert {\mathcal {L}}_hv_h(s) \right\Vert _{W^{1,N+\delta }} \le r'_\infty \}. \end{aligned} \end{aligned}$$
(3.3)

We assume for a moment that the set is not empty and hence \(t^*_h>0\), see Proposition 3.5. The following result is a direct consequence of Lemma 2.10 and the key ingredient to ensure wellposedness of the discrete equation. In addition, it enables us to employ energy techniques in the error analysis.

Lemma 3.1

Let Assumption 2.1 hold. We have for \(t \in [0,t^*_h]\), \(1 \le p \le \infty \), and \(j =0,1\) the bounds

$$\begin{aligned} \bigl \Vert \partial _t^j \lambda _h(u_h(t)) \varphi _h \bigr \Vert _{L^{p}}&\le C_{\lambda } \left\Vert \varphi _h \right\Vert _{L^{p}},&\bigl \Vert \partial _t^j \lambda _h^{-1}{{(u_h(t) )}} \varphi _h \bigr \Vert _{L^{p}}&\le C_{\lambda } \left\Vert \varphi _h \right\Vert _{L^{p}}, \\ \bigl \Vert \lambda _h(u_h(t)) \varphi _h \bigr \Vert _{H^1}&\le C_{\lambda } \left\Vert \varphi _h \right\Vert _{H^1},&\bigl \Vert \lambda _h^{-1}{{(u_h(t) )}} \varphi _h \bigr \Vert _{H^1}&\le C_{\lambda } \left\Vert \varphi _h \right\Vert _{H^1}, \end{aligned}$$

with a constant \(C_{\lambda } > 0\) depending only on \(\lambda \), its derivatives and \({\widehat{r}}_\infty , r'_\infty \), but is independent of h and \(t^*_h\).

Proof

We use the definition of \(\lambda _h\) and \(\lambda _h^{-1}\) in (2.6) and (2.8), respectively, to conclude the assertion from Assumption 2.1, the stability in (2.26), the interpolation property (2.22), and (3.3). \(\square \)

Making extensive use of Lemma 3.1, we show via energy techniques in Sect. 3.2 the following error bound on

$$\begin{aligned} \Bigl ( \bigl \Vert \mathrm {\Delta }_h\bigl ( {\mathcal {L}}_h^{V*}u(t) - u_h(t) \bigr ) \bigr \Vert _{L^2}^2 + \bigl \Vert {\mathcal {L}}_h^{V*}\partial _tu(t) - v_h(t) \bigr \Vert _{H^1}^2 \Bigr )^{1/2} = \bigl \Vert \textrm{A}_he_h(t) \bigr \Vert _{X_h}.\nonumber \\ \end{aligned}$$
(3.4)

Note that initially the result is only valid as long as the bounds in (3.3) hold.

Proposition 3.2

Under the assumptions of Theorem 2.5, it holds for \(0\le t \le t^*_h\)

$$\begin{aligned} \bigl \Vert \textrm{A}_he_h(t) \bigr \Vert _{X_h} \le C h^k , \end{aligned}$$

where C is independent of h and \(t^*_h\).

From this bound, we are able to extract convergence as well as to extend the final time \(t^*_h\) beyond T for sufficiently small h. Concerning \(u_h\), we show in the following lemma how to obtain convergence in the maximum norm and first-order Sobolev norms, but postpone the proof to Appendix C. Further, we may directly deduce the bounds on \(u_h\) in (3.3). Note that this lemma can be seen as a discrete analogue to (2.14) and is an improved variant of the results in [7, 13]. Similar bounds were already shown in [18, Thm. 1.12] and [36, Thm. 3].

Lemma 3.3

Let \({p^*}\) be given by (2.13). Then, there is a constant C independent of h such that

$$\begin{aligned} \left\Vert \varphi _h \right\Vert _{L^\infty (\varOmega _h)} + \left\Vert \varphi _h \right\Vert _{W^{1,{p^*}}(\varOmega _h)} \le C \left\Vert \mathrm {\Delta }_h\varphi _h \right\Vert _{L^2(\varOmega _h)} \end{aligned}$$

for all \(\varphi _h \in V_h\). In the case \(N=3\), the statement also holds for \({p^*}= 6\).

A further ingredient in the proof of the main result is to employ the \(H^1\)-bound on \(v_h\) in Proposition 3.2 to derive boundedness in \(L^\infty \) and \(W^{1,N+\delta }\) and thus extend the final time \(t^*_h\).

Lemma 3.4

Let \(\varphi _h \in V_h\) and \(\varphi \in W^{k+1,\infty }(\varOmega ) \cap V\), and assume that

$$\begin{aligned} \bigl \Vert {\mathcal {L}}_h^{V*}\varphi - \varphi _h \bigr \Vert _{H^1(\varOmega _h)}&\le C h^k . \end{aligned}$$
(3.5)

Then, we have for \({p^*}\) defined in (2.13) and \(\delta \) chosen in (3.1)

$$\begin{aligned} \left\Vert {\mathcal {L}}_h\varphi _h \right\Vert _{L^\infty (\varOmega )}&\le \left\Vert \varphi \right\Vert _{L^\infty (\varOmega )} + C h^{k -N/{p^*}} , \\ \left\Vert {\mathcal {L}}_h\varphi _h \right\Vert _{W^{1,N+\delta }(\varOmega )}&\le \left\Vert \varphi \right\Vert _{W^{1,N+\delta }(\varOmega )} + C h^{k -N/{p^*}} , \end{aligned}$$

with a constant C independent of h.

Since \(k\ge 1\), the choice in (2.13) enables to us to deduce the desired bounds (3.3) in \(L^\infty \) and \(W^{1,N+\delta }\) from approximation properties in \(H^1\) and hence allows us to extend the final time \(t^*_h\).

Proof

(of Lemma 3.4) For \(\psi _h \in V_h\), we combine the inverse estimate (2.25) and the Sobolev embedding \(H^1(\varOmega _h) \hookrightarrow L^{{p^*}}(\varOmega _h)\) and conclude by (3.1)

$$\begin{aligned} \left\Vert \psi _h \right\Vert _{L^\infty (\varOmega _h)}&\le C h^{-N/{p^*}} \left\Vert \psi _h \right\Vert _{L^{p^*}(\varOmega _h)}\le C h^{-N/{p^*}} \left\Vert \psi _h \right\Vert _{H^1(\varOmega _h)} , \end{aligned}$$
(3.6a)
$$\begin{aligned} \left\Vert \psi _h \right\Vert _{W^{1,N+\delta }(\varOmega _h)}&\le C h^{N/(N+\delta ) -N/2} \left\Vert \psi _h \right\Vert _{H^1(\varOmega _h)} \le C h^{-N/{p^*}} \left\Vert \psi _h \right\Vert _{H^1(\varOmega _h)} , \end{aligned}$$
(3.6b)

with a constant C independent of h. For the desired bound, we expand with the adjoint lift \({\mathcal {L}}_h^{V*}\) and obtain by (2.24)

$$\begin{aligned} \left\Vert {\mathcal {L}}_h\varphi _h \right\Vert _{L^\infty (\varOmega )}&\le \left\Vert \varphi \right\Vert _{L^\infty (\varOmega )} + \bigl \Vert \varphi - {\mathcal {L}}_h{\mathcal {L}}_h^{V*}\varphi \bigr \Vert _{L^\infty (\varOmega )} + \bigl \Vert {\mathcal {L}}_h{\mathcal {L}}_h^{V*}\varphi - {\mathcal {L}}_h\varphi _h \bigr \Vert _{L^\infty (\varOmega )} \\&\le \left\Vert \varphi \right\Vert _{L^\infty (\varOmega )} + C h^{k} \left\Vert \varphi \right\Vert _{W^{k+1,\infty }(\varOmega )} + C \bigl \Vert {\mathcal {L}}_h^{V*}\varphi - \varphi _h \bigr \Vert _{L^\infty (\varOmega _h)} . \end{aligned}$$

Since \( {\mathcal {L}}_h^{V*}\varphi - \varphi _h \in V_h\), the first assertion then follows from (3.6a) together with (3.5). The second estimate is derived fully analogously. \(\square \)

Hence, once we have shown Proposition 3.2, we can give the proof of our first main result.

Proof

(of Theorem 2.5)

Inserting the adjoint lift, we obtain for \(t\in [0,t^*_h]\) with (2.24), (3.4), and Lemma 3.3 the bound

$$\begin{aligned} \left\Vert u(t) - {\mathcal {L}}_hu_h(t) \right\Vert _{W^{1,{p^*}}}&\le \bigl \Vert (\textrm{Id}- {\mathcal {L}}_h{\mathcal {L}}_h^{V*}) u(t) \bigr \Vert _{W^{1,{p^*}}} + C \left\Vert \textrm{A}_he_h(t) \right\Vert _{X_h} \le C h^k \,, \end{aligned}$$

and similarly

$$\begin{aligned} \left\Vert \partial _tu(t) - {\mathcal {L}}_hv_h(t) \right\Vert _{H^1}&\le \bigl \Vert (\textrm{Id}- {\mathcal {L}}_h{\mathcal {L}}_h^{V*}) \partial _tu(t) \bigr \Vert _{H^1} + C \left\Vert \textrm{A}_he_h(t) \right\Vert _{X_h} \le C h^k \,, \end{aligned}$$

with a constant C independent of h and \(t^*_h\). Hence, it remains to show \(t^*_h= T\). Combining the bounds in Proposition 3.2 and Lemma 3.3, we show by (3.2) for h sufficiently small that

$$\begin{aligned} \left\Vert {\mathcal {L}}_hu_h(t^*_h) \right\Vert _{L^\infty (\varOmega )}&\le \left\Vert u(t^*_h) \right\Vert _{L^\infty (\varOmega )} + C h^k< {\widehat{r}}_\infty \,, \\ \left\Vert {\mathcal {L}}_hu_h(t^*_h) \right\Vert _{W^{1,N+\delta }(\varOmega )}&\le \left\Vert u(t^*_h) \right\Vert _{W^{1,N+\delta }(\varOmega )} + C h^k < r'_\infty \,, \\ \end{aligned}$$

as well as with Proposition 3.2 and Lemma 3.4

$$\begin{aligned} \left\Vert {\mathcal {L}}_hv_h(t^*_h) \right\Vert _{W^{1,N+\delta }(\varOmega )}&\le \left\Vert \partial _tu(t^*_h) \right\Vert _{W^{1,N+\delta }(\varOmega )} + C h^{k -N/{p^*}} < r'_\infty . \end{aligned}$$

Thus, the continuity of the discrete solution \(y_h\) and the equivalence of all norms in finite-dimensional spaces yields \(t^*_h\ge T\). In particular, the statement of Theorem 2.5 is true for \(t\in [0,T]\). \(\square \)

3.2 Proof of Proposition 3.2

The rest of this section is devoted to the proof of Proposition 3.2. The first step is to show that the set defined in (3.3) is not empty.

Proposition 3.5

The initial error satisfies

$$\begin{aligned} \left\Vert \textrm{A}_he_h(0) \right\Vert _{X_h}&\le C h^k , \end{aligned}$$

where C is independent of h. In particular, it holds \(0 < t^*_h\le T\).

Proof

The bound directly follows from the choice (2.12), the interpolation properties in (2.22), and the bounds in Proposition 2.4. To show that \(t^*_h>0\), we proceed as in the proof of Theorem 2.5 with \(t=0\) instead of \(t = t^*_h\). \(\square \)

With the aid of Lemma 3.1, we are able to define with \(\varLambda _h(y_h)\) from (2.7) the state-dependent inner products

$$\begin{aligned} \left( \varphi _h \mid \psi _h \right) _{\varLambda _h,t}&{:}{=}\left( \varLambda _h(y_h(t)) \varphi _h \mid \psi _h \right) _{X_h},&t \in [0,t^*_h], \, \varphi _h, \psi _h&\in X_h. \end{aligned}$$

The corresponding norm is equivalent to the norm of \(X_h\), i.e., we have

$$\begin{aligned} c_{\varLambda _h} \left\Vert \varphi _h \right\Vert _{X_h}&\le \left\Vert \varphi _h \right\Vert _{\varLambda _h,t} \le C_{\varLambda _h} \left\Vert \varphi _h \right\Vert _{X_h},&t \in [0,t^*_h], \, \varphi _h&\in X_h, \end{aligned}$$
(3.7)

with the constants from Lemma 3.1.

3.3 Error Equation

We study the bound on the discrete error \(e_h\) and derive an evolution equation for it. Inserting the projected solution \(J_hy\) of (2.1) in (2.10), we obtain

$$\begin{aligned} \varLambda _h(y_h(t)) J_h\partial _ty(t)&= \textrm{A}_hJ_hy(t) + G_h(t, J_hy) \\&\quad + \bigl ( \varLambda _h(y_h(t)) - \varLambda _h( J_hy(t)) \bigr ) J_h\partial _ty(t) + \delta _{h}(t) \end{aligned}$$

with defect

$$\begin{aligned} \begin{aligned} \delta _{h}(t)&= \bigl ( \varLambda _h(J_hy(t)) J_h- J_h\varLambda (y(t)) \bigr ) \partial _ty(t) \\&\quad + \bigl (J_h\textrm{A}- \textrm{A}_hJ_h\bigr ) y(t) + \bigl ( J_hG(t, y) - G_h(t, J_hy) \bigr ). \end{aligned} \end{aligned}$$
(3.8)

This leads us to the error equation

$$\begin{aligned} \begin{aligned} \varLambda _h(y_h(t)) \partial _te_h(t)&= \textrm{A}_he_h(t) + \varGamma _h(t) + \delta _{h}(t), \end{aligned} \end{aligned}$$
(3.9)

where the stability term is given by

$$\begin{aligned} \varGamma _h(t)&{:}{=}\bigl ( G_h(t, J_hy(t) ) - G_h(t, y_h(t) ) \bigr )\nonumber \\&\quad + \bigl ( \varLambda _h(y_h(t))- \varLambda _h( J_hy(t)) \bigr ) J_h\partial _ty(t) . \end{aligned}$$
(3.10)

Remark 3.6

Let us explain the main differences to the error analysis presented by Maier and Hochbruck [21, 31] and Makridakis [32]. In [21, 31], instead of \(\varLambda _h(y_h)\) they use \(\varLambda _h( I_hy)\) which has the properties from Lemma 3.1. However, to bound the stability term, inverse inequalities are used which induce restrictions on the polynomial degree and also the CFL-type condition. Our technique is more related to [32], where bounds on \(\left\Vert u_h \right\Vert _{W^{1,\infty }}\) replace (3.3).

However, in both approaches the error analysis is performed in \(H^1\times L^2\). They thus have to impose stronger CFL-type conditions to close the argument.\(\lozenge \)

We introduce the state-dependent operator

$$\begin{aligned} {\textbf{A}}_h(t) = \varLambda _h^{-1}{(y_h(t))} \textrm{A}_h\end{aligned}$$

and define the modified error as

$$\begin{aligned} {\widetilde{e}}_h(t) {:}{=}{\textbf{A}}_h(t) e_h(t). \end{aligned}$$

Differentiating the term \(\varLambda _h(y_h(t)) {\widetilde{e}}_h(t)\) and using (3.9) lead to the following modified error equation

$$\begin{aligned} \begin{aligned} \varLambda _h(y_h(t)) \partial _t{\widetilde{e}}_h(t)&= \textrm{A}_h{\widetilde{e}}_h(t) - \bigl ( \partial _t\varLambda _h(y_h(t)) \bigr ) {\widetilde{e}}_h(t) \\&\quad + \textrm{A}_h\varLambda _h^{-1}{(y_h(t))} \bigl (\varGamma _h(t) + \delta _{h}(t) \bigr ). \end{aligned} \end{aligned}$$
(3.11)

We state two results on the stability term and the defect and postpone their proofs to Sect. 5.

Lemma 3.7

For \(0\le t\le t^*_h\), it holds

$$\begin{aligned} \left\Vert \textrm{A}_h\varGamma _h(t) \right\Vert _{X_h} \le C \left\Vert \textrm{A}_he_h(t) \right\Vert _{X_h} \end{aligned}$$

with a constant C independent of h and \(t^*_h\).

Similarly, we show the optimal error bound of the defect in the stronger norm.

Lemma 3.8

For \(0\le t\le t^*_h\) it holds

$$\begin{aligned} \left\Vert \textrm{A}_h\delta _{h}(t) \right\Vert _{X_h} \le C h^k \end{aligned}$$

with a constant C independent of h and \(t^*_h\).

In addition, we note that by (2.7) and Lemma 3.1 there is a constant C independent of h and \(t^*_h\) such that for all \(x_h \in X_h\) it holds

$$\begin{aligned} \left\Vert \textrm{A}_h\varLambda _h(y_h(t)) x_h \right\Vert _{X_h} \le C \left\Vert \textrm{A}_hx_h \right\Vert _{X_h}, \quad 0\le t\le t^*_h. \end{aligned}$$
(3.12)

With these two lemmas and the bound on the initial error in Proposition 3.5, we conclude the remaining estimate.

Proof

(of Proposition 3.2) We first compute

$$\begin{aligned} \partial _t \left\Vert {\widetilde{e}}_h(t) \right\Vert _{\varLambda _h,t}^2 = \left( \bigl ( \partial _t\varLambda _h(y_h(t)) \bigr ) {\widetilde{e}}_h(t) \mid {\widetilde{e}}_h(t) \right) _{X_h} + 2 \left( \varLambda _h(y_h(t)) \partial _t{\widetilde{e}}_h(t) \mid {\widetilde{e}}_h(t) \right) _{X_h}. \end{aligned}$$

Inserting the error equation (3.11), we use the skew-symmetry of \(\textrm{A}_h\) and combine the bounds in (3.12) and Lemmas 3.1, 3.7, and 3.8 to obtain

$$\begin{aligned} \partial _t \left\Vert {\widetilde{e}}_h(t) \right\Vert _{\varLambda _h,t}^2&\le C \left\Vert {\widetilde{e}}_h(t) \right\Vert _{\varLambda _h,t}^2 + C h^{2 k}. \end{aligned}$$

The application of a Gronwall lemma together with Proposition 3.5 and (3.7) then yields the assertion. \(\square \)

4 Error Analysis for the Full Discretization

We carry over the results and techniques established in the last section to the fully discrete schemes. We work with a discrete analogous of (3.3) given by a final time step \({n^*}\) which allows us to perform the next time step to \(t^{{n^*}+1}\), that is

$$\begin{aligned} {n^*}&{:}{=}\max \Bigl \{ 0 \le n \le N - 1 \mid \max _{ k=0,\dots ,n } \left\Vert {\mathcal {L}}_hu_h^{k} \right\Vert _{L^\infty } \le {\widehat{r}}_\infty , \text { and }\nonumber \\&\quad \max _{ k=0,\dots ,n } \max \bigl \{ \left\Vert {\mathcal {L}}_hu_h^{k} \right\Vert _{W^{1,N+\delta }}, \left\Vert {\mathcal {L}}_hv_h^{k} \right\Vert _{W^{1,N+\delta }}, \left\Vert {\mathcal {L}}_h\partial _{\tau }u_h^{k} \right\Vert _{W^{1,N+\delta }} \bigr \} \le r'_\infty \Bigr \}.\nonumber \\ \end{aligned}$$
(4.1)

In particular, we will establish \({n^*}\ge N-1\). Note that by (2.16) formally, we have to show that \({n^*}\ge 1\) for the last term in (4.1), which can be interpreted as providing both the base cases \(n=0,1\) in the induction. However, the case \(n=0\) is already covered by Proposition 3.2, such that the set in (4.1) is not empty, and it holds \({n^*}\ge 0\).

Further, note that similar to Lemma 3.1 we conclude from the bounds in (4.1) that for \(0 \le n \le {n^*}\), \(1 \le p \le \infty \), and \(j =0,1\) it holds

$$\begin{aligned} \bigl \Vert \partial _{\tau }^j \lambda _h(u_h^{n}) \varphi _h \bigr \Vert _{L^{p}}&\le C_{\lambda } \left\Vert \varphi _h \right\Vert _{L^{p}},&\bigl \Vert \partial _{\tau }^j \lambda _h^{-1}{{(u_h^{n})}} \varphi _h \bigr \Vert _{L^{p}}&\le C_{\lambda } \left\Vert \varphi _h \right\Vert _{L^{p}}, \end{aligned}$$
(4.2)

and the bounds in Lemma 3.1 in the \(H^1\)-norm remain valid.

Throughout this section, we employ several times the estimate from Lemma 3.3, and also a straightforward extension of Lemma 3.4 including the temporal convergence rate.

Lemma 4.1

Let \(\varphi _h \in V_h\) and \(\varphi \in W^{k+1,\infty }(\varOmega ) \cap V\), and assume that for some \(\ell \in \{1,2\}\) it holds

$$\begin{aligned} \bigl \Vert {\mathcal {L}}_h^{V*}\varphi - \varphi _h \bigr \Vert _{H^1(\varOmega _h)}&\le C \bigl (\tau ^\ell + h^k \bigr ) . \end{aligned}$$

Then, we have

$$\begin{aligned} \left\Vert {\mathcal {L}}_h\varphi _h \right\Vert _{L^\infty (\varOmega )}&\le \left\Vert \varphi \right\Vert _{L^\infty (\varOmega )} + C h^{ -N/{p^*}} \bigl (\tau ^\ell + h^k \bigr ) , \\ \left\Vert {\mathcal {L}}_h\varphi _h \right\Vert _{W^{1,N+\delta }(\varOmega )}&\le \left\Vert \varphi \right\Vert _{W^{1,N+\delta }(\varOmega )} + C h^{ -N/{p^*}} \bigl (\tau ^\ell + h^k \bigr ) , \end{aligned}$$

with a constant C independent of h and \(\tau \).

4.1 Euler

First note that for the Euler method (2.17), we have by construction \(\partial _{\tau }u_h^{k} = v_h^{k}\) such that it is sufficient to check the first three conditions. As above, we define the discrete error by \(e_h^{n} = J_hy(t^{n}) - y_h^{n}\) and aim to show as in Proposition 3.2 the following bound.

Proposition 4.2

Under the assumptions of Theorem 2.6, for \(0\le n \le {n^*}+ 1\) it holds the bound

$$\begin{aligned} \bigl \Vert \textrm{A}_he_h^{n} \bigr \Vert _{X_h} \le C \bigl (\tau + h^k \bigr ) , \end{aligned}$$

where C is independent of h, \(\tau \) and \({n^*}\).

As in the spatially discrete case, this estimate allows us to immediately conclude our main result.

Proof

(of Theorem 2.6) We proceed along the lines of the proof of Theorem 2.5 to conclude the convergence up to \(t^{{n^*}+1}\). In addition, Lemma 3.3 and the CFL-type condition (2.18) together with Lemma 4.1 for \(\ell =1\) further allow us to prove \({n^*}\ge N-1\) for \(h,\tau \) sufficiently small, and the assertion is shown for all n. \(\square \)

The rest of this section is devoted to the proof of Proposition 4.2. In order to derive the error equation, we insert the projected exact solution \(J_hy\) of (2.1) in the scheme (2.17) and derive

$$\begin{aligned} \varLambda _h(y_h^{n}) J_h\partial _{\tau }y(t^{n+1})&= \textrm{A}_hJ_hy(t^{n+1}) + G_h(t^{n}, J_hy(t^{n}) ) \\&\quad + \bigl ( \varLambda _h(y_h^{n}) - \varLambda _h( J_hy(t^{n}) ) \bigr ) J_h\partial _{\tau }y(t^{n+1}) + \delta _{\text {Eu}}^{n+1} \end{aligned}$$

with defect \(\delta _{\text {Eu}}^{n+1} = \delta _{h,\text {Eu}}^{n+1} + \delta _{\tau ,\text {Eu}}^{n+1} \) given by

$$\begin{aligned} \delta _{h,\text {Eu}}^{n+1}&= \bigl (J_h\textrm{A}- \textrm{A}_hJ_h\bigr ) y(t^{n+1}) + J_hG(t^{n}, y(t^{n}) ) - G_h(t^{n}, J_hy(t^{n}) ) \nonumber \\&\quad + \bigl ( \varLambda _h( J_hy(t^{n}) ) J_h- J_h\varLambda ( y(t^{n}) ) \bigr ) \partial _{\tau }y(t^{n+1}) , \end{aligned}$$
(4.3a)
$$\begin{aligned} \delta _{\tau ,\text {Eu}}^{n+1}&= J_h\varLambda ( y(t^{n}) ) \partial _{\tau }y(t^{n+1}) - J_h\varLambda ( y(t^{n+1}) ) \partial _ty(t^{n+1}) \nonumber \\&\quad + J_hG(t^{n+1}, y(t^{n+1}) ) - J_hG(t^{n}, y(t^{n}) ) . \end{aligned}$$
(4.3b)

This yields the discrete error equation

$$\begin{aligned} \begin{aligned} \varLambda _h(y_h^{n}) \partial _{\tau }e_h^{n+1}&= \textrm{A}_he_h^{n+1} + \varGamma _h^n + \delta _{\text {Eu}}^{n+1}, \end{aligned} \end{aligned}$$
(4.4)

where the stability term is given by

$$\begin{aligned} \varGamma _h^{n}&{:}{=}&\bigl ( G_h(t^{n}, J_hy(t^{n})) - G_h(t^{n}, y_h) \bigr ) \nonumber \\{} & {} + \bigl ( \varLambda _h(y_h^{n} ) - \varLambda _h( J_hy(t^{n})) \bigr ) J_h\partial _{\tau }y(t^{n+1}). \end{aligned}$$
(4.5)

In order to obtain a recursion for \(e_h^{n+1}\), we recall the state-dependent operator and define the corresponding resolvent

$$\begin{aligned} {\textbf{A}}_h^{\hspace{-0.3mm}n} = \varLambda _h^{-1}(y_h^{n}) \textrm{A}_h, \qquad {\textbf{R}}_{\text {Eu},n} {:}{=}\bigl ( I - \tau {\textbf{A}}_h^{\hspace{-0.3mm}n} \bigr )^{-1}. \end{aligned}$$
(4.6)

A simple calculation shows that for the inner product

$$\begin{aligned} \left( \varphi _h \mid \psi _h \right) _{n} {:}{=}\left( \varLambda _h(y_h^{n}) \varphi _h \mid \psi _h \right) _{X_h}, \qquad \varphi _h, \psi _h \in X_h, \end{aligned}$$

which satisfies by (4.1) the same bounds as in (3.7), we obtain

$$\begin{aligned} \left\Vert {\textbf{R}}_{\text {Eu},n} \varphi _h \right\Vert _{n} \le \left\Vert \varphi _h \right\Vert _{n}, \end{aligned}$$

and rewrite (4.4) as

$$\begin{aligned} e_h^{n+1} = {\textbf{R}}_{\text {Eu},n} e_h^{n} + \tau {\textbf{R}}_{\text {Eu},n} \varLambda _h^{-1}(y_h^{n}) \bigl ( \varGamma _h^n + \delta _{\text {Eu}}^{n+1} \bigr ). \end{aligned}$$

Since \( {\textbf{A}}_h^{\hspace{-0.3mm}n}\) commutes with \({\textbf{R}}_{\text {Eu},n}\), we obtain

$$\begin{aligned} \begin{aligned} {\textbf{A}}_h^{\hspace{-0.3mm}n} e_h^{n+1}&= {\textbf{R}}_{\text {Eu},n} {\textbf{A}}_h^{\hspace{-0.3mm}n} e_h^{n} + \tau {\textbf{R}}_{\text {Eu},n} {\textbf{A}}_h^{\hspace{-0.3mm}n} \varLambda _h^{-1} (y_h^{n}) \bigl ( \varGamma _h^n + \delta _{\text {Eu}}^{n+1} \bigr ) \end{aligned} \end{aligned}$$

which has to be resolved. Proceeding as in Lemma 3.7, and noting that for any norm it holds

$$\begin{aligned} \left\Vert \partial _{\tau }y(t^{n}) \right\Vert _{} \le \max \limits _{t\in [t^{n-1}, t^{n}]} \left\Vert \partial _ty(t) \right\Vert _{}, \end{aligned}$$
(4.7)

we have for \(0\le n\le {n^*}\) the stability bound

$$\begin{aligned} \left\Vert \textrm{A}_h\varGamma _h^{n} \right\Vert _{X_h} \le C \left\Vert \textrm{A}_he_h^{n} \right\Vert _{X_h} \end{aligned}$$

with a constant C independent of h, \(\tau \) and \({n^*}\). Similarly, we show the optimal consistency error of the defect in the stronger norm, see Sect. 5 for the proof.

Lemma 4.3

For \(0\le n\le {n^*}\), it holds

$$\begin{aligned} \bigl \Vert \textrm{A}_h\delta _{\text {Eu}}^{n+1} \bigr \Vert _{X_h} \le C \bigl ( \tau + h^k \bigr ) \end{aligned}$$

with a constant C independent of h, \(\tau \) and \({n^*}\).

Hence, we have already established the estimate

$$\begin{aligned} \bigl \Vert {\textbf{A}}_h^{\hspace{-0.3mm}n} e_h^{n+1} \bigr \Vert _{n} \le \left\Vert {\textbf{A}}_h^{\hspace{-0.3mm}n} e_h^{n} \right\Vert _{n} + C \tau \left\Vert \textrm{A}_he_h^{n} \right\Vert _{X_h} + C \tau \bigl ( \tau + h^k \bigr ), \end{aligned}$$
(4.8)

and the last step toward the main result is to change the norms.

Lemma 4.4

For \(1\le n\le {n^*}\), it holds for all \(\varphi _h \in V_h\)

$$\begin{aligned} \left\Vert {\textbf{A}}_h^{\hspace{-0.3mm}n} \varphi _h \right\Vert _{n} \le (1 + C \tau ) \left\Vert {\textbf{A}}_h^{\hspace{-0.3mm}n-1} \varphi _h \right\Vert _{n-1} \end{aligned}$$

with a constant C independent of h, \(\tau \) and \({n^*}\).

Proof

Expanding the norm as

$$\begin{aligned} \left\Vert {\textbf{A}}_h^{\hspace{-0.3mm}n} \varphi _h \right\Vert _{n}^2&= \bigl ( \textrm{A}_h\varphi _h \mid \varLambda _h^{-1}(y_h^{n}) \textrm{A}_h\varphi _h \bigr )_{X_h} \\&= \bigl \Vert {\textbf{A}}_h^{\hspace{-0.3mm}n-1} \varphi _h \bigr \Vert _{n-1}^2 + \tau \bigl ( \textrm{A}_h\varphi _h \mid \partial _{\tau }\varLambda _h^{-1} (y_h^{n}) \textrm{A}_h\varphi _h \bigr )_{X_h} \end{aligned}$$

and using (4.2) several times give the assertion. \(\square \)

With this, we are able to proof the estimate on \(\textrm{A}_he_h^{n+1}\).

Proof

(of Proposition 4.2)

We first consider the case \({n^*}= 0\). Hence, (4.8) with \(n=0\) directly yields the assertion without the use of Lemma 4.4 and hence without any bound on \(\partial _{\tau }u_h^{k}\). With this, we established \({n^*}\ge 1\).

In the case \({n^*}\ge 1\), we employ Lemma 4.4 in (4.8) and make use of the norm equivalences to obtain

$$\begin{aligned} \bigl \Vert {\textbf{A}}_h^{\hspace{-0.3mm}n} e_h^{n+1} \bigr \Vert _{n} \le (1+C \tau ) \bigl \Vert {\textbf{A}}_h^{\hspace{-0.3mm}n-1} e_h^{n} \bigr \Vert _{n-1} + C \tau \bigl ( \tau + h^k \bigr ). \end{aligned}$$

Resolving the recursion and using Proposition 3.5 yields the result. \(\square \)

4.2 Midpoint

The proof is very similar to the Euler method and hence, we only sketch the relevant details. First note that by construction in (2.19) it holds

$$\begin{aligned} \partial _{\tau }u_h^{k} = \tfrac{1}{2} ( v_h^{k} + v_h^{k-1} ), \end{aligned}$$

such that the last bound in (4.1) does not have to be shown separately. Again, we aim at the following bound.

Proposition 4.5

Under the assumptions of Theorem 2.7, for \(0\le n \le {n^*}+ 1\) it holds the bound

$$\begin{aligned} \bigl \Vert \textrm{A}_he_h^{n} \bigr \Vert _{X_h} \le C \bigl (\tau ^2 + h^k \bigr ) , \end{aligned}$$

where C is independent of h, \(\tau \) and \({n^*}\).

Combining Lemma 4.1 with the weaker CFL-type condition (2.20) yields the convergence result.

Proof

(of Theorem 2.7) As in the proof of Theorem 2.6, the convergence follows directly. To show that \({n^*}\ge N-1\), we employ Lemma 4.1 with \(\ell =2\) together with the CFL-type condition (2.20). \(\square \)

Hence, it remains to show Proposition 4.5. As for the Euler method, we derive the following error equation

$$\begin{aligned} \begin{aligned} \varLambda _h({\bar{y}}_h^{n + 1/2}) \partial _{\tau }e_h^{n+1}&= \textrm{A}_he_h^{n+1/2} + \varGamma _h^n + \delta _{\text {M}}^{n+1}, \end{aligned} \end{aligned}$$
(4.9)

with a stability term similar to the one in (4.5) satisfying

$$\begin{aligned} \left\Vert \textrm{A}_h\varGamma _h^n \right\Vert _{X_h} \le C \bigl ( \left\Vert \textrm{A}_he_h^{n} \right\Vert _{X_h} + \bigl \Vert \textrm{A}_he_h^{n-1} \bigr \Vert _{X_h} \bigr ), \end{aligned}$$
(4.10)

and a composed defect \(\delta _{\text {M}}^{n+1} = \delta _{h,\text {M}}^{n+1} + \delta _{\tau ,\text {M}}^{n+1}\). The first component is basically the same as \(\delta _{h,\text {Eu}}^{n+1}\) in (4.3b), and the second satisfies

$$\begin{aligned} \begin{aligned} \delta _{\tau ,\text {M}}^{n+1}&= J_h\varLambda ( {\bar{y}}^{n + 1/2} ) \partial _{\tau }y(t^{n+1}) - J_h\varLambda ( y(t^{n+1/2}) ) \partial _ty(t^{n+1/2}) \\&\quad + J_h\textrm{A}\bigl ( y(t^{n+1/2}) - \tfrac{1}{2} ( y(t^{n+1}) + y(t^{n})) \bigr ), \\&\quad + J_hG(t^{n+1/2}, y(t^{n+1/2}) ) - J_hG(t^{n+1/2}, {\bar{y}}^{n + 1/2} ), \end{aligned} \end{aligned}$$
(4.11)

such that we derive in Sect. 5 the desired order of convergence.

Lemma 4.6

For \(0\le n\le {n^*}\), it holds

$$\begin{aligned} \bigl \Vert \textrm{A}_h\delta _{\text {M}}^{n+1} \bigr \Vert _{X_h} \le C \bigl ( \tau ^2 + h^k \bigr ) \end{aligned}$$

with a constant C independent of h, \(\tau \) and \({n^*}\).

We solve for \(e_h^{n+1}\) in the error equation (4.9) and define for the solution-dependent operator \({\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} = \varLambda _h^{-1}({\bar{y}}^{n + 1/2} ) \textrm{A}_h\) the maps

$$\begin{aligned} {\mathcal {R}}_{\pm ,n+1/2} {:}{=}I \pm \tfrac{\tau }{2} {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2}, \qquad {\textbf{R}}_{\text {m},n+1/2} {:}{=}{\mathcal {R}}_{-,n+1/2}^{-1} {\mathcal {R}}_{+,n+1/2}. \end{aligned}$$

A simple calculation shows that for the inner product

$$\begin{aligned} \left( \varphi _n \mid \psi _h \right) _{n+1/2} {:}{=}\bigl ( \varLambda _h({\bar{y}}^{n + 1/2} ) \varphi _n \mid \psi _h \bigr )_{X_h}, \qquad \varphi _n, \psi _h \in X_h, \end{aligned}$$

we have

$$\begin{aligned} \bigl \Vert {\mathcal {R}}_{-,n+1/2}^{-1} \varphi _h \bigr \Vert _{n+1/2} \le \left\Vert \varphi _h \right\Vert _{n+1/2}, \qquad \left\Vert {\textbf{R}}_{\text {m},n+1/2} \varphi _h \right\Vert _{n+1/2} = \left\Vert \varphi _h \right\Vert _{n+1/2}. \end{aligned}$$

Rewriting (4.9) and multiplying by \({\textbf{A}}_h^{\hspace{-0.3mm}n+1/2}\), we obtain

$$\begin{aligned} \begin{aligned} {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} e_h^{n+1}&= {\textbf{R}}_{\text {m},n+1/2} {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} e_h^{n} \\&\quad + \tau {\mathcal {R}}_{-,n+1/2}^{-1} {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} \varLambda _h^{-1}({\bar{y}}^{n + 1/2} ) \bigl ( \varGamma _h^n + \delta _{\text {M}}^{n+1} \bigr ). \end{aligned} \end{aligned}$$
(4.12)

Finally, we have as in Lemma 4.4 the following bound when changing the norm.

Lemma 4.7

For \(1\le n\le {n^*}\) it holds

$$\begin{aligned} \bigl \Vert {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} \varphi _h \bigr \Vert _{n+1/2} \le (1 + C \tau ) \bigl \Vert {\textbf{A}}_h^{\hspace{-0.3mm}n-1/2} \varphi _h \bigr \Vert _{n-1/2} \end{aligned}$$

with a constant C independent of h, \(\tau \) and \({n^*}\).

Proof

First note that

$$\begin{aligned} \bigl \Vert \partial _{\tau }( \tfrac{3}{2} u_h^{k} - \tfrac{1}{2} u_h^{k-1} ) \bigr \Vert _{L^\infty }&\le \tfrac{3}{2} \bigl \Vert \partial _{\tau }u_h^{k} \bigr \Vert _{L^\infty } + \tfrac{1}{2} \bigl \Vert \partial _{\tau }u_h^{k-1} \bigr \Vert _{L^\infty } , \end{aligned}$$

as well as

$$\begin{aligned} \bigl \Vert \tfrac{3}{2} u_h^{k} - \tfrac{1}{2} u_h^{k-1} \bigr \Vert _{L^\infty }&= \bigl \Vert u_h^{k} + \tfrac{\tau }{2} \partial _{\tau }u_h^{k} \bigr \Vert _{L^\infty } . \end{aligned}$$

This allows us to proceed as in Lemma 4.4 and to bound \(\partial _{\tau }^j \varLambda _h({\bar{y}}^{n + 1/2} )\), \(j=0,1\), and the inverse \(\varLambda _h^{-1}({\bar{y}}^{n + 1/2} )\). \(\square \)

We are then able to conclude the error bound for the midpoint rule.

Proof

(of Proposition 4.5) Using the error equation (4.12), we employ Lemmas 4.6 and 4.7, and (4.10) to obtain

$$\begin{aligned} \bigl \Vert {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} e_h^{n+1} \bigr \Vert _{n+1/2}&\le (1 + C \tau ) \bigl \Vert {\textbf{A}}_h^{\hspace{-0.3mm}n-1/2} e_h^{n} \bigr \Vert _{n-1/2} \\&\quad + C \tau \bigl ( \bigl \Vert \textrm{A}_he_h^{n} \bigr \Vert _{X_h} + \bigl \Vert \textrm{A}_he_h^{n-1} \bigr \Vert _{X_h} \bigr ) + C \tau \bigl ( \tau ^2 + h^k \bigr ). \end{aligned}$$

With the bound on \(\textrm{A}_he_h^{0}\) from Proposition 3.5 and using the fact the first step is given by the Euler method, we obtain with Proposition 4.2

$$\begin{aligned} \bigl \Vert \textrm{A}_he_h^{1} \bigr \Vert _{X_h} \le C \tau \bigl ( \tau + h^k \bigr ), \end{aligned}$$

which yields by a Gronwall lemma the assertion. \(\square \)

4.3 Exponential Euler

For the exponential method, we apply a similar approach and derive the necessary bound in the stronger energy norm. However, there is no direct relation to the discrete derivatives of the error. In this case, we have to prove an additional error estimate.

Proposition 4.8

Under the assumptions of Theorem 2.8, for \(0\le n \le {n^*}+ 1\) there hold the bounds

$$\begin{aligned} \bigl \Vert \textrm{A}_he_h^{n} \bigr \Vert _{X_h}&\le C \bigl (\tau + h^k \bigr ), \end{aligned}$$

and for \(1\le n \le {n^*}+ 1\)

$$\begin{aligned} \bigl \Vert \partial _{\tau }e_h^{n} \bigr \Vert _{X_h}&\le C \bigl (\tau + h^k \bigr ), \end{aligned}$$

where C is independent of h, \(\tau \), and \({n^*}\).

Once this is established, the last main result directly follows.

Proof

(of Theorem 2.8) In order to conclude the convergence rates, we only employ the first estimate in Proposition 4.8. To show \({n^*}\ge N-1\), again the first estimate allows us to guarantee the first three bounds in (4.1). The bound on \({\mathcal {L}}_h\partial _{\tau }u_h^{k}\) follows from the second estimate in Proposition 4.8 combined with Lemma 4.1 and the CFL-type condition (2.18). \(\square \)

The rest of this section is devoted to the proof of Proposition 4.8. We introduce the auxiliary approximation \({\tilde{y}}^{n}(t^{n} + s)\) for \(s\in [0,\tau ]\) as the solution of

$$\begin{aligned} \varLambda _h(y_h^{n}) \partial _t{\tilde{y}}^{n}(t^{n} + s)&= \textrm{A}_h{\tilde{y}}^{n} (t^{n} + s) + G_h(t^{n}, y_h^{n} ) ,\quad {\tilde{y}}^{n}(t^{n}) =y_h^{n} \end{aligned}$$
(4.13)

and thus satisfies \({\tilde{y}}^{n}(t^{n} + \tau ) = y_h^{n+1}\). In order to derive the error equation, we insert the projected exact solution \(J_hy\) in (4.13)

$$\begin{aligned} \varLambda _h(y_h^{n}) J_h\partial _ty(t^{n} + s)&= \textrm{A}_hJ_hy(t^{n} + s) + G_h(t^{n}, J_hy(t^{n}) ) \\&\quad + \bigl ( \varLambda _h(y_h^{n}) - \varLambda _h( J_hy(t^{n}) ) \bigr ) J_h\partial _ty(t^{n} + s) + \delta _{\text {ExEu}}^{n+1}(t^{n} + s) \end{aligned}$$

with defect \(\delta _{\text {ExEu}}^{n+1} = \delta _{h,\text {ExEu}}^{n+1} + \delta _{\tau ,\text {ExEu}}^{n+1} \) given by

$$\begin{aligned} \delta _{h,\text {ExEu}}^{n+1}(t^{n} + s)&= \bigl (J_h\textrm{A}- \textrm{A}_hJ_h\bigr ) y(t^{n} + s) + J_hG(t^{n}, y(t^{n}) ) - G_h(t^{n}, J_hy(t^{n}) ) \\&\quad + \bigl ( \varLambda _h( J_hy(t^{n}) ) J_h- J_h\varLambda ( y(t^{n}) ) \bigr ) \partial _ty(t^{n} + s) , \\ \delta _{\tau ,\text {ExEu}}^{n+1}(t^{n} + s)&= J_h\varLambda ( y(t^{n}) ) \partial _ty(t^{n} + s) -< J_h\varLambda ( y(t^{n} + s) \partial _ty(t^{n} + s) \\&\quad + J_hG(t^{n} + s, y(t^{n} + s) ) - J_hG(t^{n}, y(t^{n}) ) . \end{aligned}$$

Similarly, we define the auxiliary error by

$$\begin{aligned} {\tilde{e}}_h^{n}(t^{n} + s) {:}{=}J_hy(t^{n} + s) - {\tilde{y}}^{n}(t^{n} + s) , \quad {\tilde{e}}_h^{n}(t^{n}) = e_h^{n}, \quad {\tilde{e}}_h^{n}(t^{n} + \tau ) = e_h^{n+1} . \end{aligned}$$

This yields the discrete error equation for \(s\in [0,\tau ]\)

$$\begin{aligned} \varLambda _h(y_h^{n}) \partial _t{\tilde{e}}_h^{n} (t^{n} + s)&= \textrm{A}_h{\tilde{e}}_h^{n} (t^{n} + s) + \varGamma _h^{n} (t^{n} + s) + \delta _{\text {ExEu}}^{n+1}(t^{n} + s) \end{aligned}$$
(4.14)

with stability term

$$\begin{aligned} \varGamma _h^{n} (t^{n} + s) {:}{=}\bigl ( G_h(t^{n}, J_hy(t^{n})) - G_h(t^{n}, y_h) \bigr ) + \bigl ( \varLambda _h(y_h^{n} ) - \varLambda _h( J_hy(t^{n})) \bigr ) J_h\partial _ty((t^{n} + s)). \end{aligned}$$

Using the variation-of-constants formula with the state-dependent operator defined in (4.6), we obtain from (4.14)

$$\begin{aligned} {\tilde{e}}_h^{n}(t^{n} + s)&= \textrm{e}^{ s {\textbf{A}}_h^{\hspace{-0.3mm}n} } e_h^{n} + \int _0^s \textrm{e}^{(s-\sigma ) {\textbf{A}}_h^{\hspace{-0.3mm}n} } \varLambda _h^{-1} (y_h^{n})\bigl ( \varGamma _h^n (t^{n} + \sigma ) + \delta _{\text {ExEu}}^{n+1}(t^{n} + \sigma ) \bigr ) \,\mathrm{{d}} \sigma . \end{aligned}$$

To obtain the error bounds stated in Proposition 4.8, we need the following two estimates which follow along the lines of Lemmas 3.7 and 4.3: For \(0\le n \le {n^*}\) it holds

$$\begin{aligned} \sup \limits _{s \in [0,\tau ]} \left\Vert \textrm{A}_h\varGamma _h^n (t^{n} + s) \right\Vert _{X_h}&\le C \left\Vert \textrm{A}_he_h^{n} \right\Vert _{X_h} , \end{aligned}$$
(4.15a)
$$\begin{aligned} \sup \limits _{s \in [0,\tau ]} \bigl \Vert \textrm{A}_h\delta _{\text {ExEu}}^{n+1}(t^{n} + s) \bigr \Vert _{X_h}&\le C \bigl ( \tau + h^k \bigr ) , \end{aligned}$$
(4.15b)

with a constant C independent of h, \(\tau \) and \({n^*}\). This allows us to conclude the bounds in the two stronger norms.

Proof

(of Proposition 4.8) We proceed as in the proof of Proposition 4.2 in order to obtain the bound on \(\textrm{A}_h{\tilde{e}}_h^{n}\) in the form

$$\begin{aligned} \sup \limits _{s \in [0,\tau ]} \bigl \Vert \textrm{A}_h{\tilde{e}}_h^{n}(t^{n} + s) \bigr \Vert _{X_h}&\le C \bigl (\tau + h^k \bigr ) , \end{aligned}$$
(4.16)

which implies the first statement in the proposition. For the discrete derivative of the error, we employ (4.7), (4.14), and (4.15) to conclude

$$\begin{aligned} \bigl \Vert \partial _{\tau }e_h^{n} \bigr \Vert _{X_h}&\le \sup \limits _{s \in [0,\tau ]} \bigl \Vert \partial _t{\tilde{e}}_h^{n}(t^{n} + s) \bigr \Vert _{X_h} \\&\le C \sup \limits _{s \in [0,\tau ]} \bigl \Vert \textrm{A}_h{\tilde{e}}_h^{n}(t^{n} + s) \bigr \Vert _{X_h} + C \left\Vert \textrm{A}_he_h^{n} \right\Vert _{X_h} + C \bigl (\tau + h^k \bigr ) \\&\le C \bigl (\tau + h^k \bigr ) , \end{aligned}$$

where we used (4.16) in the last step. \(\square \)

4.4 Exponential Midpoint Rule

For the exponential midpoint rule, we combine the approaches presented for the semi-implicit midpoint rule and the exponential Euler method. In particular, we have to prove error bounds in the stronger norm as well as for the discrete derivative of the error.

Proposition 4.9

Under the assumptions of Theorem 2.9, for \(0\le n \le {n^*}+ 1\), there hold the bounds

$$\begin{aligned} \bigl \Vert \textrm{A}_he_h^{n} \bigr \Vert _{X_h}&\le C \bigl (\tau ^2 + h^k \bigr ), \end{aligned}$$

and for \(1\le n \le {n^*}+ 1\)

$$\begin{aligned} \bigl \Vert \partial _{\tau }e_h^{n} \bigr \Vert _{X_h}&\le C \bigl (\tau ^2 + h^k \bigr ), \end{aligned}$$

where C is independent of h, \(\tau \) and \({n^*}\).

Once this is established, the last main result directly follows.

Proof

(of Theorem 2.9) We only combine the argument presented in the proofs of Theorems 2.7 and 2.8 to conclude the assertion. \(\square \)

The rest of this section is devoted to the proof of Proposition 4.9. We introduce the auxiliary approximation \({\tilde{y}}^{n}(t^{n} + s)\) for \(s\in [0,\tau ]\) as the solution of

$$\begin{aligned} \varLambda _h({\bar{y}}_h^{n + 1/2}) \partial _t{\tilde{y}}^{n}(t^{n} + s)&= \textrm{A}_h{\tilde{y}}^{n} (t^{n} + s) + G_h(t^{n+1/2}, {\bar{y}}_h^{n + 1/2} ) \end{aligned}$$
(4.17)

with \({\tilde{y}}^{n}(t^{n}) =y_h^{n}\), and thus satisfies \({\tilde{y}}^{n}(t^{n} + \tau ) = y_h^{n+1}\). In order to derive the error equation, we insert the projected exact solution \(J_hy\) in (4.17) and conclude

$$\begin{aligned} \varLambda _h({\bar{y}}_h^{n + 1/2}) J_h\partial _ty(t^{n} + s)&= \textrm{A}_hJ_hy(t^{n} + s) + G_h(t^{n+1/2}, J_h{\bar{y}}^{n + 1/2} ) + \delta _{\text {ExM}}^{n+1}(t^{n} + s) \\&\quad + \bigl ( \varLambda _h({\bar{y}}_h^{n + 1/2}) - \varLambda _h( J_h{\bar{y}}^{n + 1/2} ) \bigr ) J_h\partial _ty(t^{n} + s) \end{aligned}$$

with defect \(\delta _{\text {ExM}}^{n+1} = \delta _{h,\text {ExM}}^{n+1} + \delta _{\tau ,\text {ExM},1}^{n+1} + \delta _{\tau ,\text {ExM},2}^{n+1} \) given by

$$\begin{aligned} \delta _{h,\text {ExM}}^{n+1}(t^{n} + s)&= \bigl (J_h\textrm{A}- \textrm{A}_hJ_h\bigr ) y(t^{n} + s) \\&\quad + J_hG(t^{n+1/2}, {\bar{y}}^{n + 1/2} ) - G_h(t^{n+1/2}, J_h{\bar{y}}^{n + 1/2} ) \\&\quad + \bigl ( \varLambda _h( J_h{\bar{y}}^{n + 1/2}) J_h- J_h\varLambda ( {\bar{y}}^{n + 1/2} ) \bigr ) \partial _ty(t^{n} + s) , \\ \delta _{\tau ,\text {ExM},1}^{n+1}(t^{n} + s)&= J_h\varLambda ( {\bar{y}}^{n + 1/2} ) \partial _ty(t^{n} + s) - J_h\varLambda ( y(t^{n+1/2}) ) \partial _ty(t^{n} + s) \\&\quad + J_hG(t^{n+1/2}, y(t^{n+1/2}) ) - J_hG(t^{n+1/2}, J_h{\bar{y}}^{n + 1/2} ) , \\ \delta _{\tau ,\text {ExM},2}^{n+1}(t^{n} + s)&= J_h\varLambda ( y(t^{n+1/2}) ) \partial _ty(t^{n} + s) - J_h\varLambda ( y(t^{n} + s) ) \partial _ty(t^{n} + s) \\&\quad + J_hG(t^{n} + s, y(t^{n} + s) ) - J_hG(t^{n+1/2}, y(t^{n+1/2}) ) . \end{aligned}$$

Deriving the error equation and using the variation-of-constants formula, we obtain with the state-dependent operator \({\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} = \varLambda _h^{-1}({\bar{y}}_h^{n + 1/2} ) \textrm{A}_h\)

$$\begin{aligned} e_h^{n+1}&= \textrm{e}^{ \tau {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2}} e_h^{n} + \int _0^\tau \textrm{e}^{(\tau -\sigma ) {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} } \varLambda _h^{-1}({\bar{y}}_h^{n + 1/2}) \bigl ( \varGamma _h^n (t^{n} + \sigma ) + \delta _{\text {ExM}}^{n+1}(t^{n} + \sigma ) \bigr ) \,\mathrm{{d}} \sigma \end{aligned}$$

with stability term

$$\begin{aligned} \varGamma _h^{n} (t^{n} + s)&{:}{=}G_h(t^{n+1/2}, J_h{\bar{y}}^{n + 1/2} ) - G_h(t^{n+1/2}, {\bar{y}}_h^{n + 1/2} ) \\&\quad + \bigl ( \varLambda _h({\bar{y}}_h^{n + 1/2}) - \varLambda _h( J_h{\bar{y}}^{n + 1/2} ) \bigr ) J_h\partial _ty(t^{n} + s) . \end{aligned}$$

Unlike for the exponential Euler method, one has to pay more attention to the derivation of the error bound on the discrete derivatives. In order to show the error bound, we do not only apply \( {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2}\) to the error equation, but also apply the discrete derivative \(\partial _{\tau }\). In a straightforward manner, one can derive the following auxiliary result.

Lemma 4.10

Under the assumptions of Theorem 2.9, it holds for \(0\le \sigma \le \tau \)

$$\begin{aligned} \bigl \Vert \partial _{\tau }\bigl ( \textrm{e}^{\sigma {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} } \varphi _h^n \bigr ) - \textrm{e}^{\sigma {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} } \partial _{\tau }\varphi _h^n \bigr \Vert _{X_h} \le C \tau \left\Vert \textrm{A}_h\varphi _h^n \right\Vert _{X_h} \end{aligned}$$

with a constant C independent of h, \(\tau \) and \({n^*}\).

This leads to several error terms which appeared above similarly. Along the lines of Lemmas 3.7, 4.3 and 4.6, we immediately conclude the following bounds: For \(0\le n \le {n^*}\), the stability terms are bounded by

$$\begin{aligned} \sup \limits _{s \in [0,\tau ]} \left\Vert \textrm{A}_h\varGamma _h^n (t^{n} + s) \right\Vert _{X_h}&\le C \bigl ( \left\Vert \textrm{A}_he_h^{n} \right\Vert _{X_h} + \bigl \Vert \textrm{A}_he_h^{n-1} \bigr \Vert _{X_h} \bigr ) , \end{aligned}$$
(4.18a)
$$\begin{aligned} \sup \limits _{s \in [0,\tau ]} \left\Vert \partial _{\tau }\varGamma _h^n (t^{n} + s) \right\Vert _{X_h}&\le C \sum _{j \in \{n-1,n\} } \bigl \Vert \partial _{\tau }e_h^{j} \bigr \Vert _{X_h} + \bigl \Vert e_h^{j} \bigr \Vert _{X_h} , \end{aligned}$$
(4.18b)

and further the defects satisfy

$$\begin{aligned} \sup \limits _{s \in [0,\tau ]} \bigl \Vert \textrm{A}_h\delta _{h,\text {ExM}}^{n+1}(t^{n} + s) \bigr \Vert _{X_h} + \sup \limits _{s \in [0,\tau ]} \bigl \Vert \partial _{\tau }\delta _{h,\text {ExM}}^{n+1}(t^{n} + s) \bigr \Vert _{X_h}&\le C h^k , \end{aligned}$$
(4.19a)
$$\begin{aligned} \sup \limits _{s \in [0,\tau ]} \bigl \Vert \textrm{A}_h\delta _{\tau ,\text {ExM},1}^{n+1}(t^{n} + s) \bigr \Vert _{X_h} + \sup \limits _{s \in [0,\tau ]} \bigl \Vert \partial _{\tau }\delta _{\tau ,\text {ExM},1}^{n+1}(t^{n} + s) \bigr \Vert _{X_h}&\le C \tau ^2 , \end{aligned}$$
(4.19b)
$$\begin{aligned} \sup \limits _{s \in [0,\tau ]} \bigl \Vert \textrm{A}_h\delta _{\tau ,\text {ExM},2}^{n+1}(t^{n} + s) \bigr \Vert _{X_h}&\le C \tau , \end{aligned}$$
(4.19c)

with a constant C independent of h, \(\tau \) and \({n^*}\). The main difficulty is to extract the additional order of convergence in the defect \(\delta _{\tau ,\text {ExM},2}^{n}\). We show in Sect. 5 the following lemma.

Lemma 4.11

Under the assumptions of Theorem 2.9, it holds

$$\begin{aligned} \Bigl \Vert {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} \int _0^\tau \textrm{e}^{(\tau -\sigma ) {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} } \varLambda _h^{-1}({\bar{y}}_h^{n + 1/2}) \delta _{\tau ,\text {ExM},2}^{n+1}(t^{n} + \sigma ) \bigr ) \,\mathrm{{d}} \sigma \Bigr \Vert _{X_h}&\le C \tau ^3 , \\ \Bigl \Vert \int _0^\tau \textrm{e}^{(\tau -\sigma ) {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} } \varLambda _h^{-1}({\bar{y}}_h^{n + 1/2}) \partial _{\tau }\delta _{\tau ,\text {ExM},2}^{n+1}(t^{n} + \sigma ) \bigr ) \,\mathrm{{d}} \sigma \Bigr \Vert _{X_h}&\le C \tau ^3 , \end{aligned}$$

with a constant C independent of h, \(\tau \), and \({n^*}\).

From this, we conclude the error bounds in Proposition 4.9.

Proof

(of Proposition 4.9) Following the lines of the preceding proofs of Propositions 4.5 and 4.8, one establishes the error bound on \(\left\Vert \textrm{A}_he_h^{n} \right\Vert _{X_h}\). Applying the discrete derivative to the error equation and employing the bounds in Lemma 4.10 combined with (4.18) and (4.19) yield

$$\begin{aligned} \partial _{\tau }e_h^{n+1}&= \textrm{e}^{ \tau {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2}} \partial _{\tau }e_h^{n} + \int _0^\tau \textrm{e}^{(\tau -\sigma ) {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} } \varLambda _h^{-1}({\bar{y}}_h^{n + 1/2}) \partial _{\tau }\delta _{\text {ExM}}^{n+1}(t^{n} + \sigma ) \,\mathrm{{d}} \sigma + \varDelta _h^n , \end{aligned}$$

where the remainder term \(\varDelta _h^n\) satisfies

$$\begin{aligned} \left\Vert \varDelta _h^n \right\Vert _{X_h} \le C \tau \bigl ( \left\Vert \partial _{\tau }e_h^{n} \right\Vert _{X_h} + \bigl \Vert \partial _{\tau }e_h^{n-1} \bigr \Vert _{X_h} \bigr ) + C \tau \bigl ( \tau ^2 + h^k). \end{aligned}$$

Finally, the application of Lemma 4.11 yields the desired estimate on \(\left\Vert \partial _{\tau }e_h^{n} \right\Vert _{X_h}\) and closes the proof. \(\square \)

5 Estimates for Stability Terms and Defects

This section is devoted to the proofs of the postponed stability and consistency estimate from Sects. 3 and 4.

5.1 Stability

In the following, we give a detailed proof for the stability term given in (3.10). We emphasize that the corresponding bounds used in Sect. 4 are derived fully analogously, and we thus refrain from giving the details here.

Proof

(of Lemma 3.7) We consider the two contributions of \(\varGamma _h\) in (3.10) separately.

(a) We first note by (2.6) and (2.7) that

$$\begin{aligned} \textrm{A}_h\bigl ( G_h(t, J_hy) \!-\! G_h(t, y_h) \bigr )\!\! =\!\begin{pmatrix} I_hg(t,{\mathcal {L}}_h{\mathcal {L}}_h^{V*}u, {\mathcal {L}}_h{\mathcal {L}}_h^{V*}\partial _tu) \!-\! I_hg(t,{\mathcal {L}}_hu_h, {\mathcal {L}}_hv_h) \\ 0 \end{pmatrix}. \end{aligned}$$

Without loss of generality, we show the assertion only for \(g(t, u, \partial _tu) = g(\partial _tu)\) and obtain

$$\begin{aligned}&\bigl \Vert \textrm{A}_h\bigl ( G_h(t, J_hy) - G_h(t, y_h) \bigr ) \bigr \Vert _{X_h} \\&\quad = \, \bigl \Vert I_hg({\mathcal {L}}_h{\mathcal {L}}_h^{V*}\partial _tu) - I_hg({\mathcal {L}}_hv_h) \bigr \Vert _{V_h} \\&\quad = \, \bigl \Vert I_h\bigl ( \int \limits _0^1 g'( \sigma {\mathcal {L}}_h{\mathcal {L}}_h^{V*}\partial _tu+ (1-\sigma ) {\mathcal {L}}_hv_h) \,d\sigma \, ( {\mathcal {L}}_h{\mathcal {L}}_h^{V*}\partial _tu- {\mathcal {L}}_hv_h) \bigr ) \bigr \Vert _{V_h} \\&\quad \lesssim \, \bigl \Vert \int \limits _0^1 g'( \sigma {\mathcal {L}}_h{\mathcal {L}}_h^{V*}\partial _tu+ (1-\sigma ) {\mathcal {L}}_hv_h) \,d\sigma \bigr \Vert _{W^{1,N+ \delta }} \bigl \Vert {\mathcal {L}}_h{\mathcal {L}}_h^{V*}\partial _tu- {\mathcal {L}}_hv_h \bigr \Vert _{V_h} , \end{aligned}$$

where we used Lemma 2.10 for the last estimate. The latter term is estimated with (3.4) by \(\textrm{A}_he_h\) in the \(X_h\)-norm. For the integral part, we use the stability of \({\mathcal {L}}_h{\mathcal {L}}_h^{V*}\) in (2.24) with \(\ell =0\) to bound it by a constant depending on the \(W^{1,N+\delta }\)-norms of \(\partial _tu\) and \(v_h\). Hence, the bounds in (3.3) yield the stability for \(G_h\).

(b) Next, we consider by (2.6) and (2.7)

$$\begin{aligned} \textrm{A}_h\bigl ( \varLambda _h( J_hy(t)) \! -\! \varLambda _h(y_h(t)) J_h\partial _ty(t)\!\! =\! \begin{pmatrix} \pi _h\Bigl ( I_h\bigl ( \lambda ({\mathcal {L}}_h{\mathcal {L}}_h^{V*}u) - \lambda ({\mathcal {L}}_hu_h) \bigr ) {\mathcal {L}}_h^{V*}\partial _t^2 u\Bigr ) \\ 0 \end{pmatrix}, \end{aligned}$$

and estimate with the stability of the \(L^2\)-projection in (2.26) and of the interpolation (2.22), the algebra property of \(W^{1,N+\delta }\), and the stability of \({\mathcal {L}}_h^{V*}\) in (2.24)

$$\begin{aligned}&\left\Vert \textrm{A}_h\bigl ( \varLambda _h( J_hy(t)) - \varLambda _h(y_h(t)) J_h\partial _ty(t) \right\Vert _{X_h} \\&\quad \le C \bigl \Vert \lambda ({\mathcal {L}}_h{\mathcal {L}}_h^{V*}u) - \lambda ({\mathcal {L}}_hu_h) \bigr \Vert _{W^{1,N+\delta }} \bigl \Vert {\mathcal {L}}_h^{V*}\partial _t^2 u \bigr \Vert _{W^{1,N+\delta }} \\&\quad \le C \bigl \Vert {\mathcal {L}}_h^{V*}u- u_h \bigr \Vert _{W^{1,N+\delta }} \bigl \Vert \partial _t^2 u \bigr \Vert _{W^{1,N+\delta }}, \end{aligned}$$

with a constant depending on the \(W^{1,N+\delta }\)-norms of \(u\) and \(u_h\). Using Lemma 3.3 and (3.4), the first term is bounded by \(\textrm{A}_he_h\) in the \(X_h\)-norm. \(\square \)

5.2 Defects

We first estimate the spatial defect (3.8) which will reappear in a modified form in the defects of the full discretization.

Proof

(of Lemma 3.8)

  1. (a)

    We compute with (2.27) and (2.28)

    $$\begin{aligned} \textrm{A}_h\bigl (J_h\textrm{A}- \textrm{A}_hJ_h\bigr ) y(t) = \textrm{A}_h\bigl (J_h- {\varvec{{\mathcal {L}}_h}}^{*}\bigr )\textrm{A}y(t) = \begin{pmatrix} ( {\mathcal {L}}_h^{V*}- {\mathcal {L}}_h^{H*}) \mathrm {\Delta }u\\ 0 \end{pmatrix} \end{aligned}$$

    and, inserting the interpolation, estimate with (2.22), (2.23), (2.24), and (2.25),

    $$\begin{aligned} \left\Vert \textrm{A}_h\bigl (J_h\textrm{A}- \textrm{A}_hJ_h\bigr ) y(t) \right\Vert _{X_h} \le C h^k \left\Vert \mathrm {\Delta }u \right\Vert _{H^{k+1}}. \end{aligned}$$
  2. (b)

    As above, we only consider the case \(g(t, u, \partial _tu) = g(\partial _tu)\) and obtain with (2.6) and (2.7)

    $$\begin{aligned} \textrm{A}_h\bigl ( J_hG(t, y) - G_h(t, J_hy) \bigr ) = \begin{pmatrix} {\mathcal {L}}_h^{V*}g(\partial _tu) - I_hg({\mathcal {L}}_h{\mathcal {L}}_h^{V*}\partial _tu) \\ 0 \end{pmatrix}. \end{aligned}$$

    From this, we conclude with (2.22) and (2.24)

    $$\begin{aligned}&\left\Vert \textrm{A}_h\bigl ( J_hG(t, y) - G_h(t, J_hy) \bigr ) \right\Vert _{X_h} \\&\quad \le \bigl \Vert ({\mathcal {L}}_h^{V*}- I_h) g( \partial _tu) \bigr \Vert _{V_h} + \bigl \Vert I_hg(\partial _tu) - I_hg({\mathcal {L}}_h{\mathcal {L}}_h^{V*}\partial _tu) \bigr \Vert _{V_h} \\&\quad \le \,C h^k \left\Vert g( \partial _tu) \right\Vert _{H^{k+1}} + C \bigl \Vert (\textrm{Id}-{\mathcal {L}}_h{\mathcal {L}}_h^{V*}) \partial _tu \bigr \Vert _{W^{1,\infty }} \\&\quad \le \,C ( \left\Vert g( \partial _tu) \right\Vert _{H^{k+1}} + \left\Vert \partial _tu \right\Vert _{W^{k+1,\infty }} ) \, h^k , \end{aligned}$$

    which gives the desired convergence rate.

  3. (c)

    We compute with (2.6) and (2.7)

    $$\begin{aligned} \textrm{A}_h\bigl ( J_h\varLambda (y) \! -\! \varLambda _h(J_hy) J_h\bigr ) \partial _ty\!\! =\!\!\begin{pmatrix} {\mathcal {L}}_h^{V*}( \lambda (u) \partial _t^2 u)\! -\! \pi _h\bigl ( I_h\lambda ({\mathcal {L}}_h{\mathcal {L}}_h^{V*}u) {\mathcal {L}}_h{\mathcal {L}}_h^{V*}\partial _t^2 u\bigr ) \\ 0 \end{pmatrix}, \end{aligned}$$

    such that again (2.22), (2.23) and (2.24) yield the estimate

    $$\begin{aligned}&\left\Vert \textrm{A}_h\bigl ( J_h\varLambda (y) - \varLambda _h(J_hy) J_h\bigr ) \partial _ty \right\Vert _{X_h} \\&\quad \le \bigl \Vert ({\mathcal {L}}_h^{V*}- {\mathcal {L}}_h^{H*}) \lambda (u) \partial _t^2 u \bigr \Vert _{V_h} + \bigl \Vert ({\mathcal {L}}_h^{H*}- \pi _h{\mathcal {L}}_h^{-1}) \lambda (u) \partial _t^2 u \bigr \Vert _{V_h} \\&\qquad + \bigl \Vert \lambda (u) \partial _t^2 u- {\mathcal {L}}_hI_h\lambda ({\mathcal {L}}_h{\mathcal {L}}_h^{V*}u) {\mathcal {L}}_h{\mathcal {L}}_h^{V*}\partial _t^2 u \bigr \Vert _{V_h} \\&\quad \le \,C (\left\Vert \lambda ( u) \right\Vert _{W^{k+1,\infty }} , \bigl \Vert \partial _t^2u \bigr \Vert _{W^{k+1,\infty }} ) \, h^k + h^{-1} \bigl \Vert ({\mathcal {L}}_h^{H*}- \pi _h{\mathcal {L}}_h^{-1}) \lambda (u) \partial _t^2 u \bigr \Vert _{H_h}. \end{aligned}$$

The last term is estimated using [16, Lem. 8.24] to obtain

$$\begin{aligned} \bigl \Vert ({\mathcal {L}}_h^{H*}- \pi _h{\mathcal {L}}_h^{-1}) \lambda (u) \partial _t^2 u \bigr \Vert _{H_h}&\lesssim h^{k} \bigl \Vert {\mathcal {L}}_h^{-1} \lambda (u) \partial _t^2 u \bigr \Vert _{H_h} \\&\lesssim h^{k} \bigl \Vert ( {\mathcal {L}}_h^{-1} - I_h) \lambda (u) \partial _t^2 u \bigr \Vert _{H_h} + h^{k} \bigl \Vert I_h\lambda (u) \partial _t^2 u \bigr \Vert _{H_h}, \end{aligned}$$

and (2.22) together with (B.2) gives the desired bound. \(\square \)

For the fully discrete defects, we rely on further Lipschitz bounds of the nonlinearities \(\varLambda \) and \(G\) which we collect in the next lemma. Since we work in a first-order framework, we denote in the following for any function \(x \in X\), the projection onto the first and second component by \(x_1\) or \(x_2\), respectively.

Lemma 5.1

Let \(x,y,z \in X\), and let Assumption 2.1 hold.

  1. (a)

    If \(x_1,y_1,z_2 \in W^{1,\infty }(\varOmega )\), then

    $$\begin{aligned} \left\Vert \textrm{A}\bigl ( \varLambda ( x ) - \varLambda ( y ) \bigr ) z \right\Vert _{X} \le C \left\Vert x_1 - y_1 \right\Vert _{H^1(\varOmega )} , \end{aligned}$$

    where the constant depends on the \(W^{1,\infty }\)-norms of \(x_1,y_1,z_2\).

  2. (b)

    If \(x_1,x_2,y_1,y_2 \in W^{1,\infty }(\varOmega )\), then

    $$\begin{aligned} \left\Vert \textrm{A}\bigl ( G(t, x ) - G(s, y ) \bigr ) \right\Vert _{X} \le C \bigl ( |t-s| + \left\Vert x_1 - y_1 \right\Vert _{H^1(\varOmega )} + \left\Vert x_2 - y_2 \right\Vert _{H^1(\varOmega )} \bigr ) \end{aligned}$$

    where the constant depends on the \(W^{1,\infty }\)-norms of \(x_1,x_2,y_1,y_2\).

Proof

We expand the difference in part (a) as

$$\begin{aligned} \left\Vert \textrm{A}\bigl ( \varLambda ( x ) - \varLambda ( y ) \bigr ) z \right\Vert _{X}&= \left\Vert \bigl ( \lambda ( x_1 ) - \lambda ( y_1 ) \bigr ) z_2 \right\Vert _{H^1} \\&= \bigl \Vert \int _0^1 \lambda '( \sigma x_1 + (1-\sigma y_1) )( x_1 - y_1 ) z_2 \bigr \Vert _{H^1} \\&\lesssim \sup \limits _{\sigma \in [0,1 ]} \bigl \Vert \lambda '( \sigma x_1 + (1-\sigma y_1) ) \bigr \Vert _{W^{1,\infty }} \left\Vert x_1 - y_1 \right\Vert _{H^1} \left\Vert z_2 \right\Vert _{W^{1,\infty }}, \end{aligned}$$

and the assumptions in the lemma yield the bound. The very same computation yields the second estimate (b). \(\square \)

Thus, the defect of the Euler method can be bounded in a straightforward way.

Proof

(of Lemma 4.3) We recall the splitting \(\delta _{\text {Eu}}^{n+1} = \delta _{h,\text {Eu}}^{n+1} + \delta _{\tau ,\text {Eu}}^{n+1} \) of the defect in (4.3), and note that by proof of Lemma 3.8 it holds

$$\begin{aligned} \bigl \Vert \textrm{A}_h\delta _{h,\text {Eu}}^{n+1} \bigr \Vert _{X_h} \le C h^k. \end{aligned}$$

We treat the two parts in (4.3b) separately. The second term involving \(G\) is bounded using (2.28) and Lemma 5.1. Further, we expand

$$\begin{aligned}&\textrm{A}_hJ_h\varLambda ( y(t^{n}) ) \partial _{\tau }y(t^{n+1}) - \textrm{A}_hJ_h\varLambda ( y(t^{n+1}) ) \partial _ty(t^{n+1}) = \, {\varvec{{\mathcal {L}}_h}}^{*}\textrm{A}\bigl ( \varLambda ( y(t^{n}) ) \\&\quad - \varLambda ( y(t^{n+1}) ) \bigr ) \partial _ty(t^{n+1}) + \, {\varvec{{\mathcal {L}}_h}}^{*}\textrm{A}\varLambda ( y(t^{n}) ) \bigl ( \partial _{\tau }y(t^{n+1}) - \partial _ty(t^{n+1}) \bigr ), \end{aligned}$$

such that Lemma 5.1 is employed on the first part. For the second part, note that by the fundamental theorem of calculus we obtain in any norm

$$\begin{aligned} \bigl \Vert \partial _{\tau }z(t^{n+1}) - \partial _tz(t^{n+1}) \bigr \Vert _{} \le \frac{\tau }{2} \sup \limits _{s \in [0,\tau ]} \bigl \Vert \partial _t^2 z(t^{n} + s ) \bigr \Vert _{} , \end{aligned}$$

and the claim follows. \(\square \)

Similarly, we bound the defect of the midpoint rule in (4.11), and as above we do not have to treat the spatial part \(\delta _{h,\text {M}}^{n+1}\).

Proof

(of Lemma 4.6) We first note that from Taylor expansions and the Peano kernel theorem, we conclude in any norm the bounds

$$\begin{aligned} \bigl \Vert \partial _{\tau }z(t^{n+1}) - \partial _tz(t^{n+1/2}) \bigr \Vert _{}&\le \frac{\tau ^2}{24} \sup \limits _{s \in [0,\tau ]} \bigl \Vert \partial _t^3 z(t^{n} + s ) \bigr \Vert _{} , \\ \bigl \Vert z(t^{n+1/2}) - \tfrac{1}{2} \bigl ( z(t^{n+1}) + z(t^{n}) \bigr ) \bigr \Vert _{}&\le \frac{\tau ^2}{4} \sup \limits _{s \in [0,\tau ]} \bigl \Vert \partial _t^2 z(t^{n} + s ) \bigr \Vert _{} , \\ \bigl \Vert z(t^{n+1/2}) - \bigl ( \tfrac{3}{2} z(t^{n}) - \tfrac{1}{2} z(t^{n- 1}) \bigr ) \bigr \Vert _{}&\le \frac{3 \tau ^2}{8} \sup \limits _{s \in [0,\tau ]} \bigl \Vert \partial _t^2 z(t^{n} + s ) \bigr \Vert _{}. \end{aligned}$$

Combining this with Lemma 5.1 and the proof of Lemma 4.3 yields the desired bounds on the defect. \(\square \)

We finally treat the principle defect of the exponential midpoint rule. We pursue the strategy adapted from [12, Prop. 5.3]. Before we give the proof, we need two auxiliary results. The first one allows us to compare function evaluations of finite element objects with their interpolation. The proof is given in Appendix B.

Lemma 5.2

Let \(L \in {\mathbb {N}}\) and assume that \(f :{\mathbb {R}}^L \rightarrow {\mathbb {R}}\) is sufficiently often differentiable and that \(\varphi _{i,h} \in V_h\) satisfies

$$\begin{aligned} \left\Vert \varphi _{i,h} \right\Vert _{L^\infty } + \left\Vert \varphi _{i,h} \right\Vert _{W^{1,4}} \le C \end{aligned}$$

for \(i = 1, \ldots . L\). Then,

$$\begin{aligned} \left\Vert f(\varphi _{1,h},\ldots ,\varphi _{L,h}) - I_hf({\mathcal {L}}_h\varphi _{1,h},\ldots ,{\mathcal {L}}_h\varphi _{L,h}) \right\Vert _{L^2(\varOmega _h)}&\lesssim h^2 , \\ \left\Vert f( \varphi _{1,h},\ldots ,\varphi _{L,h}) - I_hf({\mathcal {L}}_h\varphi _{1,h},\ldots , {\mathcal {L}}_h\varphi _{L,h}) \right\Vert _{H^1(\varOmega _h)}&\lesssim h , \end{aligned}$$

with a constant independent of h.

On the continuous level, the chain rule allows us to bound terms of the form \(\mathrm {\Delta }(\mu (u) w )\) by the norms of \(\mathrm {\Delta }u\) and \(\mathrm {\Delta }w\). Even though this is not straightforward in the discrete case, one can establish a very similar result. We recall the Ritz projection \(R_h\) defined in (2.5) and note that the proof is given in Appendix D.

Lemma 5.3

Let \(u_h, w_h \in V_h\) and assume that \(\left\Vert \mathrm {\Delta }_hu_h \right\Vert _{L^2} + \left\Vert \mathrm {\Delta }_hw_h \right\Vert _{L^2} \le C\). Further, let \(\mu :{\mathbb {R}}\rightarrow {\mathbb {R}}\) be twice continuously differentiable. Then,

$$\begin{aligned} \left\Vert \mathrm {\Delta }_hR_h(\mu (u_h) w_h) \right\Vert _{L^2(\varOmega _h)} \le C \end{aligned}$$

with a constant C independent of h.

With these preparations, we can provide the error bounds on the defects of the exponential midpoint rule.

Proof

(of Lemma 4.11) Let us denote

$$\begin{aligned} d_n^1(t)&{:}{=}J_h\bigl ( \varLambda ( y(t^{n+1/2}) ) - \varLambda ( y(t) ) \bigr ) \partial _ty(t) , \\ d_n^2(t)&{:}{=}J_h\bigl ( G(t, y(t) ) - G(t^{n+1/2}, y(t^{n+1/2}) ) \bigr ) , \end{aligned}$$

then it remains to show for \(i=1,2\)

$$\begin{aligned} \Bigl \Vert {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} \int _0^\tau \textrm{e}^{(\tau -\sigma ) {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} } \varLambda _h^{-1} ({\bar{y}}_h^{n + 1/2}) d_n^i(t^{n}+ \sigma ) \bigr ) \,\mathrm{{d}} \sigma \Bigr \Vert _{X_h}&\le C \tau ^3 , \\ \Bigl \Vert \int _0^\tau \textrm{e}^{(\tau -\sigma ) {\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} } \varLambda _h^{-1} ({\bar{y}}_h^{n + 1/2}) \partial _{\tau }d_n^i (t^{n} + \sigma ) \bigr ) \,\mathrm{{d}} \sigma \Bigr \Vert _{X_h}&\le C \tau ^3 . \end{aligned}$$

Following the lines of the proof of [12, Prop. 5.3], we observe that for the first bound on \(d_n^1\), we have to establish the bounds

$$\begin{aligned} \bigl \Vert \textrm{A}_h{\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} \varLambda _h^{-1} ({\bar{y}}_h^{n + 1/2}) J_h\bigl ( \partial _t\varLambda ( y(t) ) \bigr ) \partial _ty(t) \bigr \Vert _{X_h}&\le C , \end{aligned}$$
(5.1a)
$$\begin{aligned} \bigl \Vert \textrm{A}_h\varLambda _h^{-1} ({\bar{y}}_h^{n + 1/2}) J_h\bigl ( \varLambda ( y(t^{n+1/2}) ) - \varLambda ( y(t) ) \bigr ) \partial _t^2 y(t) \bigr \Vert _{X_h}&\le C \tau , \end{aligned}$$
(5.1b)
$$\begin{aligned} \bigl \Vert \textrm{A}_h\varLambda _h^{-1} ({\bar{y}}_h^{n + 1/2})J_h\Bigl ( \bigl ( \partial _t^2 \varLambda ( y(t) ) \bigr ) \partial _ty(t) + \bigl ( \partial _t\varLambda ( y(t) ) \bigr ) \partial _t^2 y(t) \Bigr ) \bigr \Vert _{X_h}&\le C , \end{aligned}$$
(5.1c)

as well as for \(d_n^2\) the bounds

$$\begin{aligned} \bigl \Vert \textrm{A}_h{\textbf{A}}_h^{\hspace{-0.3mm}n+1/2} \varLambda _h^{-1}({\bar{y}}_h^{n + 1/2}) J_h\partial _tG(t, y(t) ) \bigr \Vert _{X_h}&\le C , \end{aligned}$$
(5.2a)
$$\begin{aligned} \bigl \Vert \textrm{A}_h\varLambda _h^{-1} ({\bar{y}}_h^{n + 1/2}) J_h\partial _t^2 G(t, y(t) ) \bigr \Vert _{X_h}&\le C . \end{aligned}$$
(5.2b)

For the discrete derivative, we have similar terms which have one \(\textrm{A}_h\) less, and instead \(\partial _{\tau }\) applied to objects following \(\varLambda _h^{-1}({\bar{y}}_h^{n + 1/2})\). In the following, we only discuss the bounds in (5.1a) and (5.2a) since the remaining ones are more standard.

Denoting \(w_h {:}{=}{\mathcal {L}}_h^{V*}\bigl ( \lambda '(u) \, \partial _tu\, \partial _t^2 u\bigr )\), it is sufficient to show for the bound in (5.1a) the estimate

$$\begin{aligned} \bigl \Vert \mathrm {\Delta }_hQ_h\bigl ( (I_h\lambda ({\mathcal {L}}_h{\bar{u}}_h^{n + 1/2})^{-1} ) w_h \bigr ) \bigr \Vert _{H_h} \le C, \end{aligned}$$

since the first component vanishes. In the following, we will often use that by (2.28) and the assumptions of Theorem 2.9 it holds

$$\begin{aligned} \left\Vert \mathrm {\Delta }_hw_h \right\Vert _{L^2(\varOmega _h)} = \bigl \Vert {\mathcal {L}}_h^{H*}\mathrm {\Delta }\bigl ( \lambda '(u) \, \partial _tu\, \partial _t^2 u\bigr ) \bigr \Vert _{L^2(\varOmega _h)} \le C, \end{aligned}$$

and to keep the notation simple, we will write \(u_h\) instead of \({\bar{u}}_h^{n + 1/2}\). We split the term using the inverse estimate in the form (A.1) in

$$\begin{aligned} \bigl \Vert \mathrm {\Delta }_hQ_h\bigl ( (I_h\lambda ( {\mathcal {L}}_hu_h) )^{-1} w_h \bigr ) \bigr \Vert _{H_h}&\lesssim \bigl \Vert \mathrm {\Delta }_hR_h\bigl ( \lambda (u_h)^{-1} w_h \bigr ) \bigr \Vert _{H_h} \\&\quad + h^{-1} \bigl \Vert Q_h\bigl ( (I_h\lambda ({\mathcal {L}}_hu_h) )^{-1} w_h \bigr ) - R_h\bigl ( \lambda (u_h)^{-1} w_h \bigr ) \bigr \Vert _{V_h} . \end{aligned}$$

The first term is bounded by Lemma 5.3 using

$$\begin{aligned} \bigl \Vert \mathrm {\Delta }_h{\bar{u}}_h^{n + 1/2} \bigr \Vert _{L^2(\varOmega _h)} \lesssim \left\Vert \mathrm {\Delta }_hu_h^{n} \right\Vert _{L^2(\varOmega _h)} + \left\Vert \mathrm {\Delta }_hu_h^{n-1} \right\Vert _{L^2(\varOmega _h)}. \end{aligned}$$

To split the second term further, we add and subtract in the second term \( Q_h\bigl ( \lambda (u_h)^{-1} w_h \bigr )\), and obtain for the first term with the \(L^2\)-stability of \(Q_h\) in (2.26) and the inverse estimate (2.25)

$$\begin{aligned}&\, h^{-1} \bigl \Vert Q_h\bigl ( ((I_h\lambda ({\mathcal {L}}_hu_h) )^{-1} - \lambda (u_h)^{-1} ) w_h \bigr ) \bigr \Vert _{V_h} \\&\quad \le \, h^{-2} \bigl \Vert (I_h\lambda ({\mathcal {L}}_hu_h) )^{-1} - \lambda (u_h)^{-1} \bigr \Vert _{L^2(\varOmega _h)} \left\Vert w_h \right\Vert _{L^\infty } \\&\quad \le \, h^{-2} \bigl \Vert I_h\lambda ({\mathcal {L}}_hu_h) - \lambda (u_h) \bigr \Vert _{L^2(\varOmega _h)} \left\Vert \mathrm {\Delta }_hw_h \right\Vert _{L^2(\varOmega _h)}, \end{aligned}$$

where we used in the last step Lemma 3.3, the identity

$$\begin{aligned} (I_h\lambda ({\mathcal {L}}_hu_h) )^{-1} - \lambda (u_h)^{-1} = (I_h\lambda ({\mathcal {L}}_hu_h) )^{-1} \bigl ( \lambda (u_h)- (I_h\lambda ({\mathcal {L}}_hu_h) ) \bigr ) \lambda (u_h)^{-1}, \end{aligned}$$

and the maximum norm estimate on \(u_h\). Finally, Lemma 5.2 leads to a uniform bound in h. Using the identity

$$\begin{aligned} Q_h\varphi - R_h\varphi = Q_h\bigl ( \varphi - I_h\varphi ) + R_h\bigl ( I_h\varphi - \varphi \bigr ), \end{aligned}$$

the assertion follows, once we have established

$$\begin{aligned} h^{-2} \bigl \Vert \bigl (\textrm{Id}- I_h) \bigl ( \lambda (u_h)^{-1} w_h \bigr ) \bigr \Vert _{L^2(\varOmega _h)} + h^{-1} \bigl \Vert \bigl (\textrm{Id}- I_h) \bigl ( \lambda (u_h)^{-1} w_h \bigr ) \bigr \Vert _{H_0^{1}(\varOmega _h)} \le C. \end{aligned}$$

However, applying Lemma 5.2 once more gives precisely this estimate. The bound in (5.2a) is derived in the very same way. \(\square \)