1 Introduction

In the present paper we are concerned with a-posteriori error estimation for model order reduction applied to a semi-discrete in space quasilinear parabolic partial differential equation (PDE) with non-monotone nonlinearity. The PDE appears for instance as state equation in the optimal control problems from [9, 35], and is used in the modeling of heat conduction, when the thermal conductivity of the material under consideration is temperature-dependent, cf. e.g. [48, 49, 58].

The numerical treatment of evolution equations and related problems is challenging. For instance, the discretization of associated optimal control problems leads to large-scale optimization problems that are highly expensive to solve. This is especially true for nonlinear equations. Therefore, model order reduction-techniques (MOR) play an important role in this context and other many-query scenarios, i.e. in situations where the same equation has to be solved repeatedly for different right hand sides or parameters. The aim of MOR is to replace the high-dimensional original model by a suitable model with less degrees of freedom, the so-called reduced-order model. A prominent method of MOR for parabolic PDEs is the so-called Proper Orthogonal Decomposition (POD) method, [64]. This approach uses so-called snapshots of the dynamical system to construct a low-dimensional subspace of e.g. a high-dimensional finite-element space. More generally speaking, projection of a high-dimensional dynamical or parametric system onto smaller dimensional spaces leads to so-called reduced basis methods (RB), see e.g. [31]. These subspaces need to be in some sense capable of expressing the original trajectory of the system sufficiently well. The question of estimating the model order reduction error arises naturally and has been subject to intensive research. We refer e.g. to [52] and the references therein for RB-methods in PDE-constrained optimization in general, and to [53] or the survey [29] for POD in particular.

Since there is a huge amount of literature about POD/RB-MOR, not just for uncontrolled equations but even in the context of PDE-constrained optimization, we have to restrict ourselves to an incomplete literature overview. POD-error-estimates have been obtained in the a-priori regime for linear parabolic equations in [39], and for certain nonlinear equations e.g. in [12, 40, 59]. For recently obtained RB-a-posteriori error-estimates for quasilinear equations with monotone nonlinearity related to magneto(quasi)statics, both elliptic and parabolic, we refer to [33, 34]. For a general nonlinear parabolic PDE including a-posteriori error estimation on the time-discrete level we exemplarily cite [19], and refer to its introduction for an overview over further literature. Moreover, let us mention some results related to optimal control: A-posteriori POD-errors for linear-quadratic optimal control problems have been derived in [63]. The technique has been extended to nonlinear problems in [38] and problems with mixed control-state constraints in [28]. POD-a-posteriori error-estimates for an optimal control problem with semilinear state equation with monotone nonlinearity have been discussed in [53]. Appropriate coupling between numerical optimization and MOR is an active area of research, see e.g. [26]. For approaches based on a-posteriori error estimation and trust-region type-algorithms, respectively, we refer to [7, 57], and in particular [52, 53]. A different method, so-called optimality system POD, has been proposed in [41]. A related aspect, the interplay between POD and discretization, is under consideration in e.g. [21,22,23]. Balancing of discretization- and POD-errors for an optimal control problem is addressed in [27]. Finally, we exemplarily cite [2] and [37] for RB/POD-MOR applied to robust and multiobjective optimization, respectively.

For quasilinear parabolic PDEs or related optimal control problems, even the analysis of the full-order model is less complete. We mention [30, 50] for the analysis of the equation itself based on the functional analytic tool of nonautonomous maximal parabolic regularity [5], and refer to [9, 10] for first- and second-order optimality conditions of related optimal control problems. For additional state constraints we refer to [36]. A quasilinear version of the so-called thermistor problem has been addressed in [48, 49], and convergence of the SQP method applied to the model problem from [9] has been proven in [35]. For earlier literature on quasilinear parabolic optimal control problems, and optimal control of quasilinear elliptic PDEs with refer to the introductions of [9, 10]. Finite element discretization error-estimates for the state equation from [10] are obtained in [11]. Having in mind a possible coupling of numerical optimization and MOR à la [52, 53], we start in the present paper with deriving a-posteriori reduced basis errors for the corresponding state equation. Herein, the presence of a non-monotone nonlinearity is the main difference to earlier publications concerned with RB-a-posteriori error-estimates for nonlinear PDEs [33, 34, 53], and also poses the main difficulty in our analysis. Moreover, compared to [19, 33, 52] also time-discretization errors are included in our estimates which may prevent the choice of unnecessarily accurate reduced-models below the time-discretization-error in practice. Nevertheless, our reference solution is semi-discrete (in space), i.e. we fix a spatial discretization and do not address errors arising from this.

The paper is organized as follows: We start by introducing the model problem and the underlying assumptions in Sect. 2. Further, we provide a short overview over the results obtained on this equation so far, and introduce its semi-discretization (in space), and the reduced-order counterpart thereof. Moreover, we provide a sketchy outlook how the results subsequently obtained might be applied in the context of PDE-constrained optimization. In Sect. 3 we prove RB-a-posteriori error-estimates for a reduced-order trajectory with certain time regularity. Depending on how much regularity of the semi-discrete (in space) solution we are willing to exploit, we present two different approaches to obtain a-posteriori error-estimates. In case that hyperreduction of the nonlinearity is done by empirical interpolation (EIM) we provide estimates that also include the additional EIM-error in Sect. 4. Finally, we illustrate our results by numerical experiments in Sect. 5.

Notation Throughout the paper we follow the conventions of [9] and use standard notation for (Bochner-)Lebesgue, (Bochner-)Sobolev-, and Hölder-spaces. By subscript D we denote incorporation of certain homogeneous Dirichlet boundary conditions. If the underlying spatial domain becomes clear from the context we will omit it, i.e. we write \(H^1_D\) instead of \(H^1_D(\Omega )\) for instance.

2 Model problem, assumptions, and RB-MOR

In this section we first introduce the continuous model equation with non-monotone nonlinearity, and recall some regularity results under appropriate assumptions. Second, we state the semi-discrete (in space) version and its RB-reduction. Finally, we briefly explain how the results obtained in this paper might be applied to an optimal control problem associated to the model equation.

2.1 Model problem, assumptions, and regularity results

Let us start with defining the setting for the following quasilinear parabolic PDE:

$$\begin{aligned} \left. \begin{aligned} \partial _t u + \mathscr {A}(u)u&= f \qquad&\text {on } I \times \Omega , \\ u(0)&= u_0 \qquad&\text {on } \Omega . \end{aligned} \qquad \right\} \qquad \qquad \qquad (\hbox {Eq}) \end{aligned}$$

The nonlinear operator \(\mathscr {A}\) is defined by

$$\begin{aligned} \langle \mathscr {A}(u) \varphi , \psi \rangle _{H^{-1}_D, H^1_D} := \int \limits _{\Omega } \xi (u)\mu \nabla \varphi \nabla \psi \text {d}x\text {d}t, \qquad \varphi , \psi \in L^2(I,H^1_D), \end{aligned}$$

whenever \(u\!:I \times \Omega \rightarrow \mathbb {R}\) is measurable. Boundary conditions are incorporated in the right hand side f and the respective function spaces. The precise assumptions required for obtaining a well-posed problem are stated below.

Before going into the detailed assumptions, we would like to comment briefly on the non-monotone structure of the nonlinearity in (Eq). The main difficulty as well as the main novelty of this paper arise from this fact. Recall that a nonlinear operator \(\mathscr {N}\!:X \rightarrow X^*\) on a Banach space X is called monotone if

$$\begin{aligned} \langle \mathscr {N}(x)-\mathscr {N}(y), x-y \rangle _{X^*,X} \ge 0 \qquad \forall x,y \in X, \end{aligned}$$

and strongly monotone if there exists a constant \(c > 0\) such that

$$\begin{aligned} \langle \mathscr {N}(x)-\mathscr {N}(y), x-y \rangle _{X^*,X} \ge c\Vert x-y \Vert _X^2 \qquad \forall x,y \in X, \end{aligned}$$

cf. e.g. [56, 65] for this notion and its application in the theory of nonlinear PDEs. It has turned out that exploitation of strong monotonicity of the nonlinear terms is also an important step in the derivation of RB-a-posteriori error-estimates for semilinear parabolic [53], quasilinear elliptic [34], and quasilinear parabolic [33] PDEs, respectively. Note that the quasilinear nonlinearities in [33, 34] refer to problems from magneto(quasi)statics and depend on the gradient of the solution. The nonlinear operator \(H^1_D \rightarrow H^{-1}_D\) under consideration in the present paper, however, is given by the map \(u \mapsto \mathscr {A}(u)u\), and hence it depends on the solution u and not on its gradient \(\nabla u\). The counterexample [20, Example 8.18] shows that this essentially changes the structure of the nonlinearity: It cannot be expected to be monotone, and therefore it is essentially different to those considered in [33, 34, 53]. In fact, the main difficulty in the derivation of RB-a-posteriori error-estimates will be to find a workaround for the missing strong monotonicity of our nonlinearity.

For the rest of this paper, we rely on the following minimal assumptions:

Assumption 2.1

  1. 1.

    \(\Omega \subset \mathbb {R}^d\), \(d \in \{1,2,3\}\), is a bounded domain with boundary \(\partial \Omega \). \(\Gamma _N \subset \partial \Omega \) is relatively open and denotes the Neumann boundary part, whereas \(\Gamma _D = \partial \Omega \setminus \Gamma _N\) is the part of the boundary equipped with homogeneous Dirichlet conditions. By subscript D we denote that a space of functions on \(\Omega \) incorporates such homogeneous Dirichlet boundary conditions on \(\Gamma _D\). We assume that \(\Omega \cup \Gamma _N\) is Gröger regular [25] such that all chart-maps in the definition of Gröger regularity can be chosen volume-preserving. For some \(T > 0\) we define the time interval \(I = [0,T]\).

  2. 2.

    The function \(\xi \!:\mathbb {R}\rightarrow \mathbb {R}\) is differentiable with bounded derivative, and \(\mu \!:\Omega \rightarrow \mathbb {R}^{d\times d}\) is measurable, uniformly bounded, and coercive in the following sense:

    $$\begin{aligned} 0< \mu _{\bullet } := \inf _{x \in \Omega } \inf _{z \in \mathbb {R}^d \setminus \{0\}} \frac{z^T \mu (x) z}{z^T z}, \qquad \mu ^{\bullet } := \sup _{x \in \Omega } \sup _{1 \le i,j \le d} |\mu _{i,j}(x) |< \infty . \end{aligned}$$

    We assume a coercivity condition \(0 < \xi _{\bullet } \le \xi \le \xi ^{\bullet }\) for \(\xi \) as well. With this we define as above

    $$\begin{aligned} \langle \mathscr {A}(u) \varphi , \psi \rangle _{L^2(I,W^{1,2}_D)} := \int \limits _I \int \limits _\Omega \xi (u) \mu \nabla \varphi \nabla \psi \text {d}x\text {d}t, \qquad \varphi , \psi \in L^2(I,W^{1,2}_D), \end{aligned}$$

    whenever u is a measurable function on \(\Omega \).

  3. 3.

    We assume that there is some \(p \in (d,4)\) such that

    $$\begin{aligned} -\nabla \cdot \mu \nabla + 1\!:W^{1,p}_D \rightarrow W^{-1,p}_D \end{aligned}$$

    is a topological isomorphism. For \(s > 2\) such that \(\frac{1}{s} < \frac{1}{2}\left( 1-\frac{d}{p}\right) \) we choose

    $$\begin{aligned} f \in L^s(I,W^{-1,p}_D), \qquad u_0 \in (W^{-1,p}_D,W^{1,p}_D)_{1/s',s}. \end{aligned}$$

It has been shown in [50, Theorem 5.3] that under these assumptions (Eq) is well-posed: There exists a unique solution u with regularity

$$\begin{aligned} u \in W^{1,s}(I,W^{-1,p}_D) \cap L^s(I,W^{1,p}_D). \end{aligned}$$

In fact, this is even true for \(\xi \) being locally instead of globally Lipschitz-continuous. In the present paper we focus on a semi-discrete formulation of (Eq) and its solution. For globally Lipschitz-continuous \(\xi \), existence and sufficient regularity of such a semi-discrete solution is automatically ensured due to space-discretization, even without Gröger regularity of the domain or the isomorphism property in Assumption 2.1.3, see Proposition 2.2 and the introduction of Sect. 3.2 for details. Therefore, we keep the summary of results concerning (Eq) on the continuous level rather short: We refer for instance to [9, 36, 50] for a discussion of Assumption 2.1 and only recall two regularity results from the literature that might be seen as motivation for exploiting the respective regularity of the semi-discrete in space solution lateron: Regularity in case of right hand sides in the slightly more regular Bessel potential-spaces \(H^{-\zeta ,p}\) and appropriate initial regularity has been addressed in [9, section 3.2]. In particular, \(C^{\alpha }(I,W^{1,p}_D)\)-regularity of the solution with some \(\alpha > 0\) and \(p \in (d,4)\) is obtained under fairly general regularity assumptions that admit certain constellations of non-smooth domains and coefficients, mixed boundary conditions, and distributional right-hand sides f. An equation similar to the one in the present paper has been considered in [10, Theorem 2.3] for \(C^{1,1}\)-smooth domains and coefficients, homogeneous Dirichlet boundary conditions, integrable right-hand sides, and a possibly unbounded nonlinearity, including a semilinear term. The authors obtain \(W^2\)-regularity results that enable the derivation of finite element error-estimates [11]. Applying their setting to (Eq) yields \(C^{0,\alpha }(I, W^{1,\infty })\)-regularity of the solutions with some \(\alpha > 0\), cf. also [36, Corollary 5.4].

2.2 Semi-discretization in space and RB-MOR

We now introduce the semi-discrete (in space) counterpart of (Eq). Its solution will serve as the so-called truth-solution, i.e. the reference solution to which the a-posteriori error-estimates will refer to. In particular, this means that we do not address spatial discretization errors in this paper. Moreover, we also introduce the RB-reduced counterpart of the semi-discrete (in space) equation.

Let \(V_h\) be an \(H^1_D\)-conforming finite element space on \(\Omega \) and \(I_h\!:H^1_D \rightarrow V_h\) be an appropriate interpolation operator. We may introduce the semi-discrete (in space) counterpart of (Eq) as follows: Find \(u_h\in W^{1,2}(I,V_h^*) \cap L^2(I,V_h)\) such that

$$\begin{aligned} \left. \begin{aligned} \langle \partial _t u_h(t), \varphi _h \rangle _{H^{-1}_D, H^1_D} + \langle \mathscr {A}(u_h(t)) u_h(t), \varphi _h \rangle _{H^{-1}_D, H^1_D}&= \langle f(t), \varphi _h \rangle _{H^{-1}_D, H^1_D} \\&\qquad \forall t \in I, \varphi _h \in V_h, \\ u_h(0)&= I_h u_0. \end{aligned} \right\} \qquad ({\hbox {Eq}_h}) \end{aligned}$$

Due to finite-dimensionality of \(V_h\), (\({\hbox {Eq}_h}\)) results in a system of ordinary differential equations (ODEs) for the coefficients of \(u_h\) w.r.t. some basis of \(V_h\). This allows to discuss existence of solutions to (\({\hbox {Eq}_h}\)), even without using all parts of Assumption 2.1.

Proposition 2.2

Assume that \(V_h\) is a subspace of \(C(\overline{\Omega })\). For any right-hand side \(f \in L^2(I,H^{-1}_D)\) and initial value \(I_h u_0 \in V_h\) there exists a unique solution \(u_h\!:I \rightarrow V_h\) of (\({\hbox {Eq}_h}\)) such that

$$\begin{aligned} \Vert u_h\Vert _{L^\infty (I,L^2) \cap L^2(I,H^1_D)} + \Vert u_h\Vert _{W^{1,2}(I,V_h^*) \cap L^2(I,V_h)} \le C \left( \Vert f \Vert _{L^2(I,H^{-1}_D)} + \Vert I_h u_0 \Vert _{L^2} \right) , \end{aligned}$$

with a constant \(C > 0\) independent of \(V_h\), f and \(I_h u_0\).

For the rest of the paper we will refer to \(u_h\) as the semi-discrete (in space) solution to (Eq), or, shorter, the truth-solution. By u we will denote the continuous in space and time solution to (Eq), short: the true solution. To our best knowledge, a finite element error analysis for \(u_h\) in the setting of Assumption 2.1 without additional suppositions has not been carried out in the existing literature. We refer to the references in [11] or to the introduction of [16] for an overview of results under additional assumptions. Let us only mention two results: A \(L^\infty (I,L^2) \cap L^2(I,H^1)\)-quasi-bestapproximation result has been derived in [17] with a technique slightly related to the one applied in Sect. 3.2. In [16] for instance pointwise error-estimates have been obtained.

Proof

Let \((\varphi _i)_{i=1,\ldots ,N_h}\), \(N_h = \dim V_h\), be a basis for \(V_h\). Writing \(u_h(t) = \sum _{i=1}^{N_h} \mathbf{x}_i(t) \varphi _i\), the coefficient vector \(\mathbf{x}(t) \in \mathbb {R}^{N_h}\) fulfills

$$\begin{aligned} \mathbf{M} \partial _t \mathbf{x}(t) + \mathbf{A}(\mathbf{x}(t)) \mathbf{x}(t) = \mathbf{f}(t), \qquad \mathbf{x}(0) = \mathbf{x}_0, \end{aligned}$$
(1)

with

$$\begin{aligned} \mathbf{M}&:= \left( \langle \varphi _i, \varphi _j \rangle _{L^2} \right) _{i,j}, \\ \mathbf{A(\mathbf{z})}&:= \left( \int \limits _\Omega \xi \left( \sum _{n=1}^{N_h} \mathbf{z}_n \varphi _n \right) \mu \nabla \varphi _i \nabla \varphi _j dx \right) _{i,j}, \\ \mathbf{f}(t)&:= \left( \langle f(t), \varphi _i \rangle _{H^{-1}_D, H^1_D} \right) _i, \end{aligned}$$

and \(\mathbf{x}_0\) being the coefficient vector of \(I_h u_0\). Using continuity of the basis functions, as well as global Lipschitz-continuity and boundedness of \(\xi \), one easily verifies that (1) satisfies the assumptions of [61, Lemma 5.7]. Therefore, existence and uniqueness of a solution \(\mathbf{x} \in W^{1,2}(I,\mathbb {R}^{N_h})\) to (1) follows, which implies existence and uniqueness of a solution \(u_h\in W^{1,2}(I,V_h)\) to (\({\hbox {Eq}_h}\)). The estimate follows by standard techniques: Testing (\({\hbox {Eq}_h}\)) with \(\varphi _h = u_h\) and integrating by parts yields the first summand of the estimate. The second summand is obtained by testing with an arbitray \(\varphi _h \in L^2(I,V_h)\). \(\square \)

Note that the embedding \(V_h \hookrightarrow C(\overline{\Omega })\) is crucial in the previous argument: It ensures Lipschitz continuity of \(\mathbf{A}(\cdot )\) and therefore existence of a solution to (1) via a generalized Picard-Lindelöf principle. If \(V_h\) is a classical Lagrange finite element space on a polygonal (polyhedral) domain \(\Omega \) equipped with a triangular (tetrahedral) mesh, this assumption is obviously fulfilled. However, we would like to point out that except for the assumptions from Proposition 2.2 we do not rely on further details of spatial discretization.

Performing expensive computations in the high-dimensional space \(V_h\) can be avoided by application of the reduced basis approach: We replace the finite element space \(V_h\) by a much smaller n-dimensional subspace \(V_h^n \subset V_h\) that is related to the physical properties of the system and might be determined by the well-known POD approach [39, 64], for instance. Although our arguments do not rely on the particular choice of \(V_h^n\) and therefore also cover general RB-methods, we clearly have in mind \(V_h^n\)’s obtained by POD and also restrict our numerical experiments in Sect. 5 to this case. Having at hand a reduced ansatz-space \(V_h^n \subset V_h\) and a suitable projection \(P_n\!:V_h \rightarrow V_h^n\), usually the \(L^2\)- or \(H^1\)-orthogonal projection, we introduce the reduced-order counterpart of (\({\hbox {Eq}_h}\)) as follows: Find \(u_h^n\in W^{1,2}(I,(V_h^n)^*) \cap L^2(I,V_h^n)\) such that

$$\begin{aligned} \left. \begin{aligned} \langle \partial _t u_h^n(t), \varphi _h^n \rangle _{H^{-1}_D, H^1_D} + \langle \mathscr {A}(u_h^n(t)) u_h^n(t), \varphi _h^n \rangle _{H^{-1}_D, H^1_D}&= \langle f(t), \varphi _h^n \rangle _{H^{-1}_D, H^1_D} \\&\qquad \forall t \in I, \varphi _h^n \in V_h^n, \\ u_h^n(0)&= P_n I_h u_0. \end{aligned} \right\} \qquad \qquad ({\text {Eq}_h\text {-RB}_n}) \end{aligned}$$

A reader familiar with ROM-techniques may already have noticed that the nonlinear term in (1) does not allow for efficient evaluation within the reduced-order model. We can overcome this issue by so-called hyperreduction techniques, e.g. the empirical interpolation method (EIM), which will be addressed in Sect. 4.

2.3 Outlook towards optimal control

In [9, Example 2.5] and [35] the following optimal control problem governed by (Eq) has been addressed:

$$\begin{aligned} \left. \begin{aligned} \min _{u,q} J(u,q)&:= \frac{1}{2} \Vert u-u_d \Vert _{L^2(I \times \Omega )}^2 + \frac{\gamma }{2} \Vert q \Vert _{L^2(I,\mathbb {R}^k)}^2 \\ \text {s.t.} \qquad&q \in Q_{ad} := \left\{ q \in L^2(I,\mathbb {R}^k)\!:q_a \le q \le q_b \text { a.e. on } I \right\} , \\&\text {and ({Eq}) with right-hand side } f = \sum _{i=1}^k q_i b_i. \end{aligned}\qquad \right\} \qquad \qquad (\hbox {OCP}) \end{aligned}$$

Hereby, \(u_d \in L^2(I\times \Omega )\) denotes the desired state, \(\gamma > 0\) is a Tikhonov-parameter, \(b_i \in H^{-\zeta ,p}_D \subset H^1_D\), \(i=1,\ldots ,k\), are some fixed spatial control functions, and \(q_a, q_b \in L^\infty (I,\mathbb {R}^k)\) define box-constraints for the control. The special structure of the right hand side f is commonly referred to as purely time-dependent control in the literature [44]. Fixing a space-discretization for (Eq) as described earlier in this section results in a semi-discrete (in space) counterpart (OCP\(_h\)) of (OCP), which we may consider again as reference object. Proposition 2.2 shows that this problem is well-defined.

At this point, let us make some short remarks about discretization of optimal control problems: Besides the discretization of the state variable and the state equation one also has to decide how to treat the control variable. There are several well-established concepts addressing the latter issue, e.g. variational discretization [14, 32], piecewise linear [54, 55], and piecewise constant [6] control discretization. For respective a-priori finite element discretization error-estimates we refer e.g. to [46, 47] for linear-quadratic parabolic optimal control, or to [51] for a semilinear parabolic state equation, also including discussion of the purely time-dependent control setting. Due to the purely time-dependent control structure of (OCP), the semi-discrete (in space) problem (OCP\(_h\)) may be regarded as variational discretization of (OCP) when only space is discretized. We mention e.g. [53] for the same semi-discrete (in space) approach to a-posteriori POD-errors for an optimal control problem governed by a semilinear parabolic PDE, or [15] for semi-discrete (in space) finite element error-estimates for optimal control of the instationary Navier-Stokes equation. There are different possibilities to choose an appropriate time-discretization. Our estimates in particular apply to CG1 time-discretization, but an extension to e.g. DG0 (implicit Euler) discretization can be easily stated, see Remark 4.4.

In numerical algorithms for the solution of (OCP\(_h\)), we may have to evaluate the semi-discrete reduced functional \(j(q) := J(u_h(q),q)\) where \(u_h(q)\) denotes the solution of (\({\hbox {Eq}_h}\)) associated with several control functions q. Since repeated evaluation of \(j_h\) is costly, RB-MOR can be applied to (\({\hbox {Eq}_h}\)). Therefore, and due to additional time-discretization, we only have the possibility to compute an approximate solution \(u_h^{n,m}= u_h^{n,m}(q)\) instead of \(u_h(q)\). A short computation shows that the resulting error in the reduced functional can be estimated as follows:

$$\begin{aligned} |J(u_h^{n,m},q) - J(u_h,q) |\le \left[ \frac{1}{2} \Vert u_h^{n,m}-u_h\Vert _{L^2(I,L^2)} + \Vert u_h^{n,m}-u_d \Vert _{L^2(I,L^2)} \right] \Vert u_h^{n,m}-u_h\Vert _{L^2(I,L^2)}. \end{aligned}$$

Consequently, we immediately obtain an a-posteriori error for the reduced functional of (OCP\(_h\)) if we have an \(L^2(I,L^2)\)-estimate for solutions of (\({\hbox {Eq}_h}\)) at hand. This may be regarded as motivation for the results in the present paper. Note that the above estimate for the functional error differs from the estimates obtained in [52, Theorems 4 and 9] because we do not utilize adjoint information. Deriving a-posteriori reduced modeling errors for the adjoint equation of (OCP\(_h\)), and consequently for the gradient of the reduced functional, is beyond the scope of the present paper. We refer to [52, 53] for such estimates in case of different model problems.

3 A-posteriori RB-error-estimates

In this section we state and prove our first main results: A-posteriori error-estimates for (\({\hbox {Eq}_h}\)) including both reduced-order and time-discretization errors. For the reason of clarity we exclude hyperreduction for the nonlinearity at this point, and address this issue in the following section.

We roughly follow the ansatz of [53], where a semilinear equation with monotone nonlinearity has been discussed. To overcome the difficulties arising from the fact that our nonlinearity is not monotone we present two different approaches: The first approach, see Sect. 3.2, relies on exploiting \(L^\infty (I,W^{1,\infty })\)-regularity of the truth-solution \(u_h\) and allows to obtain explicit estimates of “classical” structure in terms of the error in the initial condition and the \(V_h^*\)-residual of the discrete solution under consideration. As a semi-discrete in space solution, \(u_h\) obviously exhibits the required regularity for any fixed (spatial) discretization level. However, since the error-estimates will depend on the value of the \(L^\infty (I,W^{1,\infty })\)-norm of \(u_h\) it is desirable to have uniform bounds for this norm for all sufficiently fine spatial discretization levels. We believe that we can only expect such a uniform bound if the continuous in space and time solution of (Eq) exhibits \(L^\infty (I,W^{1,\infty })\)-regularity, which is guaranteed in the setting of [10]. However, in the less regular setting of [9] we cannot expect such a result. Therefore, the second approach, see Sect. 3.3, is motivated by the intention to exploit less regularity of the truth-solution, more precisely: \(L^\infty (I,W^{1,p})\)-regularity for some \(p > d\). For continuous in space and time solutions of (Eq) this regularity is guaranteed in the setting of [9]. The price to pay for exploiting less regularity of \(u_h\) is that we do not obtain an explicit formula for the error-estimate; instead the evaluation of the estimate requires the solution of an ODE. Moreover, for technical reasons we require additional assumptions on time regularity of the residual and the size of the initial error. We start by fixing the following notation and assumptions:

Assumption 3.1

  1. 1.

    Assume that \(V_h \subset H^1_D \cap C(\overline{\Omega }) {\cap W^{1,\infty }(\Omega )}\) is an \(N_h\)-dimensional conforming finite element space, and \(V_h^n\) a n-dimensional subspace of \(V_h\). By \(u_h\in W^{1,2}(I,V_h^*) \cap L^2(I,V_h)\) we denote the truth-solution, i.e. the unique solution to (\({\hbox {Eq}_h}\)).

  2. 2.

    Moreover, let \(u_h^n\in W^{1,2}(I,(V_h^n)^*) \cap L^2(I,V_h^n)\) be arbitrary. By \(e_h^n:= u_h^n-u_h\) we denote the error with respect to the truth-solution.

  3. 3.

    We denote the global Lipschitz-constant of \(\xi \) by \(|\xi ' |_\infty \).

We have in mind the following situation: \(u_h^n\) is the solution of a time-discrete counterpart of (Eq\(_h\)-RB\(_n\)), and we want to estimate how good \(u_h^n\) approximates the truth-solution \(u_h\). Note that in order to ensure that \(u_h^n\) meets the regularity requirements of Assumption 3.1 we have to choose a time-discretization for (Eq\(_h\)-RB\(_n\)) that results in sufficiently regular solutions, e.g. the Crank-Nicolson scheme in its CG1-DG0 Petrov-Galerkin form. Time-discrete solutions of (Eq\(_h\)-RB\(_n\)) obtained by Discontinuous Galerkin time-discretization, e.g. backward Euler, do not fulfill Assumption 3.1. Since discontinuous time-discretization might be of particular interest in the context of PDE-constrained optimization we outline an approach to overcome this restriction in Remark 4.4.

3.1 Some preliminary calculations

In this subsection we follow the residual-based ansatz of [53] as far as possible without modification, i.e. up to the point where strong monotonicity of the nonlinearity would be required. From that point on we develop two different approaches that will be discussed in the following subsections.

First, we introduce the residual of \(u_h^n\) by

$$\begin{aligned} r_h^n(t) := \partial _t u_h^n(t) + \mathscr {A}(u_h^n(t))u_h^n(t) - f(t) \in (V_h^n)^* \hookrightarrow H^{-1}_D, \quad t \in I. \end{aligned}$$
(2)

To keep notation short we will omit the argument “t” in the following. A short computation utilizing (\({\hbox {Eq}_h}\)) shows that

$$\begin{aligned} \langle r_h^n, \varphi _h \rangle _{H^{-1}_D,H^1_D} = \langle \partial _t e_h^n, \varphi _h \rangle _{H^{-1}_D,H^1_D} + \langle \mathscr {A}(u_h^n)u_h^n- \mathscr {A}(u_h)u_h, \varphi _h \rangle _{H^{-1}_D, H^1_D} \end{aligned}$$
(3)

holds for all \(\varphi _h \in V_h\). We consider \(V_h\) as a vector space canonically equipped with the \(H^1_D\)-norm. Therefore, its dual \(V_h^*\) is canonically equipped with the following norm:

$$\begin{aligned} \Vert \ell _h \Vert _{V_h^*} := \sup _{0 \ne \psi _h \in V_h} \frac{\langle \ell _h, \psi _h \rangle _{H^{-1}_D, H^1_D}}{\Vert \psi _h \Vert _{H^1_D}} = \sup _{0 \ne \psi _h \in V_h} \frac{\ell _h(\psi _h)}{\Vert \psi _h \Vert _{H^1_D}}. \end{aligned}$$
(4)

Note that this norm is not equal to the \(H^{-1}_D\)-norm, because we only test with elements \(\psi _h\) from \(V_h\) in (4). For later use we state the following observation:

Lemma 3.2

Let Assumption 3.1 hold. Then the function \(I \rightarrow \mathbb {R}\), \(t \mapsto \Vert r_h^n(t) \Vert _{V_h^*}^2\) is well-defined a.e. and \(L^2\)-integrable.

Proof

This follows from the definition of \(r_h^n\) and the regularity assumed for \(u_h^n\). \(\square \)

Plugging in \(\varphi _h = e_h^n(t)\) for every fixed t in (3), and using the classical integration by parts formula from [56, Remark 7.5] we obtain

$$\begin{aligned} \frac{d}{dt} \frac{1}{2} \Vert e_h^n\Vert _{L^2}^2 + \langle \mathscr {A}(u_h^n)u_h^n- \mathscr {A}(u_h)u_h, e_h^n\rangle _{H^{-1}_D,H^1_D} = \langle r_h^n, e_h^n\rangle _{H^{-1}_D,H^1_D}. \end{aligned}$$
(5)

Note that the second summand on the left-hand side of (5) causes problems in our case: If the nonlinearity \(u \mapsto \mathscr {A}(u)u\) was strongly monotone, we could proceed as done in [53] for a semilinear term and estimate as follows:

$$\begin{aligned} \langle \mathscr {A}(u_h^n)u_h^n- \mathscr {A}(u_h)u_h, e_h^n\rangle _{H^{-1}_D,H^1_D} \ge c |e_h^n|_{H^1_D}^2. \end{aligned}$$

However, as pointed out in Sect. 2.1 such an estimate cannot be expected to hold true. We cannot even bound the term under consideration from below by zero. Therefore, we have to proceed in a different way and split the problematic term into a coercive part and a remainder as follows:

$$\begin{aligned}&\langle \mathscr {A}(u_h^n) u_h^n- \mathscr {A}(u_h)u_h, u_h^n-u_h\rangle _{H^{-1}_D,H^1_D} \\&\quad = \int \limits _\Omega (\xi (u_h^n) \mu \nabla u_h^n- \xi (u_h)\mu \nabla u_h)\nabla (u_h^n-u_h)\text {d}x\\&\quad = \int \limits _\Omega \xi (u_h^n) \mu \nabla e_h^n\nabla e_h^n\text {d}x+ \int \limits _\Omega (\xi (u_h^n)-\xi (u_h)) \mu \nabla u_h\nabla e_h^n\text {d}x\\&\quad \ge \xi _\bullet \mu _\bullet |e_h^n|_{H^1_D}^2 + \int \limits _\Omega (\xi (u_h^n)-\xi (u_h)) \mu \nabla u_h\nabla e_h^n\text {d}x. \end{aligned}$$

Plugging this into (5) yields

$$\begin{aligned}&\frac{d}{dt} \frac{1}{2} \Vert e_h^n\Vert _{L^2}^2 + \xi _\bullet \mu _\bullet |e_h^n|_{H^1_D} \nonumber \\&\quad \le \langle r_h^n, e_h^n\rangle _{H^{-1}_D,H^1_D} - \int \limits _\Omega (\xi (u_h^n)-\xi (u_h)) \mu \nabla u_h\nabla e_h^n\text {d}x, \end{aligned}$$
(6)

i.e. except for the remainder term that we have shifted to the right-hand-side we have preserved a similar structure as in [53]. Formula (6) will serve as the common basis for our two different approaches in the following subsections. The main challenge in both cases is to estimate the second summand on the right-hand side in (6) in such a way that Gronwall’s Lemma or a similar comparison principle can be applied to the resulting inequality.

3.2 A-posteriori estimates—Approach I

The approach of this subsection is closer to [53] than the second one, and relies on \(L^\infty (I,W^{1,\infty })\)-regularity of the truth-solution. An analogous regularity assumption on the true solution u and a similar estimate of the nonlinear term as below have been used in [17] in the context of finite element errors.

Theorem 3.3

Let Assumptions 2.1and 3.1hold, and let \(c_\text {Lip}> 0\) be such that

$$\begin{aligned} |u_h(t) |_{W^{1,\infty }} \le c_\text {Lip}\qquad \forall t \in I. \end{aligned}$$

Moreover, let \(\varepsilon , \eta > 0\) be chosen such that

$$\begin{aligned} \eta + \varepsilon |\xi ' |_\infty \mu ^\bullet c_\text {Lip}= \xi _\bullet \mu _\bullet \end{aligned}$$

and define \(\beta := 2\left( \frac{1}{2\varepsilon } |\xi ' |_\infty \mu ^\bullet c_\text {Lip}+ \xi _\bullet \mu _\bullet \right) \). Then, the following a-posteriori error-estimates for \(u_h^n\) hold:

$$\begin{aligned} \Vert e_h^n(t) \Vert _{L^2}^2&\le e^{\beta t}\Vert u_h^n(0)-u_h(0) \Vert _{L^2}^2 + \eta ^{-1} \int \limits _0^t e^{\beta (t-s)} \Vert r_h^n(s) \Vert _{V_h^*}^2 \text {d}s, \end{aligned}$$
(7)
$$\begin{aligned} \Vert e_h^n\Vert _{L^2(I,L^2)}^2&\le \beta ^{-1} \left( e^{\beta T} - 1 \right) \Vert u_h^n(0)-u_h(0) \Vert _{L^2}^2 \nonumber \\&\quad + \eta ^{-1} \beta ^{-1} \int \limits _0^T (e^{\beta (T-t)}-1) \Vert r_h^n(t) \Vert _{V_h^*}^2 \text {d}t. \end{aligned}$$
(8)
$$\begin{aligned} \Vert e_h^n\Vert _{L^2(I,H^1_D)}^2&\le \xi _\bullet ^{-1} \mu _\bullet ^{-1} e^{\beta T} \Vert u_h^n(0)-u_h(0) \Vert _{L^2}^2 \nonumber \\&\quad + \xi _\bullet ^{-1} \mu _\bullet ^{-1} \eta ^{-1} \int \limits _0^T e^{\beta (T-t)} \Vert r_h^n(t)\Vert _{V_h^*}^2 \text {d}t. \end{aligned}$$
(9)

Proof

We proceed with the argument from the previous subsection. Starting with the estimate (6) we bound the remaining term of the nonlinearity in the following way:

$$\begin{aligned} \left| \int \limits _\Omega (\xi (u_h^n)-\xi (u_h)) \mu \nabla u_h\nabla e_h^n\text {d}x\right| \le |\xi ' |_\infty \mu ^\bullet c_\text {Lip}\Vert e_h^n\Vert _{L^2} \Vert e_h^n\Vert _{H^1_D}. \end{aligned}$$
(10)

Using \(W^{1,\infty }\)-regularity for \(u_h\) we can estimate one of the \(e_h^n\)-factors in the \(L^2\)-norm, which would not be possible assuming only \(W^{1,p}\)-regularity for \(u_h\) with some finite p. With the help of Young’s inequality we arrive at

$$\begin{aligned} \frac{d}{dt} \frac{1}{2} \Vert e_h^n\Vert _{L^2}^2 + \xi _\bullet \mu _\bullet \Vert e_h^n\Vert _{H^1_D}^2&\le \xi _\bullet \mu _\bullet \Vert e_h^n\Vert _{L^2}^2 + \langle r_h^n, e_h^n\rangle _{H^{-1}_D,H^1_D} \\&\quad + |\xi ' |_\infty \mu ^\bullet c_\text {Lip}\left( \frac{1}{2\varepsilon } \Vert e_h^n\Vert _{L^2}^{{2}} + \frac{\varepsilon }{2} \Vert e_h^n\Vert _{H^1_D}^{{2}} \right) \end{aligned}$$

with some \(\varepsilon > 0\). Another application of Young’s inequality yields

$$\begin{aligned} \frac{d}{dt} \frac{1}{2} \Vert e_h^n\Vert _{L^2}^2 + \xi _\bullet \mu _\bullet \Vert e_h^n\Vert _{H^1_D}^2&\le \left( \frac{1}{2\varepsilon } |\xi ' |_\infty \mu ^\bullet c_\text {Lip}+ \xi _\bullet \mu _\bullet \right) \Vert e_h^n\Vert _{L^2}^2 + \frac{1}{2\eta } \Vert r_h^n\Vert _{V_h^*}^2 \\&\quad + \left( \frac{\eta }{2} + \frac{\varepsilon }{2} |\xi ' |_\infty \mu ^\bullet c_\text {Lip}\right) \Vert e_h^n\Vert _{H^1_D}^2 \end{aligned}$$

with \(\eta > 0\). Now, choose \(\eta , \varepsilon \) as in the statement of the theorem and obtain

$$\begin{aligned} \frac{d}{dt} \frac{1}{2} \Vert e_h^n\Vert _{L^2}^2 + \frac{1}{2} \xi _\bullet \mu _\bullet \Vert e_h^n\Vert _{H^1_D}^2&\le \beta \cdot \frac{1}{2} \Vert e_h^n\Vert _{L^2}^2 + \frac{1}{2\eta } \Vert r_h^n\Vert _{V_h^*}^2, \end{aligned}$$
(11)

where we will use from now on the abbreviation \(\beta = 2\left( \frac{1}{2\varepsilon } |\xi ' |_\infty \mu ^\bullet c_\text {Lip}+ \xi _\bullet \mu _\bullet \right) \) to enhance readability. With the help of Gronwall’s Lemma [18, Corollary 2] we obtain an a-posteriori estimate for the \(L^\infty (I,L^2)\)-error from this:

$$\begin{aligned} \Vert e_h^n(t) \Vert _{L^2}^2 \le \Vert P_n I_h u_0 - I_h u_0 \Vert _{L^2}^2 e^{\beta t} + \eta ^{-1} \int \limits _0^t \Vert r_h^n(s) \Vert _{V_h^*}^2 e^{\beta (t-s)} \text {d}s. \end{aligned}$$

The second summand thereof is integrated using integration by parts, i.e.

$$\begin{aligned} \int \limits _0^T e^{\beta t} \left( \int \limits _0^t e^{-\beta s} \Vert r_h^n(s) \Vert _{V_h^*}^2 \text {d}s\right) \text {d}t= \beta ^{-1} \int \limits _0^T (e^{\beta (T-t)} - 1) \Vert r_h^n(t) \Vert _{V_h^*}^2 \text {d}t, \end{aligned}$$

and together with the first summand we obtain the \(L^2(I,L^2)\)-estimate (8). As in [53] the \(L^2(I,H^1)\)-estimate (9) is obtained from (11) by integrating with respect to time over I and using (8). \(\square \)

It is possible to exploit less regularity of \(u_h\) as we will outline in the following. The price to pay is that the constants in the modified estimates cannot be computed explicitely, in general. In fact, if \(d=2\) and \(u_h\in L^\infty (I,W^{1,4})\), estimate (10) can be replaced by

$$\begin{aligned} \left| \int \limits _\Omega (\xi (u_h^n)-\xi (u_h)) \mu \nabla u_h\nabla e_h^n\text {d}x\right| \le |\xi ' |_\infty \mu ^\bullet \Vert e_h^n\Vert _{L^4} |u_h|_{W^{1,4}} |e_h^n|_{H^1}. \end{aligned}$$

In order to apply Gronwall’s Lemma as before, one has to observe

$$\begin{aligned} \Vert e_h^n\Vert _{L^4} |u_h|_{W^{1,4}} |e_h^n|_{H^1}&\le \frac{1}{2\epsilon } \Vert e_h^n\Vert _{L^4}^2 |u_h|_{W^{1,4}}^2 + \frac{\epsilon }{2} |e_h^n|_{H^1}^2 \\&\le \frac{C_{LGN,2}^2}{2\epsilon } |u_h|_{W^{1,4}}^2 \Vert e_h^n\Vert _{L^2}|e_h^n|_{H^1} + \frac{\epsilon }{2} |e_h^n|_{H^1}^2, \end{aligned}$$

where Young’s inequality with parameter \(\epsilon > 0\) has been used in the first, and the 2D-Ladyzhenskaya-Gagliardo-Nirenberg interpolation inequality

$$\begin{aligned} \Vert \varphi \Vert _{L^4} \le C_{LGN,2} \Vert \varphi \Vert _{L^2}^{\frac{1}{2}}\Vert |\varphi |_{W^{1,4}}^\frac{1}{2}, \qquad \varphi \in W^{1,4}, \end{aligned}$$

in the second step. Finally, a second application of Young’s inequality, again with parameter \(\epsilon \), allows to obtain

$$\begin{aligned} \Vert e_h^n\Vert _{L^4} |u_h|_{W^{1,4}} |e_h^n|_{H^1} \le \frac{C_{LGN,2}^4}{8\epsilon ^3} |u_h|_{W^{1,4}}^4 \Vert e_h^n\Vert _{L^2}^2 + \epsilon |e_h^n|_{H^1}^2. \end{aligned}$$

In dimension \(d=3\) one has to apply

$$\begin{aligned} \Vert \varphi \Vert _{L^4} \le C_{LGN,3} \Vert \varphi \Vert _{L^2}^\frac{1}{4} |\varphi |_{W^{1,4}}^\frac{3}{4}, \qquad \varphi \in W^{1,4}, \end{aligned}$$

and Young’s inequality with exponents 4 and \(\frac{4}{3}\) to arrive at

$$\begin{aligned} \Vert e_h^n\Vert _{L^4} |u_h|_{W^{1,4}} |e_h^n|_{H^1} \le \frac{C_{LGN,3}^8}{64\epsilon ^8} |u_h|_{W^{1,4}}^8 \Vert e_h^n\Vert _{L^2}^2 + \left( \frac{\epsilon }{2} + \frac{3 \epsilon ^\frac{4}{3}}{4}\right) |e_h^n|_{H^1}^2. \end{aligned}$$

In both cases one can proceed similarly as in the proof before in order to prove estimates for \(e_h^n\) of structure analogous to those in Theorem 3.3. The main difference is that the constants \(C_{LGN,2}, C_{LGN,3} > 0\), whose exact values are unknown in general, enter the estimates. For upper bounds of these constants we refer e.g. to [1, Theorem 7.3] and the references therein.

3.3 A-posteriori estimates—Approach II

We derive a-posteriori error-estimates that rely on \(L^\infty (I,W^{1,p}_D)\)-regularity for the truth-solution \(u_h\) with some \(p > d\), only. For technical reasons we will have to impose an additional assumption on time-regularity of the residual and the size of the initial error. We start with the following auxiliary result:

Lemma 3.4

Let Assumptions 2.1 and 3.1 hold, and let \(\varepsilon , \eta > 0\) satisfy

$$\begin{aligned} \xi _\bullet \mu _\bullet = \eta + \varepsilon \cdot \mu ^\bullet (2\xi ^\bullet )^{1-\frac{2}{q}} |\xi ' |_\infty ^{\frac{2}{q}} c_p. \end{aligned}$$

The error-function \(t\mapsto \Vert e_h^n(t) \Vert _{L^2}^2\) satisfies the differential inequality

$$\begin{aligned} \varphi '(t) \le \alpha \varphi (t) + \beta \varphi (t)^r + \gamma (t), \qquad t \in I, \end{aligned}$$
(12)

with the constants \(\alpha = 2 \xi _\bullet \mu _\bullet \), \(\beta = \varepsilon ^{-1} \mu ^\bullet (2\xi ^\bullet )^{1-\frac{2}{q}} |\xi ' |_\infty ^{\frac{2}{q}} c_p\), the function \(\gamma = \gamma (t) = \eta ^{-1} \Vert r_h^n(t) \Vert _{V_h^*}^2\), and the exponent \(r = \frac{2}{q} = 1-\frac{2}{p} \in (0,1)\).

As already for the discussion of the equation on the continuous level, cf. [9, 50], boundedness of \(\xi \) is essential in our argument below.

Proof

We pick up the argument from Sect. 3.1. We start by estimating the second summand on the right-hand side of (6) as follows:

$$\begin{aligned} \left| \int \limits _\Omega (\xi (u_h^n)-\xi (u_h))\mu \nabla u_h\nabla e_h^n\text {d}x\right| \le \mu ^\bullet \Vert \xi (u_h^n) - \xi (u_h) \Vert _{L^q} |u_h|_{W^{1,p}} |e_h^n|_{H^1} \end{aligned}$$

with \(p^{-1} + q^{-1} + \frac{1}{2} = 1\), i.e. \(q = \frac{2p}{p-2}\). Note that we define the \(W^{1,p}\)-semi-norm as follows:

$$\begin{aligned} |\varphi |_{W^{1,p}}^p := \int \limits _{\Omega } |\nabla \varphi |_2^p dx = \int \limits _\Omega \left( \sum _{i=1}^d \left( \frac{\partial \varphi }{\partial x_i}\right) ^2\right) ^{p/2}\text {d}x. \end{aligned}$$

Next, we apply the well-known Riesz-Thorin interpolation-inequality

$$\begin{aligned} \Vert f \Vert _{L^q} \le \Vert f \Vert _{L^\infty }^{1-\frac{2}{q}} \Vert f \Vert _{L^2}^{\frac{2}{q}}, \qquad f \in L^\infty , q \in (2,\infty ), \end{aligned}$$

to \(f = \xi (u_h^n)-\xi (u_h)\), use \(\Vert f \Vert _{L^\infty } \le 2\xi ^\bullet \) and the Lipschitz-estimate \(\Vert \xi (u_h^n)-\xi (u_h) \Vert _{L^2} \le |\xi ' |_\infty \Vert e_h^n\Vert _{L^2}\), and arrive at

$$\begin{aligned} \left| \int \limits _\Omega (\xi (u_h^n)-\xi (u_h))\mu \nabla u_h\nabla e_h^n\text {d}x\right| \le \mu ^\bullet (2\xi ^\bullet )^{1-\frac{2}{q}} |\xi ' |_\infty ^{\frac{2}{q}} |u_h|_{W^{1,p}} \Vert e_h^n\Vert _{L^2}^{\frac{2}{q}}|e_h^n|_{H^1}. \end{aligned}$$
(13)

Note that this is the point where uniform boundedness of the nonlinearity \(\xi \) enters. As before, we will estimate the product of the last two factors by Young’s inequality and move the \(H^1\)-semi-norm term to the left-hand side of (6) in order to obtain an ODE for the \(L^2\)-error. This is why we are not able to convert the \(\Vert e_h^n\Vert _{L^2}^{2/q}\)-term to an \(\Vert e_h^n\Vert _{L^2}^{2}\)-term by application of Young’s inequality, because we need to generate an \(|e_h^n|_{H^1}^2\)-term from the second factor, such that this term can be canceled by the left-hand side of (6).

Plugging (13) into (6) and using Young’s inequality twice we obtain:

$$\begin{aligned} \frac{d}{dt} \frac{1}{2} \Vert e_h^n\Vert _{L^2}^2 + \xi _\bullet \mu _\bullet \Vert e_h^n\Vert _{H^1_D}\le & {} \frac{\eta }{2} \Vert e_h^n\Vert _{H^1_D}^2 + \frac{1}{2\eta } \Vert r_h^n\Vert _{V_h^*}^2 + \xi _\bullet \mu _\bullet \Vert e_h^n\Vert _{L^2}^2 \\&+ \mu ^\bullet (2\xi ^\bullet )^{1-\frac{2}{q}} |\xi ' |_\infty ^{\frac{2}{q}} |u_h|_{W^{1,p}} \\&\cdot \left( \frac{\varepsilon }{2} \Vert e_h^n\Vert _{H^1}^2 + \frac{1}{2\varepsilon } \Vert e_h^n\Vert _{L^2}^{\frac{4}{q}} \right) . \end{aligned}$$

Here, \(\varepsilon , \eta > 0\) are the parameters appearing in Young’s inequality. Choosing them as in the statement of the lemma yields

$$\begin{aligned}&\frac{d}{dt} \Vert e_h^n\Vert _{L^2}^2 + \xi _\bullet \mu _\bullet \Vert e_h^n\Vert _{H^1}^2 \nonumber \\&\quad \le \eta ^{-1} \Vert r_h^n\Vert _{V_h^*}^2 + 2 \xi _\bullet \mu _\bullet \Vert e_h^n\Vert _{L^2}^2 + \varepsilon ^{-1} \mu ^\bullet (2\xi ^\bullet )^{1-\frac{2}{q}} |\xi ' |_\infty ^{\frac{2}{q}} c_p \Vert e_h^n\Vert _{L^2}^{4/q}, \end{aligned}$$
(14)

from which the claim follows. \(\square \)

Let us briefly comment on the rather challenging structure of (12): First, note that \(r \in (0,1)\), i.e. the right-hand side in (12) only depends Lipschitz-continuously on \(\varphi (t)\), if \(\varphi (t)\) stays uniformly away from zero, which can be ensured for \(\varphi (0) > 0\) only. Moreover, the Lipschitz-constant on sets bounded uniformly away from zero increases, if r gets smaller. The latter, however, is the case if \(p > d\) gets smaller, i.e. if we exploit less regularity of the truth-solution. In other words: The smaller the initial error, and the less regularity of the truth-solution we use, the more ill-posed (12) becomes.

Theorem 3.5

Let Assumptions 2.1and 3.1hold, and let \(p > d\) and \(c_p > 0\) such that

$$\begin{aligned} |u_h(t) |_{W^{1,p}} \le c_{p} \qquad \forall t \in I. \end{aligned}$$

Moreover, we assume that the initial error does not vanish, i.e. \(\Vert e_h^n(0) \Vert _{L^2} > 0\), and that \(t \mapsto \Vert r_h^n(t) \Vert _{V_h^*}^2\) is piecewise continuous on I. Let \(\varepsilon , \eta > 0\) be chosen such that

$$\begin{aligned} \xi _\bullet \mu _\bullet = \eta + \varepsilon \cdot \mu ^\bullet (2\xi ^\bullet )^{1-\frac{2}{q}} |\xi ' |_\infty ^{\frac{2}{q}} c_p \end{aligned}$$

holds. Given the constants \(\alpha = 2 \xi _\bullet \mu _\bullet \), \(\beta = \varepsilon ^{-1} \mu ^\bullet (2\xi ^\bullet )^{1-\frac{2}{q}} |\xi ' |_\infty ^{\frac{2}{q}} c_p\), and \(r = 1-\frac{2}{p}\), let \(\varphi \!:I \rightarrow [0,\infty )\) be the solution to

$$\begin{aligned} \varphi '(t)&= \alpha \varphi (t) + \beta \varphi (t)^r + \eta ^{-1} \Vert r_h^n(t) \Vert _{V_h^*}^2, \qquad t \in I, \\ \varphi (0)&= \Vert u_h^n(0)-u_h(0) \Vert _{L^2}^2. \end{aligned}$$

Then the following a-posteriori error-estimates hold true:

$$\begin{aligned} \Vert e_h^n(t) \Vert _{L^2}^2&\le \varphi (t), \quad \forall t \in I, \qquad \Vert e_h^n\Vert _{L^2(I,L^2)}^2 \le \int \limits _0^T \varphi (s) \text {d}s, \end{aligned}$$
(15)
$$\begin{aligned} \Vert e_h^n\Vert _{L^2(I,H^1_D)}^2&\le \frac{1}{\xi _\bullet \mu _\bullet } \left( \Vert u_h^n(0)-u_h(0) \Vert _{L^2}^2 + \eta ^{-1} \Vert r_h^n\Vert _{L^2(I,V_h^*)}^2 \right. \nonumber \\&\quad \left. \alpha \int \limits _0^T \varphi (s) ds + \beta \int \limits _0^T \varphi (s)^{2/q}\text {d}s\right) . \end{aligned}$$
(16)

Proof

In order to apply [18, Theorem 44] to Lemma 3.4 we have to verify that \(f(t,z) = \alpha z + \beta z^r + \eta ^{-1} \Vert r_h^n(t) \Vert _{V_h^*}^2\) satisfies the required assumption, i.e. that given \(\varphi _0 > 0\) there is \(\varepsilon > 0\) such that the initial value problem

$$\begin{aligned} \varphi '(t) = f(t,\varphi (t)), \qquad \varphi (0) = \varphi _0 +\delta , \end{aligned}$$

has a solution on the time interval I as long as \(\delta \in [0,\varepsilon ]\). Existence of a local solution on some time interval \([0,T_{\max })\) with \(T_{\max } \in (0,T]\) is clear due to Peano’s existence-theorem, since \(t \mapsto \Vert r_h^n(t) \Vert _{V_h^*}^2\) is piecewise continuous on I and \(u \mapsto \alpha u + \beta u^r\) even admits a continuous extension \(\mathbb {R}\rightarrow \mathbb {R}\), \(u \mapsto \alpha u + \beta \text {sign}(u) |u |^r\). Further, due to \(\alpha> 0, \beta > 0\), and \(\Vert r_h^n(t) \Vert _{V_h^*}^2 \ge 0\) it is clear that \(\varphi \) is monotone increasing for \(\varphi _0 > 0\). The theory of ODEs shows that either \(T_{\max } = T\) or \(T_{\max } < T\) and \(\varphi (t) \rightarrow \infty \) as \(t \rightarrow T_{\max }\). We argue by contradiction that \(T_{\max } < T\) is impossible and therefore assume that \(T_{\max } < T\). Then, because of \(\varphi (t) \rightarrow \infty \) as \(t \rightarrow T_{\max }\), there is \(t_0 \in (0,T)\) such that \(\varphi (t) > 1\) for \(t \ge t_0\). For \(t \ge t_0\) it holds due to \(r \in (0,1)\) that

$$\begin{aligned} \varphi '(t) = \alpha \varphi (t) + \beta \varphi (t)^r + \eta ^{-1}\Vert r_h^n(t) \Vert _{V_h^*}^2 \le (\alpha + \beta ) \varphi (t) + \eta ^{-1}\Vert r_h^n(t) \Vert _{V_h^*}^2. \end{aligned}$$

By Gronwall’s Lemma [18, Corollary 2] we conclude that \(\varphi (t)\) stays bounded on I, which contradicts the assumption \(T_{\max } < T\). Therefore, all solutions \(\varphi \) have to exist on the whole time interval I. Thus, we have shown the estimate for the \(L^\infty (I,L^2)\)-error, from which we immediately obtain the \(L^2(I,L^2)\)-error by integration. Following again [53] we integrate (12) to obtain (16). \(\square \)

Note that the additional assumption on the residual is fulfilled, if e.g. \(u_h^n\) is piecewise \(C^1\) w.r.t. time on I. An explicit comparison principle for (12) as in [18, Corollary 2] would allow to obtain also explicit formulas in the estimates of Theorem 3.5. Unfortunately, we only found such results in the literature for the special cases \(\gamma \equiv 0\) or \(\alpha = 0\) [18, Theorems 21 and 23], that are not of interest in the present context.

4 A-posteriori RB- and EIM-error-estimates

It is a well-known issue in RB-methods that the evaluation of nonlinear terms such as \(\xi (u)\) requires access to the full number of degrees of freedom. Since the reasoning behind MOR is to avoid such computations within the full model, alternatives have to be found. In order to allow for an efficient offline-online splitting, the evaluation of nonlinearities in the reduced-order model for (Eq) needs to be done by methods of hyperreduction, e.g. the Empirical Interpolation Method (EIM, [8]). In this section we describe a very basic version of the latter technique applied to our model problem, and show how the additional errors can be incorporated in the a-posteriori error-estimates of Theorems 3.3 and 3.5 using the same technique as in [33, 34].

4.1 Empirical interpolation of \(\mathscr {A}\)

First, we introduce EIM as far as required for our purpose and as concise as possible. For details see e.g. [8, 64]. In order to present the main idea as clearly as possible, we stick to the continuous setting and omit space-discretization; the generalization to finite element spaces with a nodal basis is straightforward. Given so-called snapshots \(y_1, \ldots , y_N \in C(\overline{\Omega })\), and a tolerance \(\text {tol}_\text {EIM}> 0\), determine via a Greedy procedure some functions \(\Xi _1,\ldots ,\Xi _m \in C(\overline{\Omega })\), and interpolation points \(x_1, \ldots , x_m \in \overline{\Omega }\) such that

$$\begin{aligned} \xi (y_\ell (x_j)) = \sum _{k=1}^m c_{\ell ,k} \Xi _k(x_j), \qquad \ell =1,\ldots ,N, \quad j=1,\ldots ,m. \end{aligned}$$

implies \(\left\| \xi (y_\ell ) - \sum _{k=1}^m c_{\ell ,k} \Xi _k \right\| _{L^\infty } \le \text {tol}_\text {EIM}\). For some \(w \in C(\overline{\Omega })\) we define the EIM-approximation of \(\xi (w)\) as

$$\begin{aligned} \xi ^{\text {EIM}}_m(w) = \sum _{k=1}^m c_k \Xi _k \end{aligned}$$

where \(c \in \mathbb {R}^m\) solves the \(m \times m\)-system \( \xi (w(x_j)) = \sum _{k=1}^m c_k \Xi _k(x_j)\), \(j=1,\ldots ,m\). With this we may introduce a RB-EIM-reduced counterpart of (Eq) as

$$\begin{aligned} \left. \begin{aligned} \langle \partial _t u_h^{n,m}(t), \varphi _h^n \rangle _{H^{-1}_D, H^1_D} + \langle \mathscr {A}^{\text {EIM}}_m(u_h^{n,m}(t)) u_h^{n,m}(t), \varphi _h^n \rangle _{H^{-1}_D, H^1_D}&= \langle f(t), \varphi _h^n \rangle _{H^{-1}_D, H^1_D} \\&\qquad \forall t \in I, \varphi _h^n \in V_h^n, \\ u_h^{n,m}(0)&= P_n I_h u_0, \end{aligned} \right\} \qquad ({\hbox {Eq}_h\hbox {-RB}_n\hbox {-EIM}_m}) \end{aligned}$$

where \(\mathscr {A}_m^\text {EIM}\) denotes the EIM-reduced version of the nonlinear differential operator defined by

$$\begin{aligned} \langle \mathscr {A}_m^\text {EIM}(u) \varphi , \psi \rangle _{H^{-1}_D, H^1_D} := \int \limits _{\Omega } \xi _m^\text {EIM}(u)\mu \nabla u \nabla \varphi \text {d}x, \qquad \varphi , \psi \in H^1_D. \end{aligned}$$

Note that there is an efficient online evaluation of \(\mathscr {A}_m^\text {EIM}\), because the stiffness-matrices associated to the operators \(-\nabla \cdot \Xi _k \mu \nabla \) can be precomputed in the offline-phase. Therefore, we only have to deal with m and n degrees of freedom, respectively, when dealing with \(\mathscr {A}_m^\text {EIM}\). In the following we will denote the EIM-error by

$$\begin{aligned} \Delta ^{\text {EIM}}_m(u) := \Vert \xi (u) - \xi ^\text {EIM}_m(u) \Vert _{L^\infty }. \end{aligned}$$

Results addressing a-priori convergence of EIM can be found e.g. in [8, 24, 45]. For sophisticated algorithmic coupling of model order reduction and hyperreduction we refer to [19, 60] for instance. Other kinds of hyperreduction include e.g. discrete empirical interpolation (DEIM, [13]), or dynamic mode decomposition (DMD, [3]).

4.2 A-posteriori RB-error-estimates including the EIM-error

In this section we extend the results from Sect. 3 by incorporating also EIM-errors: It is clear that \(\mathscr {A}(u_h^n)u_h^n\), and therefore \(r_h^n\), cannot be computed efficiently during the online-phase due to the fact that the assembly of the stiffness-matrix for \(\mathscr {A}(u_h^n)\) requires us to use the full number of degrees of freedom. Hence, the estimates of Theorems 3.3 and 3.5 cannot be evaluated efficiently in the online-phase. Consequently, evaluation of \(r_h^n\) has to be avoided. Instead, given an arbitrary \(u_h^{n,m}\) fulfilling Assumption 3.1, we introduce the EIM-reduced residual \(r_h^{n,m}\) of \(u_h^{n,m}\) as

$$\begin{aligned} r_h^{n,m}(t) := \partial _t u_h^{n,m}(t) + \mathscr {A}^\text {EIM}_m(u_h^{n,m}(t))u_h^{n,m}(t) - f(t) \in V_h^* \hookrightarrow H^{-1}_D, \qquad t \in I. \end{aligned}$$
(17)

It is obvious, that \(r_h^{n,m}\) allows an efficient online evaluation. It remains to show how the error \(u_h^{n,m}-u_h\) to the truth-solution can be estimated in terms of \(r_h^{n,m}\) instead of \(r_h^n\). Since all changes in the arguments already known from Sect. 3 are straightforward utilizing the estimates (18) and (19) below, we omit the details and only state the results. As before, we will we omit the argument “t” in the following. A short computation as in Sect. 3.1 shows that the RB-EIM-error \(e_h^{n,m}:= u_h^{n,m}-u_h\) fulfills

$$\begin{aligned} \langle r_h^{n,m}, \varphi _h \rangle _{H^{-1}_D, H^1_D}&= \langle \partial _t e_h^{n,m}, \varphi _h \rangle _{H^{-1}_D, H^1_D} + \langle \mathscr {A}(u_h^{n,m}) u_h^{n,m}- \mathscr {A}(u_h)u_h, \varphi _h \rangle _{H^{-1}_D, H^1_D} \\&\quad + \langle \mathscr {A}^\text {EIM}_m(u_h^{n,m})u_h^{n,m}- \mathscr {A}(u_h^{n,m})u_h^{n,m}, \varphi _h \rangle _{H^{-1}_D, H^1_D}. \end{aligned}$$

As before it follows:

$$\begin{aligned} \frac{d}{dt} \frac{1}{2} \Vert e_h^{n,m}\Vert _{L^2}^2 + \xi _\bullet \mu _\bullet |e_h^{n,m}|_{H^1}^2\le & {} \langle r_h^{n,m}, e_h^{n,m}\rangle _{H^{-1}_D, H^1_D} \nonumber \\&- \int \limits _\Omega (\xi (u_h^{n,m})-\xi (u_h)) \mu \nabla u_h\nabla e_h^{n,m}\text {d}x\nonumber \\&- \int \limits _\Omega (\xi ^\text {EIM}(u_h^{n,m}) - \xi (u_h^{n,m})) \mu \nabla u_h^{n,m}\nabla e_h^{n,m}\text {d}x. \end{aligned}$$
(18)

The second summand on the right-hand side can be estimated as in Sects. 3.2 and 3.3. The third summand is estimated as follows:

$$\begin{aligned} \left| \int \limits _\Omega (\xi ^\text {EIM}(u_h^{n,m}) - \xi (u_h^{n,m})) \mu \nabla u_h^{n,m}\nabla e_h^{n,m}\text {d}x\right|\le & {} \Delta ^\text {EIM}_m(u_h^{n,m}) \mu ^\bullet |u_h^{n,m}|_{H^1} \Vert e_h^{n,m}\Vert _{H^1} \nonumber \\\le & {} \frac{1}{2\delta } \Delta ^\text {EIM}_m(u_h^{n,m}) \mu ^\bullet |u_h^{n,m}|_{H^1}^2 + \frac{1}{2}\delta \Delta ^\text {EIM}_m(u_h^{n,m}) \mu ^\bullet \Vert e_h^{n,m}\Vert _{H^1}^2, \end{aligned}$$
(19)

where \(\delta > 0\) is the parameter in Young’s inequality. With this, we are ready to state the modified versions of the two main results from Sect. 3, beginning with the modified version of Theorem 3.3:

Theorem 4.1

Let Assumptions 2.1and 3.1hold, and let \(c_\text {Lip}> 0\) such that

$$\begin{aligned} |u_h(t) |_{W^{1,\infty }} \le c_\text {Lip}\qquad \forall t \in I. \end{aligned}$$

Given \(u_h^{n,m}\), choose \(\varepsilon , \eta , \delta > 0\) such that

$$\begin{aligned} \eta + \varepsilon |\xi ' |_\infty \mu ^\bullet c_\text {Lip}+ \delta \Delta _m^\text {EIM}\mu ^\bullet = \xi _\bullet \mu _\bullet , \end{aligned}$$

is satisfied with the EIM-error \(\Delta _m^\text {EIM}:= \sup _{t\in I} \Delta _m^\text {EIM}(u_h^{n,m}(t))\). Moreover, we introduce the constant \(\beta := 2\left( \frac{1}{2\varepsilon } |\xi ' |_\infty \mu ^\bullet c_\text {Lip}+ \xi _\bullet \mu _\bullet \right) \). Then the following a-posteriori error-estimates for \(u_h^{n,m}\) hold true:

$$\begin{aligned} \Vert e_h^{n,m}(t) \Vert _{L^2}^2&\le e^{\beta t}\Vert u_h^{n,m}(0)-u_h(0) \Vert _{L^2}^2 \nonumber \\&\quad + \int \limits _0^t e^{\beta (t-s)} \left( \eta ^{-1} \Vert r_h^{n,m}(s) \Vert _{V_h^*}^2 + \delta ^{-1} \Delta _m^\text {EIM}\mu ^\bullet |u_h^{n,m}(s) |_{H^1}^2 \right) \text {d}s, \end{aligned}$$
(20)
$$\begin{aligned} \Vert e_h^{n,m}\Vert _{L^2(I,L^2)}^2&\le \beta ^{-1} \left( e^{\beta T} - 1 \right) \Vert u_h^{n,m}(0)-u_h(0) \Vert _{L^2}^2 \nonumber \\&\quad + \beta ^{-1} \int \limits _0^T (e^{\beta (T-t)}-1) \left( \eta ^{-1} \Vert r_h^{n,m}(t) \Vert _{V_h^*}^2 + \delta ^{-1} \Delta _m^\text {EIM}\mu ^\bullet |u_h^{n,m}(t) |_{H^1}^2 \right) \text {d}t, \end{aligned}$$
(21)
$$\begin{aligned} \Vert e_h^{n,m}\Vert _{L^2(I,H^1_D)}^2&\le \xi _\bullet ^{-1} \mu _\bullet ^{-1} e^{\beta T} \Vert u_h^{n,m}(0)-u_h(0) \Vert _{L^2}^2 \nonumber \\&\quad + \xi _\bullet ^{-1} \mu _\bullet ^{-1} \int \limits _0^T e^{\beta (T-t)} \left( \eta ^{-1} \Vert r_h^{n,m}(t) \Vert _{V_h^*}^2 + \delta ^{-1} \Delta _m^\text {EIM}\mu ^\bullet |u_h^{n,m}(t) |_{H^1}^2 \right) \text {d}t. \end{aligned}$$
(22)

We also fix the following simplified estimates, that are less sharp but exhibit a favorable structure: They are weighted sums of the initial \(L^2\)-error, the \(L^2\)-\(V_h^*\)-norm of the residual, and the EIM-error. This allows to determine the optimal choice of the parameters \(\varepsilon , \eta , \delta \) for these simpler estimates.

Corollary 4.2

Under the assumptions of the previous theorem it holds:

$$\begin{aligned} \Vert e_h^{n,m}(t) \Vert _{L^2}^2&\le e^{\beta t}\Vert u_h^{n,m}(0)-u_h(0) \Vert _{L^2}^2 + e^{\beta t} \eta ^{-1} \int \limits _0^t\Vert r_h^{n,m}(s) \Vert _{V_h^*}^2 \text {d}s\\&\quad + \delta ^{-1} e^{\beta t} \Delta _m^\text {EIM}\mu ^\bullet \int \limits _0^t |u_h^{n,m}(s) |_{H^1}^2 \text {d}s, \\ \Vert e_h^{n,m}\Vert _{L^2(I,L^2)}^2&\le \beta ^{-1} \left( e^{\beta T} - 1 \right) \Vert u_h^{n,m}(0)-u_h(0) \Vert _{L^2}^2 \\&\quad + \beta ^{-1} (e^{\beta T}-1) \left( \eta ^{-1} \Vert r_h^{n,m}\Vert _{L^2(I,V_h^*)}^2 + \delta ^{-1} \Delta _m^\text {EIM}\mu ^\bullet |u_h^{n,m}|_{L^2(I,H^1)}^2 \right) , \\ \Vert e_h^{n,m}\Vert _{L^2(I,H^1_D)}^2&\le e^{\beta T} \xi _\bullet ^{-1} \mu _\bullet ^{-1} \Vert u_h^{n,m}(0)-u_h(0) \Vert _{L^2}^2 \\&\quad + e^{\beta T} \xi _\bullet ^{-1} \mu _\bullet ^{-1} \left( \eta ^{-1} \Vert r_h^{n,m}\Vert _{L^2(I,V_h^*)}^2 + \delta ^{-1} \Delta _m^\text {EIM}\mu ^\bullet |u_h^{n,m}|_{L^2(I,H^1)}^2 \right) . \end{aligned}$$

The same technique as in Sect. 3.3 yields the following result:

Theorem 4.3

Let Assumptions 2.1and 3.1hold, and let \(p > d\) and \(c_p > 0\) such that

$$\begin{aligned} |u_h(t) |_{W^{1,p}} \le c_{p} \qquad \forall t \in I. \end{aligned}$$

Moreover, we assume that the initial error does not vanish, i.e. \(\Vert e_h^{n,m}(0) \Vert _{L^2} > 0\), and that \(t \mapsto \Vert r_h^n(t) \Vert _{V_h^*}^2\) is piecewise continuous on I. Choose \(\varepsilon , \eta , \delta > 0\) such that

$$\begin{aligned} \xi _\bullet \mu _\bullet = \eta + \varepsilon \cdot \mu ^\bullet (2\xi ^\bullet )^{1-\frac{2}{q}} |\xi ' |_\infty ^{\frac{2}{q}} c_p + \delta \Delta _m^\text {EIM}\mu ^\bullet \end{aligned}$$

is satisfied for the EIM-error \(\Delta ^\text {EIM}_m = \sup _{t \in I} \Delta ^\text {EIM}_M(u_h^{n,m}(t))\). Given the constants \(\alpha = 2 \xi _\bullet \mu _\bullet \), \(\beta = \varepsilon ^{-1} \mu ^\bullet (2\xi ^\bullet )^{1-\frac{2}{q}} |\xi ' |_\infty ^{\frac{2}{q}} c_p\), and \(r = 1-\frac{2}{p}\), let \(\varphi \!:I \rightarrow [0,\infty )\) be the solution to

$$\begin{aligned} \varphi '(t)&= \alpha \varphi (t) + \beta \varphi (t)^r + \eta ^{-1} \Vert r_h^{n,m}(t) \Vert _{V_h^*}^2 + \delta ^{-1} \Delta ^\text {EIM}_m \mu ^\bullet |u_h^{n,m}(t) |_{H^1}^2, \qquad t \in I, \\ \varphi (0)&= \Vert e_h^{n,m}(0) \Vert _{L^2}^2. \end{aligned}$$

Then the following a-posteriori error-estimates hold true:

$$\begin{aligned} \Vert e_h^{n,m}(t) \Vert _{L^2}^2&\le \varphi (t), \quad \forall t \in I, \qquad \Vert e_h^{n,m}\Vert _{L^2(I,L^2)}^2 \le \int \limits _0^T \varphi (s) \text {d}s, \end{aligned}$$
(23)
$$\begin{aligned} \Vert e_h^{n,m}\Vert _{L^2(I,H^1_D)}^2&\le \frac{1}{\xi _\bullet \mu _\bullet } \left( \Vert u_h^{n,m}(0)-u_h(0) \Vert _{L^2}^2 + \eta ^{-1} \Vert r_h^n\Vert _{L^2(I,V_h^*)}^2 \right. \nonumber \\&\quad + \delta ^{-1} \Delta ^\text {EIM}_m \mu ^\bullet |u_h^{n,m}|_{L^2(I,H^1)}^2 \left. + \alpha \int \limits _0^T \varphi (s) \text {d}s+ \beta \int \limits _0^T \varphi (s)^{2/q}\text {d}s\right) . \end{aligned}$$
(24)

Note that the error-estimates in the results above hold true without additional assumptions on the size of residuals and EIM-errors. In practice, however, in order to obtain an accurate reduced-order model and corresponding small error-estimates, it will be necessary to construct RB- and EIM-bases in such a way that the size of residuals and EIM-errors is balanced appropriately. We do not address this issue here and refer for instance to [19, 60].

Moreover, let us point out that the EIM-error \(\Delta _m^\text {EIM}(u_h^{n,m})\) at \(u_h^{n,m}\) cannot be computed without referring to the full number of degrees of freedom; however, computation of \(\Vert \xi (u_h^{n,m})-\xi _m^\text {EIM}(u_h^{n,m})\Vert _{L^\infty }\) in the full degrees of freedom is still much cheaper than computation of the respective full stiffness-matrices associated with the nonlinear elliptic operator that would be required for the computation of \(r_h^n\). In contrast, note that the \(H^1\)-semi-norm of \(u_h^{n,m}\) required in Theorems 4.1 and 4.3 admits efficient online evaluation, because it is induced by a bilinear form whose matrix w.r.t. the basis of \(V_h^n\) can be precomputed and saved. Similarly, also the weight-matrices for the evaluation of the EIM-reduced residual can be precomputed and saved in the offline-phase.

To conclude this section, we shortly outline a possibility to relax Assumption 3.1 (2) in order to allow error estimation also for a discontinuous in time trajectory.

Remark 4.4

Let \(u_h^{n,m}\) e.g. be given as

$$\begin{aligned} u_h^{n,m}:= \mathbf{1}_{\{0\}} U^{n,m}_{h,0} + \sum _{\ell =1}^{N_t} \mathbf{1}_{(t_{\ell -1},t_\ell ]} U^{n,m}_{h,\ell }, \qquad U^{n,m}_{h,\ell } \in V_h^n, \quad \ell = 0,\ldots ,N_t, \end{aligned}$$

for a partition \(0 = t_0< t_1< \cdots< t_{N_t -1} < t_{N_t} = T\). Such \(u_h^{n,m}\) might be obtained by applying the backward Euler method in its DG0-formulation to (Eq\(_h\)-RB\(_n\)-EIM\(_m\)). Since our error-estimates do not apply directly to \(u_h^{n,m}\) due to discontinuity w.r.t. time, we replace \(u_h^{n,m}\) by its piecewise linear and continuous w.r.t time interpolation \({{\hat{u}}}_h^{n,m}\) w.r.t. the same partition defined by \({{\hat{u}}}^{n,m}_h(t_\ell ) := u_h^{n,m}(t_\ell ) = U_{h,\ell }^{n,m}\) for \(\ell = 0,\ldots ,N_t\). Obviously, Theorems 4.1 and 4.3 apply to \({{\hat{u}}}_h^{n,m}\), and to obtain an estimate for the overall error we need to add the interpolation error \({{\hat{u}}}_h^{n,m}-u_h^{n,m}\). The latter can be computed explicitely:

$$\begin{aligned} \Vert u_h^{n,m}- {{\hat{u}}}_h^{n,m} \Vert _{L^\infty (I,L^2)}^2&\le \max _{1 \le \ell \le N_t} \Vert U_{h,\ell }^{n,m} - U_{h,\ell -1}^{n,m}\Vert _{L^2}^2, \\ \Vert u_h^{n,m}- {{\hat{u}}}_h^{n,m} \Vert _{L^2(I,L^2)}^2&\le \sum _{\ell =1}^{N_t} \frac{1}{3}(t_\ell -t_{\ell -1}) \Vert U_{h,\ell }^{n,m} - U_{h,\ell -1}^{n,m}\Vert _{L^2}^2, \\ \Vert u_h^{n,m}- {{\hat{u}}}_h^{n,m} \Vert _{L^2(I,H^1)}^2&\le \sum _{\ell =1}^{N_t} \frac{1}{3}(t_\ell -t_{\ell -1}) \Vert U_{h,\ell }^{n,m} - U_{h,\ell -1}^{n,m}\Vert _{H^1}^2. \end{aligned}$$

The appearance of such jump-terms is what we may expect for an a-posteriori error for a discontinuous-in-time trajectory. Note that compared to classical a-posteriori error-estimates for discontinuous-in-time methods, see [42, 62] for instance, we do not assume that \(u_h^{n,m}\) is the solution to a discrete-in-time analogue to (Eq\(_h\)-RB\(_n\)).

5 Numerical illustration for POD-MOR

In this final section of the paper we illustrate and compare the quality of our RB-EIM-a-posteriori error-estimates numerically for three prototypical test problems. Although the results of this paper apply to general RB-methods, our particular focus is on POD-MOR. Therefore, we restrict ourselves to reduced ansatz-spaces \(V_h^n\) spanned by a POD-basis of rank n in our numerical tests.

5.1 Test problems and technical details

The two-dimensional domain \(\Omega = [0,1]^2\) and the time interval \(I = [0,1]\) are the same in all three test problems. We fix two discs \(C_1 = B_{\frac{1}{5}}\left( \frac{1}{4},\frac{1}{4}\right) \) and \(C_2 = B_{\frac{1}{5}}\left( \frac{3}{4},\frac{3}{4}\right) \), and the three boundary parts \(\Gamma _1 = \{x \in \partial \Omega \!:x_2 = 1 \}\), \(\Gamma _2 = \{ x \in \partial \Omega \!:x_1 = 0, x_2 < \frac{1}{2}\}\), \(\Gamma _3 = \{ x \in \partial \Omega \!:x_1 = 1, x_2 < \frac{1}{2}\}\). The nonlinearity is given by

$$\begin{aligned} \xi (u) = \frac{3}{4} + \frac{1}{2(1+e^{5u})}. \end{aligned}$$

We introduce the three test problems P1-P3 by equipping the equation

$$\begin{aligned} \partial _t u - \nabla \cdot \xi (u) \nabla u&= 10 \sin (2\pi t) \mathbf{1}_{C_1} - 10 \cos (2\pi t) \mathbf{1}_{C_2}, \end{aligned}$$

with the following boundary and initial conditions, respectively:

(P1) Pure homogeneous Dirichlet boundary conditions and zero initial condition.

(P2) Pure homogeneous Neumann boundary conditions and zero initial condition.

(P3) Mixed boundary conditions: homogeneous Dirichlet boundary condition \(u = 0\) on \(I \times \Gamma _1\), non-homogeneous Neumann conditions \(\xi (u) \partial _n u = \sin (2\pi t)\) on \(I \times \Gamma _2\), and \(\xi (u) \partial _n u = -\cos (2\pi t)\) on \(I \times \Gamma _3\), and natural boundary condition \(\partial _n u = 0\) on the remaining part of the boundary. The initial condition is \([u(0)](x_1,x_2) := \frac{1}{10}(1-x_1)\).

Space- and time-discretization All computations are done utilizing FEniCS [4, 43] and piecewise linear finite elements on a mesh generated by mshr, the mesh-generation tool of FEniCS, with \(N_h = 5769\) degrees of freedom and maximum cell diameter \(h_{\max } \approx 2.1 \cdot 10^{-2}\). The POD-basis is generated with snapshots coming from an (implicit) Crank-Nicolson solution of the equation with \(N_t = 2500\) timesteps (“reference solution”). Hereby, the appearing nonlinear equations are solved by the built-in nonlinear solver of FEniCS. The same set of snapshots is also used to generate the EIM-approximation of the nonlinearity in a standard greedy procedure with \(L^\infty \)-tolerance \(10^{-6}\) independent of the number of POD-basis functions, i.e. we do not balance accuracy of POD- and EIM-approximation. The POD-EIM-reduced equation is again solved utilizing the (implicit) Crank-Nicolson scheme with \(N_t = 2500\) timesteps, whereby the nonlinear algebraic equations appearing in every timestep are solved by a standard Newton-method that is initialized with a semi-implicit Euler step as first guess (“reduced solution”). Approximate true \(L^2(I,L^2)\)-, \(L^\infty (I,L^2)\), and \(L^2(I,H^1)\)-errors are computed with respect to a further numerical solution that is computed on the same finite element mesh, but with a four times higher number of timesteps than for the snapshot generation (“truth-solution”). Finally, to ensure comparability between the different test problems and norms, all errors and estimates are relative errors, i.e. the absolute error or error-estimate is divided by the corresponding norm of the truth-solution.

Estimation of the required parameters Parameters like \(\xi _\bullet , \mu _\bullet , |\xi ' |_\infty \) etc. are known from the problem data. The solution-dependent parameters are found as follows: The norms of \(u_h\) are computed exactly based on the truth-solution in order to give the possibility to determine whether our estimates are sharp or not under the exact data. In real applications we would have to estimates those norms appropriately. The quality of the error-estimates –as absolute values– can heavily deteriorate in case of “safe” (i.e. large) estimates for the parameters. The same might happen in case of just inconvenient problem data due to the exponential terms in the estimates. However, we would like to point out that one might still hope in such a case that the relative behavior of the estimates, i.e. whether they decrease/increase by some factor, provides some information on the quality of the reduced model. Although we compute the EIM-error \(\Delta _m^\text {EIM}\) as defined in Sect. 4.1 by accessing the full number of degrees of freedom, we did not observe significant time consumption for this. We believe that this is due to the fact that evaluation of \(\xi (u_h^{n,m})\) in the full model is much cheaper than assembling the corresponding stiffness-matrices in the full model.

Choice of the exponent p In order to obtain meaningful results we had to use relatively large values for p, e.g. \(p=16\). Therefore, choosing p according to the requirements of [9], i.e. only slightly larger than d in general, seems to be difficult.

Estimates for Approach I (Theorem 4.1) For Approach I we determine the parameters \(\varepsilon , \eta , \delta \) in such a way that the simpler estimates for the \(L^2(I,H^1)\)-error in Corollary 4.2 become optimal, and plug in the same parameters into the estimates from Theorem 4.1. Integrals with respect to time (residuals or weighted residuals in the formulas of Theorem 4.1) are evaluated using Gauss-quadrature of order 2 on every subinterval given by the timesteps.

Estimates for Approach II (Theorem 4.3) The following parameters turned out to be a good choice:

$$\begin{aligned} \eta = \frac{1}{10} (1-\Delta _m^\text {EIM}) \xi _\bullet \mu _\bullet , \qquad \varepsilon = \frac{9}{10} \frac{1-\Delta _m^\text {EIM}}{\mu ^\bullet (2\xi ^\bullet )^{1-\frac{2}{q}} |\xi ' |_\infty ^{\frac{2}{q}} c_p}, \qquad \delta = \frac{\xi _\bullet \mu _\bullet }{\mu ^\bullet }. \end{aligned}$$

Note that optimization of the parameters as in Approach I is not possible because we do not have an explicit formula at hand. The ODE for the evaluation of \(\varphi \) is solved utilizing the backward difference formulae solver (BDF) within the solve_ivp-routine from scipy.integrate, with relative tolerance rtol=\(10^{-6}\), and absolute tolerance atol=\(10^{-3} \cdot \Vert u_h(0) - u_h^n(0)\Vert _{L^2}^2\). The maximal allowed step size is the same as the size of timesteps in the reduced model. We found that among other methods (Runge-Kutte with 2/3 and 4/5 stages, Radau) this choice delivered the best results. However, it is clear that the numerical approximation of \(\varphi \) is challenging (in particular for small p or small initial values), which might influence the reliability of the results.

5.2 Discussion of the results

Figures 1, 2, 3 and 4 show the results of our experiments. It can be seen that Approach I yields better results the smoother the truth-solution is: Test problems P1 and P2 (homogeneous boundary conditions, Figs. 1 and 2) perform better than the problem with mixed boundary conditions (Test problem P3, Fig. 3). Moreover, we observe that the a-posteriori error-estimates of both approaches start stagnating at about the same point at which also the true errors stagnate due to time-discretization. Indeed, in Fig. 4 it can be seen that this stagnation comes from stagnation of the residual norm at roughly the same magnitude as the size of time-steps. This indicates that from that point on the overall accuracy of the reduced-order model cannot be improved further by increasing the number of basis functions. See also e.g. [28] for balancing of POD-MOR- and time-discretization-errors for linear-quadratic parabolic optimal control problems.

Fig. 1
figure 1

Test problem P1 (homogeneous Dirichlet boundary conditions): a) Estimates from Approach I (\(\bullet \): Theorem 4.1 with optimized parameters, dashed lines: Corollary 4.2). b) Estimates from Approach II (\(\blacktriangle \): \(p=6\), \(\blacksquare \): \(p=16\), \(\blacklozenge \), \(p=32\)). \(L^\infty (I,L^2)\)-, \(L^2(I,L^2)\)-, and \(L^2(I,H^1)\)-errors are displayed in black, blue, and red, respectively. Approximate true errors w.r.t. the truth-solution are included in dotted lines

Fig. 2
figure 2

Example P2 (homogeneous Neumann boundary conditions): a) Estimates from Approach I (\(\bullet \): Theorem 4.1 with optimized parameters, dashed lines: Corollary 4.2). b) Estimates from Approach II for \(p=16\). \(L^\infty (I,L^2)\)-, \(L^2(I,L^2)\)-, and \(L^2(I,H^1)\)-errors are displayed in black, blue, and red, respectively. Approximate true errors w.r.t. the truth-solution are included in dotted lines

Fig. 3
figure 3

Example P3 (mixed boundary conditions): a Estimates from Approach I (\(\bullet \): Theorem 4.1 with optimized parameters, dashed lines: Corollary 4.2). b Estimates from Approach II for \(p=16\). \(L^\infty (I,L^2)\)-, \(L^2(I,L^2)\)-, and \(L^2(I,H^1)\)-errors are displayed in black, blue, and red, respectively. Approximate true errors w.r.t. the truth-solution are included in dotted lines

Fig. 4
figure 4

Error contributions in Example P3 (mixed boundary conditions): blue \(\blacksquare \): residual norms \(\Vert r_h^n\Vert _{L^2(I,V_h^*)}\), black \(\bullet \): initial errors \(\Vert e_h^n(0) \Vert _{L^2}\), red \(\blacklozenge \): EIM-errors \(\Delta ^{\text {EIM}}\) for \(u_h^n\)

How much Approach II depends on the choice of the exponent p can be seen in Fig. 2b. The estimates stagnate very early for small p, i.e. Approach II unfortunately does not yield reasonable results in that case. For large p the estimates seem to get closer to the values of Approach I. In this sense one might interpret Approach II as a modification of Approach I that trades strength of the required assumption (bigger p means stronger assumption) against quality of results (smaller p means less meaningful results and numerical instability).

For the computing times observed in our numerical experiments we refer to Table 1: The evaluation of the POD-EIM-reduced model is about 25- to 100-times faster than the evaluation of the full model. We believe that even higher speedups might be possible in case of finer finite element discretization. Compared to the computing time for the full model, evaluation of the a-posteriori error-estimates from Approach I is quite cheap: Evaluation of the POD-EIM-reduced model together with computation of an error-estimate still yields a speedup of factor at least 10. As expected, evaluation of the estimates from Approach II needs slightly more time.

Table 1 Computing times for the setup of the EIM-reduction of the nonlinearity, the evaluation of the POD-EIM-reduced model, and the error-estimates, respectively