1 Introduction

Modeling, predicting, and control of fracture or damage in solid materials are of great technical importance for the safety requirements of structures in various fields of engineering, e.g., automobile components, aerospace, and marine industries. Therefore, developing a comprehensive model of fracture propagation has long been a challenge in physics, mechanics, and material sciences (Lawn 1993; Marder and Fineberg 1996). The classical method for modeling the fracture propagation is to consider a sharp interface in order to separate the structure explicitly into a fully broken part and a fully intact one. This approach implies tracking the exact position of the interface to be able to follow the propagation of the fracture. Therefore, in finite element settings for fracture description, the numerical implementation requires handling of the discontinuities. To overcome the problem of explicit interface tracking, the phase-field method, going back to Ambrosio and Tortorelli (1990), is now widely used for the description of fracture phenomena. This method is also attractive because of the ability of simulating the fracture initiation, propagation, merging, and branching. The phase-field approach to model the fracture, as a two-phase discontinuous model with a sharp interface, consists of introducing a continuous phase-field variable in order to approximate the sharp fracture discontinuity. The phase-field variable smoothly differentiates between the two phases. In fact, the fracture phase-field represents the smooth transition from the fully destroyed phase to the fully intact part. The fracture propagation is tracked by the evolution of the phase field.

In this paper, our linearized model is based upon classical Griffith’s theory of fracture (Griffith 1921) which was rewritten as a variational model in Francfort and Marigo (1998). For some overview and summary of the obtained results, see, e.g., Bourdin et al. (2008), Miehe et al. (2010), Ambati et al. (2015). As the variational inequality, resulting from the fracture irreversibility, is sometimes undesirable when working in optimization, we allow for a regularization of the irreversibility analyzed in Neitzel et al. (2017, 2019).

The approximation error analysis for finite element simulations of fractures is only considered in forward simulations. See, e.g., Negri (2003) for fracture propagation without a phase-field, or Burke et al. (2010, 2013) for a phase-field fracture model. However, even in this literature only qualitative convergence has been shown. To the best of our knowledge, no quantitative convergence explicitly clarifying the dependence of the convergence speed on the mesh-size and the problem data can be found in studies on FE-analysis of fracture propagation. Since in optimization problems the problem data vary, this quantitative dependence is crucial in the discretization error analysis of optimization problems.

To deal with this lack of analysis, we provide analysis for a linearized fracture model considered within an optimization problem. While this equation is linear, it does not correspond to a positive definite bilinear form. Furthermore, the regularity in the linear equation is severely limited, due to the non-smooth coefficients induced by the known, low, regularity of the nonlinear fracture problem. These two points are in contrast to the usual assumptions made in the discretization error analysis of optimization problems. In fact, known regularity results for the coefficients allow for \(W^{1,p}\) regularity of the solutions only, thereby prohibiting quantitative rates of convergence in \(W^{1,2}\) as would be needed for the standard error analysis of elliptic optimization problems, compare, e.g., Falk (1973), Geveci (1979), Malanowski (1982), Arada et al. (2002), Rösch (2006), Meyer and Rösch (2004), Meyer and Rösch (2005), Hinze (2005) to name only a few.

To circumvent the problems coming from indefinite bilinear forms, we will utilize an approach proposed by Schatz (1974), to show that if the continuous linearized fracture model admits a unique solution, the discretized equation does so too, for sufficiently small mesh size. We can then expand the technique of Gaspoz et al. (2019), to assert that the same asymptotic error estimates hold for the adjoint problem as well.

Finally, the lack in regularity can be avoided, as Haller-Dintelmann et al. (2019) have shown that a slightly improved differentiability may be assumed for solutions to fracture problems. This will be crucial for our work in obtaining quantitative estimates for the discretization error.

While we do not tackle the nonlinear fracture problem, the obtained discretization errors can then be utilized in SQP-type methods applied to the control of nonlinear fracture problems to efficiently couple discretization error and progress in the optimization variable, see, e.g., Ziems and Ulbrich (2011). It should be noted that we consider a time-discrete damage model, only. For the considered quasi-static fracture model, the time evolution is not smooth, in general, and thus typical convergence estimates, for the forward evolution of the fracture, provide only weak convergence in space-time, see, e.g., Bartels et al. (2018). It is subject to future research to see, if concepts such as BV-solutions, see, e.g., Knees (2019), provide a suitable framework for an analogous error analysis with respect to the time discretization.

This paper is organized as follows. In Sect. 2, we review the linearized model of the fracture control problem which was discussed in Neitzel et al. (2017). Section 3 contains the finite element discretization of the considered model, and an priori error estimate for the discretized model. This section is subdivided, for better clarity, into three parts.

First, in Sect. 3.1, we consider the case when the linearized equation is an isomorphism, but the corresponding bilinear form is not positive definite. Based upon an approach by Schatz (1974), we will utilize compactness to show that for sufficiently small mesh sizes the discretized equation remains an isomorphism, and that the error satisfies a quasi-best approximation property. Second, based upon this result, and an improved differentiability result by Haller-Dintelmann et al. (2019), we can utilize standard techniques to derive a posteriori error estimates for the optimization problem using a discretization approach suggested by Hinze (2005) in Sect. 3.2. Finally, in Sect. 3.3, we will extend a new technique, developed in Gaspoz et al. (2019), to show that a similar estimate also holds for the adjoint variable, which will also provide a means for obtaining improved error estimates. We will place particular emphasis on the stability of the estimates with respect to variations of the linearization point as it is needed for the inexact iterative solution of a corresponding nonlinear optimization problem via, e.g., an SQP algorithm.

Section 4 presents the numerical test highlighting the reduced rates of convergence compared to the standard setting where smooth coefficients are considered.

Throughout, c denotes a generic constant which may be different at each instance.

2 The linearized fracture control problem

Let \(\varOmega \subset {{\mathbb {R}}}^2\) be a bounded polygonal domain, with boundary \(\partial \varOmega\) consisting of \(\varGamma _D\) and \(\varGamma _N\) with

$$\begin{aligned} {\mathscr {H}}^{1}(\varGamma _D) \ne 0 \quad \text {and } \quad {\mathscr {H}}^{1}(\varGamma _N) \ne 0, \end{aligned}$$

where \({\mathscr {H}}^{1}\) is the 1-dimensional Hausdorff-measure, and \(\varGamma _D\) and \(\varGamma _N\) are the parts where Dirichlet and Neumann boundary conditions are imposed, respectively. We assume that \(\varOmega \cup \varGamma _N\) is regular in the sense of Gröger (1989), and the fracture propagation is controlled by the traction force q acting on the boundary \(\varGamma _N\).

By u we represent the vector-valued displacement field, in the space of admissible displacements \(H^1_D(\varOmega ; {{\mathbb {R}}}^2):=\{v \in H^1(\varOmega ; {{\mathbb {R}}}^2) \, | \, v=0 \, \text {on} \; \varGamma _D\}\). The usual \(L^2(\varOmega )\)-scalar product, and the corresponding norm, are denoted by \((\cdot , \cdot )\) and \(\Vert \cdot \Vert\) respectively. We use appropriate subscripts 1, p, or r in the norms in corresponding Sobolev spaces \(W^{1,p}(\varOmega )\) or \(H^r(\varOmega ) = W^{r,2}(\varOmega )\). The \(L^2(\varGamma _N;{{\mathbb {R}}}^2)\)-norm and the associated scalar product will be indicated by a subscript \(\varGamma _N\).

Following Neitzel et al. (2017), the fracture \({\mathscr {C}}\) is initially modeled by Griffith’s criterion for brittle fracture, which assumes the fracture propagates when the elastic energy restitution rate reaches its critical value \(G_c\). It is then regularized by a phase-field approach. The phase-field variable \(\varphi\) represents the fracture region by \(\varphi =0\) and the non-fractured part by \(\varphi =1\). The values in between, \(0<\varphi <1\), correspond to a transition zone with width \(\varepsilon\) on each side of the fracture path. The problem is then to find u(t) and \(\varphi (t)\) minimizing the energy of the system subject to the irreversibility constraint

$$\begin{aligned} \varphi (t_2) \le \varphi (t_1) \quad \forall \, t_1 \le t_2. \end{aligned}$$

After introducing a time partition, the time evolution of the fracture is given by a sequence of problems associated to each time step. As the error estimate, considered here, remains invariant for any time level, we ignore the time discretization, and provide the argument only for one time step.

In order to avoid degeneracy in the elastic energy, the model is further regularized by the parameter \(\kappa >0\), \(\kappa \ll \varepsilon\), and the coefficient function \(g(\varphi )\). To guarantee the irreversibility of the fracture as well as the differentiability, the regularized fracture model is relaxed by a penalization term with some positive factor \(\gamma\). Letting \({\mathbb {C}}\) represent the elasticity tensor, and \(e(u)=\frac{1}{2}(\nabla u + \nabla u^T)\) the symmetric gradient, the fracture model presented in Neitzel et al. (2017) asserts that any energy minimizer \(\mathbf{u}= (u, \varphi )\) satisfies the Euler–Lagrange equations

$$\begin{aligned} \begin{aligned}&\Bigl (g(\varphi ){\mathbb {C}} e(u),e(v)\Bigr ) - (q,v)_{\varGamma _N} = 0\\&G_c\varepsilon (\nabla \varphi ,\nabla \psi ) - \frac{G_c}{\varepsilon } (1-\varphi ,\psi )+ (1-\kappa )( \varphi {\mathbb {C}} e(u):e(u), \psi )\\&\quad + \gamma ([(\varphi -\varphi ^{0})^+]^3,\psi )=0 \end{aligned} \end{aligned}$$
(1)

for a given \(\varphi ^0 \in H^1(\varOmega )\), \(0 \le \varphi ^0 \le 1\), a given \(q \in Q\), and any \((v,\psi ) \in V\). Here

$$\begin{aligned} Q := L^2(\varGamma _N;{{\mathbb {R}}}^2), \quad \quad V := H^1_D(\varOmega ;{{\mathbb {R}}}^2)\times H^1(\varOmega ) = V_u \times V_\varphi , \end{aligned}$$

and the coefficient function is given by

$$\begin{aligned} g(\varphi ):=(1-\kappa )\varphi ^2 + \kappa . \end{aligned}$$

It is further shown in Neitzel et al. (2017) that there exists at least one solution \(\mathbf{u}=(u,\varphi )\) of (1) in V, while any solution of (1) satisfies the additional regularity

$$\begin{aligned} \mathbf{u}\in W^{1,p}(\varOmega ;\,{{\mathbb {R}}}^2) \times L^\infty (\varOmega ) \end{aligned}$$

for some \(p>2\), depending only on \(\kappa\) and \(\varOmega\). More precisely, it holds \(0 \le \varphi \le 1\), and there exists a constant \(c_\kappa\) depending on \(\kappa\) such that

$$\begin{aligned} \Vert u\Vert _{1,p} \le c_\kappa \Vert q\Vert _{\varGamma _N}. \end{aligned}$$

Very recently, a higher regularity of the solution is derived in Haller-Dintelmann et al. (2019). Based on this, noting that \(\varOmega\) is \(W^{2,p/2}\) regular for sufficiently small \(p > 2\), see, e.g., Grisvard (1985, Theorem 4.3.2.4), and assuming \(\varphi ^0 \in W^{2,p/2}(\varOmega )\), Neitzel et al. (2019) provides the \(\gamma\) independent estimate

$$\begin{aligned} \Vert u\Vert _{1+s} \le c(\Vert q\Vert _{\varGamma _N}, \varepsilon ) \end{aligned}$$
(2)

with a sufficiently small positive s, depending only on \(\kappa\), \(\varOmega\).

Since the fracture is modeled in order to finally propagate subject to an optimal control, it is required to provide an appropriate means for discussing first order necessary optimality conditions, as well as the potential approximation of the nonlinear optimization problem by a sequence of linear-quadratic problems. Therefore, the model is then linearized at a given point \((q_k,\mathbf{u}_k)=(q_k,u_k,\varphi _k)\). The discussion above shows that we can assume the regularity of this point, which we fix in the following:

Assumption 1

We assume the existence of constants \(s\in (0,1/2)\) and \(p>2\) and C such that

$$\begin{aligned} (q_k,\mathbf{u}_k)&=(q_k,u_k,\varphi _k) \in Q \times {\varvec{W}} \\&:= Q \times (V \cap (H^{1+s}(\varOmega ;{{\mathbb {R}}}^2) \cap W^{1,p}(\varOmega ;{{\mathbb {R}}}^2) \times W^{1,p}(\varOmega ))) \end{aligned}$$

with

$$\begin{aligned} \Vert q_k\Vert _{\varGamma _N} , \Vert u_k\Vert _{1+s}, \Vert u_k\Vert _{1,p}, \Vert \varphi _k\Vert _{1,p} \le C. \end{aligned}$$

Further, we split \({\varvec{W}} = {\varvec{W}}_u \times {\varvec{W}}_\varphi\) and we assume that the linearized operator A given by (3), below, has trivial kernel.

Then the linearized fracture model at point \((q_k,\mathbf{u}_k) \in Q \times {\varvec{W}}\) reads as follows. For given \(q \in Q\) and \(\varphi ^0 \in {\varvec{W}}_\varphi\), \(0 \le \varphi ^0 \le 1\), find \(\mathbf{u}= (u,\varphi ) \in V\) such that for any \((v,\psi ) \in V\)

$$\begin{aligned} \begin{aligned}&\Bigl ( g(\varphi _k) {\mathbb {C}} e(u),e(v)\Bigr ) +2(1-\kappa )(\varphi _k{\mathbb {C}} e(u_k)\varphi ,e(v)) =(q,v)_{\varGamma _N}\\&G_c\varepsilon (\nabla \varphi ,\nabla \psi ) +\frac{G_c}{\varepsilon } (\varphi ,\psi ) + (1-\kappa )( \varphi {\mathbb {C}} e(u_k):e(u_k), \psi )\\&\quad + 3\gamma ([(\varphi _k-\varphi ^{0})^+]^2\varphi ,\psi ) + 2(1-\kappa )( \varphi _k {\mathbb {C}} e(u_k):e(u), \psi ) = 0. \end{aligned} \end{aligned}$$
(3)

Clearly, (3) defines a bilinear form \(a: V \times V \rightarrow {{\mathbb {R}}}\), and a corresponding linear operator \(A:V\rightarrow V^*\), where \(V^*\) denotes the dual space of V. Defining further the compact operator \(B:Q \rightarrow V^*\) by

$$\begin{aligned} \langle B q,(v,\psi )\rangle _{V^*,V} := (q,v)_{\varGamma _N} \end{aligned}$$
(4)

for all \((v, \psi ) \in V\), the linearized Euler–Lagrange equations (3) can be expressed as

$$\begin{aligned} A \mathbf{u}= B q. \end{aligned}$$
(5)

It is worthwhile to mention that the operator \(A:V\rightarrow V^*\) satisfies a Gårding inequality, see Neitzel et al. (2017),

Lemma 1

The variational form \(a(\cdot , \cdot )\) is continuous on \(V \times V\), and satisfies a Gårding-like inequality. Namely, there exist constants \(c_c, c_1, c_2\) depending only on C in Assumption 1, and some \(r \in (0,1)\) such that

$$\begin{aligned} a(\mathbf{u};\mathbf{v}) \le c_c \Vert \mathbf{u}\Vert _V \Vert \mathbf{v}\Vert _V, \quad \forall \, \mathbf{u},\mathbf{v}\in V, \end{aligned}$$

and

$$\begin{aligned} a(\mathbf{v};\mathbf{v}) \ge c_1 \Vert \mathbf{v}\Vert ^2_{V} - c_2 \Vert \psi \Vert ^2_{r}, \quad \forall \, \mathbf{v}= (v,\psi ) \in V. \end{aligned}$$

With this, we consider the following optimal control problem for fracture propagation, where the displacement u is forced to be as close as possible to a desired displacement \(u^{\mathrm{d}} \in L^2(\varOmega )\), by the action of the control variable q.

Find \((q,\mathbf{u}) = (q,(u,\varphi ))\in (Q\times V)\) solving

$$\begin{aligned} \begin{aligned} \min _{q,\mathbf{u}} \;J(q,\mathbf{u}) :&= \frac{1}{2}\Vert u-u_{\mathrm{d}} \Vert ^2 + \frac{\alpha }{2}\Vert q\Vert ^2_{\varGamma _N}\\ \text {s.t.} \; A \mathbf{u}&= B q, \end{aligned} \end{aligned}$$
(6)

in which the parameter \(\alpha >0\) scales the cost of the control.

According to Neitzel et al. (2017), the problem (6) admits a unique solution. In contrast to standard analysis for (6), even if it is assumed that A is an isomorphism, A is usually not coercive, cf. Lemma 1. Based on Haller-Dintelmann et al. (2019), Assumption 1 asserts the regularity \(\mathbf{u}\in H^{1+s}\) and the bound

$$\begin{aligned} \Vert \mathbf{u}\Vert _{1+s} \le c(\mathbf{u}_k) \Vert q\Vert _Q, \end{aligned}$$
(7)

with the same s as in Assumption 1, for any solution \(\mathbf{u}\) with data q to (5). In fact, the constant \(c(\mathbf{u}_k)\) depends only on \(\Vert u_k\Vert _{1+s}\) and \(\Vert \varphi _k\Vert _{1,p}\).

Remark 1

Before we continue, let us discuss Assumption 1.

First, problem (6) can be thought of as a model for a QP approximation to a nonlinear optimization problem involving (1) as constraint. The assumed regularity \((q_k,u_k) \in Q\times W\) can be satisfied, if minimizing sequences \((q_k,u_k)\) for this problem satisfy \(\Vert q_k\Vert _{\varGamma _N} \le C\). This property is usually ensured, as this boundedness is crucial when showing existence of solutions to minimization problems by the direct method in the calculus of variations. The regularity of \(u_k\) and \(\varphi _k\) is then natural due to the regularity estimate in Haller-Dintelmann et al. (2019, Section 7). By the method of the proof for this regularity, it is natural to assume that the linear operator is \(H^{1+s}\) regular with the same s. By the methods used in Haller-Dintelmann et al. (2019) the number s obtained will always be smaller than 1/2, although more regularity could be possible. As \(s < 1/2\) conveniently fits to the embedding \(L^2(\varGamma _N;{{\mathbb {R}}}^2) \subset H^{-1+s}(\varOmega ;{{\mathbb {R}}}^2)\), we fix \(s < 1/2\) for a more convenient notation.

Second, for the same optimization problem, the assumption that A has trivial kernel at least in the considered local solution as linearization point \((q_k,\mathbf{u}_k)\) is a constraint qualification and asserts that KKT conditions are necessary at this solution. Since the set of invertible operators is open, it is natural to assume the same close to the solution.

3 A priori finite element error estimate

This section is devoted to discretization and the derivation of corresponding error estimates for equation (5). We consider a conforming finite element method (FEM) to discretize the problem (6) in space. Let \(\{{\mathscr {T}}_h \}\) be a sequence of meshes with mesh size \(h>0\), \(h \rightarrow 0\). The mesh \({\mathscr {T}}_h\) consists of open rectangles T which provide a decomposition of \({{\overline{\varOmega }}}\), that is

$$\begin{aligned} {{\overline{\varOmega }}} = \bigcup _{T \in {\mathscr {T}}_h} {{\overline{T}}}, \end{aligned}$$

such that the mesh matches the splitting of the boundary into \(\varGamma _D\) and \(\varGamma _N\). The mesh size h is defined by \(h:= \max _{T \in {\mathscr {T}}_h} \text {diam} (T)\), and \({\mathscr {T}}_h\) satisfies the standard quasi-uniform mesh properties in the sense of Brenner and Scott (2008). With this setting, we consider a conforming finite dimensional space \(V_h \subset V\), with piecewise bilinear test- and ansatz functions, over the decomposed domain \({\mathscr {T}}_h\). We start by considering the linearized phase-field model for finding \(\mathbf{u}= (u,\varphi ) \in V\) solving

$$\begin{aligned} a(\mathbf{u};\mathbf{v})=(q,v)_{\varGamma _N} \quad \forall \mathbf{v}= (v,\psi ) \in V. \end{aligned}$$
(8)

Its finite element approximation \(\mathbf{u}_h=(u_h,\varphi _h) \in V_h\) is then given as usual by solving

$$\begin{aligned} a(\mathbf{u}_h;\mathbf{v}_h)=(q,v_h)_{\varGamma _N} \quad \forall \mathbf{v}_h = (v_h,\psi _h)\in V_h. \end{aligned}$$
(9)

3.1 Forward problem

In this section, we first provide the error analysis for finite element approximations of the state variable \(\mathbf{u}= (u, \varphi ) \in V\) for fixed given \(q \in Q\). To this end, let \(I_h: H^{1 + s} \rightarrow V_h\) be an interpolation operator satisfying the interpolation error estimate

$$\begin{aligned} \Vert w-I_h w \Vert _{V} \le c_I h ^{s} \Vert w\Vert _{1+s} \end{aligned}$$

for any \(w \in H^{1+s}\). Taking this into account, we infer the following theorem.

Theorem 1

Let Assumption 1hold. Then there are constants \(h_0\) and c, such that for all \(h \le h_0\), the discretized PDE (9) admits a unique solution, and the solutions \(\mathbf{u}\in V\) and \(\mathbf{u}_h \in V_h\) of the PDE (8) and its discretization (9) satisfy the following quasi-best approximation property

$$\begin{aligned} \Vert \mathbf{u}- \mathbf{u}_h \Vert _V \le c \inf _{\mathbf{v}\in V_h} \Vert \mathbf{u}- \mathbf{v}\Vert _V. \end{aligned}$$

Moreover, \(h_0\) and c are independent of the linearization point, and only depend on C in Assumption 1.

Proof

Following the technique by Schatz (1974), see also Brenner and Scott (2008), we first show that any solution of the problem (9), if any exists, satisfies the quasi-best approximation error estimate. Next, with the help of the obtained estimation result, we provide the argument for existence of a unique solution to (9). To this end, let us assume that \(\mathbf{u}_h\) is such a solution. Furthermore, for compactness of notation, let us denote the error by

$$\begin{aligned} e_{\mathbf{u}} = (e_u, e_{\varphi }) := \mathbf{u}- \mathbf{u}_h = (u-u_h, \varphi - \varphi _h). \end{aligned}$$

Based on Lemma 1, we have

$$\begin{aligned} c_1 \Vert \mathbf{v}\Vert _V^2 \le a(\mathbf{v}; \mathbf{v}) + c_2 \Vert \psi \Vert ^2_r, \quad \forall \, \mathbf{v}= (v, \psi ) \in V, \end{aligned}$$

for some \(r \in (0,1)\). Letting \(\mathbf{v}= e_{\mathbf{u}}\) in the inequality above, based on the Galerkin orthogonality and continuity of the bilinear form a, we obtain the following for all \(\mathbf{v}_h \in V_h\).

$$\begin{aligned} \begin{aligned} c_1 \Vert e_{\mathbf{u}} \Vert _V^2&\le a(e_{\mathbf{u}}; e_{\mathbf{u}}) + c_2 \Vert e_{\varphi } \Vert ^2_r \\&= a(e_{\mathbf{u}}; \mathbf{u}- \mathbf{v}_h) + c_2 \Vert e_{\varphi } \Vert ^2_r \\&\le c_c \Vert e_{\mathbf{u}}\Vert _V \Vert \mathbf{u}- \mathbf{v}_h\Vert _V + c_2 \Vert e_{\varphi } \Vert ^2_r. \end{aligned} \end{aligned}$$
(10)

Next, we consider that since \(A : V \rightarrow V^*\) is an isomorphism, where

$$\begin{aligned} \left\langle A \mathbf{u}; \mathbf{v}\right\rangle = a(\mathbf{u}; \mathbf{v}), \end{aligned}$$

the mapping \(A^* : V^* \rightarrow V\) is an isomorphism too.

Noting that

$$\begin{aligned} (v,\psi ) \mapsto \left( \frac{e_{u}}{\Vert e_{u}\Vert _r} , v \right) _r + \left( \frac{e_{\varphi }}{\Vert e_{\varphi }\Vert _r} , \psi \right) _r \end{aligned}$$

is an element in \((H^r(\varOmega ;{{\mathbb {R}}}^2)\times H^r(\varOmega ))^*\), and considering the fact that \((H^r(\varOmega ;{{\mathbb {R}}}^2)\times H^r(\varOmega ))^*\) is embedded in \(V^*\), we observe that the adjoint equation

$$\begin{aligned} a(\mathbf{v}; {\varvec{\lambda }}) = \left\langle A^* {\varvec{\lambda }}; \mathbf{v}\right\rangle = \left( \frac{e_{u}}{\Vert e_{u}\Vert _r} , v \right) _r + \left( \frac{e_{\varphi }}{\Vert e_{\varphi }\Vert _r} , \psi \right) _r \end{aligned}$$
(11)

has a unique solution \({\varvec{\lambda }}= (\lambda _u,\lambda _\varphi )\) in V.

Without loss of generality, let \(s >0\) in Assumption 1 be such that \(r \le 1-s\). Then

$$\begin{aligned} \left( \frac{e_{u}}{\Vert e_{u}\Vert _r} , v \right) _r + \left( \frac{e_{\varphi }}{\Vert e_{\varphi }\Vert _r} , \psi \right) _r&\le \Vert v\Vert _r + \Vert \psi \Vert _r \\&\le c (\Vert v \Vert _{1-s} + \Vert \psi \Vert _{1-s}), \end{aligned}$$

which implies that the right hand side of equation (11) is an element of \((H^{1-s}(\varOmega ;{{\mathbb {R}}}^2)\times H^{1-s}(\varOmega ))^* = H^{-1 + s}(\varOmega ;{{\mathbb {R}}}^2)\times H^{-1+s}(\varOmega )\). Therefore, by elliptic regularity for A, the solution \({\varvec{\lambda }}\) of the adjoint equation (11) belongs also to the same space \(H^{1 + s}\), with \(\Vert {\varvec{\lambda }}\Vert _{1+s}\le c_\lambda\). That is, \({\varvec{\lambda }}\in H^1_D \cap H^{1 + s}\).

Now, we can employ the Aubin-Nitsche duality argument, along with the Galerkin orthogonality and the continuity of the bilinear form a, to obtain for \({\varvec{\lambda }}_h = I_h \lambda \in V_h\),

$$\begin{aligned} \Vert e_u \Vert _r + \Vert e_\varphi \Vert _r&= \left( \frac{e_{u}}{\Vert e_{u}\Vert _r} , e_u \right) _r + \left( \frac{e_{\varphi }}{\Vert e_{\varphi }\Vert _r} , e_\varphi \right) _r \\&= a(e_\mathbf{u}; {\varvec{\lambda }}) \\&= a (e_\mathbf{u}; {\varvec{\lambda }}- {\varvec{\lambda }}_h) \\&\le c_c \Vert e_\mathbf{u}\Vert _{V} \Vert {\varvec{\lambda }}- {\varvec{\lambda }}_h \Vert _{V} \\&\le c_c c_I h^s \Vert e_\mathbf{u}\Vert _{V} \Vert {\varvec{\lambda }}\Vert _{1+s} \\&\le c_c c_I c_\lambda h^s \Vert e_\mathbf{u}\Vert _{V}, \end{aligned}$$

using the previously defined interpolation operator to bound the best approximation error. Consequently, we get

$$\begin{aligned} \Vert e_\varphi \Vert _r \le c_0 h^s \Vert e_\mathbf{u}\Vert _{V}, \end{aligned}$$
(12)

with \(c_0=c_c c_I c_\lambda\). Combining (10) and (12) we obtain

$$\begin{aligned} c_1 \Vert e_\mathbf{u}\Vert _V \le c_c \Vert \mathbf{u}- \mathbf{v}\Vert _V + c_0 c_2 h^s \Vert e_\mathbf{u}\Vert _V, \end{aligned}$$
(13)

which implies that for \(h \le h_0\), where \(h_0=\frac{1}{2} ( \frac{c_1}{c_0 c_2})^{1/s}\), the following desired quasi-best approximation property holds:

$$\begin{aligned} \Vert \mathbf{u}- \mathbf{u}_h \Vert _V \le \frac{2c_c}{c_1} \inf _{\mathbf{v}\in V_h} \Vert \mathbf{u}- \mathbf{v}\Vert _V. \end{aligned}$$
(14)

To complete the proof, it is left to show the existence of \(\mathbf{u}_h\) as a unique solution to (9), when \(h \le h_0\). Since (9) describes a finite dimensional linear system, it suffices to show that the bilinear form

$$\begin{aligned} a(\cdot ;\cdot ) :V_h\times V_h \rightarrow {{\mathbb {R}}}\end{aligned}$$

has a trivial kernel for \(h\le h_0\). This is clear by first noting that \(q=0\) in (8) implies \(\mathbf{u}=0\), since \(A: V \rightarrow V^*\) is an isomorphism. Then considering the error estimate (14) which allows us to conclude that \(q=0\) in (9) implies \(\mathbf{u}_h=0 \in V_h\) for \(h \le h_0\).

Since \(h_0\) and c only depend on the constants \(c_1\), \(c_2\), \(c_3\), and \(c_c\) the independence of \(h_0\) and c from the linearization point follows by Lemma 1. \(\square\)

Consequently, we obtain the following quantitative convergence rate, which will be used to derive further estimates in what follows.

Corollary 1

Let Assumption 1be satisfied, \(q \in Q\) be arbitrary, and \(\mathbf{u}\in V\) be the corresponding solution to the PDE (8). Then there exist positive constants \(h_0\), c and s such that, the solution \(\mathbf{u}_h \in V_h\) to (9) exists for all \(h \le h_0\) and the discretization error satisfies

$$\begin{aligned} \Vert \mathbf{u}-\mathbf{u}_h \Vert _V \le c h^{s} \Vert q\Vert _{\varGamma _N} \end{aligned}$$

where \(h_0\) and c can be bounded by Assumption 1and s is as in Assumption 1provided the value of s is chosen in accordance with Remark 1as the \(H^{1+s}\) regularity of A.

Proof

This is an immediate consequence of combining the regularity estimate (7) with the quasi-best approximation of Theorem 1. \(\square\)

3.2 The control problem

The result obtained in Theorem 1 provides a means to estimate the error in approximating the solution \((q,\mathbf{u})\) of the optimal control problem (6) by a conforming finite element method. Following the idea of Hinze (2005), the control space Q does not need to be discretized as the optimality conditions imply a variational discretization of Q. Let us consider the variational form of the optimization problem

$$\begin{aligned} \begin{aligned} \min _{q,\mathbf{u}} \;J(q,\mathbf{u})&= \frac{1}{2}\Vert u-u^{\mathrm{d}}\Vert ^2 + \frac{\alpha }{2}\Vert q\Vert ^2_{\varGamma _N}\\ \text {s.t.} \quad a(\mathbf{u};\mathbf{v})&=(q,v)_{\varGamma _N} \quad \forall \, \mathbf{v}=(v,\psi ) \in V, \end{aligned} \end{aligned}$$
(15)

and the corresponding discretized model

$$\begin{aligned} \begin{aligned} \min _{q_h,\mathbf{u}_h} \;J(q_h,\mathbf{u}_h)&= \frac{1}{2}\Vert u_h-u^{\mathrm{d}}\Vert ^2 + \frac{\alpha }{2}\Vert q_h\Vert ^2_{\varGamma _N}\\ \text {s.t.} \quad a(\mathbf{u}_h;\mathbf{v}_h)&=(q_h,v_h)_{\varGamma _N} \quad \forall \, \mathbf{v}_h=(v_h,\varphi _h) \in V_h. \end{aligned} \end{aligned}$$
(16)

The error estimate can now be derived.

Theorem 2

Let \(({{\bar{q}}}, {{\bar{\mathbf{u}}}})=({{\bar{q}}}, ({{\bar{u}}}, {{\bar{\varphi }}}))\) be the solution to the problem (15), and \(({{\bar{q}}}_h, {{\bar{\mathbf{u}}}}_h) = ({{\bar{q}}}_h, ({{\bar{u}}}_h, {{\bar{\varphi }}}_h))\) be the solution to the problem (16), with \(h \le h_0\); \(h_0\) being the constant introduced in Theorem 1. Then we have the following estimate, for some positive c and s from Assumption 1,

$$\begin{aligned} \alpha \Vert {{\bar{q}}} - {{\bar{q}}}_h \Vert _{\varGamma _N}^2 + \Vert {{\bar{u}}} - {{\bar{u}}}_h \Vert ^2 \le c \left( 1+\frac{1}{\alpha }\right) h^{2 s}. \end{aligned}$$

Proof

With most of the work done in Theorem 1 the proof is now standard. We recall that the variational problem (3) can also be written in operator form (5). Since A is an isomorphism, by \(A \mathbf{u}= B q\) we can define the solution operator \(S \in {\mathscr {L}} (Q,L^2(\varOmega ;{{\mathbb {R}}}^2))\), \(S=A^{-1} B\), such that \(\mathbf{u}= Sq\). Analogously, the discrete operator \(S_h \in {\mathscr {L}} (Q,L^2(\varOmega ;{{\mathbb {R}}}^2))\) is defined corresponding to (9). As the phase-field variable \(\varphi\) does not play a role directly in the objective function J, we may ignore \(\varphi\), and consider \(u=Sq\) and \(u_h=S_hq_h\) to construct the reduced objective functions

$$\begin{aligned} j(q):=\frac{1}{2}\Vert Sq-u^{\mathrm{d}}\Vert ^2 + \frac{\alpha }{2}\Vert q\Vert ^2_{\varGamma _N} \end{aligned}$$

and

$$\begin{aligned} j(q_h):=\frac{1}{2}\Vert S_h q_h-u^{\mathrm{d}}\Vert ^2 + \frac{\alpha }{2}\Vert q_h\Vert ^2_{\varGamma _N} \end{aligned}$$

for the problems (15) and (16), respectively. Denoting the adjoint of operator S by \(S^*\), the necessary, and sufficient, optimality conditions for \({{\bar{q}}}\) and \({{\bar{q}}}_h\) are

$$\begin{aligned} (S^*(S {{\bar{q}}} - u^{\mathrm{d}}) + \alpha {{\bar{q}}}, q - {{\bar{q}}})_{\varGamma _N}&= 0, \quad \forall \, q \in Q, \end{aligned}$$
(17)
$$\begin{aligned} (S_h^*(S_h {{\bar{q}}}_h - u^{\mathrm{d}}) + \alpha {{\bar{q}}}_h, q - \bar{q}_h)_{\varGamma _N}&= 0, \quad \forall \, q \in Q. \end{aligned}$$
(18)

Noting that \({{\bar{q}}}_h\) is a feasible test function for (17), and \({{\bar{q}}}\) is a feasible test function for (18), we obtain

$$\begin{aligned} \begin{aligned} 0&= (S^*(S {{\bar{q}}} - u^{\mathrm{d}})-S^*_h(S_h{{\bar{q}}}_h-u^{\mathrm{d}}) + \alpha {{\bar{q}}} - \alpha {{\bar{q}}}_h, {{\bar{q}}}_h - {{\bar{q}}})_{\varGamma _N} \\&= - \alpha \Vert {{\bar{q}}}_h - {{\bar{q}}} \Vert _{\varGamma _N}^2 + (S {{\bar{q}}} - u^{\mathrm{d}}, S({{\bar{q}}}_h - {{\bar{q}}})) - (S_h {{\bar{q}}}_h - u^{\mathrm{d}}, S_h({{\bar{q}}}_h - {{\bar{q}}}))_{\varGamma _N}, \end{aligned} \end{aligned}$$

and consequently, after standard calculations,

$$\begin{aligned} \begin{aligned} \alpha \Vert {{\bar{q}}} - {{\bar{q}}}_h \Vert _{\varGamma _N}^2 + \Vert {{\bar{u}}} - {{\bar{u}}}_h \Vert ^2&= (S {{\bar{q}}} - S_h {{\bar{q}}}_h, (S-S_h){{\bar{q}}}) \\&\quad + ((S-S_h)^*(S\bar{q} - u^{\mathrm{d}}), {{\bar{q}}}_h - {{\bar{q}}})_{\varGamma _N}. \end{aligned} \end{aligned}$$
(19)

By Corollary 1, we have

$$\begin{aligned} \Vert (S-S_h)q \Vert \le c h^{s} \Vert q\Vert _{\varGamma _N}, \end{aligned}$$
(20)

for some c and \(s>0\). Applying (20) together with the Young’s inequality to the right hand side of (19) we obtain

$$\begin{aligned} \alpha \Vert {{\bar{q}}} - {{\bar{q}}}_h \Vert _{\varGamma _N}^2 + \Vert {{\bar{u}}} - {{\bar{u}}}_h \Vert ^2 \le c h ^{2s} + \dfrac{c}{\alpha } h ^{2s} + \frac{\alpha }{2} \Vert {{\bar{q}}} - {{\bar{q}}}_h \Vert _{\varGamma _N}^2, \end{aligned}$$

and thus the assertion. \(\square\)

3.3 Improved error estimate

Of course, the error estimate in Theorem 2 is suboptimal, since (20) only used the \(H^1\) error for the finite element discretization and an \(L^2\)-type error would improve the convergence rate to \(h^{4s}\) on the right side of the inequality in Theorem 2. However, this estimate still contains a bad scaling in terms of the regularization parameter \(\alpha\) as observed in Gaspoz et al. (2019). We will thus follow their ideas to obtain the correct error estimates with correct asymptotic dependence on \(\alpha\).

To do so, we start by considering a suitably reduced first-order optimality system of (6). We will then be able to transform this problem into the form of the already studied in Sect. 3.1 for the forward problem allowing to reuse the results obtained there.

Clearly, see, e.g., Neitzel et al. (2017), the following optimality conditions hold true for (6):

Theorem 3

Let Assumption 1be given, and let \(({{\bar{q}}}, {{\bar{\mathbf{u}}}}) \in Q \times V\) be the solution to (6). Then there exists a Lagrange multiplier \({{\bar{\mathbf{z}}}} = ({{\bar{z}}}, {{\bar{\zeta }}}) \in V\) such that the system

$$\begin{aligned} \begin{aligned} A {{\bar{\mathbf{u}}}}&= B {{\bar{q}}} \quad \text {in } V^* \\ A^* {{\bar{\mathbf{z}}}}&= {{\bar{u}}} - u^{\mathrm{d}} \quad \text {in } V^* \\ \alpha {{\bar{q}}}&= -B^* {{\bar{\mathbf{z}}}} \quad \text {on } \varGamma _N \end{aligned} \end{aligned}$$
(21)

is satisfied. Here we implicitly understand \(u-u^{\mathrm{d}} = (u-u^{\mathrm{d}},0) \in V^*\).

Moreover, because of the convexity of (6), any triplet \(({{\bar{q}}}, {{\bar{\mathbf{u}}}}, {{\bar{\mathbf{z}}}}) \in Q \times V \times V\) solving (21) gives rise to a solution of (6).

We notice that, by replacing \({{\bar{q}}}\) with \(-\frac{1}{\alpha } B^* {{\bar{\mathbf{z}}}}\), the reduced form of the optimality system (21) can be written in the following matrix form for \(({{\bar{\mathbf{u}}}},{{\bar{\mathbf{z}}}}) \in V \times V\),

$$\begin{aligned} \begin{pmatrix} A & \quad \frac{1}{\alpha } B B^* \\ - I_u & \quad A^* \end{pmatrix} \begin{pmatrix} {{\bar{\mathbf{u}}}} \\ {{\bar{\mathbf{z}}}} \end{pmatrix} = \begin{pmatrix} 0 \\ - u^{\mathrm{d}} \end{pmatrix} \end{aligned}$$
(22)

where \(I_u\) denotes the identity on the displacement component in V, i.e., \(I_u (v,\psi ) = v\) for any \((v,\psi ) \in V\).

The operator matrix in (22) is a compact perturbation of the diagonal operator \(\begin{pmatrix} A & \quad 0\\ 0 & \quad A^* \end{pmatrix}\). Hence we will be able to show analogous results to Sect. 3.1 for the optimization problem based on a Gårding-like inequality for the reduced optimality system.

In order to analyze that, we formulate the corresponding bilinear form. To this end, let us consider the variational form of the reduced optimality system (22):

$$\begin{aligned} \begin{aligned}&(A {{\bar{\mathbf{u}}}} , \mathbf{v}) + \frac{1}{\alpha } (B^* {{\bar{\mathbf{z}}}} , B^* \mathbf{v})_{\varGamma _N} = 0, \quad \forall \mathbf{v}= (v,\psi ) \in V,\\&(A \mathbf{v}, {{\bar{\mathbf{z}}}}) - ({{\bar{u}}} , v) = -(u^{\mathrm{d}}, v) \quad \forall \mathbf{v}= (v,\psi )\in V. \end{aligned} \end{aligned}$$
(23)

We define the normed space \(X:=V \times V\), in which any element \(\mathbf{x}=(\mathbf{x}_1,\mathbf{x}_2)\in X\) with \(\mathbf{x}_1=(x_{1,u},x_{1,v})\) and \(\mathbf{x}_2=(x_{2,u},x_{2,v})\) has the norm

$$\begin{aligned} \Vert \mathbf{x}\Vert _X := (\Vert \mathbf{x}_1\Vert _V^{2} + \Vert \mathbf{x}_2\Vert _V^{2})^{1/2}. \end{aligned}$$

Summing up the two equations (23), allows us to introduce the bilinear form \(b: X \times X \rightarrow {{\mathbb {R}}}\), associated with the operator in (22) as

$$\begin{aligned} b(\varvec{\varPhi },\varvec{\varPsi }) := a(\varvec{\varPhi }_1, \varvec{\varPsi }_1) + a(\varvec{\varPsi }_2, \varvec{\varPhi }_2) + \left( \frac{1}{\alpha }(B^* \varvec{\varPhi }_2 , B^* \varvec{\varPsi }_1)_{\varGamma _N} - (\varvec{\varPhi }_{1,u} , \varvec{\varPsi }_{2,u}) \right) \end{aligned}$$
(24)

for any \(\varvec{\varPhi }_1=(\varvec{\varPhi }_{1,u},\varvec{\varPhi }_{1,v}), \varvec{\varPhi }_2=(\varvec{\varPhi }_{2,u},\varvec{\varPhi }_{2,v}), \varvec{\varPsi }_1=(\varvec{\varPsi }_{1,u},\varvec{\varPsi }_{1,v}), \varvec{\varPsi }_2=(\varvec{\varPsi }_{2,u},\varvec{\varPsi }_{2,v}) \in V\), with \(\varvec{\varPhi }:=(\varvec{\varPhi }_1, \varvec{\varPhi }_2), \varvec{\varPsi }:=(\varvec{\varPsi }_1, \varvec{\varPsi }_2) \in X\). Then, with \({{\bar{\mathbf{x}}}}=({{\bar{\mathbf{u}}}}, {{\bar{\mathbf{z}}}})\), the variational formulation (23) reads as

$$\begin{aligned} b({{\bar{\mathbf{x}}}},\varvec{\varPsi }) = - ( u^{\mathrm{d}}, \varvec{\varPsi }_{2,u}) \quad \forall \varvec{\varPsi }\in X. \end{aligned}$$

We are then able to conclude the following lemma.

Lemma 2

For any given \((q_k,\mathbf{u}_k)\) satisfying Assumption 1the bilinear form \(b:X \times X \rightarrow {{\mathbb {R}}}\), defined by (24), is continuous on \(X \times X\), and satisfies a Gårding-like inequality; more precisely, there exist constants \({\tilde{c}}_c, {\tilde{c}}_1, {\tilde{c}}_2\) and some \(r \in (0,1)\), depending on C in Assumption 1, such that

$$\begin{aligned} | b(\varvec{\varPhi }, \varvec{\varPsi }) | \le {\tilde{c}}_c \Vert \varvec{\varPhi }\Vert _X \Vert \varvec{\varPsi }\Vert _X \end{aligned}$$

and

$$\begin{aligned} b(\varvec{\varPhi },\varvec{\varPhi }) + {\tilde{c}}_2\Vert \varvec{\varPhi }\Vert ^2_r \ge {\tilde{c}}_1 \Vert \varvec{\varPhi }\Vert ^2_X \end{aligned}$$

where the \(\Vert \cdot \Vert _r\) norm is to be understood component-wise.

Proof

With the help of Lemma 1, and considering the boundedness of the compact operator \(B^*\), it is straightforward to observe that b satisfies the mentioned Gårding’s inequality and the continuity relation. \(\square\)

Having introduced the bilinear form \(b: X \times X \rightarrow {{\mathbb {R}}}\), we notice that the variational formulation of the reduced optimality system (22) consists of finding \(\mathbf{x}=(\mathbf{u}, \mathbf{z}) \in X\) solving

$$\begin{aligned} b(\mathbf{x},\varvec{\varPsi }) = - (u^{\mathrm{d}}, \varvec{\varPsi }_{2,u}) \quad \forall \varvec{\varPsi }=(\varvec{\varPsi }_1,\varvec{\varPsi }_2) \in X. \end{aligned}$$
(25)

The discretized problem is thus given by finding \(\mathbf{x}_h=(\mathbf{u}_h,\mathbf{z}_h) \in X_h = V_h \times V_h\) solving

$$\begin{aligned} b(\mathbf{x}_h,\varvec{\varPsi }_h) = - (u^{\mathrm{d}}, \varvec{\varPsi }_{h,2,u}) \quad \forall \varvec{\varPsi }_h=(\varvec{\varPsi }_{h,1},\varvec{\varPsi }_{h,2}) \in X_h. \end{aligned}$$
(26)

To estimate the error between the solutions \(\mathbf{x}\in X\) and \(\mathbf{x}_h \in X_h\), to the problems (25) and (26) respectively, let us define \(e_{\mathbf{x}}:=\mathbf{x}- \mathbf{x}_h\); that is \(e_{\mathbf{x}}=(e_{\mathbf{u}},e_{\mathbf{z}})=(\mathbf{u}-\mathbf{u}_h,\mathbf{z}-\mathbf{z}_h)\).

We need the following intermediate estimate for our final argument.

Lemma 3

Let \(e_\mathbf{x}=(e_\mathbf{u},e_\mathbf{z})=((e_u,e_\varphi ),(e_z,e_\zeta ))\), and s from Assumption 1be such that \(r \le 1-s\) for \(r\in (0,1)\) for Lemma 2, and Assumption 1hold. Then there is a constant \(h_0\) such that the following estimate holds true

$$\begin{aligned} \Vert e_\mathbf{x}\Vert _r \le c h^s \Vert e_\mathbf{x}\Vert _X \end{aligned}$$

for some constant c, and for all \(h \le h_0\).

Proof

Based on Lemma 2, and Wloka (1987, Theorem 12.8), we get that the matrix operator \(M:V \times V \rightarrow V^* \times V^*\) is Fredholm of index zero, and it is straightforward to show that \({\text {kern}} M=\{0\}\).

To this end, let \((\mathbf{u}, \mathbf{z}) \in {\text {kern}} \, M\). By definition, this implies

$$\begin{aligned} {\left\{ \begin{array}{ll} (A \mathbf{u}, \mathbf{v}) + \frac{1}{\alpha } (B^* \mathbf{z}, B^* \mathbf{v})_{\varGamma _N} = 0, &\quad \forall \mathbf{v}\in V,\\ (A \mathbf{v}, \mathbf{z}) - ( \mathbf{u}, \mathbf{v}) = 0, &\quad \forall \mathbf{v}\in V. \end{array}\right. } \end{aligned}$$
(27)

Testing the first equation in (27) with \(\mathbf{v}:= \mathbf{z}\), and the second one with \(\mathbf{v}:= \mathbf{u}\), and then subtracting the second equation from the first one we get

$$\begin{aligned} \frac{1}{\alpha } \Vert B^* \mathbf{z}\Vert _{\varGamma _N}^2 + \Vert \mathbf{u}\Vert ^2 = 0. \end{aligned}$$

This simply implies \(\mathbf{u}=0\). Noting that \(A^*\) is an isomorphism, it is then an immediate result from the second equation in (27) that \(\mathbf{z}=0\).

Therefore, M is an isomorphism, hence the same argument presented in the proof of Theorem 1 can be applied to obtain

$$\begin{aligned} \Vert e_\varphi \Vert _r + \Vert e_\zeta \Vert _r \le c h^s \Vert e_\mathbf{x}\Vert _X, \end{aligned}$$

and consequently the desired bound follows. \(\square\)

We are now in the place to present the final results.

Theorem 4

Given Assumption 1, let \(h\le h_1\) for some sufficiently small \(h_1 \le h_0\). Then there is a constant \(c > 0\) such that the solutions \(\mathbf{x}=(\mathbf{u},\mathbf{z}) \in X\) and \(\mathbf{x}_h=(\mathbf{u}_h,\mathbf{z}_h) \in X_h\) to the problems (25) and (26) satisfy the quasi-best approximation property

$$\begin{aligned} \Vert \mathbf{x}- \mathbf{x}_h \Vert _X \le c \inf _{\varvec{\varPsi }_h \in X_h} \Vert \mathbf{x}- \varvec{\varPsi }_h \Vert _X. \end{aligned}$$

Again, \(h_1\) is independent of the linearization point and only depends on the bound C in Assumption 1.

Proof

Analogous to Theorem 1, we get from Lemma 3 that for any \(h \le h_0\) the estimate

$$\begin{aligned} {\tilde{c}}_1\Vert e_{\mathbf{x}}\Vert _V^2 \le {\tilde{c}}_c \Vert e_{\mathbf{x}}\Vert _V \Vert \mathbf{x}-\varvec{\varPsi }_h\Vert _V + {\tilde{c}}_2\Vert e_{\mathbf{x}}\Vert _r^2\le {\tilde{c}}_c \Vert e_{\mathbf{x}}\Vert _V \Vert \mathbf{x}-\varvec{\varPsi }_h\Vert _V + ch^{2s}\Vert e_{\mathbf{x}}\Vert _V^2 \end{aligned}$$

holds true for arbitrary \(\varvec{\varPsi }_h \in V_h\). This shows the result once \(h \le h_1\) with

$$\begin{aligned} ch_1^{2s} \le \frac{{\tilde{c}}_1}{2}. \end{aligned}$$

\(\square\)

Corollary 2

Under the assumption of Theorem 4, there exist constants c such that we have the following further estimates.

$$\begin{aligned} \Vert \mathbf{u}-\mathbf{u}_h\Vert +\Vert \mathbf{z}-\mathbf{z}_h\Vert \le ch^{2s}, \end{aligned}$$

and

$$\begin{aligned} \Vert B^*\mathbf{z}-B^*\mathbf{z}_h\Vert _{\varGamma _N} \le ch^{2s}. \end{aligned}$$

Proof

We start by showing the existence of a constant c satisfying

$$\begin{aligned} \Vert e_{\mathbf{u}}\Vert _V + \Vert e_{\mathbf{z}}\Vert _V \le c h^s. \end{aligned}$$
(28)

To this end, we notice that by having \(A^*\mathbf{z}= u - u^d\) in \(V^*\), and the bound (7), Theorem 4 can be applied to result in

$$\begin{aligned} \begin{aligned} \Vert e_{\mathbf{u}}\Vert _V + \Vert e_{\mathbf{z}}\Vert _V&\le c \, \Vert e_{\mathbf{x}}\Vert _X \\&\le c \, \inf _{\varvec{\varPsi }_h \in X_h} \Vert \mathbf{x}- \varvec{\varPsi }_h \Vert _X \\&\le c \, \Vert \mathbf{x}- I_h\mathbf{x}\Vert _X \\&\le c \, \sqrt{ch^{2s} \Vert \mathbf{u}\Vert ^2_{1+s} + ch^{2s}\Vert \mathbf{z}\Vert ^2_{1+s}} \\&\le c \, h^s. \end{aligned} \end{aligned}$$

Then, combined with Lemma 3, the first statement is proven as follows.

$$\begin{aligned} \begin{aligned} \Vert e_{\mathbf{u}}\Vert + \Vert e_{\mathbf{z}}\Vert&\le c \bigl ( \Vert e_{\mathbf{u}}\Vert _r + \Vert e_{\mathbf{z}}\Vert _r \bigr )\\&\le c \, h^s \Vert e_{\mathbf{x}}\Vert _X \\&\le c \, h^{2s}. \end{aligned} \end{aligned}$$

Concerning the second statement, we utilize another duality argument which implies the existence of a unique \(\lambda \in H^{1+ \min \{s,\frac{1}{2}\}} = H^{1+s}\) such that for any \(\mathbf{v}\in V\) we have

$$\begin{aligned} a(\mathbf{v}, \lambda ) = \left( \frac{BB^*e_\mathbf{z}}{\Vert B^*e_\mathbf{z}\Vert _{\varGamma _N}}, \mathbf{v}\right) _{\varGamma _N}. \end{aligned}$$

Letting \(\mathbf{v}:= e_\mathbf{z}\) we get

$$\begin{aligned} \Vert B^*e_\mathbf{z}\Vert _{\varGamma _N}&= \left( \frac{B^*e_\mathbf{z}}{\Vert B^*e_\mathbf{z}\Vert _{\varGamma _N}}, B^*e_{\mathbf{z}}\right) _{\varGamma _N} \\&= a(e_{\mathbf{z}}, \lambda ) \\&= a(e_{\mathbf{z}}, \lambda - I_h \lambda ) \\&\le c \, \Vert e_{\mathbf{z}} \Vert _V \Vert \lambda - I_h \lambda \Vert _V \\&\le c \, h^s \, h^s \Vert \lambda \Vert _{1+s} \\&\le c \, h^{2s}. \end{aligned}$$

\(\square\)

4 Numerical experiment

In this section, we present the numerical implementation to simulate the fracture problem (3). The aim is to demonstrate the validity of the error estimates we have obtained in the previous section. Let us consider the two-dimensional square domain \(\varOmega = (-1,1)^2\), with the boundary \(\partial \varOmega = \varGamma _N \, \cup \, \varGamma _D \, \cup \, \varGamma _{\text {free}}\) consisting of three different parts, where

$$\begin{aligned} \varGamma _N = \{ (x,1) | -1 \le x \le 1\}, \quad \quad \varGamma _D = \{ (x,-1) | -1 \le x \le 1\}, \end{aligned}$$

and \(\varGamma _{\text {free}}\) represents the rest of the boundary. On the boundary piece \(\varGamma _N\) a control q is applied in normal direction, whereas on the Dirichlet boundary \(\varGamma _D\) the displacement vector is prescribed by \(u=0\). \(\varGamma _{\text {free}}\) is a free boundary over which a natural boundary condition for the displacement is employed.

Since the rates are infact implied by the linearized fracture model, we consider the forward problem, only. The fixed parameters at the linearized model (3) are set as follows. The control acting on \(\varGamma _N\) is \(q=10\), the penalization parameter is \(\gamma = 10^8\), the fracture energy release rate is \(G_c = 1\), the bulk regularization parameter is \(\kappa = 10^{-4}\), and the phase-field regularization parameter is \(\varepsilon = 0.088\). We linearize the fracture model at point \(\mathbf{u}_k = (u_k,\varphi _k)\), with

$$\begin{aligned} u_k(x,y) = (0, (1+y) \times 10^{-5}), \quad (x,y) \in \varOmega , \end{aligned}$$

and \(\varphi _k = \varphi ^0\). Note that by this choice the penalty term vanishes. The initial fracture \(\varphi ^0\) will be specified in each of the Examples 4.1 to 4.3. Notice that while in some examples \(\varphi ^0\) is not \(W^{1,p}\) the coefficient \(g(\varphi ^0)\) remains regular enough to assert \(H^{1+s}\) regularity of \((u,\varphi )\).

To approximate the solution of (3), we discretize the model by choosing standard \(Q_1\) finite elements for the displacement u and the phase-field \(\varphi\). The implementation is performed with the help of software DOpElib (Goll et al. 2017), which is developed based on the finite element software library deal.II (Bangerth et al. 2007; Arndt et al. 2017).

We start the calculations on a twice globally refined quadrilateral mesh of the domain, i.e., \(h_0= 0.5\) using the edge-length to label the triangulations.

Since the exact solution of the problem (3) is not available, to investigate the impact of mesh refinement on the accuracy of the approximated solution, we follow two strategies to estimate the order of convergence:

(1):

We compare the solutions at coarser meshes with the solution at the finest mesh, here eight refinements of the initial mesh. According to Corollary 1 we expect that

$$\begin{aligned} \Vert u - u_{h_i} \Vert _V \approx c \, h_i^{s} \end{aligned}$$

where \(h_i = h_0/{2^{i}}\), \(i=0, 1, 2, \ldots\), is the element size on level i. We denote the solution at the finest mesh by \({\hat{u}} = u_8\), and approximate u by \({\hat{u}}\), and estimate

$$\begin{aligned} s \approx \log _2 \left( \frac{\Vert {\hat{u}} - u_{h_{i-1}} \Vert _V}{\Vert {\hat{u}} - u_{h_{i}} \Vert _V} \right) , \quad i=1,2, \ldots . \end{aligned}$$
(2):

It should be noted that strategy I is known to provide bad estimates for s if the reference solution is not good enough compared to the level i. Since s is not known exactly, but could potentially be very small for the examples, we perform a second test for the convergence order:

$$\begin{aligned} s \approx \log _2 \left( \frac{\Vert u_{h_{i-1}} - u_{h_i} \Vert _V}{\Vert u_{h_i} - u_{h_{i+1}} \Vert _V} \right) , \quad i=1,2, \ldots . \end{aligned}$$

If both tests provide similar results we consider the approximate rates to be correctly estimated.

4.1 Example 1: low regularity

In the first test, we consider a simple initial phase-field satisfying Assumption 1 with selectable regularity. Let us consider

$$\begin{aligned} \varphi ^0(x,y)=|(x,y)|^\sigma \end{aligned}$$

with \(\sigma \in [0.5, 0.9]\). To ensure that the integration error of the coefficient is not dominating the error, a summed quadrature rule is utilized on the elements with a vertex at the singularity of \(\varphi ^0\). As it can be seen in Table 1 both strategies for estimating s provide similar values for coarse meshes while on fine meshes strategy I provides a too large value for s due to insufficient quality of the reference value \({\hat{u}}\).

Table 1 Example 1; Mesh refinement result and the order of convergence

We observe convergence rates s between 0.3 and 0.5 in accordance with our theory. The varying convergence rates highlight the variability of s in Assumption 1.

4.2 Example 2: rectangular hole

Inside the domain \(\varOmega\), we initially consider a horizontal fracture represented by

$$\begin{aligned} \varphi ^0 = {\left\{ \begin{array}{ll} 0, & \quad \text {on} \, (-0.125-d,0.125+d) \times (-d,d) \\ 1, & \quad \text {otherwise} \end{array}\right. } \end{aligned}$$

where \(d=h_2 = h_0/4\). The value for d is chosen small enough to have a reasonable shape of fracture while simultaneously, the discontinuity in the data is resolved on mesh level two. As can be seen in Table 2, both strategies provide nearly identical values for \(s \approx 0.9\). The initial fracture and resulting approximated phase-field and displacement solutions are depicted in Figs. 1 and 2. We notice, that the observed convergence rate is larger than 1/2. Due to the observation that the error for the PDE is quasi-best, see, Theorem 4, this can be explained by the observed smoothness in the Figs. 1 and 2, where the only visible singularity is aligned with the boundary of the initial phase-field and thus with the mesh.

Table 2 Example 2; Mesh refinement result and the order of convergence
Fig. 1
figure 1

Example 2; Initial fracture \(\varphi ^0\) (left) and approximated phase-field \(\varphi\) (right) at the 5th mesh level

Fig. 2
figure 2

Example 2; The x-component (left) and the y-component (right) of approximated displacement u at the 5th mesh level

4.3 Example 3: Kellogg’s test

Finally, we examine the convergence results by considering the example introduced in Kellogg (1971) for our initial fracture, since it has been the subject of attention in many papers concerning singularities of interface problems. Because of singularities, it is difficult to obtain accurate numerical approximations to interface problems by standard finite element methods. To be more specific, we start with the fracture

$$\begin{aligned} \varphi ^0 = {\left\{ \begin{array}{ll} 0, & \quad x_1x_2<0 \\ 1, & \quad \text {otherwise} \end{array}\right. } \end{aligned}$$

which is illustrated in Fig. 3(left), and is a special case of a solution to the well-known Kellogg’s example highlighting the related singularity in the origin.

Indeed both strategies for estimating s provide similar values of \(s \approx 0.7\). Figures 3(right) and 4 display approximated solutions \(\varphi\) and u, respectively.

Fig. 3
figure 3

Example 3; Initial fracture \(\varphi ^0\) (left) and approximated phase-field \(\varphi\) (right) at the 5th mesh level

Fig. 4
figure 4

Example 3; The x-component (left) and the y-component (right) of approximated displacement u at the 5th mesh level

Again, the observed convergence rate, in Table 3, is larger than 1/2 relating to relatively smooth solutions as observed in Figs. 3 and 4.

Table 3 Example 3; Mesh refinement result and the order of convergence