1 Introduction

An interesting class of PDE problems that is recently drawing attention in the literature is that of coupled bulk-surface partial differential equations (BSPDEs). Given a number \(d\in {\mathbb {N}}\) of space dimensions, a BSPDE is a system of \(m\in {\mathbb {N}}\) PDEs posed in the bulk \(\varOmega \subset {\mathbb {R}}^d\) coupled with \(n\in {\mathbb {N}}\) PDEs posed on the surface \(\varGamma := \partial \varOmega \) through either linear or non-linear coupling, see for instance [37]. For ease of presentation, we confine the exposition to the case \(m=n=1\), even if the whole study applies to the case of arbitrary m and n. In the time-independent case, let \(u({\varvec{x}})\) and \(v({\varvec{x}})\) be the bulk and surface variables obeying the following elliptic bulk-surface (BS) linear problem:

$$\begin{aligned} {\left\{ \begin{array}{ll} - \varDelta u ({\varvec{x}}) + u({\varvec{x}}) = f({\varvec{x}}), &{}\quad {\varvec{x}}\in \varOmega ;\\ - \varDelta _\varGamma v ({\varvec{x}}) + v({\varvec{x}}) + \dfrac{\partial u}{\partial \varvec{\nu }}({\varvec{x}}) = g({\varvec{x}}), &{}\quad {\varvec{x}}\in \varGamma ;\\ \dfrac{\partial }{\partial \varvec{\nu }} u({\varvec{x}}) = -\alpha u({\varvec{x}}) + \beta v({\varvec{x}}), &{}\quad {\varvec{x}}\in \varGamma ,\\ \end{array}\right. } \end{aligned}$$
(1)

where \(\alpha , \beta > 0\), while \(\varDelta \) and \(\varDelta _{\varGamma }\) denote the Laplace and Laplace–Beltrami operators respectively, and \(\varvec{\nu }\) denotes the outward unit normal vector field on \(\varGamma \) (see “Appendix A” for notations and definitions). The above model (1) was considered in [27].

For a nonlinear time-dependent generalisation, let \(u({\varvec{x}},t)\) and \(v({\varvec{x}},t)\) be the bulk and surface variables obeying the following bulk-surface reaction–diffusion system (BSRDS):

$$\begin{aligned} {\left\{ \begin{array}{ll} \dfrac{\partial u}{\partial t} ({\varvec{x}},t) - d_u\varDelta u ({\varvec{x}},t) = q(u({\varvec{x}},t)), &{}\quad {\varvec{x}}\in \varOmega , \ t \in [0,T];\\ \dfrac{\partial v}{\partial t} ({\varvec{x}},t) - d_v\varDelta _\varGamma v ({\varvec{x}},t) + \dfrac{\partial u}{\partial \varvec{\nu }} ({\varvec{x}},t) = r(u({\varvec{x}},t), v({\varvec{x}},t)), &{}\quad {\varvec{x}}\in \varGamma , \ t \in [0,T];\\ \dfrac{\partial u}{\partial \varvec{\nu }} ({\varvec{x}},t) = s(u({\varvec{x}},t), v({\varvec{x}},t)), &{}\quad {\varvec{x}}\in \varGamma , \ t \in [0,T], \end{array}\right. } \end{aligned}$$
(2)

where T is the final time and q(u), r(uv), s(uv) are possibly nonlinear functions. The model comprises several time-dependent BSPDE models existing in the literature, see for example [22, 29, 37, 41].

The quickly growing interest toward BSPDEs arises from the numerous applications of such PDE problems in different areas, such as cellular biological systems [22, 29, 36, 40] or fluid dynamics [15, 17, 34], among many other applications. Among the various state-of-the art numerical methods for the spatial discretisation of BSPDEs existing in the literature we mention finite elements [27, 33, 37, 38], trace finite elements [32], cut finite elements [17] and discontinuous Galerkin methods [20].

On the other hand, the Virtual Element Method (VEM) for bulk-only PDEs is a recent extension of the well-known finite element method (FEM) for the numerical approximation of several classes of PDEs on flat domains [5] or surfaces [31]. The key feature of VEM is that of being a polygonal method, i.e. it handles elements of a quite general polygonal shape, rather than just of triangular shape [5]. The success of virtual elements is due to several advantages arising from non-polygonal mesh generality. A non-exhaustive list of such advantages includes: (i) computationally cheap mesh pasting [12, 19, 31], (ii) efficient adaptive algorithms [18], flexible approximation of the domain and in particular of its boundary [23], and the possibility of enforcing higher regularity to the numerical solution [4, 9, 16], just to mention a few. Thanks to these advantages, several extensions of the original VEM for the Poisson equation [5] were developed for numerous PDE problems, such as heat [46] and wave equations [45], reaction–diffusion systems [1], Cahn–Hilliard equation [4], Stokes equation [8], linear elasticity [6], plate bending [16], fracture problems [11], eigenvalue problems [39] and many more.

The purpose of the present paper is to introduce a bulk-surface virtual element method (BSVEM) for the spatial discretisation of a coupled system of BSPDEs in two space dimensions. The proposed method combines the VEM for the bulk equation(s) with the surface finite element method (SFEM) for the surface equation(s). We apply the proposed method to (i) the linear elliptic BS Poisson problem (1) and (ii) the BSRDS (2).

The main novelty in the present study is devoted to error analysis. In fact, the simultaneous presence of non-triangular elements and boundary approximation error (which cannot be neglected in the context of BSPDEs, because the boundary is itself the domain of a surface PDE) provide new numerical analysis challenges. We prove that the proposed method possesses optimal second-order convergence in space, provided the exact solution is \(H^{2+1/4}\) in the bulk and \(H^2\) on the surface. H denotes the Hilbert space. This is slightly more than the usual requirement of \(H^2\) both in the bulk and on the surface [33]. However, our analysis requires this slightly higher regularity assumption only in the simultaneous presence of a curved boundary \(\varGamma \) and non-triangular elements close to the boundary, which is a novel case. Otherwise, our results fall back to the known cases in the literature, see for instance [33] for the case of triangular BSFEM and [1] for the case of polygonal VEM in the absence of curvature in the boundary \(\varGamma \). The extension to higher order cases would require the usage of curved elements, because otherwise the geometric error arising from the boundary approximation would dominate over the numerical error, see [24]. This challenge will be addressed in future studies. The proposed analysis has three by-products:

  1. 1.

    The bulk-VEM of lowest polynomial order \(k=1\) possesses optimal convergence also in the presence of curved boundaries. A first work in this direction is found in [10], in which the authors consider polygonal elements with a curved boundary that match the exact domain in order to avoid errors arising from boundary approximation. Subsequently, in [14], the need of matching the exact domain was removed by introducing suitable corrections in the method. In the present work, instead, we obtain similar results in the low polynomial case \(k=1\) by harnessing in a novel way the geometric error estimates of BS polyhedral domains in [27].

  2. 2.

    A potential alternative approach to the lifting operator used in the analysis of SPDEs [26] and BSPDEs [27] is the Sobolev extension operator. In fact, we prove that, if a function is \(H^{2+1/4}\) instead of \(H^2\) in the two-dimensional bulk, the Sobolev extension retains optimal approximation properties of lifting. Moreover, the Sobolev extension of a function has the property of preserving its \(W^{m,p}\) class, while its lift does not because it is not \({\mathscr {C}}^k\) for any positive integer k. This property, which is crucial in our analysis, is potentially beneficial to the error analysis of bulk-only or BSVEM approximation of more general PDEs, where the boundary curvature was not accounted for, see for instance [5, 8, 31, 39, 45, 46].

  3. 3.

    We construct a special inverse trace operator to account for general boundary conditions. Like the standard inverse trace operator, our inverse trace maps a function \(v \in H^1(\varGamma )\) to a function \(v_B \in H^1(\varOmega )\) such that \(\Vert v_B\Vert _{H^1(\varOmega )} \le C\Vert v\Vert _{H^1(\varGamma )}\), with C depending on the domain \(\varOmega \). Our inverse trace has the stronger property that \(\Vert v_B\Vert _{L^2(\varOmega )} \le C\Vert v\Vert _{L^2(\varGamma )}\) and \(|v_B|_{H^1(\varOmega )} \le C \Vert v\Vert _{H^1(\varGamma )}\). This means that the proposed operator preserves both \(L^2\) and \(H^1\) norms up to the same multiplicative constants, i.e. it is an \(L^2\)-preserving inverse trace.

The proposed method has all the benefits of polygonal meshes, two of which will be illustrated in the present work. First, the usage of suitable polygons drastically reduces the computational complexity of matrix assembly on equal meshsize in comparison to the triangular BSFEM. Similar results are obtained in the literature through other methods, such as trace FEMs [32] or cut FEMs [17]. Second, a curved portion of the boundary can be approximated with a single element with many edges. The BSVEM lends itself to other advantages due to its polygonal nature, such as efficient algorithms for adaptivity or mesh pasting, see for instance [18]. These aspects will be addressed in future studies.

Hence, the structure of our paper is as follows. In Sect. 2, we elaborate on the weak formulations, existence and regularity for problems (1) and (2). In Sect. 3, we introduce polygonal BS meshes, analyse geometric error, define suitable function spaces, analyse their approximation properties and present the spatial discretisation of the considered BSPDE problems. In Sect. 4, we carry out the convergence error analysis for the parabolic case, the main result being optimal second-order spatial convergence in the \(L^2\) norm, both in the bulk and on the surface. In Sect. 5 we present an IMEX-Euler time discretisation of the parabolic problem. In Sect. 6 we show that polygonal meshes can be exploited to significantly reduce the computational time of the matrix assembly. In Sect. 7 we illustrate our findings through three numerical examples. The first example shows (i) optimised matrix assembly and (ii) optimal convergence for the elliptic problem (1). The second example shows optimal convergence in space and time for the parabolic problem (2) for the linear case. The third example compares the BSFEM- and BSVEM-IMEX Euler approximations of the wave pinning RDS considered in [22]. In “Appendix A” we provide some basic definitions and results required in our analysis. In “Appendix B” we report some lengthy proofs involved in Sect. 4.

2 Weak formulations of BSPDEs, existence and regularity

In this section we state the weak formulations of the elliptic (1) and parabolic problems (2), respectively. For problem (1), we multiply the first two equations of (1) by two test functions \(\varphi \in H^1(\varOmega )\) and \(\psi \in H^1(\varGamma )\), respectively, then we apply Green’s formula in the bulk \(\varOmega \) and on the one-dimensional manifold \(\varGamma \) [26]. We obtain the following formulation: find \(u \in H^1(\varOmega )\) and \(v \in H^1(\varGamma )\) such that

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle \int _{\varOmega }\nabla u \cdot \nabla \varphi + \int _{\varOmega }u\varphi = \int _{\varOmega }f\varphi + \int _{\varGamma }\dfrac{\partial u}{\partial \varvec{\nu }}\varphi ;\\ \displaystyle \int _{\varGamma }\nabla _{\varGamma }v\cdot \nabla _{\varGamma }\psi + \int _{\varGamma }v\psi + \int _{\varGamma }\dfrac{\partial u}{\partial \varvec{\nu }} \psi = \int _{\varGamma }g\psi , \end{array}\right. } \end{aligned}$$
(3)

for all \(\varphi \in H^1(\varOmega )\) and \(\psi \in H^1(\varGamma )\). By using the third equation of (1) in (3), we obtain the following weak formulation: find \(u\in H^1(\varOmega )\) and \(v\in H^1(\varGamma )\) such that

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle \int _{\varOmega }\nabla u \cdot \nabla \varphi + \int _{\varOmega }u\varphi + \int _{\varGamma }(\alpha u -\beta v)\varphi = \int _{\varOmega }f\varphi ;\\ \displaystyle \int _{\varGamma }\nabla _{\varGamma }v\cdot \nabla _{\varGamma }\psi + \int _{\varGamma }(-\alpha u + (\beta +1)v) \psi = \int _{\varGamma }g\psi , \end{array}\right. } \end{aligned}$$
(4)

for all \(\varphi \in H^1(\varOmega )\) and \(\psi \in H^1(\varGamma )\).

By reasoning similarly to the elliptic problem (1), we obtain the following weak formulation of the parabolic problem (2): find \(u \in L^\infty ([0,T]; H^1(\varOmega ))\) and \(v\in L^\infty ([0,T]; H^1(\varGamma ))\) such that

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle \int _{\varOmega }\dfrac{\partial u}{\partial t}\varphi + d_u\int _{\varOmega }\nabla u \cdot \nabla \varphi = \int _{\varOmega }q(u)\varphi + \int _{\varGamma }s(u,v)\varphi ;\\ \displaystyle \int _{\varGamma }\dfrac{\partial v}{\partial t}\psi + d_v\int _{\varGamma }\nabla _{\varGamma }v \cdot \nabla _{\varGamma }\psi + \int _{\varGamma }s(u,v)\psi = \int _{\varGamma }r(u,v)\psi , \end{array}\right. } \end{aligned}$$
(5)

for all \(\varphi \in L^\infty ([0,T]; H^1(\varOmega ))\) and \(\psi \in L^\infty ([0,T]; H^1(\varGamma ))\). The following theorem contains existence, uniqueness and regularity results for the weak problem (4).

Theorem 1

[Existence, uniqueness and regularity for problem (4)] If \(\varGamma \) is a \({\mathscr {C}}^3\) surface, \(f\in L^2(\varOmega )\) and \(g\in L^2(\varGamma )\), then the weak elliptic problem (4) has a unique solution \((u,v) \in H^2(\varOmega ) \times H^2(\varGamma )\) that fulfils

$$\begin{aligned}&\Vert (u,v)\Vert _{H^2(\varOmega ) \times H^2(\varGamma )} \le C \Vert (f,g)\Vert _{L^2(\varOmega ) \times L^2(\varGamma )}; \end{aligned}$$
(6)
$$\begin{aligned}&\Vert {{\,\mathrm{Tr}\,}}(u)\Vert _{H^{3/2}(\varOmega )} \le C \Vert (f,g)\Vert _{L^2(\varOmega ) \times L^2(\varGamma )}, \end{aligned}$$
(7)

where \(C>0\) is a constant that depends on \(\alpha , \beta \) and \(\varOmega \).

Proof

See [27] for estimate (6). Estimate (7) follows from (6) by using the trace inequality (A.11). \(\square \)

The problem of existence, uniqueness and regularity for the parabolic problem (5) is much more complicated and strictly depends on the nature of the kinetics \(q(\cdot )\), \(r(\cdot )\) and of the coupling kinetics \(s(\cdot )\). For the remainder of this work we will adopt the following set of assumptions.

Assumption 1

We assume that:

  • \(\varGamma \) is a \({\mathscr {C}}^3\) surface, qrs are \({\mathscr {C}}^2\) and globally Lipschitz functions.

  • The initial datum \((u_0,v_0)\) fulfils \(u_0\in H^2(\varOmega )\), \({{\,\mathrm{Tr}\,}}(u_0) \in H^2(\varGamma )\) and \(v_0\in H^2(\varGamma )\).

  • There exists a unique solution (uv) that fulfils

    $$\begin{aligned}&\Vert (u,{\dot{u}})\Vert _{L^\infty ([0,T]; H^{2+1/4}(\varOmega ))} + \Vert ({{\,\mathrm{Tr}\,}}(u),{{\,\mathrm{Tr}\,}}({\dot{u}}),v,{\dot{v}})\Vert _{L^\infty ([0,T]; H^2(\varGamma ))}\nonumber \\&\quad \qquad \le C \exp (T)\left( \Vert u_0\Vert _{H^{2+1/4}(\varOmega )} + \Vert ({{\,\mathrm{Tr}\,}}(u_0),v_0)\Vert _{H^2(\varGamma )}\right) , \end{aligned}$$
    (8)

    where \(T>0\) is the final time and \(C>0\) depends on \(\varOmega \), \(\Vert q\Vert _{{\mathscr {C}}^2({\mathbb {R}})}\), \(\Vert r\Vert _{{\mathscr {C}}^2({\mathbb {R}}^2)}\) and \(\Vert s\Vert _{{\mathscr {C}}^2({\mathbb {R}}^2)}\).

In many applications, assuming globally Lipschitz kinetics is too restrictive and an ad-hoc analysis is required, see for instance [29]. However, there are notable examples of BSRDS with globally Lipschitz kinetics, such as the wave pinning model studied in [22] and considered in the numerical example in Sect. 7.3. From here onwards, we shall assume that the weak parabolic problem (5) has a unique and sufficiently regular solution.

3 The bulk-surface virtual element method

In this section we will introduce the bulk-surface virtual element method (BSVEM) through the following steps:

  • describe the polygonal BS meshes that will be used in the method (Sect. 3.1);

  • quantify the geometric error arising from polygonal approximation of BS domains (Sect. 3.2);

  • introduce the discrete function spaces and bilinear forms used in the method and their approximation properties (Sects. 3.33.4);

  • present the spatially discrete formulations of the elliptic- and parabolic problems (1) and (2), respectively (Sect. 3.5).

3.1 Polygonal bulk-surface meshes

Let \(h>0\) be a positive number called meshsize and let \(\varOmega _h = \cup _{E\in {\mathscr {E}}_h} E\) be a polygonal approximation of the bulk \(\varOmega \), where \({\mathscr {E}}_h\) is a set of non-degenerate polygons. The polygonal bulk \(\varOmega _h\) automatically induces a piecewise linear approximation \(\varGamma _h\) of \(\varGamma \), defined by \(\varGamma _h = \partial \varOmega _h\), exactly as in the case of triangular meshes, see for example [27]. Notice that we can write \(\varGamma _h = \cup _{F \in {\mathscr {F}}_h} F\), where \({\mathscr {F}}_h\) is the set of the edges of \(\varOmega _h\) that lie on \(\varGamma _h\). We assume that:

  1. (F1)

    the diameter of each element \(E\in {\mathscr {E}}_h\) does not exceed h;

  2. (F2)

    for any two distinct elements \(E_1\), \(E_2 \in {\mathscr {E}}_h\), their intersection \(E_1\cap E_2\) is either empty, or a common vertex, or a common edge.

  3. (F3)

    all nodes of \(\varGamma _h\) lie on \(\varGamma \);

  4. (F4)

    every edge \(F\in {\mathscr {F}}_h\) is contained in the Fermi stripe U of \(\varGamma \) (see Fig. 1 for an illustration).

  1. (V1)

    there exists \(\gamma _1 > 0\) such that every \(E \in {\mathscr {E}}_h\) is star-shaped with respect to a ball of radius \(\gamma _1 h_E\), where \(h_E\) is the diameter of E;

  2. (V2)

    there exists \(\gamma _2 > 0\) such that for all \(E \in {\mathscr {E}}_h\), the distance between any two nodes of E is at least \(\gamma _2 h_E\).

Assumptions (F1)–(F4) are standard in the SFEM literature, see for instance [26], while assumptions (V1)–(V2) are standard in the VEM literature, see for instance [5]. The combined assumptions (F1)–(V2) will prove sufficient in our BS setting. In the following definitions and results we provide the necessary theory for estimating the geometric error arising from the boundary approximation.

Definition 1

(Essentials of polygonal BS meshes) An edge \({\bar{e}}\) of any element \(E\in {\mathscr {E}}_h\) is called a boundary edge if \({\bar{e}} \subset \varGamma _h\), otherwise \({\bar{e}}\) is called an inner edge. Let \({{\mathscr {B}}}{{\mathscr {E}}}(E)\) and \({{\mathscr {I}}}{{\mathscr {E}}}(E)\) be the sets of boundary and inner edges of E, respectively. An element \(E\in {\mathscr {E}}_h\) is called an external element if it has at least one boundary edge, otherwise E is called an internal element. Let \(\varOmega _B\) be the discrete narrow band defined as the union of the external elements of \(\varOmega _h\) as illustrated in Fig. 1b. From Assumption (F4), for any boundary edge \({\bar{e}}\), we have that \({\varvec{a}}({\bar{e}}) \subset \varGamma \), where \({\varvec{a}}\) is the normal projection defined in Lemma A1. Hence, for sufficiently small h and for all \(E\in {\mathscr {E}}_h\), it is possible to define the exact element \(\breve{E}\) as the compact set enclosed by the edges \({{\mathscr {I}}}{{\mathscr {E}}}(E) \cup \{{\varvec{a}}({\bar{e}}) | {\bar{e}} \in {{\mathscr {B}}}{{\mathscr {E}}}(E)\}\), see Fig. 2 for an illustration.

Remark 1

(Properties of polygonal BS meshes) For any BS mesh \((\varOmega _h, \varGamma _h)\) of meshsize \(h>0\) it holds that:

  • for sufficiently small \(h>0\), the discrete narrow band \(\varOmega _B\) is contained in the Fermi stripe U as shown in Fig. 1b (blue color).

  • the collection of all exact elements is a coverage of the exact bulk \(\varOmega \), that is \(\cup _{E\in {\mathscr {E}}_h} \breve{E} = \varOmega \).

Fig. 1
figure 1

Illustration of continuous domain, discrete domain and related notations (color figure online)

Let \(N \in {\mathbb {N}}\) and let \({\varvec{x}}_i\), \(i=1,\dots ,N\), be the nodes of \(\varOmega _h\), which can be ordered in an arbitrary way. However, if \(\varOmega \) has a rectangular shape and the nodes are ordered along a Cartesian grid, the matrices associated with the method will have a block-tridiagonal structure. Let \(M \in {\mathbb {N}}\), \(M < N\) and assume that the nodes of \(\varGamma _h\) are \({\varvec{x}}_k\), \(k=1,\dots ,M\), i.e. the first M nodes of \(\varOmega _h\). Throughout the paper we need the following reduction matrix \(R \in {\mathbb {R}}^{N\times M}\) defined as \(R := [I_M; 0]\), where \(I_M\) is the \(M\times M\) identity matrix. Equivalently, \(R = (r_{ik})\) is defined as

$$\begin{aligned} r_{ik} = {\left\{ \begin{array}{ll} \delta _{ik}&{} \quad \text {if}\ i=1,\ldots ,M;\\ 0&{} \quad \text {if}\ i=M+1,\ldots ,N, \end{array}\right. } \end{aligned}$$
(9)

for all \(k=1,\dots ,M\), where \(\delta _{ik}\) is the Kronecker symbol. The reduction matrix R fulfils the following two properties:

  • For \({\varvec{v}}\in {\mathbb {R}}^{N}\), \(R^T{\varvec{v}}\in {\mathbb {R}}^M\) is the vector with the first M entries of \({\varvec{v}}\);

  • For \({\varvec{w}}\in {\mathbb {R}}^M\), \(R{\varvec{w}}\in {\mathbb {R}}^N\) is the vector whose first M entries are those of \({\varvec{w}}\) and the other \(N-M\) entries are 0.

In what follows, we will use the matrix R for an optimised implementation of the BSVEM.

Fig. 2
figure 2

Construction of the exact element \(\breve{E}\) corresponding to the polygonal element E according to Definition 1. If h is sufficiently small (depending on the curvature of \(\varGamma \) and the mesh regularity parameter \(\gamma _1\)) then \(\varGamma \) cannot intersect any inner edges and curved elements are well-defined (subplots a, b). Otherwise, if \(\varGamma \) intersects an inner edge of E, \(\breve{E}\) is not well-defined (subplot c)

Fig. 3
figure 3

On the mapping \(G_E\) between a triangular element E and its curved counterpart \(\breve{E}\). In the case of one boundary edge (subplots a, b), a \({\mathscr {C}}^2\) homeomorphism \(G_E: E\rightarrow \breve{E}\) is known to exist, see [27]. With two adjacent boundary edges (subplots c, d) the mapping \(G_E\) cannot be smooth, because \(G_E\) maps the non-smooth curve \({\bar{e}}_1 \cup {\bar{e}}_2\) onto the smooth curve \({\varvec{a}}({\bar{e}}_1) \cup {\varvec{a}}({\bar{e}}_2)\). The approach proposed in Lemma 1 solves this issue

3.2 Variational crime

We now consider the geometric error due to the boundary approximation. Since the surface variational crime in surface finite elements is well-understood [26], we will mainly focus on the variational crime in the bulk. To this end, it is useful to analyse the relation between any element \(E\in {\mathscr {E}}_h\) and its exact counterpart \(\breve{E}\), see Definition 1. For the special case of triangular meshes with at most one boundary edge per element (Fig. 3a, b), there exists a \({\mathscr {C}}^2\) homeomorphism \(G_E: E \rightarrow \breve{E}\) that is quadratically close to the identity with respect to the meshsize, see [27]. Instead, when an element E has adjacent, non-collinear boundary edges (which can occur even in triangular meshes, see Fig. 3c, d), such a smooth homeomorphism does not exist. In fact, a smooth mapping cannot map the adjacent boundary edges of E - a curve with corners- onto a portion of the smooth curve \(\varGamma \). However, in the following result we show the existence of a homeomorphism between E and \(\breve{E}\) with slightly weaker regularity, which is sufficient for our purposes.

Lemma 1

(Parameterisation of the exact geometry) Let h be sufficiently small (depending on the curvature of \(\varGamma \) and the mesh regularity parameter \(\gamma _1\)) and let \({\mathscr {E}}_h\) fulfil assumptions (F1)–(V2). There exists a homeomorphism \(G:\varOmega _h \rightarrow \varOmega \) such that \(G \in W^{1,\infty }(\varOmega _h)\) and

$$\begin{aligned}&G|_{\varGamma _h} = {\varvec{a}}|_{\varGamma _h}; \end{aligned}$$
(10)
$$\begin{aligned}&G|_{\varOmega _h {\setminus } \varOmega _B} = Id; \end{aligned}$$
(11)
$$\begin{aligned}&\Vert JG - Id\Vert _{L^\infty (\varOmega _B)} \le Ch; \end{aligned}$$
(12)
$$\begin{aligned}&\Vert \det (JG) - 1\Vert _{L^\infty (\varOmega _B)} \le Ch; \end{aligned}$$
(13)
$$\begin{aligned}&\Vert G - Id\Vert _{L^\infty (\varOmega _B)} \le Ch^2, \end{aligned}$$
(14)

where \({\varvec{a}}\) is the normal projection defined in Lemma A1, JG is the Jacobian of G and C is a constant that depends on \(\varGamma \) and the constants \(\gamma _1\), \(\gamma _2\) are those considered in Assumptions (V1)–(V2).

Fig. 4
figure 4

Steps of the construction of G as described in the proof of Lemma 1

Proof

Let \(E\in {\mathscr {E}}_h\). By Assumption (V2), E is star-shaped with respect to a ball \({\mathscr {B}}_E\) of diameter \(R_E \ge \gamma _2 h_E\), let \({\varvec{x}}_E\) be the center of such ball (Fig. 4a). For \({\bar{e}} \in {{\mathscr {B}}}{{\mathscr {E}}}(E)\) let \(T_{{\bar{e}}}\) be the triangle spanned by \({\bar{e}}\) and \({\varvec{x}}_E\) (Fig. 4b). Let us now consider the collection of all the \(T_{{\bar{e}}}\)’s defined as follows: \({{\mathscr {B}}}{{\mathscr {T}}}_h := \{T_{{\bar{e}}} | {\bar{e}}\in E, \ E \in {\mathscr {E}}_h\}\). We need to prove that \({{\mathscr {B}}}{{\mathscr {T}}}_h\) is quasi-uniform, i.e we need to prove that for all \(E\in {\mathscr {E}}_h\) and \({\bar{e}}\in {{\mathscr {B}}}{{\mathscr {E}}}(E)\), the triangle \(T_{{\bar{e}}}\) has an inscribed ball of diameter greater or equal to \({\bar{\gamma }} h_E\), where the constant \({\bar{\gamma }} > 0\) depends on \(\gamma _1\) and \(\gamma _2\), only. To this end, if \(h_{{\bar{e}}}\) is the height of \(T_{{\bar{e}}}\) relative to the basis \({\bar{e}}\) (Fig. 4c), then we have

$$\begin{aligned} |h_{{\bar{e}}}| \ge R_E \ge \gamma _1 h_E, \qquad \text {from Assumption } (V1);\end{aligned}$$
(15)
$$\begin{aligned} |{\bar{e}}| \ge \gamma _2 h_E, \qquad \text {from Assumption } (V2). \end{aligned}$$
(16)

In addition, since no edge of \(T_{{\bar{e}}}\) is longer than \(h_E\), our claim follows.

Let now \(\breve{T}_{{\bar{e}}}\) be the curved triangle corresponding to \(T_{{\bar{e}}}\) (Fig. 4d), which again is well-defined by assuming h sufficiently small depending on \(\varGamma \) and \({\bar{\gamma }}\), which in turn depends on \(\gamma _1\) and \(\gamma _2\). Since the triangulation \({{\mathscr {B}}}{{\mathscr {T}}}_h\) is quasi-uniform, then, from [27, Proposition 4.7 and its proof] there exists a diffeomorphism \(G_{{\bar{e}}}: T_{{\bar{e}}} \rightarrow \breve{T}_{{\bar{e}}}\) such that

$$\begin{aligned}&G_{{\bar{e}}}({\varvec{x}}) = {\varvec{a}}({\varvec{x}}), \qquad \forall {\varvec{x}}\in {\bar{e}}; \end{aligned}$$
(17)
$$\begin{aligned}&G_{{\bar{e}}}({\varvec{x}}) = {\varvec{x}}, \qquad \forall {\varvec{x}}\in \partial T_{{\bar{e}}} {\setminus } {\bar{e}}; \end{aligned}$$
(18)
$$\begin{aligned}&\Vert JG_{{\bar{e}}} - Id\Vert _{L^\infty (T_{{\bar{e}}})} \le Ch; \end{aligned}$$
(19)
$$\begin{aligned}&\Vert \det (JG)_{{\bar{e}}} - 1\Vert _{L^\infty (T_{{\bar{e}}})} \le Ch; \end{aligned}$$
(20)
$$\begin{aligned}&\Vert G_{{\bar{e}}} - Id\Vert _{L^\infty (T_{{\bar{e}}})} \le Ch^2, \end{aligned}$$
(21)

where C depends only on \({\bar{\gamma }}\), which in turn depends only on \(\gamma _1\) and \(\gamma _2\). We are ready to construct the mapping \(G:\varOmega _h \rightarrow \varOmega \) as follows:

$$\begin{aligned} G({\varvec{x}}) = {\left\{ \begin{array}{ll} G_{{\bar{e}}}({\varvec{x}}), &{}\quad \text {if} \ {\varvec{x}}\in T_{{\bar{e}}} \ \text {for some} \ {\bar{e}} \in {{\mathscr {B}}}{{\mathscr {E}}}(E), \ E\in {\mathscr {E}}_h;\\ {\varvec{x}}, &{} \quad \text {otherwise}. \end{array}\right. } \end{aligned}$$
(22)

Property (18) can now be rephrased by saying that G, restricted to any inner edge of \(\varOmega _h\), is the identity. In addition, since all the \(G_{{\bar{e}}}\)’s are homeomorphisms, we obtain that G is a homeomorphism between \(\varOmega _h\) and \(\varOmega \). Finally, from (17), (19), (20) and (22) we obtain the desired estimates (10)–(13). \(\square \)

Remark 2

(Virtual elements for bulk-only PDEs) Lemma 1 has an important consequence in the analysis of boundary approximation for VEMs for the case of bulk PDEs. For triangular finite elements, boundary approximation is a well-understood topic, see for instance [21]. For VEMs, the first work in this direction is [10], in which a VEM on exact curved polygons is considered, in order to take out the geometric error. However, in the lowest-order VEM on polygonal domains, it is empirically known that the geometric error does not prevent optimality, as discussed in [14]. To the best of the authors’ knowledge, the present study provides, as a by-product, the first rigorous proof of this fact.

Thanks to Lemma 1 it is possible to define bulk- and surface-lifting operators.

Definition 2

(Bulk- and surface-lifting operators) Given \(V:\varOmega _h\rightarrow {\mathbb {R}}\) and \(W:\varGamma _h \rightarrow {\mathbb {R}}\), their lifts are defined by \(V^\ell := V \circ G^{-1}\) and \(W^\ell := W \circ G^{-1}\), respectively. Conversely, given \(v:\varOmega \rightarrow {\mathbb {R}}\) and \(w:\varGamma \rightarrow {\mathbb {R}}\), their inverse lifts are defined by \(v^{-\ell } := v \circ G\) and \(w^{-\ell } := w \circ G\), respectively, with \(G:\varOmega _h \rightarrow \varOmega \) being the mapping defined in Lemma 1.

Lemma 1 also enables us to show the equivalence of Sobolev norms under lifting as illustrated next.

Lemma 2

(Equivalence of norms under lifting) There exists two constants \(c_2> c_1 > 0\) depending on \(\varGamma \) and \(\gamma _2\) such that, for all \(V: \varOmega _h \rightarrow {\mathbb {R}}\) and for all \(W: \varGamma _h \rightarrow {\mathbb {R}}\),

$$\begin{aligned}&c_1 \Vert V^\ell \Vert _{L^2(\varOmega _h^\ell )} \quad \le \Vert V\Vert _{L^2(\varOmega _h)} \quad \le c_2\Vert V^\ell \Vert _{L^2(\varOmega _h^\ell )}; \end{aligned}$$
(23)
$$\begin{aligned}&c_1 |V^\ell |_{H^1(\varOmega _h^\ell )} \le |V|_{H^1(\varOmega _h)} \le c_2|V^\ell |_{H^1(\varOmega _h^\ell )}; \end{aligned}$$
(24)
$$\begin{aligned}&c_1 \Vert W^\ell \Vert _{L^2(\varGamma )} \quad \le \Vert W\Vert _{L^2(\varGamma _h)} \quad \le c_2\Vert W^\ell \Vert _{L^2(\varGamma )}; \end{aligned}$$
(25)
$$\begin{aligned}&c_1 |W^\ell |_{H^1(\varGamma )} \le |W|_{H^1(\varGamma _h)} \le c_2|W^\ell |_{H^1(\varGamma )}. \end{aligned}$$
(26)

Proof

Estimates (23)–(24) are found by using the map G introduced in Lemma 1 in the proof of [27, Proposition 4.9]. A proof of (25)–(26) can be found in [26, Lemma 4.2]. \(\square \)

We are ready to estimate the effect of lifting on bulk and surface integrals.

Lemma 3

(Geometric error of lifting) If \(u, \varphi \in H^{1}(\varOmega )\), then

$$\begin{aligned}&\left| \int _{\varOmega }\nabla u \cdot \nabla \varphi - \int _{\varOmega _h}\nabla u^{-\ell } \cdot \nabla \varphi ^{-\ell } \right| \le Ch|u|_{H^1(\varOmega _B^\ell )}|\varphi |_{H^1(\varOmega _B^\ell )}, \end{aligned}$$
(27)
$$\begin{aligned}&\left| \int _{\varOmega }u \varphi - \int _{\varOmega _h}u^{-\ell } \varphi ^{-\ell } \right| \le Ch \Vert u\Vert _{L^{2}(\varOmega _B^\ell )}\Vert \varphi \Vert _{L^{2}(\varOmega _B^\ell )}, \end{aligned}$$
(28)

where C depends on \(\varGamma \), \(\gamma _1\) and \(\gamma _2\). If \(v,\psi \in H^1(\varGamma )\), then

$$\begin{aligned}&\left| \int _{\varGamma }\nabla _{\varGamma }v \cdot \nabla _{\varGamma }\psi - \int _{\varGamma _h}\nabla _{\varGamma _h}v^{-\ell } \cdot \nabla _{\varGamma _h}\psi ^{-\ell } \right| \le Ch^2|v|_{H^1(\varGamma )} |\psi |_{H^1(\varGamma )}; \end{aligned}$$
(29)
$$\begin{aligned}&\left| \int _{\varGamma }v \psi - \int _{\varGamma _h}v^{-\ell } \psi ^{-\ell } \right| \le Ch^2\Vert v\Vert _{L^2(\varGamma )} \Vert \psi \Vert _{L^2(\varGamma )}, \end{aligned}$$
(30)

where C depends on \(\varGamma \), \(\gamma _1\) and \(\gamma _2\).

Proof

To prove (27)–(28) it is sufficient to use the bulk geometric estimates (11)–(13) in the proof of [27, Lemma 6.2]. A proof of (29)–(30) can be found in [26]. \(\square \)

Lemma 3 allows to rephrase integrals on the exact domain as integrals on the discrete domain, and vice-versa, up to a small error that is O(h) in the bulk and \(O(h^2)\) on the surface.

Remark 3

(Preservation of regularity under lifting) For \(E\in {\mathscr {E}}_h\) the inverse lift of an \(H^2(\breve{E})\) function is not, in general, \(H^2(E)\), cf. Remark A1 and Lemma 1. This problem does not arise in the context of triangular BSFEM in the presence of curved boundaries, see for instance [33], or bulk virtual elements in the absence of curved boundaries, see for instance [1]. In the context of bulk- or bulk-surface VEM in the presence of curved boundaries, our analysis requires full \(H^2\)-regularity of the exact solution mapped on the polygonal domain. Because \(u^{-\ell }\) does not retain the regularity class of u, we need an alternative mapping of u on \(\varGamma _h\). To this end, we recall the following Theorem.

Theorem 2

(Sobolev extension theorem) Assume that \(\varOmega \) has a Lipschitz boundary \(\varGamma \), let \(r\in {\mathbb {N}}\) and \(p\in [1,+\infty ]\). Then, for any function \(u\in W^{r,p}(\varOmega )\), there exists an extension \({\tilde{u}} \in W^{r,p}({\mathbb {R}}^2)\) such that \({\tilde{u}}_{|\varOmega } = u\) and

$$\begin{aligned} \Vert {\tilde{u}}\Vert _{W^{r,p}({\mathbb {R}}^2)} \le C\Vert u\Vert _{W^{r,p}(\varOmega )}, \end{aligned}$$
(31)

where C depends on \(\varOmega \) and r.

Proof

See [44]. \(\square \)

We are now able to approximate the exact solution u through \({\tilde{u}}_{|\varOmega _h}\), that is the restriction to \(\varOmega _h\) of the Sobolev extension \({\tilde{u}}\) of the exact solution u. We present the following variant of Lemma 3 in order to quantify the geometric error of the Sobolev extension.

Lemma 4

(Geometric error of Sobolev extension) If \(0< \gamma < 1\), there exist \(C>0\) and \(C_\gamma > 0\) such that

$$\begin{aligned}&\Vert {\tilde{u}} - u^{-\ell }\Vert _{L^2(\varOmega _h)} \le Ch^2\Vert u\Vert _{H^{1+3/4}(\varOmega )}, \quad \forall \ u\in H^{1+3/4}(\varOmega ); \end{aligned}$$
(32)
$$\begin{aligned}&|{\tilde{u}} - u^{-\ell }|_{H^1(\varOmega _h)} \le Ch^{3/2}\Vert u\Vert _{H^2(\varOmega )} + C_\gamma h^{1/2 + 2\gamma }\Vert u\Vert _{H^{2+\gamma }(\varOmega )}, \quad \forall \ u\in H^{2 + \gamma }(\varOmega ). \end{aligned}$$
(33)

Proof

By using (31), (A.14) with \(\gamma = \frac{3}{4}\), (11) and (14) we have that

$$\begin{aligned} \begin{aligned}&\Vert {\tilde{u}}-u^{-\ell }\Vert _{L^2(\varOmega _h)} = \Vert {\tilde{u}} - {\tilde{u}}\circ G\Vert _{L^2(\varOmega _h)} \le C\Vert {\tilde{u}}\Vert _{H^{1+3/4}(\varOmega _h)}\Vert (Id-G)^{3/4}\Vert _{L^2(\varOmega _h)}\\ =&C\Vert u\Vert _{H^{1+3/4}(\varOmega )}\Vert (Id-G)^{3/4}\Vert _{L^2(\varOmega _B)} \le C\Vert u\Vert _{H^{1+3/4}(\varOmega )} |\varOmega _B|^{1/2}\Vert Id-G\Vert _{L^\infty (\varOmega _B)}^{3/4}\\ \le&Ch^{1/2} h^{3/2}\Vert u\Vert _{H^{1+3/4}(\varOmega )} = Ch^2\Vert u\Vert _{H^{1+3/4}(\varOmega )}, \end{aligned} \end{aligned}$$
(34)

which proves (32). Notice that, in the last line of (34), the \(h^{1/2}\) term is the effect of the Sobolev extension being exact except on the discrete narrow band \(\varOmega _B\), while the \(h^{3/2}\) terms is the approximation accuracy, which is intuitively justified by the Sobolev index \(1+3/4\) being larger than the exponent 3/2. Using (A.10), (31), (12) and (14) we have that

$$\begin{aligned}&|{\tilde{u}} - u^{-\ell }|_{H^1(\varOmega _h)} = \Vert \nabla {\tilde{u}} - (JG^T\ \nabla {\tilde{u}})\circ G\Vert _{L^2(\varOmega _h)}\nonumber \\ \le&\Vert (Id-JG^T\circ G)\Vert _{L^\infty (\varOmega _h)}\Vert \nabla {\tilde{u}}\Vert _{L^2(\varOmega _B)} + \Vert JG^T\circ G\Vert _{L^\infty (\varOmega _h)}\Vert \nabla {\tilde{u}} - \nabla {\tilde{u}}\circ G\Vert _{L^2(\varOmega _h)}\nonumber \\ \le&Ch\Vert \nabla {\tilde{u}}\Vert _{L^2(\varOmega _B)} + C\Vert \nabla {\tilde{u}} - \nabla {\tilde{u}}\circ G\Vert _{L^2(\varOmega _h)} \le Ch^{3/2}\Vert u\Vert _{H^2(\varOmega )} \nonumber \\&+ C\Vert \nabla {\tilde{u}} - \nabla {\tilde{u}}\circ G\Vert _{L^2(\varOmega _h)}. \end{aligned}$$
(35)

Since \({\tilde{u}} \in H^{2+\gamma }(\varOmega _h)\), then \(\nabla {\tilde{u}} \in H^{1+\gamma }(\varOmega _h)\). Hence, by reasoning as in (34) we have that

$$\begin{aligned} \Vert \nabla {\tilde{u}} - \nabla {\tilde{u}}\circ G\Vert _{L^2(\varOmega _h)} \le C_\gamma h^{1/2 + 2\gamma }\Vert u\Vert _{H^{2+\gamma }(\varOmega )}. \end{aligned}$$
(36)

By substituting (36) into (35) we get the desired estimate. \(\square \)

3.3 Virtual element space and operators in the bulk

In this section, we define virtual element spaces on polygons and polygonal domains by following [7]. Let E be a polygon in \({\mathbb {R}}^2\). A preliminary virtual element space on E is given by

$$\begin{aligned} \tilde{{\mathbb {V}}}(E) := \Big \{v \in H^1(E) \cap {\mathscr {C}}^0(E) \Big |\ v_{|e} \in {\mathbb {P}}_1(e),\ \forall \ e\in \text {edges}(E) \wedge \varDelta v \in {\mathbb {P}}_1(E) \Big \},\nonumber \\ \end{aligned}$$
(37)

where \({\mathbb {P}}_1(E)\) is the space of linear polynomials on the polygon E. The functions in \({\mathbb {V}}(E)\) are not known in closed form, but we are able to use them in a spatially discrete method, hence the name virtual. Let us consider the elliptic projection \(\varPi ^\nabla _E : \tilde{{\mathbb {V}}}(E) \rightarrow {\mathbb {P}}_1(E)\) defined by

$$\begin{aligned} \displaystyle \int _E \nabla \big (v - \varPi ^\nabla _E v\big ) \cdot \nabla p_1 = 0 \quad \forall \ p_1 \in {\mathbb {P}}_1(E); \quad \text {and} \quad \displaystyle \int _{\partial E} \big (v - \varPi ^\nabla _E v\big ) = 0. \end{aligned}$$
(38)

Using Green’s formula, it is easy to see that the operator \(\varPi _E^\nabla \) is computable, see [3] for the details. The so-called enhanced virtual element space in two dimensions is now defined as follows:

$$\begin{aligned} {\mathbb {V}}(E) := \left\{ v\in \tilde{{\mathbb {V}}}(E) \Bigg |\ \int _E vp_1 = \int _E \big (\varPi _E^\nabla v\big )p_1,\ \forall \ p_1\in {\mathbb {P}}_1(E)\right\} . \end{aligned}$$
(39)

The practical usability of the space \({\mathbb {V}}(E)\) stems from the following result.

Proposition 1

(Degrees of freedom) Let \(n\in {\mathbb {N}}\). If E is a polygon with n vertices \({\varvec{x}}_i\), \(i=1,\dots ,n\), then \(\dim ({\mathbb {V}}(E)) = n\) and each function \(v \in {\mathbb {V}}(E)\) is uniquely defined by the nodal values \(v({\varvec{x}}_i)\), \(i=1,\dots ,n\). Hence, the nodal values constitute a set of degrees of freedom.

Proof

See [3]. \(\square \)

For \(s=1,2\) we define the broken bulk Sobolev seminorms as \(|u|_{s,\varOmega ,h} := \sum _{E\in {\mathscr {E}}_h} |u_{|E}|_{H^s(E)}\). The approximation properties of the space \({\mathbb {V}}(E)\) are given by the following result.

Proposition 2

(Projection error on \({\mathbb {P}}_1(E)\)) For all \(E\in {\mathscr {E}}_h\), \(s\in \{1,2\}\) and \(w\in H^s(E)\) there exists \(w_\pi \in {\mathbb {P}}_1(E)\) such that

$$\begin{aligned} \Vert w-w_\pi \Vert _{L^2(E)} + h_E |w-w_\pi |_{H^1(E)} \le C h_E^s |w|_{H^s(E)}, \end{aligned}$$
(40)

where C is a constant that depends only on \(\gamma _1\).

Proof

See [3]. \(\square \)

Remark 4

(Regularity of \({\mathbb {V}}(E)\) functions) If E is a convex polygon, from the properties of the Poisson problem on convex Lipschitz domains, it holds that \({\mathbb {V}}(E) \subset H^2(E)\), see for instance [42]. Otherwise, if E is non-convex, we may only assert that \({\mathbb {V}}(E) \subset H^{1+\varepsilon }(E)\) for \(0\le \varepsilon < 1/2\), see [42]. In either case, \({\mathbb {V}}(E) \subset {\mathscr {C}}^0(E)\), see Theorem A4. We will account for this regularity issue in devising a numerical method with optimal convergence.

The discontinuous and continuous bulk virtual element spaces are defined by pasting local spaces:

$$\begin{aligned} {\mathbb {V}}_{\varOmega ,h} := \{v:\varOmega _h \rightarrow {\mathbb {R}} \ | \ v_{|E} \in {\mathbb {V}}(E), \ \forall \ E \in {\mathscr {E}}_h\}; \quad \text {and} \quad {\mathbb {V}}_\varOmega := {\mathbb {V}}_{\varOmega ,h} \cap {\mathscr {C}}^0(\varOmega _h). \end{aligned}$$
(41)

Thanks to Remark 4, the only source of discontinuity in \({\mathbb {V}}_{\varOmega ,h}\) are jumps across edges. In \({\mathbb {V}}_\varOmega \) we consider the Lagrange basis \(\{\varphi _i, \ i=1,\dots ,N\}\) where, for each \(i=1,\dots ,N\), \(\varphi _i\) is the unique \({\mathbb {V}}_\varOmega \) function such that \(\varphi _i({\varvec{x}}_j) = \delta _{ij},\) for all \(j = 1,\dots ,N\), with \(\delta _{ij}\) being the Kronecker symbol. The set \(\{\varphi _i, \ i=1,\dots ,N\}\) is a basis of \({\mathbb {V}}_\varOmega \) thanks to Proposition 1.

In the remainder of this section, let E be an element of \(\varOmega _h\). The stabilizing form \(S_E: {\mathbb {V}}(E) \times {\mathbb {V}}(E) \rightarrow {\mathbb {R}}\), is defined by

$$\begin{aligned} S_E(v,w) := \sum _{P \in \text { vertices }(E)} v(P)w(P), \qquad \forall \ v,\, w\in {\mathbb {V}}(E). \end{aligned}$$
(42)

The \(L^2\) projector \(\varPi ^0_E: {\mathbb {V}}(E) \rightarrow {\mathbb {P}}_1(E)\) is defined as follows: for \(w \in {\mathbb {V}}(E)\):

$$\begin{aligned} \int _E \big (w - \varPi ^0_E w\big )p_1 = 0, \qquad \forall \ \ p_1\in {\mathbb {P}}_1(E). \end{aligned}$$
(43)

As shown in [3], \(\varPi _E^0\) is computable because \(\varPi _E^0 = \varPi _E^\nabla \). Even if \(\varPi _E^0\) is not a new projector, the presentation and the analysis of the method benefit from the usage of the equivalent definition (43). Moreover, since \(\varPi _E^0 = \varPi _E^\nabla \), the boundedness property of projection operators in Hilbert spaces translates to

$$\begin{aligned} \big \Vert \varPi ^0_E w\big \Vert _{L^2(E)} \le \Vert w\Vert _{L^2(E)} \quad \text {and} \qquad \big |\varPi ^0_E w\big |_{H^1(E)} \le |w|_{H^1(E)}. \end{aligned}$$
(44)

We are now ready to introduce the approximate \(L^2\) bilinear form \(m_E: {\mathbb {V}}(E) \times {\mathbb {V}}(E) \rightarrow {\mathbb {R}}\) and the approximate gradient-gradient bilinear form \(a_E: {\mathbb {V}}(E) \times {\mathbb {V}}(E) \rightarrow {\mathbb {R}}\), defined as follows:

$$\begin{aligned}&m_E(v,w) := \int _E \big (\varPi ^0_E v\big )\big (\varPi ^0_E w\big ) + {{\,\mathrm{area}\,}}(E) S_E\big (v-\varPi ^0_E v, w - \varPi ^0_E w\big ); \end{aligned}$$
(45)
$$\begin{aligned}&a_E(v,w) := \int _E \big (\nabla \varPi ^\nabla _E v\big ) \cdot \big (\nabla \varPi ^\nabla _E w\big ) + h_E S_E\big (v-\varPi ^\nabla _E v, w - \varPi ^\nabla _E w\big ), \end{aligned}$$
(46)

for all \(v,w\in {\mathbb {V}}(E)\), where \(h_E\) is the diameter of E. The following result easily follows from the construction of the bilinear forms of \(a_E\) and \(m_E\).

Proposition 3

(Stability and consistency) The bilinear forms \(a_h\) and \(m_h\) are consistent, i.e. for all \(v \in {\mathbb {V}}(E)\) and \(p \in {\mathbb {P}}_1(E)\)

$$\begin{aligned} a_E(v,p) = \int _E \nabla v \cdot \nabla p; \quad \text {and} \quad m_E(v,p) = \int _E vp. \end{aligned}$$
(47)

The bilinear forms \(a_h\) and \(m_h\) are stable, meaning that there exists two constants \(0< \alpha _* < \alpha ^*\) depending on \(\gamma _2\) such that, for all \(v\in {\mathbb {V}}(E)\)

$$\begin{aligned}&\alpha _* \int _E \nabla v \cdot \nabla v \le a_E(v,v) \le \alpha ^* \int _E \nabla v \cdot \nabla v; \end{aligned}$$
(48)
$$\begin{aligned}&\alpha _* \int _E v^2 \le m_E(v,v) \le \alpha ^* \int _E v^2. \end{aligned}$$
(49)

Proof

See [5]. \(\square \)

We observe from (48) and (49) that the error in the approximate bilinear forms \(a_E\) and \(m_E\) is not a function of the meshsize h, see also [5]. Nevertheless, we will show that the method retains optimal convergence thanks to the consistency properties (47). The global bilinear forms are defined by pasting the corresponding local bilinear forms as follows:

$$\begin{aligned}&m_h(v,w) := \sum _{E\in {\mathscr {E}}_h} m_E(v_{|E}, w_{|E}); \quad \text {and} \quad a_h(v,w) := \sum _{E\in {\mathscr {E}}_h} a_E(v_{|E}, w_{|E}), \end{aligned}$$
(50)

for all \(v,w\in {\mathbb {V}}_\varOmega \). A consequence of Proposition 3 is that \(m_h: {\mathbb {V}}_\varOmega \times {\mathbb {V}}_\varOmega \rightarrow {\mathbb {R}}\) is positive definite, while \(a_h: {\mathbb {V}}_\varOmega \times {\mathbb {V}}_\varOmega \rightarrow {\mathbb {R}}\) is positive semi-definite.

The bilinear form \(m_h\) is not sufficient to discretise load terms like \(\int _{\varOmega }f\varphi \), because f is not in the space \({\mathbb {V}}_\varOmega \). We resolve this issue by combining the approaches in [1, 30, 31]. From [1] we take the usage of the projection operator \(\varPi ^\nabla \), while from [30, 31] we take the usage of the Lagrange interpolant. In the context of virtual elements, the Lagrange interpolant is defined as follows.

Definition 3

(Virtual Lagrange Interpolant) Given an element-wise continuous function \(f : \varOmega _h \rightarrow {\mathbb {R}}\), \(f_{|E} \in {\mathscr {C}}(E)\) for all \(E\in {\mathscr {E}}_h\), the virtual Lagrange interpolant \(I_\varOmega f\) of f is the unique \({\mathbb {V}}_{\varOmega ,h}\) function such that \(I_\varOmega f_{|E}({\varvec{x}}_i) = f({\varvec{x}}_i)\) for all i,  with \({\varvec{x}}_i\in \ \text {nodes}(E)\) and \(E\in {\mathscr {E}}_h\). In particular, if \(f \in {\mathscr {C}}(\varOmega _h)\), then \(I_\varOmega f\) is the unique \({\mathbb {V}}_\varOmega \) function such that \(I_\varOmega f({\varvec{x}}_i) = f({\varvec{x}}_i)\) for all \(i=1,\dots ,N\).

The following result provides an estimate of the interpolation error in the bulk.

Proposition 4

(Interpolation error in the bulk) If \(w:\varOmega _h \rightarrow {\mathbb {R}}\) is such that \(w_E \in H^2(E)\) for all \(E\in {\mathscr {E}}_h\), then the interpolant \(I_\varOmega (w)\) fulfils

$$\begin{aligned} \Vert w-I_\varOmega (w)\Vert _{L^2(\varOmega _h)} + h|w-I_\varOmega (w)|_{1,\varOmega ,h} \le Ch^2|w|_{2,\varOmega ,h}, \end{aligned}$$
(51)

where \(C>0\) depends only on \(\gamma _1\).

Proof

See [3]. \(\square \)

Unlike projection operators, we may not assert that the interpolant \(I_\varOmega \) is bounded in \(L^2(\varOmega _h)\). However from (51) we have the quasi-boundedness of \(I_\varOmega \) in \(L^2(\varOmega _h)\):

$$\begin{aligned} \Vert I_\varOmega (w)\Vert _{L^2(\varOmega _h)} \le \Vert w\Vert _{L^2(\varOmega _h)} + Ch^2|w|_{2,\varOmega ,h}, \end{aligned}$$
(52)

where \(C>0\) depends only on \(\gamma _1\).

3.4 Finite element space and operators on the surface

Let \(F\in {\mathscr {F}}_h\) be an edge of the approximated curve \(\varGamma _h\). The local finite element space on F is the space \({\mathbb {P}}_1(F)\) of linear polynomials on F. The global finite element space on \(\varGamma _h\) is defined by \({\mathbb {V}}_\varGamma := \Big \{v \in {\mathscr {C}}^0(\varGamma _h) \Big |\ v_{|F} \in {\mathbb {P}}_1(F),\ \text {for all}\ F \in {\mathscr {F}}_h \Big \}\), i.e. the space of piecewise linear functions on the approximated curve \(\varGamma _h\). On \({\mathbb {V}}_\varGamma \) we consider the Lagrange basis \(\{\psi _k, \ k=1,\dots , M\}\) where, for each \(k=1,\dots , M\), \(\psi _k\) is the unique \({\mathbb {V}}_\varGamma \) function such that \(\psi _k({\varvec{x}}_l) = \delta _{kl}\) for all \(l=1,\dots ,M\). It is easy to see that the Lagrange basis functions of \({\mathbb {V}}_\varGamma \) are the restrictions to \(\varGamma _h\) of the first M Lagrange basis functions of \({\mathbb {V}}_\varOmega \), that is:

$$\begin{aligned} \varphi _{k | \varGamma _h} = \psi _k, \qquad \forall \ k=1,\dots ,M. \end{aligned}$$
(53)

Before introducing the spatially discrete formulations, we are left to treat terms like \(\int _{\varGamma }g\varphi \), since g is not in the boundary finite element space \({\mathbb {V}}_\varGamma \).

Definition 4

(Surface Lagrange interpolant) If \(g: \varGamma _h \rightarrow {\mathbb {R}}\) is a continuous function, the Lagrange interpolant \(I_\varGamma (g)\) of g is the unique \({\mathbb {V}}_\varGamma \) function such that \(I_\varGamma (g)({\varvec{x}}_i) = g({\varvec{x}}_i)\) for all \(i=1,\dots ,M\).

We consider the broken surface Sobolev norm \(|v|_{2,\varGamma ,h} := \sum _{F\in {\mathscr {F}}_h} |V_{|F}|_{H^2(F)}\). The following are basic properties of Lagrange interpolation.

Lemma 5

(Properties of Lagrange interpolation on the surface) Let \(v\in {\mathscr {C}}^0(\varGamma _h)\) such that, for every \(F\in {\mathscr {F}}_h\), \(v_{|F}\in H^2(F)\). Then

$$\begin{aligned} \Vert v-I_\varGamma (v)\Vert _{L^2(\varGamma _h)} + h|v-I_\varGamma (v)|_{H^1(\varGamma _h)} \le Ch^2 |v|_{2,\varGamma ,h}. \end{aligned}$$
(54)

For any \(w\in {\mathscr {C}}^0(\varGamma _h)\), the surface Lagrange interpolant fulfils

$$\begin{aligned} {|I_\varGamma (w)| \le I_\varGamma (|w|).} \end{aligned}$$
(55)

Proof

See [21] for (54), while (55) is a consenquence of \(I_\varGamma \) being monotonic, i.e. \(I_\varGamma (v) \le I_\varGamma (w)\) for any \(v,w \in {\mathscr {C}}^0(\varGamma _h)\) such that \(v\le w\). \(\square \)

3.5 The spatially discrete formulations

We are now ready to introduce the BSVEM discretisation of the weak problems (4) and (5). The discrete counterpart of the elliptic problem (4) is: find \(U \in {\mathbb {V}}_\varOmega \) and \(V \in {\mathbb {V}}_\varGamma \) such that

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle a_h(U, \varphi ) + m_h(U, \varphi ) + \int _{\varGamma _h}(\alpha U -\beta V)\varphi = m_h(I_\varOmega (f), \varphi );\\ \displaystyle \int _{\varGamma _h}\nabla _{\varGamma _h}V\cdot \nabla _{\varGamma _h}\psi + \int _{\varGamma _h}(-\alpha U + (\beta +1)V) \psi = \int _{\varGamma _h}I_\varGamma (g)\psi , \end{array}\right. } \end{aligned}$$
(56)

for all \(\varphi \in {\mathbb {V}}_\varOmega \) and \(\psi \in {\mathbb {V}}_{\varGamma }\). We express the spatially discrete solution (UV) in the Lagrange bases as follows:

$$\begin{aligned} U({\varvec{x}}) = \sum _{i=1}^N \xi _i \varphi _i({\varvec{x}}), \quad {\varvec{x}}\in \varOmega _h;\quad \text {and} \quad V({\varvec{x}}) = \sum _{k=1}^M \eta _k \psi _k({\varvec{x}}), \quad {\varvec{x}}\in \varGamma _h. \end{aligned}$$
(57)

Hence, problem (56) is equivalent to: find \(\varvec{\xi }:= (\xi _i, \dots , \xi _N)^{T} \in {\mathbb {R}}^{N}\) and \(\varvec{\eta }:= (\eta _1,\dots , \eta _M)^{T} \in {\mathbb {R}}^{M}\) such that

$$\begin{aligned} {\left\{ \begin{array}{ll} \begin{aligned} &{}\displaystyle \sum _{i=1}^N\displaystyle \xi _i a_h(\varphi _i, \varphi _j) + \sum _{i=1}^N \xi _i m_h(\varphi _i, \varphi _j) + \alpha \sum _{k=1}^M \xi _k \int _{\varGamma _h}\varphi _k\varphi _l\\ &{} \qquad \quad - \beta \sum _{k=1}^M \eta _k \int _{\varGamma _h}\psi _k\varphi _l = \sum _{i=1}^N f({\varvec{x}}_i) m_h(\varphi _i, \varphi _j); \\ &{}\displaystyle \sum _{k=1}^M \eta _k \int _{\varGamma _h}\nabla _{\varGamma _h}\psi _k \cdot \nabla _{\varGamma _h}\psi _l -\alpha \sum _{k=1} ^ M \xi _k \int _{\varGamma _h}\varphi _k\psi _l\\ &{} \qquad \quad + (\beta +1) \sum _{k=1} ^ M \eta _k \int _{\varGamma _h}\psi _k \psi _l = \sum _{k=1}^M g({\varvec{x}}_k)\int _{\varGamma _h}\psi _k\psi _l, \end{aligned} \end{array}\right. } \end{aligned}$$
(58)

for all \(j=1,\dots ,N\) and \(l=1,\dots ,M\). We consider the matrices \(A_\varOmega = (a_{i,j}^{\varOmega }) \in {\mathbb {R}}^{N\times N}\), \(M_\varOmega = (m_{i,j}^{\varOmega }) \in {\mathbb {R}}^{N \times N}\), \(A_\varGamma = (a_{k,l}^{\varGamma }) \in {\mathbb {R}}^{M \times M}\), \(M_{\varGamma } = (m_{k,l}^{\varGamma }) \in {\mathbb {R}}^{M \times M}\) defined as follows:

$$\begin{aligned}&a_{i,j}^{\varOmega } := a_h(\varphi _i, \varphi _j); \quad \text {and}\quad m_{i,j}^{\varOmega } := m_h(\varphi _i, \varphi _j), \qquad i,j=1,\dots ,N; \end{aligned}$$
(59)
$$\begin{aligned}&a_{k,l}^{\varGamma } := \int _{\varGamma _h}\nabla _{\varGamma _h}\psi _k\cdot \nabla _{\varGamma _h}\psi _l; \quad \text {and} \quad m_{k,l}^{\varGamma } := \int _{\varGamma _h}\psi _k \psi _l, \qquad k,l=1,\dots ,M. \end{aligned}$$
(60)

Moreover, we consider the column vectors \({\varvec{f}}\in {\mathbb {R}}^{N}\) and \({\varvec{g}}\in {\mathbb {R}}^{M}\) defined by

$$\begin{aligned} {\varvec{f}}:= \left[ \begin{matrix} f({\varvec{x}}_1)\\ \vdots \\ f({\varvec{x}}_N) \end{matrix}\right] ; \qquad \text {and} \qquad {\varvec{g}}:= \left[ \begin{matrix} g({\varvec{x}}_1)\\ \vdots \\ g({\varvec{x}}_M) \end{matrix}\right] . \end{aligned}$$
(61)

By using (53) we can now rewrite the discrete formulation (58) in matrix-vector form as a block \((N+M)\times (N+M)\) linear algebraic system:

$$\begin{aligned} {\left\{ \begin{array}{ll} A_\varOmega \varvec{\xi }+ M_\varOmega \varvec{\xi }+ \alpha RM_\varGamma R^T\varvec{\xi }- \beta R M_\varGamma \varvec{\eta }= M_\varOmega {\varvec{f}};\\ A_\varGamma \varvec{\eta }- \alpha M_\varGamma R^T\varvec{\xi }+ (\beta + 1) M_\varGamma \varvec{\eta }= M_\varGamma {\varvec{g}}. \end{array}\right. } \end{aligned}$$
(62)

In compact form, (62) reads

(63)

which is uniquely solvable, as a consequence of the positive semi-definiteness of \(a_h\) and the positive definiteness of \(m_h\).

The spatial discretisation of the parabolic problem (5) is: find \(U\in L^2([0,T]; {\mathbb {V}}_\varOmega )\) and \(V \in L^2([0,T]; {\mathbb {V}}_\varGamma )\) such that

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle m_h \left( \dfrac{\partial U}{\partial t}, \varphi \right) + d_u a_h(U, \varphi ) = m_h(\varPi ^0 I_\varOmega (q(\varPi ^0 U)), \varphi ) + \int _{\varGamma _h}I_\varGamma (s(U,V))\varphi ;\\ \displaystyle \int _{\varGamma _h}\dfrac{\partial V}{\partial t}\psi + d_v \int _{\varGamma }\nabla _{\varGamma _h}V \cdot \nabla _{\varGamma _h}\psi + \int _{\varGamma _h}I_\varGamma (s(U,V))\psi = \int _{\varGamma _h}I_\varGamma (r(U,V))\psi , \end{array}\right. } \end{aligned}$$
(64)

for all \(U \in L^2([0,T]; {\mathbb {V}}_{\varOmega })\) and \(V \in L^2([0,T]; {\mathbb {V}}_\varGamma )\). The discrete initial conditions are prescribed as follows

$$\begin{aligned} U_0 = I_\varOmega (u_0); \quad \text {and} \quad V_0 = I_\varGamma (v_0). \end{aligned}$$
(65)

Remark 5

(Special cases) If every element \(E\in {\mathscr {E}}_h\) is convex or f is linear, optimal convergence is retained by replacing the term \(m_h(\varPi ^0 I_\varOmega (q(\varPi ^0 U)), \varphi )\) with \(m_h(I_\varOmega (q(U)), \varphi )\), i.e by removing the projection operator \(\varPi ^0\). By expressing the time-dependent semi-discrete solution (UV) in the Lagrange bases as follows

$$\begin{aligned} U({\varvec{x}}, t) = \sum _{i=1}^N \xi _i(t) \varphi _i({\varvec{x}}), \quad {\varvec{x}}\in \varOmega _h; \quad \text {and} \quad V({\varvec{x}}, t) = \sum _{k=1}^M \eta _k(t) \psi _k({\varvec{x}}), \quad {\varvec{x}}\in \varGamma _h, \end{aligned}$$
(66)

the fully discrete problem can be written in matrix-vector form as an \((M+N)\times (M+N)\) nonlinear ODE system of the form:

$$\begin{aligned} {\left\{ \begin{array}{ll} M_\varOmega {\dot{\varvec{\xi }}}(t) + {d_u}K_\varOmega \varvec{\xi }(t)= M_\varOmega q(\varvec{\xi }(t)) + R M_\varGamma h(R^{T}\varvec{\xi }(t), \varvec{\eta }(t));\\ M_\varGamma {\dot{\varvec{\eta }}}(t) + {d_v}K_\varGamma \varvec{\eta }(t) = - M_\varGamma h(R^{T}\varvec{\xi }(t), \varvec{\eta }(t)) + M_\varGamma r(R^{T}\varvec{\xi }(t), \varvec{\eta }(t)). \end{array}\right. } \end{aligned}$$
(67)

Remark 6

(Implementation) Thanks to the reduction matrix R, we are able to implement the spatially discrete problems (62) and (67) by using only two kinds of mass matrix (\(M_\varOmega \) and \(M_\varGamma \)), two kinds of stiffness matrix (\(A_\varOmega \) and \(A_\varGamma \)) and R itself. In previous works on BSFEM as illustrated in [37], five kinds of mass matrix were used to evaluate the integrals appearing in the spatially discrete formulation. We stress once again that, since the pre-existing BSFEM is a special case of the proposed BSVEM, this work provides, as a by-product, an optimised matrix implementation of the BSFEM.

4 Stability and convergence analysis

The spatially discrete parabolic problem (64) fulfils the following stability estimates.

Lemma 6

[Stability estimates for the spatially discrete parabolic problem (64)] There exists \(C>0\) depending on q, r, s and \(\varOmega \) such that

$$\begin{aligned}&\sup _{t\in [0,T]}\Big (\Vert U\Vert _{L^2(\varOmega _h)}^2 + \Vert V\Vert _{L^2(\varGamma _h)}^2\Big ) + \int _0^T \Big (|U|_{H^1(\varOmega _h)}^2 + |V|_{H^1(\varGamma _h)}^2\Big )\nonumber \\&\le C\Big (1+ \Vert U_0\Vert _{L^2(\varOmega _h)}^2 + \Vert V_0\Vert _{L^2(\varGamma _h)}^2\Big )\exp (CT); \end{aligned}$$
(68)
$$\begin{aligned}&\sup _{t\in [0,T]}\Big (|U|_{H^1(\varOmega _h)}^2 + |V|_{H^1(\varGamma _h)}^2\Big ) + \int _0^T \Big (\Vert {\dot{U}}\Vert _{L^2(\varOmega _h)}^2 + \Vert {\dot{V}}\Vert _{L^2(\varGamma _h)}^2\Big )\nonumber \\&\le C\Big (1 + \Vert U_0\Vert _{H^1(\varOmega _h)}^2 + \Vert V_0\Vert _{H^1(\varGamma _h)}^2\Big )\exp (CT). \end{aligned}$$
(69)

Proof

The proof relies on standard energy techniques. By choosing \(\varphi = U\) and \(\psi = V\) in (64), using the Lipschitz continuity of q, r, s, the Young’s inequality and summing over the equations we have

$$\begin{aligned} \begin{aligned} \frac{1}{2}\frac{\mathrm {d}}{\mathrm {d}t} \left( m_h(U, U) + \Vert V\Vert _{L^2(\varGamma _h)}^2\right) + d_u a_h(U,U) + d_v |V|_{H^1(\varGamma _h)}^2\\ \le C\left( 1 + \Vert U\Vert _{L^2(\varOmega _h)}^2 + \Vert V\Vert _{L^2(\varGamma _h)}^2\right) + c \Vert U_{|\varGamma _h}\Vert _{L^2(\varGamma _h)}^2, \end{aligned} \end{aligned}$$
(70)

where \(c>0\) is arbitrarily small, thanks to the Young’s inequality. By applying (24), (A.11) and (48) to the last term in (70) we can choose c such that we have

$$\begin{aligned} \begin{aligned} \frac{1}{2}\frac{\mathrm {d}}{\mathrm {d}t} \left( m_h(U, U) + \Vert V\Vert _{L^2(\varGamma _h)}^2\right) + d_u a_h(U,U) + d_v |V|_{H^1(\varGamma _h)}^2\\ \le C\left( 1 + \Vert U\Vert _{L^2(\varOmega _h)}^2 + \Vert V\Vert _{L^2(\varGamma _h)}^2\right) +\frac{d_u}{2} a_h(U,U). \end{aligned} \end{aligned}$$
(71)

By using (49) into (71) we have

$$\begin{aligned} \begin{aligned} \frac{1}{2}\frac{\mathrm {d}}{\mathrm {d}t} \left( m_h(U, U) + \Vert V\Vert _{L^2(\varGamma _h)}^2\right) \le&C\left( m_h(U,U) + \Vert V\Vert _{L^2(\varOmega _h)}^2\right) \\&+ \,C - \frac{d_u}{2} a_h(U,U) - d_v|V|_{H^1(\varGamma _h)}^2. \end{aligned} \end{aligned}$$
(72)

An application of Grönwall’s lemma to (72) yields

$$\begin{aligned} \begin{aligned} \sup _{t\in [0,T]}\Big (m_h(U,U) +&\Vert V\Vert _{L^2(\varGamma _h)}^2\Big ) \le \Big (1+ m_h(U_0, U_0) + \Vert V_0\Vert _{L^2(\varGamma _h)}^2\Big )\exp (CT) - 1\\ -&\int _0^T \Big (\frac{d_u}{2} a_h(U,U) + d_v |V|_{H^1(\varGamma _h)}^2\Big )\exp \{C(T-t)\}\mathrm {d}t, \end{aligned} \end{aligned}$$
(73)

which yields (68) after an application of (24). Similarly, by choosing \(\varphi = {\dot{U}}\) and \(\psi = {\dot{V}}\) in (64) and summing over the equations we have

$$\begin{aligned} \begin{aligned}&m_h({\dot{U}}, {\dot{U}}) + \Vert {\dot{V}}\Vert _{L^2(\varGamma _h)}^2 + \frac{1}{2}\frac{\mathrm {d}}{\mathrm {d}t}\left( d_u a_h(U,U) + d_v |V|_{H^1(\varGamma _h)}^2\right) \\&\quad \le C\left( 1 + \Vert U\Vert _{L^2(\varOmega _h)}^2 + \Vert V\Vert _{L^2(\varGamma _h)}^2\right) + C a_h(U,U) + \frac{1}{2}\Big (m_h({\dot{U}}, {\dot{U}}) + \Vert {\dot{V}}\Vert _{L^2(\varGamma _h)}^2\Big ), \end{aligned} \end{aligned}$$
(74)

where we have exploited the Lipschitz continuity of q, r, s and the Young’s inequality and (49). From (74) we immediately get

$$\begin{aligned} \begin{aligned}&\frac{1}{2}\Big (m_h({\dot{U}}, {\dot{U}}) + \Vert {\dot{V}}\Vert _{L^2(\varGamma _h)}^2\Big ) + \frac{1}{2}\frac{\mathrm {d}}{\mathrm {d}t}\Big (d_u a_h(U,U) + d_v |V|_{H^1(\varGamma _h)}^2\Big )\\&\quad \le C\Big (a_h(U,U) + |V|_{H^1(\varGamma _h)}^2\Big ) + C\left( 1 + \Vert U\Vert _{L^2(\varOmega _h)}^2 + \Vert V\Vert _{L^2(\varGamma _h)}^2\right) . \end{aligned} \end{aligned}$$
(75)

By applying Grönwall’s lemma and then using (48)–(49) we obtain (69). \(\square \)

To derive error estimates for the spatially discrete solution we need suitable bulk and surface Ritz projections. The surface Ritz projection is taken from [28], while the bulk Ritz projection is tailor-made.

Definition 5

(Surface Ritz projection) The surface Ritz projection of a function \(v\in H^1(\varGamma )\) is the unique function \({\mathscr {R}}v \in {\mathbb {V}}_\varGamma \) such that

$$\begin{aligned} \int _{\varGamma _h}\nabla _{\varGamma _h}{\mathscr {R}}v\cdot \nabla _{\varGamma _h}\psi = \int _{\varGamma }\nabla _{\varGamma }v\cdot \nabla _{\varGamma }\psi \quad \forall \psi \in {\mathbb {V}}_\varGamma ; \quad \text {and} \quad \int _{\varGamma _h}{\mathscr {R}}v = \int _{\varGamma }v. \end{aligned}$$
(76)

Theorem 3

(Error bounds for the surface Ritz projection) The surface Ritz projection fulfils the optimal a priori error bound

$$\begin{aligned} \Vert v - ({\mathscr {R}}v)^\ell \Vert _{L^2(\varGamma )} + h\Vert v - ({\mathscr {R}}v)^\ell \Vert _{H^1(\varGamma )} \le Ch^2\Vert v\Vert _{H^2(\varGamma )}, \end{aligned}$$
(77)

where \(C>0\) depends only on \(\varGamma \) and \(\ell \) is the lifting.

Proof

See [28]. \(\square \)

We now define a suitable Ritz projection in the bulk.

Definition 6

(Bulk Ritz projection) The bulk Ritz projection of a function \(u\in H^1(\varOmega )\) is the unique function \({\mathscr {R}}u \in {\mathbb {V}}_\varOmega \) such that

$$\begin{aligned} a_h({\mathscr {R}}u, \psi ) = \int _{\varOmega }\nabla u\cdot \nabla \psi ^\ell \quad \forall \ \psi \in {\mathbb {V}}_\varOmega ; \quad \text {and} \quad {\mathscr {R}}u_{|\varGamma _h} = I_\varGamma (u^{-\ell }), \end{aligned}$$
(78)

where \(-\ell \) is the inverse lifting.

In the following theorems we show that the bulk Ritz projection of a sufficiently regular function fulfils optimal a priori error bounds in \(H^1(\varOmega )\), \(H^1(\varGamma )\), \(L^2(\varGamma )\) and \(L^2(\varOmega )\) norms.

Theorem 4

(\(H^1(\varOmega )\) a priori error bound for the bulk Ritz projection) For any \(u\in H^{2+1/4}(\varOmega )\) it holds that

$$\begin{aligned} |u - ({\mathscr {R}}u)^\ell |_{H^1(\varOmega )} \le Ch\Vert u\Vert _{H^2(\varOmega )} + Ch\Vert u\Vert _{H^{2+1/4}(\varOmega )}, \end{aligned}$$
(79)

where \(\ell \) is the lifting and C depends on \(\varOmega \) and the constants \(\gamma _1\) and \(\gamma _2\) in (V1)–(V2).

Proof

We set \(e_h := {\mathscr {R}}u - {\tilde{u}}\). From (27), (33), (40), (47), (48) and (78) we have

$$\begin{aligned}&\alpha _*|e_h|_{H^1(\varOmega _h)}^2 \le a_h(e_h, e_h) = a_h({\mathscr {R}}u, e_h) - a_h({\tilde{u}}, e_h)\\&\qquad = \int _{\varOmega }\nabla u\cdot \nabla e_h^{\ell } - \sum _{E\in {\mathscr {E}}_h} (a_E({\tilde{u}}, e_h)) = \int _{\varOmega }\nabla u\cdot \nabla e_h^{\ell } - \sum _{E\in {\mathscr {E}}_h} (a_E({\tilde{u}} - {\tilde{u}}_\pi , e_h) + a_E({\tilde{u}}_\pi ,e_h))\\&\qquad = \int _{\varOmega }\nabla u\cdot \nabla e_h^{\ell } - \int _{\varOmega _h}\nabla {\tilde{u}}\cdot \nabla e_h - \sum _{E\in {\mathscr {E}}_h} a_E({\tilde{u}} - {\tilde{u}}_\pi , e_h)\\ =&\int _{\varOmega }\nabla u \cdot \nabla e_h^{\ell } - \int _{\varOmega _h}\nabla u^{-\ell }\cdot \nabla e_h + \int _{\varOmega _h}\nabla u^{-\ell }\cdot \nabla e_h - \int _{\varOmega _h}\nabla {\tilde{u}}\cdot \nabla e_h - \sum _{E\in {\mathscr {E}}_h} a_E({\tilde{u}} - {\tilde{u}}_\pi , e_h)\\ \le&C\left( h\Vert u\Vert _{H^2(\varOmega )} + h^{3/2}\Vert u\Vert _{H^2(\varOmega )} + Ch\Vert u\Vert _{H^{2+1/4}(\varOmega )}\right) |e_h|_{H^1(\varOmega _h)}, \end{aligned}$$

which yields, for \(h \le h_0\)

$$\begin{aligned} |e_h|_{H^1(\varOmega _h)} \le \left( Ch\Vert u\Vert _{H^2(\varOmega )} + Ch\Vert u\Vert _{H^{2+1/4}(\varOmega )}\right) . \end{aligned}$$
(80)

By using (24), (33) and (80) we get

$$\begin{aligned} \begin{aligned} |u - ({\mathscr {R}}u)^\ell |_{H^1(\varOmega )} \le C|u^{-\ell } - {\mathscr {R}}u|_{H^1(\varOmega _h)} \le&C\big (|u^{-\ell } - {\tilde{u}}|_{H^1(\varOmega _h)} + |e_h|_{H^1(\varOmega _h)}\big )\\ \le&Ch\Vert u\Vert _{H^2(\varOmega )} + Ch\Vert u\Vert _{H^{2+1/4}(\varOmega )}. \end{aligned} \end{aligned}$$

\(\square \)

In order to prove the \(L^2\) convergence, care must be taken about inverse trace operators. A consequence of Theorem A3 in the Appendix A is that, given \(v\in H^1(\varGamma )\), there exists \(v_B\in H^1(\varOmega )\) such that \({{\,\mathrm{Tr}\,}}(v_B) = v\) and \(\Vert v_B\Vert _{H^1(\varOmega )} \le C\Vert v\Vert _{H^1(\varGamma )}\). However, for our purposes, we need the existence of a constant \(C>0\) such that the bounds \(\Vert v_B\Vert _{L^2(\varOmega )} \le C\Vert v\Vert _{L^2(\varGamma )}\) and \(|v_B|_{H^1(\varOmega )} \le C\Vert v\Vert _{H^1(\varGamma )}\) are simultaneously fulfilled, namely we need a \(L^2\)-preserving inverse trace operator. In the following result, we prove the existence of such an operator under special assumptions on the regularity of \(\varGamma \).

Lemma 7

(\(L^2\)-preserving inverse trace) Assume the boundary \(\varGamma \) is \({\mathscr {C}}^3\). Then, for any \(v\in H^1(\varGamma )\) such that \(\Vert v\Vert _{L^2(\varGamma )}\) is sufficiently small, there exists \(v_B\in H^1(\varOmega )\) such that

$$\begin{aligned} \qquad \Vert v_B\Vert _{L^2(\varOmega )} \le C\Vert v\Vert _{L^2(\varGamma )}; \quad \text {and} \quad |v_B|_{H^1(\varOmega )} \le C\Vert v\Vert _{H^1(\varGamma )}. \end{aligned}$$
(81)

Proof

With \(\delta \) as defined in Theorem A1, we take \(0 \le \delta _0 \le \delta \) and we define

$$\begin{aligned} v_B({\varvec{x}}) := {\left\{ \begin{array}{ll} v({\varvec{a}}({\varvec{x}}))\left( 1+\frac{d({\varvec{x}})}{\delta _0}\right) , &{} \quad \text {if}\ {\varvec{x}}\in U_{\delta _0};\\ 0, &{} \quad \text {if}\ {\varvec{x}}\in \varOmega {\setminus } U_{\delta _0}. \end{array}\right. } \end{aligned}$$
(82)

By using (A.9) we have

$$\begin{aligned} \begin{aligned} \int _{\varOmega }v_B^2({\varvec{x}})\mathrm {d}x =&\int _{U_h} v_B^2({\varvec{x}})\mathrm {d}x = \int _{-\delta _0}^0\mathrm {d}s \int _{\varGamma _s} v^2({\varvec{a}}({\varvec{x}}))\left( 1+\frac{d({\varvec{x}})}{\delta _0}\right) ^2\mathrm {d}x \\ =&\int _{-\delta _0}^0\left( 1+\frac{s}{\delta _0}\right) ^2\mathrm {d}s \int _{\varGamma _s} v^2({\varvec{a}}({\varvec{x}}))\mathrm {d}x = \frac{\delta _0}{3}\int _{\varGamma _s} v^2({\varvec{a}}({\varvec{x}}))\mathrm {d}x. \end{aligned} \end{aligned}$$
(83)

Since the decomposition \((d({\varvec{x}}), {\varvec{a}}({\varvec{x}}))\) is unique (see Lemma A1) and all the points \({\varvec{x}}\in \varGamma _s\) share the same distance \(d({\varvec{x}})\), then the mapping \({\varvec{a}}_s := {\varvec{a}}_{|\varGamma _s} : \varGamma _s \rightarrow \varGamma \) is invertible. Moreover, since \({\varvec{a}}_0 = Id_{|\varGamma }\) (which implies \(\Vert \nabla _\varGamma {\varvec{a}}\Vert = 1\)) and \({\varvec{a}}\in {\mathscr {C}}^2(U)\) (see Remark A1), we can choose \(\delta _0\) small enough such that \(0 < c \le \Vert \nabla _{\varGamma _s} {\varvec{a}}_s\Vert \le C\). Hence, \({\varvec{a}}_s^{-1}\) is \({\mathscr {C}}^2\) as well, which implies that \(\Vert \nabla _{\varGamma } {\varvec{a}}_s^{-1}\Vert \le C\). Hence by setting \({\varvec{y}}= {\varvec{a}}({\varvec{x}})\) we have

$$\begin{aligned} \int _{\varGamma _s} v^2({\varvec{a}}({\varvec{x}}))\mathrm {d}x = \int _{\varGamma }v^2({\varvec{y}})\Vert \nabla _{\varGamma } {\varvec{a}}_s^{-1}({\varvec{y}})\Vert \mathrm {d}{\varvec{y}}\le C\Vert v\Vert _{L^2(\varGamma )}^2, \end{aligned}$$
(84)

where \(\Vert \cdot \Vert \) is the Euclidean norm. By combining (83) and (84) we obtain the first inequality in (81), since \(\delta _0\) depends only on \(\varGamma \). An application of the chain rule and Leibniz’s rule yields

$$\begin{aligned} \nabla v_B({\varvec{x}}) = \nabla _{\varGamma }v({\varvec{a}}({\varvec{x}}))J{\varvec{a}}({\varvec{x}})\left( 1+\frac{d({\varvec{x}})}{\delta _0}\right) + v({\varvec{a}}({\varvec{x}}))\frac{\nabla d({\varvec{x}})}{\delta _0}. \end{aligned}$$

Since \(d({\varvec{x}})\) and \({\varvec{a}}({\varvec{x}})\) are both \({\mathscr {C}}^2\) on the compact set \(U_\delta \supset U_{\delta _0}\) and \(-\delta _0 \le d({\varvec{x}}) \le 0\), we obtain

$$\begin{aligned} \Vert \nabla v_B({\varvec{x}})\Vert \le C\Vert \nabla _{\varGamma }v({\varvec{a}}({\varvec{x}}))\Vert + \frac{C}{\delta _0}|v({\varvec{a}}({\varvec{x}}))|, \qquad \forall \ {\varvec{x}}\in U_{\delta _0}, \end{aligned}$$
(85)

where \(|\cdot |\) is the absolute value. Thanks to the continuous pasting in (82), it is easy to show that the bound (85) for the distributional gradient \(\nabla v_B\) still holds on the junction \(\varGamma _{\delta _0}\) between \(U_{\delta _0}\) and \(\varOmega {\setminus } U_{\delta _0}\). From (A.9), (85) and the Young’s inequality we have

$$\begin{aligned} \begin{aligned} \int _{\varOmega }\Vert \nabla v_B({\varvec{x}})\Vert ^2\mathrm {d}{\varvec{x}}\le&C \int _{-\delta _0}^0\mathrm {d}s\int _{\varGamma _s} \left( \Vert \nabla _{\varGamma }v({\varvec{a}}({\varvec{x}}))\Vert ^2 + \frac{|v({\varvec{a}}({\varvec{x}}))|^2}{\delta _0^2}\right) \mathrm {d}{\varvec{x}}\\ =&\int _{-\delta _0}^0\mathrm {d}s\int _{\varGamma } \left( \Vert \nabla _{\varGamma }v({\varvec{y}})\Vert ^2 + \frac{|v({\varvec{y}})|^2}{\delta _0^2}\right) \Vert \nabla _{\varGamma } {\varvec{a}}_s^{-1}({\varvec{y}})\Vert \mathrm {d}{\varvec{y}}\\ \le&C \delta _0 \int _{\varGamma } \left( \Vert \nabla _{\varGamma }v({\varvec{y}})\Vert ^2 + \frac{|v({\varvec{y}})|^2}{\delta _0^2}\right) \mathrm {d}{\varvec{y}}\le C\left( \Vert v\Vert _{L^2(\varGamma )}^2 + |v|_{H^1(\varGamma )}^2\right) , \end{aligned} \end{aligned}$$
(86)

which proves the second inequality in (81). \(\square \)

Theorem 5

[\(H^1(\varGamma )\), \(L^2(\varGamma )\) and \(L^2(\varOmega )\) error bounds for the bulk Ritz projection] Let \(\varOmega \) have a \({\mathscr {C}}^3\) boundary. Then, for any \(u\in H^{2+1/4}(\varOmega )\) such that \({{\,\mathrm{Tr}\,}}(u)\in H^2(\varGamma )\) and for h sufficiently small, it holds that

$$\begin{aligned}&\Vert u - ({\mathscr {R}}u)^{\ell }\Vert _{L^2(\varOmega )} \le Ch^{2}\left( \Vert u\Vert _{H^2(\varOmega )} + \Vert u\Vert _{H^{2+1/4}(\varOmega )}\right) , \end{aligned}$$
(87)
$$\begin{aligned}&\Vert u - ({\mathscr {R}}u)^{\ell }\Vert _{L^2(\varGamma )} + h|u - ({\mathscr {R}}u)^{\ell }|_{H^1(\varGamma )} \le Ch^2\Vert u\Vert _{H^2(\varGamma )}, \end{aligned}$$
(88)

with C depending on \(\varOmega \), \(\gamma _1\) and \(\gamma _2\). In (87), the term in \(H^{2+1/4}(\varOmega )\) norm arises only in the simultaneous presence of curved boundaries and non-triangular boundary elements.

Proof

See “Appendix B”. \(\square \)

In the following theorem we prove optimal convergence in \(L^\infty ([0,T], H^1(\varOmega )\times H^1(\varGamma ))\) norm for the spatially discrete parabolic problem (64), by harnessing the techniques used in [1, 26, 30].

Theorem 6

(Convergence of the BSVEM for the parabolic case) Assume that the kinetics q, r, s are \({\mathscr {C}}^2\) and globally Lipschitz continuous (or at least Lipschitz on the range of the discrete solution). Assume that the exact solution (uv) of the parabolic problem (2) is such that u, \(u_t\in L^\infty ([0,T]; H^{2+1/4}(\varOmega ))\) and v, \(v_t\), \({{\,\mathrm{Tr}\,}}(u)\), \({{\,\mathrm{Tr}\,}}(u_t)\in L^\infty ([0,T];H^2(\varGamma ))\). Let (UV) be the solution of (64). Then it holds that

$$\begin{aligned} \Vert u - U^\ell \Vert _{L^\infty ([0,T]; L^2(\varOmega ))} + \Vert v-V^\ell \Vert _{L^\infty ([0,T]; L^2(\varGamma ))} \le Ch^2, \end{aligned}$$
(89)

where the constant C depends on the diffusion coefficient \(d_u\), the final time T and on the following norms:

  • \(\Vert (u, u_t)\Vert _{L^\infty ([0,T]; H^{2}(\varOmega ))}\), \(\Vert (v,v_t,{{\,\mathrm{Tr}\,}}(u),{{\,\mathrm{Tr}\,}}(u_t))\Vert _{L^\infty ([0,T]; H^2(\varGamma ))}\) in any case;

  • \(\Vert (u, u_t)\Vert _{L^\infty ([0,T]; H^{2+1/4}(\varOmega ))}\) only in the simultaneous presence of curved boundaries and non-triangular boundary elements.

Proof

See “Appendix B”. \(\square \)

5 Time discretisation of the parabolic problem

For the time discretisation of the semi-discrete formulation (67) of the parabolic problem (2) we use the IMEX (IMplicit-EXplicit) Euler method, which approximates diffusion terms implicitly and reaction– and boundary terms explicitly, see for instance [30]. This choice is meant to make the implementation as simple as possible. However, if the boundary conditions and/or the reaction terms are linear, the aforementioned terms can be approximated implicitly without involving any additional nonlinear solver. We choose a timestep \(\tau > 0\) and we consider the equally spaced discrete times \(t_n := n\tau \), for \(n=0,\dots , N_T\), with \(N_T := \left\lceil \frac{T}{\tau }\right\rceil \). For \(n=0,\dots , N_T\) we denote by \(\varvec{\xi }^n\) and \(\varvec{\eta }^n\) the numerical solution at time \(t_n\). The IMEX Euler time discretisation of (67) reads

$$\begin{aligned} {\left\{ \begin{array}{ll} M_\varOmega \dfrac{\varvec{\xi }^{n+1} - \varvec{\xi }^n}{\tau } + {d_u}K_\varOmega \varvec{\xi }^{n+1} = M_\varOmega q(\varvec{\xi }^n) + RM_\varGamma s(R^T\varvec{\xi }^n, \varvec{\eta }^n);\\ M_\varGamma \dfrac{\varvec{\eta }^{n+1} - \varvec{\eta }^n}{\tau } + {d_v}K_\varGamma \varvec{\eta }^{n+1} = M_\varGamma (-s(R^T\varvec{\xi }^n, \varvec{\eta }^n) + r(R^T\varvec{\xi }^n, \varvec{\eta }^n)), \end{array}\right. } \end{aligned}$$
(90)

for \(n=0,\dots , N_T-1\), where R is the reduction matrix introduced in (9). By solving for \(\varvec{\xi }^{n+1}\) and \(\varvec{\eta }^{n+1}\) we obtain the following time stepping scheme:

$$\begin{aligned} {\left\{ \begin{array}{ll} (M_\varOmega + \tau {d_u}K_\varOmega ) \varvec{\xi }^{n+1} = M_{\varOmega }(\varvec{\xi }^n + \tau q(\varvec{\xi }^n)) + \tau RM_\varGamma s(R^T\varvec{\xi }^n, \varvec{\eta }^n);\\ (M_\varGamma + \tau {d_v}K_\varGamma )\varvec{\eta }^{n+1} = M_\varGamma (\varvec{\eta }^n + \tau r(R^T\varvec{\xi }^n, \varvec{\eta }^n) - \tau s(R^T\varvec{\xi }^n, \varvec{\eta }^n)), \end{array}\right. } \end{aligned}$$
(91)

for \(n=0,\dots , N_T-1\). Hence, two linear systems must be solved at each time step, where the matrix coefficients are the same for all n.

6 Construction of meshes optimised for matrix assembly

One of the benefits of using suitable polygonal elements is the reduction in computational complexity of the matrix assembly. This is useful especially for (i) time-independent problems and (ii) time-dependent problems on evolving domains, where matrix assembly might take the vast majority of the overall computational time.

As an example, given an arbitrarily shaped domain \(\varOmega \) with \({\mathscr {C}}^1\) boundary \(\varGamma \), we construct a polygonal mesh designed for fast matrix assembly, by proceeding as follows. Suppose that the bulk \(\varOmega \) is contained in a square Q. We discretise Q with a Cartesian grid made up of square mesh elements and assume that at least one of such squares is fully contained in the interior of \(\varOmega \) (Fig. 5a). Then we discard the elements that are outside Q (Fig. 5b). Finally, we project on \(\varGamma \) the nodes that are outside \(\varOmega \), thereby producing a discrete narrow band of irregular quadrilaterals (highlighted in purple in Fig. 5c). The mesh is then post-processed in two steps as illustrated in Fig. 6 to comply with Assumptions (V1)–(V2). Let \(\varepsilon > 0\) be a threshold. First, all \(h\varepsilon \)-close nodes are merged, see Fig. 6a and b. After that, the only chance for a polygon with at most four vertices fulfilling (V1) to not fulfil (V2) is the presence of small angles. In this case, all \(\varepsilon \)-small angles are reduced to the null angle, see Fig. 6c and d. It is worth remarking that Virtual Elements were proven to be robust with respect to distorted elements (see [7]), so \(\varepsilon \) can be chosen small.

The resulting mesh \(\varOmega _h\) is made up of equal square elements, except for the elements that are close to \(\varGamma \), as we can see in Fig. 5c, which results in faster matrix assembly. Let h be the the meshsize of \(\varOmega _h\). By construction, \(h = h_Q\), where \(h_Q\) is the meshsize of the Cartesian grid. Of course, \(\varOmega _h\) is made up of \({\mathscr {O}}(1/h_Q^2) = {\mathscr {O}}(1/h^2)\) elements. However, by definition of Hausdorff dimension, the number of squares of Q that intersect \(\varGamma \) is \({\mathscr {O}}(1/h_Q) = {\mathscr {O}}(1/h)\), hence the number on non-square elements of \(\varOmega _h\) is only \({\mathscr {O}}(1/h)\). This implies that, when assembling the mass- and stiffness- matrices \(M_\varOmega \) and \(A_\varOmega \), respectively, only \({\mathscr {O}}(1/h)\) element-wise local matrices must be actually computed, since the local matrices for a square element are pre-computed.

It is worth remarking that the advantage described in this section becomes even more striking in higher space dimension: if the embedding space has dimension \(D\in {\mathbb {N}}\), \(D \ge 2\), then the procedure described in this section reduces the computational complexity of matrix assembly from \({\mathscr {O}}(1/h^D)\) to \({\mathscr {O}}(1/h^{D-1})\). Matrix assembly optimization can be also achieved through alternative approaches, such as cut FEM [17] or trace FEM [32]. However, in these works, the authors adopt a level set representation of the boundary \(\varGamma \), which we do not need in this study, as we exploit the usage of arbitrary polygons to approximate \(\varGamma \).

Fig. 5
figure 5

Generation of a polygonal BS mesh that allows for optimised matrix assembly

Fig. 6
figure 6

Enforcement of conditions (V1)–(V2) on the polygonal mesh for optimised matrix assembly. The neighbouring elements -not shown in the picture- are affected in the process, but properties (V1)–(V2) hold true regardless

7 Numerical simulations

In this section we provide three numerical examples to compare the performances of BSFEM and BSVEM. In the first example, we show (i) optimal convergence in the case of the elliptic problem (1) and (ii) the optimised matrix assembly introduced in Sect. 6. In the second example, we show optimal convergence in space and time in the case of the parabolic problem (2). In the third example, we compare the BSFEM and BSVEM solutions of the parabolic BS wave pinning model studied in [22].

7.1 Experiment 1: the elliptic problem

We start by constructing an exact solution for the elliptic problem (1) on the unit circle \(\varOmega := \{(x,y)\in {\mathbb {R}}^2 | x^2 + y^2 = 1\}\) by using the fact that \(-\varDelta _\varGamma xy = {4}xy\), i.e. the function \(w(x,y) := xy\) is an eigenfunction of the Laplace–Beltrami operator on the unit circumference \(\varGamma = \partial \varOmega \). Specifically, in (1) we choose the following load terms \(f(x,y) := xy\), for \((x,y) \in \varOmega \) and \(g(x,y) := \left( 2+\frac{5(\alpha +2)}{\beta }\right) xy\) for \((x,y) \in \varGamma \). Here \(\alpha \) and \(\beta \) are the parameters that appear in the model, such that the exact solution is given by \(u(x,y) := xy\) for \((x,y) \in \varOmega \) and \(v(x,y) := \frac{\alpha +2}{\beta }xy\) for \((x,y) \in \varGamma \). For illustrative purposes, we choose \(\alpha =1\) and \(\beta =2\). We solve the problem with BSFEM and BSVEM on two respective sequences of five meshes with decreasing meshsizes. For BSVEM, we use optimised meshes as in Sect. 6, while for BSFEM we use quasi-uniform Delaunay triangular meshes. See Fig. 7 for an illustration. For each mesh, we measure:

  • the \(L^2(\varOmega )\times L^2(\varGamma )\) and \(L^\infty (\varOmega )\times L^\infty (\varGamma )\) errors of the numerical solution and the respective experimental orders of convergence (EOCs), that are quadratic in the meshsize. In this case BSVEM is almost three times as accurate as BSFEM on similar meshsizes, probably because the mesh reflects the symmetry of the problem;

  • the condition number \(\text {cond}_{ell}\) of the linear system (63). The BSVEM is approximately four times as ill conditioned as BSFEM on similar meshsizes. An approach to reduce this gap is described in [13];

  • only for BSVEM, the number \(N_\varOmega \) of all elements and the number \(N_\varGamma \) of elements that intersect the boundary. The first is proportional to \(\frac{1}{h^2}\), while the latter is proportional to \(\frac{1}{h}\), as predicted in Sect. 6. This illustrates that only \(O(\frac{1}{h})\) local matrices must be computed during matrix assembly, even if the mesh has \(O(\frac{1}{h^2})\) elements.

In Fig. 8 we show the BSVEM numerical solution on the finest mesh. In Tables 1 and 2 we show the errors, the respective EOCs, and the numbers \(N_\varOmega \), \(N_\varGamma \) and \(\text {cond}_{ell}\).

Fig. 7
figure 7

Illustrative representation of the meshes used for the numerical experiments

Fig. 8
figure 8

Experiment 1 for the elliptic problem (1). Numerical solution on the finest of the meshes listed in Table 1, with meshsize \(h = 4.7611e{-}2\). Left plot shows the solution u in the bulk and the right plot shows the solution v on the surface (curve) \(\varGamma \)

Table 1 Experiment 1 for the elliptic problem (1)
Table 2 Experiment 1 for the elliptic problem (1), solved with BSFEM on a sequence of triangular meshes with meshsizes as similar as possible as those in Table 1

7.2 Experiment 2: linear parabolic problem

We now want to test convergence for the parabolic problem (2) in the linear case \(q(u) = -u\), \(r(u,v) = 2u\), \(s(u,v) = \frac{4}{3}v\), \(d_u = 1\) and \(d_v = \frac{1}{4}\), on the unit circle \(\varOmega := \{(x,y) \in {\mathbb {R}}^2 | x^2 + y^2 \le 1\}\). We consider the initial conditions \(u(x,y,0) = xy\) and \(v(x,y,0) = \frac{3}{2}xy\), such that the exact solution \(u(x,y,t) = e^{-t}xy\) and \(v(x,y,t) = \frac{3}{2}e^{-t}xy\). We solve the problem via BSFEM- and BSVEM-IMEX Euler on the same sequences of meshes of Experiment 1 with coresponding timesteps \(\tau _i = 10^{-3} \times 4^{-i}\), \(i=0,\dots ,4\) and final time \(T=1\). The errors in \(L^\infty ([0,T], L^2(\varOmega )\times L^2(\varGamma ))\) and \(L^\infty ([0,T], L^\infty (\varOmega )\times L^\infty (\varGamma ))\) norms and the respective EOCs are shown in Tables 3 and 4. For both methods we observe superconvergence, with the BSVEM being more accurate on similar meshsizes.

Table 3 Experiment 2 for the parabolic problem (2)
Table 4 Experiment 2 for the parabolic problem (2)

7.3 Experiment 3: nonlinear parabolic problem

In this section we compare the BSFEM- and BSVEM-IMEX Euler numerical solutions of the bulk-surface wave pinning (BSWP) model considered in [22, Fig. 4]. The non-dimensionalised BSWP model seeks to find a bulk concentration \(b: \varOmega \times [0,T] \rightarrow {\mathbb {R}}\) and a surface concentration \(a:\varGamma \times [0,T] \rightarrow {\mathbb {R}}\) such that

$$\begin{aligned} {\left\{ \begin{array}{ll} \displaystyle \varepsilon \frac{\partial b}{\partial t} - \varDelta b = 0, &{} \quad {\varvec{x}}\in \varOmega ,\\ \displaystyle \varepsilon \frac{\partial a}{\partial t} - \varepsilon ^2 \varDelta _\varGamma a = f(a,b), &{} \quad {\varvec{x}}\in \varGamma ,\\ -(\varvec{\nu }\cdot \nabla b) = f(a,b), &{} \quad {\varvec{x}}\in \varGamma , \end{array}\right. } \end{aligned}$$
(92)

where the kinetic f is of the form \(f(a,b) := \left( k_0 + \gamma {(a^2)}/{(1+a^2)} \right) b - a\). We solve the BSWP model (92) on the unit circle \(\varOmega := \{(x,y) \in {\mathbb {R}}^2 | x^2 + y^2 \le 1\}\). We choose the parameters of the model as \(\varepsilon ^2 = 0.001\), \(k_0 = 0.05\) and \(\gamma = 0.79\). The initial condition, plotted in Fig. 9a, is \(b(x,y,0) = 2.487\), for \((x,y)\in \varOmega \) and \(a(x,y,0) = 0.309 + 0.35(1 + sign(x))\exp (-20 y^2)\) for \((x,y) \in \varGamma \). The final time is \(T=4.5\). For the BSFEM we take a quasi-uniform Delaunay triangular mesh, while for the BSVEM we take a polygonal mesh designed for optimised matrix assembly, as explained in Sect. 6. An illustrative coarser representation of the meshes typically used is shown in Fig. 7. The details of these meshes are reported in Table 5. As we can see, on almost equal meshsizes, the BSVEM generates (i) a significantly large number of boundary nodes, which translates into better boundary approximation and (ii) less number of elements in the bulk, which implies faster matrix assembly. For both spatial discretisations, the time discretisation is computed with timestep \(\tau = 2e{-}3\). The condition numbers \(\text {cond}_\varOmega := \text {cond}(M_\varOmega + \tau {d_u}K_\varOmega )\) and \(\text {cond}_\varGamma := \text {cond}(M_\varGamma + \tau {d_v}K_\varGamma )\) of BSVEM on our polygonal mesh are only about 2.5 times as large as those of BSFEM on a quasi-uniform Delaunay mesh. This result, which could be further improved through an orthogonal polynomial basis approach [13] or through a different time integrator, confirms the well-known robustness of VEM with respect to general polygonal meshes [7]. The BSVEM solution at various times is shown in Fig. 9.

Fig. 9
figure 9

Experiment 3: BSVEM solutions of the BSWP model (92) at different times

Table 5 Experiment 3 for the wave pinning model (92): Details of the meshes used for the BSFEM and BSVEM, respectively

8 Conclusions

In this study, we have considered a bulk-surface virtual element method (BSVEM) for the numerical approximation of linear elliptic and semilinear parabolic coupled BSPDE problems on smooth BS domains. The proposed method simultaneously extends the BSFEM for BSRDSs [37] and the VEM for linear elliptic [5] and semilinear parabolic [1] bulk PDEs. The method has a simplified vector-matrix form that can be exploited also in the special case of the BSFEM considered in [37].

We have introduced polygonal BS meshes in two space dimensions and we have shown that the geometric error arising from domain approximation is \({\mathscr {O}}(h)\) in the bulk and \({\mathscr {O}}(h^2)\) on the surface, where h is the meshsize, exactly as in the special case of the triangular BS meshes used in the BSFEM [27]. Suitable polygonal meshes reduce the asymptotic computational complexity of matrix assembly from \({\mathscr {O}}(1/h^2)\) to \({\mathscr {O}}(1/h)\). In future studies we will show that polygonal meshes can unlock new efficient adaptive refinement strategies.

We have introduced novel theory to address the challenges of the error analysis. First, if the exact solution (uv) is \(H^{2+1/4}(\varOmega )\times H^2(\varGamma )\), the lifting operator can be replaced with the Sobolev extension operator. Second, if \(\varGamma \) is sufficiently smooth, there exists an \(L^2\)-preserving inverse trace operator that preserves the \(L^2\) norm and the \(H^1\) seminorm up to the same scale factor. Third, we have introduced a tailor-made Ritz projection in the bulk for the VEM that accounts for boundary conditions and that fulfils optimal error bounds. By using our bulk-Ritz projection we have drawn two consequences. First, the lowest order bulk-VEM [5] retains optimal convergence even in the simultaneous presence of curved boundaries and non-zero boundary conditions, a result that lacked a rigorous proof in the literature. Second, the proposed BSVEM possesses optimal spatial convergence, that is \({\mathscr {O}}(h^2)\) in \(L^\infty ([0,T], L^2(\varOmega )\times L^2(\varGamma ))\) norm, where T is the final time.

Numerical examples validate our findings in terms of (i) convergence rate in space and time for both the elliptic and the parabolic case and (ii) the computational advantages of polygonal meshes. Given the encouraging results, the extension of BSVEM to higher space dimension and different BSPDE problems is under development. Moreover, the generalisation of problem (2) to evolving BS domains, which would comprise additional models addressed in the literature such as [35], is left for future work.