1 Introduction

The Monge–Ampère equation is a fully nonlinear partial differential equation that appears in geometric analysis and related applications. Various aspects of this important equation can be found in the monographs [6, 20, 38, 42, 44, 49, 72].

The Dirichlet boundary value problem for the Monge–Ampère equation is given by

$$\begin{aligned} \det D^2 u&=\psi&\qquad&\hbox { in}\ \Omega , \end{aligned}$$
(1.1a)
$$\begin{aligned} u&=\phi&\qquad&\hbox { on}\ \partial \Omega . \end{aligned}$$
(1.1b)

If \(\Omega \) is a smooth and strictly convex domain, \(\psi \in C^3({\bar{\Omega }})\) is strictly positive on \({\bar{\Omega }}\) and \(\phi \in C^{4,\delta }({\bar{\Omega }})\) for some \(\delta \in (0,1)\), then (1.1) has a unique strictly convex solution \(u\in C^{4,\delta }({\bar{\Omega }})\) (cf. [21, p. 371, Remark 2]). Our goal is to develop finite element methods that can capture such smooth convex solutions of (1.1).

Remark 1.1

Throughout this paper we will follow the standard notation for differential operators, function spaces and norms that can be found for example in [1, 17, 23].

As a first step, we consider a finite element method for (1.1) on polygonal domains. Accordingly we assume that \(\Omega \subset \mathbb {R}^2\) is a bounded convex polygon,

$$\begin{aligned} \phi&\in {H^4(\Omega )}, \end{aligned}$$
(1.2)
$$\begin{aligned} \psi&\in {H^2(\Omega )}\;\hbox { is strictly positive on}\ {\bar{\Omega }}, \end{aligned}$$
(1.3)

and

$$\begin{aligned} \text {the boundary value problem } (1.1) \text { has a strictly convex solution } u\in {H^4(\Omega )}, \end{aligned}$$
(1.4)

i.e., there exists a positive constant \(\alpha _\sharp \) such that the Hessian \(D^2 u\) satisfies

$$\begin{aligned} \xi ^t (D^2u)(x)\xi \ge \alpha _\sharp |\xi |^2 \qquad \forall \,x\in \Omega , \;\xi \in \mathbb {R}^2. \end{aligned}$$
(1.5)

Remark 1.2

The extension of our method to strictly convex smooth domains, where the regularity (1.4) follows from appropriate regularity of the data, will be carried out in a forthcoming paper.

Remark 1.3

Since a \(2\times 2\) symmetric matrix and its cofactor matrix have identical eigenvalues, the estimate (1.5) is equivalent to

$$\begin{aligned} \xi ^t \mathrm {Cof}(D^2u)(x)\xi \ge \alpha _\sharp |\xi |^2 \qquad \forall \,x\in \Omega , \;\xi \in \mathbb {R}^2. \end{aligned}$$
(1.6)

Remark 1.4

Note that under assumption (1.3) a sufficiently smooth solution of (1.1) is strictly convex if and only if \(\Delta u\ge 0\) on \(\Omega \). This is the key motivation for the finite element method in this paper.

There are many numerical approaches to the Dirichlet boundary value problem of the Monge–Ampère equation (and related equations) in 2 and 3 spatial dimensions, with respect to different solution classes (classical solutions, Aleksandrov solutions [2] and viscosity solutions [54]). They include (i) geometric finite difference methods [63, 66, 68, 69], (ii) monotone finite difference methods [7,8,9, 39,40,41, 48, 50, 67], (iii) augmented Lagrangian and least-squares finite element methods [19, 28,29,30,31], (iv) finite element methods based on the vanishing moment approach [3, 35,36,37, 57], (v) finite element methods based on \(L_2\) projection [4, 5, 10, 11, 13, 15, 27, 51, 58,59,60], (vi) finite element methods based on a reformulation of the Monge–Ampère equation as a Hamilton–Jacobi–Bellman equation [14, 34], and (v) two-scale methods [53, 64, 65]. Comprehensive reviews of the literature can be found in [33, 61].

The method in this paper is also based on a nonlinear least-squares approach. It is different from the least-squares method of Dean and Glowinski [19, 29, 31] in that our least-squares problem is posed only on the finite element spaces and the discrete problems are solved purely as optimization problems.

The key ingredient in our method is an enhanced cubic Lagrange element with exotic degrees of freedom (dofs) that enables us to enforce the convexity of the finite element solutions, which then allows us to develop a simple error analysis based on existing results for second order elliptic problems in non-divergence form.

The rest of the paper is organized as follows. We introduce the enhanced cubic Lagrange element in Sect. 2 together with the discrete nonlinear least-squares problem. We then present a priori and a posteriori error analyses in Sect. 3 and numerical results in Sect. 4. We end with some concluding remarks in Sect. 5. We also put some of the details in three appendices so that the main flow of the presentation is not distracted. Appendix A contains the derivation of a stability result for elliptic problems in non-divergence form needed for the error analysis in Sect. 3. Details of the optimization algorithm that we use to solve the discrete problems are given in Appendix B. An algorithm that we use to check the elementwise convexity of the approximate solutions is outlined in Appendix C.

Throughout the paper we will use C to denote a generic positive constant independent of the mesh size.

2 The discrete problem

The discrete problem is a nonlinear least-squares problem with box constraints. It is based on an exotic finite element space whose degrees of freedom (dofs) can enforce the convexity of the solutions.

2.1 An enhanced cubic Lagrange finite element

We begin by introducing a new finite element where some of the degrees of freedom (dofs) are associated with nodes outside the element domain. Consequently the construction of the local basis requires information from outside an element. Below we will treat a polynomial on an element as the restriction of a polynomial on \(\mathbb {R}^2\) and use the same notation to denote both. In other words, we will identify \(P_k(\mathbb {R}^2)\) with the space \(P_k(T)\) of polynomials of (total) degree \(\le k\) on a triangle T.

The construction of the finite element is based on the following lemma, where \({\hat{T}}\) is the reference simplex with vertices (0, 0), (1, 0) and (0, 1), and \(\varphi _{{\hat{T}}}=x_1x_2(1-x_1-x_2)\) is the cubic bubble function that vanishes on the boundary of \({\hat{T}}\).

Lemma 2.1

A function \(v\in P_3({\hat{T}})\oplus \varphi _{{\hat{T}}}^2 P_1({\hat{T}})\) is uniquely determined by the 10 dofs of the standard cubic Lagrange finite element together with the values of \(\Delta v\) at the three points (1, 1), \((-1,1)\) and \((1,-1)\) (cf. Fig. 1).

Proof

Suppose v vanishes at the 9 vertex and edge nodes, then v belongs to \(\langle { \varphi _{{\hat{T}}}}\rangle \oplus \varphi _{{\hat{T}}}^2 P_1({\hat{T}})\). A direct calculation shows that 0 is the only polynomial in \(\langle { \varphi _{{\hat{T}}}}\rangle \oplus \varphi _{{\hat{T}}}^2 P_1({\hat{T}})\) that vanishes at the center of \({\hat{T}}\) and whose Laplacian also vanishes at the three points (1, 1), \((-1,1)\) and \((1,-1)\). \(\square \)

Fig. 1
figure 1

dofs of the enhanced \(P_3\) Lagrange element

Remark 2.2

The space \(P_3({\hat{T}})\oplus \varphi _{{\hat{T}}}^2 P_1({\hat{T}})\) and the 13 dofs in Lemma 2.1 do not define a finite element on \({\hat{T}}\) in the classical sense of Ciarlet in [23, page 78] because the shape functions are treated as functions defined globally on \(\mathbb {R}^2\), and not as functions defined just on the element domain.

Remark 2.3

The vertices of the reference simplex are the midpoints of the edges of the triangle with vertices (1, 1), \((-1,1)\) and \((1,-1)\). For a general triangle T, the triangle whose midpoints are the vertices of T will be denoted by \({T}_\dag \).

On an arbitrary triangle T, the space of shape functions of the enhanced cubic Lagrange element is \(P_3(T)\oplus \varphi _{\scriptscriptstyle T}^2P_1(T)\), where \(\varphi _{\scriptscriptstyle T}\) is the cubic bubble function that vanishes on the boundary of T. The dofs of \(v\in P_3(T)\oplus \varphi _{\scriptscriptstyle T}^2P_1(T)\) are (i) the values of v at the three vertices, (ii) the values of v at the two points that trisect each edge, (iii) the value of v at the center of T, and (iv) the values of \(\mathrm {tr}(J_{\scriptscriptstyle T}^t D^2 vJ_{\scriptscriptstyle T})\) at the three vertices of \({T}_\dag \), where \(J_{\scriptscriptstyle T}\in \mathbb {R}^{2\times 2}\) is the Jacobian matrix of an affine map that maps the reference simplex to T.

Remark 2.4

The enhanced cubic Lagrange element is affine-equivalent (cf. [17, 23]) by construction. The exotic dofs at the vertices of \({T}_\dag \) are responsible for enforcing the elementwise convexity of the discrete solutions of (1.1).

2.2 The finite element space \({V}_{h}\)

Let \(\mathcal {T}_h\) be a quasi-uniform simplicial triangulation of \(\Omega \). A function v belongs to the finite element space \(V_h \subset H^1(\Omega )\) if and only if (i) v belongs to \(C({\bar{\Omega }})\) and (ii) the restriction of v to \(T\in \mathcal {T}_h\) belongs to \(P_3(T)\oplus \varphi _{\scriptscriptstyle T}^2P_1(T)\).

The (global) dofs of \(v\in V_h\) are (i) the values of v at the vertices of \(\mathcal {T}_h\), (ii) the values of v at the points that trisect the edges of \(\mathcal {T}_h\), (iii) the values of v at the centers of the triangles in \(\mathcal {T}_h\), and (iv) the values of \(\mathrm {tr}(J_{\scriptscriptstyle T}^t D^2 v_{\scriptscriptstyle T}J_{\scriptscriptstyle T})\) at the three vertices of \({T}_\dag \) for each \(T\in \mathcal {T}_h\), where \(v_{\scriptscriptstyle T}\) is the restriction of v to T and \(J_{\scriptscriptstyle T}\) is the Jacobian of an affine map that maps the reference simplex to T.

Remark 2.5

The dofs in (iii) and (iv) define the bubble functions in \(\langle \varphi _{\scriptscriptstyle T}\rangle \oplus \varphi _{\scriptscriptstyle T}^2P_1(T)\).

It follows from the extension theorems for Sobolev spaces (cf. [1, Chapter 5]) that the solution \(u\in H^4(\Omega )\) of (1.1) can be extended to a strictly convex function in \(H^4({\tilde{\Omega }})\) where \({\tilde{\Omega }}\) is an open set that contains \({\bar{\Omega }}\) in its interior. We will denote this extension again by u. We assume that h is sufficiently small so that

$$\begin{aligned} {T}_\dag \subset {\tilde{\Omega }}\qquad \forall \,T\in \mathcal {T}_h. \end{aligned}$$
(2.1)

The nodal interpolant \(\Pi _hu\in V_h\) is then defined by the condition that u and \(\Pi _hu\) share the same global dofs mentioned above.

We will denote the piecewise Hessian operator by \(D_h^2\), the set of the interior edges of \(\mathcal {T}_h\) by \(\mathcal {E}_h^i\), the length of an edge e by |e|, and the jump of the normal derivative of v across an (interior) edge by \([[\partial v/\partial n]]\).

Lemma 2.6

The following estimates are valid for \(\Pi _hu\):

$$\begin{aligned}&\Vert u-\Pi _h u\Vert _{L_2(\Omega )}+h|u-\Pi _h u|_{H^1(\Omega )}+h\Vert u-\Pi _hu\Vert _{L_\infty (\Omega )} \nonumber \\&\quad +h^2\Vert D_h^2(u-\Pi _hu)\Vert _{L_2(\Omega )}+ h^4\Big (\sum _{T\in \mathcal {T}_h}|D_h^2(\Pi _h u)|_{H^2(T)}^2\Big )^\frac{1}{2} \le C h^4|u|_{H^4({\tilde{\Omega }})}, \end{aligned}$$
(2.2)
$$\begin{aligned}&|u-\Pi _hu|_{W^{2,\infty }(T)}\le Ch|u|_{H^4({\tilde{\Omega }})} \qquad \forall \,T\in \mathcal {T}_h, \end{aligned}$$
(2.3)
$$\begin{aligned}&\sum _{e\in \mathcal {E}_h^i}|e|^{-1}\Vert [[\partial (\Pi _hu)/\partial n]]\Vert _{L_2(e)}^2=\sum _{e\in \mathcal {E}_h^i}|e|^{-1}\Vert [[\partial (u-\Pi _hu)/\partial n]]\Vert _{L_2(e)}^2\nonumber \\&\quad \le Ch^4|u|_{H^4({\tilde{\Omega }})}^2, \end{aligned}$$
(2.4)
$$\begin{aligned}&\Vert D_h^2(\Pi _h u)\Vert _{L_2(\Omega )}^2+\sum _{T\in \mathcal {T}_h}|D_h^2(\Pi _h u)|_{H^2(T)}^2 +\max _{T\in \mathcal {T}_h}|\Pi _h u|_{W^{2,\infty }(T)}^2\le C\Vert u\Vert _{H^4({\tilde{\Omega }})}^2. \end{aligned}$$
(2.5)

Proof

The estimates (2.2) and (2.3) follow from the invariance of cubic polynomials under the local nodal interpolation operator, the Bramble-Hilbert lemma [12] and scaling. The estimate (2.2) then implies the estimate (2.4) through the trace theorem with scaling, and the estimate (2.5) follows from (2.2) and (2.3) through the triangle inequality and the Sobolev Embedding Theorem [1, Theorem 4.12]. \(\square \)

In view of the estimate for \(\Vert D_h^2(u-\Pi _hu)\Vert _{L_2(\Omega )}\) in (2.2) and the bound for \(\max _{T\in \mathcal {T}_h}|\Pi _h u|_{W^{2,\infty }(T)}\) in (2.5), we immediately arrive at

$$\begin{aligned} \Vert \det D_h^2(\Pi _hu)-\det D^2u\Vert _{L_2(\Omega )}\le Ch^2, \end{aligned}$$
(2.6)

where the positive constant C is independent of h.

Remark 2.7

All the estimates for \(\Pi _h u\) are also valid for \(\Pi _h\phi \). In particular we have

$$\begin{aligned} \Vert D_h^2(\phi -\Pi _h\phi )\Vert _{L_2(\Omega )}+{ \Big (}\sum _{e\in \mathcal {E}_h^i}|e|^{-1}\Vert [[\partial (\Pi _h\phi )/\partial n]]\Vert _{L_2(e)}^2{ \Big )^\frac{1}{2}}\le Ch^2. \end{aligned}$$

2.3 A nonlinear least-squares problem with box constraints

Let \(\phi _h\) be the one dimensional cubic Lagrange interpolant of \(\phi \) along \(\partial \Omega \) and the (convex) subset \(L_h\) of \(V_h\) be defined by

$$\begin{aligned} L_h&=\{v\in V_h:\, v=\phi _h\;\text {on}\;\partial \Omega \;\text {and}\; \mathrm {tr}(J_{\scriptscriptstyle T}^tD^2v_{\scriptscriptstyle T}J_{\scriptscriptstyle T}) \ge 0\; \hbox { at the vertices of}\ {T}_\dag \nonumber \\&\quad \hbox { for every}\ T\in \mathcal {T}_h\}. \end{aligned}$$
(2.7)

Remark 2.8

The inequality constraints in the definition of \(L_h\) are motivated by the observation in Remark 1.4 and they are box constraints for the dofs of \(V_h\) introduced at the beginning of Sect. 2.2.

Remark 2.9

Note that \(\phi _h=\Pi _h\phi =\Pi _h u\) on \(\partial \Omega \) and hence \(v=\Pi _h u\) on \(\partial \Omega \) for all \(v\in L_h\).

The discrete problem is to find

$$\begin{aligned} u_h=\mathop {\mathrm{argmin}}_{v\in L_h}\mathcal {J}_h(v), \end{aligned}$$
(2.8)

where the cost function \(\mathcal {J}_h\) is defined by

$$\begin{aligned} \mathcal {J}_h(v)&=\frac{h^4}{2}\Vert D_h^2 v\Vert _{L_2(\Omega )}^2+\frac{1}{2}\sum _{T\in \mathcal {T}_h}|\mathrm {tr}(J_{\scriptscriptstyle T}^tD^2v_{\scriptscriptstyle T}J_{\scriptscriptstyle T})|_{H^2(T)}^2\nonumber \\&\quad +\frac{1}{2}\sum _{e\in \mathcal {E}_h^i}|e|^{-1}\Vert [[\partial v/\partial n]]\Vert _{L_2(e)}^2\nonumber \\&\quad +\frac{1}{2}\Vert \det D_h^2 v-\psi \Vert _{L_2(\Omega )}^2, \end{aligned}$$
(2.9)

and \(v_{\scriptscriptstyle T}\) is the restriction of v to T. Note that the Frobenius norm of \(J_{\scriptscriptstyle T}\) satisfies

$$\begin{aligned} |J_{\scriptscriptstyle T}|\approx h. \end{aligned}$$
(2.10)

We will use \(\Vert \cdot \Vert _h\) to denote the mesh-dependent norm defined by

$$\begin{aligned} \Vert v\Vert _h^2=\Vert D_h^2v\Vert _{L_2(\Omega )}^2+\sum _{e\in \mathcal {E}_h^i}|e|^{-1}\Vert [[\partial v/\partial n]]\Vert _{L_2(e)}^2. \end{aligned}$$
(2.11)

Remark 2.10

The first two terms in the definition of \(\mathcal {J}_h\) are regularization terms that are crucial for the well-posedness of the discrete problem and for enforcing the elementwise convexity of the discrete solutions. The third term is a penalty term that compensates for the fact that \(V_h\not \subset H^2(\Omega )\). The last term is the least-squares term for (1.1a).

The solvability of (2.8) is justified by the following result.

Lemma 2.11

The cost function \(\mathcal {J}_h:L_h\longrightarrow [0,\infty )\) has a global minimizer.

Proof

According to the Poincaré-Friedrichs inequality for piecewise \(H^2\) functions in [18], we have

$$\begin{aligned} \Vert v\Vert _{L_2(\Omega )}\le C\Vert v\Vert _h \qquad \forall \,v\in V_h\cap H^1_0(\Omega ), \end{aligned}$$

which implies (cf. Remark 2.9)

$$\begin{aligned} \Vert v-\Pi _hu\Vert _{L_2(\Omega )}\le C\Vert v-\Pi _hu\Vert _h \qquad \forall \,v\in L_h. \end{aligned}$$
(2.12)

It follows from (2.2), (2.4), (2.5), (2.11) and (2.12) that \(\Vert v\Vert _h\) (and hence \(\mathcal {J}_h(v)\)) approaches \(\infty \) if v belongs to \(L_h\) and \(\Vert v\Vert _{L_2(\Omega )}\) approaches \(\infty \). \(\square \)

3 Error analysis

We will show that any \(u_h\) satisfying (2.8) will converge to the solution u of (1.1) as \(h\downarrow 0\) and the order of convergence is 2. Since our optimization algorithm does not guarantee that a global minimizer of \(\mathcal {J}_h\) can be found, it is also useful to have an a posteriori error estimate that can demonstrate the convergence of our method numerically.

3.1 Some a priori bounds for \({u_h}\)

Since \(\Pi _h u\) belongs to \(L_h\), it follows from (1.1a), (2.4)–(2.6) and (2.10) that

$$\begin{aligned}&h^4\Vert D_h^2u_h\Vert _{L_2(\Omega )}^2+\sum _{T\in \mathcal {T}_h}|\mathrm {tr}(J_{\scriptscriptstyle T}^tD^2(u_h)_{\scriptscriptstyle T}J_{\scriptscriptstyle T})|_{H^2(T)}^2\nonumber \\&\qquad +\sum _{e\in \mathcal {E}_h^i}|e|^{-1}\Vert [[\partial u_h/\partial n]]\Vert _{L_2(e)}^2 +\Vert \det D_h^2 u_h-\psi \Vert _{L_2(\Omega )}^2\nonumber \\&\quad = 2J_h(u_h)\nonumber \\&\quad \le 2J_h(\Pi _h u)\nonumber \\&\quad =h^4\Vert D_h^2(\Pi _hu)\Vert _{L_2(\Omega )}^2 +\sum _{T\in \mathcal {T}_h}|\mathrm {tr}(J_{\scriptscriptstyle T}^tD^2(\Pi _hu)_{\scriptscriptstyle T}J_{\scriptscriptstyle T})|_{H^2(T)}^2\nonumber \\&\qquad +\sum _{e\in \mathcal {E}_h^i}|e|^{-1}\Vert [[\partial (\Pi _hu)/\partial n]]\Vert _{L_2(e)}^2+\Vert \det D_h^2(\Pi _hu)-\det D^2 u\Vert _{L_2(\Omega )}^2\nonumber \\&\quad \le Ch^4. \end{aligned}$$
(3.1)

Consequently we have

$$\begin{aligned} \Vert D_h^2u_h\Vert _{L_2(\Omega )}&\le C, \end{aligned}$$
(3.2)
$$\begin{aligned} \Big (\sum _{T\in \mathcal {T}_h}|\mathrm {tr}(J_{\scriptscriptstyle T}^tD^2(u_h)_{\scriptscriptstyle T}J_{\scriptscriptstyle T})|_{H^2(T)}^2\Big )^\frac{1}{2}&\le Ch^2, \end{aligned}$$
(3.3)
$$\begin{aligned} \Big (\sum _{e\in \mathcal {E}_h^i}|e|^{-1}\Vert [[\partial u_h/\partial n]]\Vert _{L_2(e)}^2\Big )^\frac{1}{2}&\le Ch^2, \end{aligned}$$
(3.4)
$$\begin{aligned} \Vert \det D_h^2 u_h-\psi \Vert _{L_2(\Omega )}&\le Ch^2. \end{aligned}$$
(3.5)

Let \(T\in \mathcal {T}_h\) be arbitrary and \(\psi _{\scriptscriptstyle T}\) be the \(P_1\) Lagrange interpolant of \(\psi \) on T. We have, by a standard inverse estimate (cf. [17, 23]),

$$\begin{aligned} \Vert \det D_h^2 u_h-\psi \Vert _{L_\infty (T)}&\le \Vert \det D_h^2 u_h-\psi _{\scriptscriptstyle T}\Vert _{L_\infty (T)} +\Vert \psi -\psi _{\scriptscriptstyle T}\Vert _{L_\infty (T)}\nonumber \\&\le Ch^{-1}\Vert \det D_h^2 u_h-\psi _{\scriptscriptstyle T}\Vert _{L_2(T)}+\Vert \psi -\psi _{\scriptscriptstyle T}\Vert _{L_\infty (T)}\nonumber \\&\le Ch^{-1}\big (\Vert \det D_h^2 u_h-\psi \Vert _{L_2(T)}+\Vert \psi -\psi _{\scriptscriptstyle T}\Vert _{L_2(T)}\big ) \nonumber \\&\qquad +\Vert \psi -\psi _{\scriptscriptstyle T}\Vert _{L_\infty (T)}, \end{aligned}$$

which together with (3.5) and the assumption that \(\psi \in { H^2(\Omega )}\) implies

$$\begin{aligned} \Vert \det D_h^2 u_h-\psi \Vert _{L_\infty (T)}\le Ch \qquad \forall \,T\in \mathcal {T}_h. \end{aligned}$$
(3.6)

3.2 The elementwise convexity of \({u_h}\)

Let \(q_{\scriptscriptstyle T}\) be a polynomial defined on \(T\in \mathcal {T}_h\). Recall that \(q_{\scriptscriptstyle T}\) is the restriction of a polynomial defined on \(\mathbb {R}^2\) which is also denoted by \(q_{\scriptscriptstyle T}\). We define \(I_Tq_{\scriptscriptstyle T}\) to be the restriction of \(I_{{T}_\dag }q_{\scriptscriptstyle T}\) on T, where \(I_{{T}_\dag }\) is the \(P_1\) nodal interpolation operator associated with \({T}_\dag \). Note that any linear polynomial on T is invariant under \(I_T\), and according to (2.7),

$$\begin{aligned} I_T \mathrm {tr}(J_{\scriptscriptstyle T}^t D^2v_{\scriptscriptstyle T}J_{\scriptscriptstyle T})\ge 0 \quad \hbox { on}\ T \text { for all } v\in L_h. \end{aligned}$$
(3.7)

We have, by (3.3), a standard inverse estimate (cf. [17, 23]) and the Bramble-Hilbert lemma,

$$\begin{aligned}&\Vert \mathrm {tr}(J_{\scriptscriptstyle T}^tD^2 (u_h)_{\scriptscriptstyle T} J_{\scriptscriptstyle T})-I_T \mathrm {tr}(J_{\scriptscriptstyle T}^t D^2 (u_h)_{\scriptscriptstyle T} J_{\scriptscriptstyle T}) \Vert _{L_\infty (T)}\nonumber \\&\quad \le C h^{-1}\Vert \mathrm {tr}(J_{\scriptscriptstyle T}^tD^2(u_h)_{\scriptscriptstyle T} J_{\scriptscriptstyle T})-I_T \mathrm {tr}(J_{\scriptscriptstyle T}^t D^2(u_h)_{\scriptscriptstyle T} J_{\scriptscriptstyle T})\Vert _{L_2(T)}\nonumber \\&\quad \le C h|\mathrm {tr}(J_{\scriptscriptstyle T}^tD^2(u_h)_{\scriptscriptstyle T} J_{\scriptscriptstyle T})|_{H^2(T)}\nonumber \\&\quad \le Ch^3. \end{aligned}$$
(3.8)

Lemma 3.1

There exists a positive constant \(\alpha _\flat \) independent of h such that, for h sufficiently small, we have

$$\begin{aligned} \xi ^t D_h^2u_h\xi \ge \alpha _\flat |\xi |^2\qquad \hbox {on all}\ T\in \mathcal {T}_h \text { and for all } \xi \in \mathbb {R}^2, \end{aligned}$$
(3.9)

or equivalently the minimum eigenvalue of \(D_h^2u_h\) is bounded below by a positive constant independent of h.

Proof

Let \(T\in \mathcal {T}_h\) be arbitrary. From (3.6), we have

$$\begin{aligned} \det D^2(u_h)_{\scriptscriptstyle T} \ge \frac{1}{2} \min _{x\in {\bar{\Omega }}}\psi (x)>0 \qquad \hbox { on}\ T \end{aligned}$$

if h is sufficiently small. Consequently, in view of (2.10), we also have

$$\begin{aligned} h^{-4}\det (J_{\scriptscriptstyle T}^t D^2(u_h)_{\scriptscriptstyle T}J_{\scriptscriptstyle T})\ge \frac{ \delta ^2}{2} \min _{x\in {\bar{\Omega }}}\psi (x) \qquad \forall \,T\in \mathcal {T}_h, \end{aligned}$$
(3.10)

where the positive constant \(\delta \le \min \{h^{-2}|\det J_{\scriptscriptstyle T}|:\, T\in \mathcal {T}_h\}\) is independent of h.

On the other hand, on each \(T\in \mathcal {T}_h\), we have

$$\begin{aligned} h^{-2}\mathrm {tr}(J_{\scriptscriptstyle T}^tD^2(u_h)_{\scriptscriptstyle T} J_{\scriptscriptstyle T})&\ge h^{-2}\big [\mathrm {tr}(J_{\scriptscriptstyle T}^t D^2(u_h)_{\scriptscriptstyle T} J_{\scriptscriptstyle T})-I_T\mathrm {tr}(J_{\scriptscriptstyle T}^tD^2(u_h)_{\scriptscriptstyle T} J_{\scriptscriptstyle T})\big ]\nonumber \\&\ge -h^{-2}\Vert \mathrm {tr}(J_{\scriptscriptstyle T}^t D^2(u_h)_{\scriptscriptstyle T} J_{\scriptscriptstyle T})-I_T \mathrm {tr}(J_{\scriptscriptstyle T}^t D^2(u_h)_{\scriptscriptstyle T} J_{\scriptscriptstyle T}) \Vert _{L_\infty (T)}\nonumber \\&\ge -Ch \end{aligned}$$
(3.11)

by (3.7) and (3.8), where the positive constant C is independent of h.

We conclude from (3.10) and (3.11) that, for h sufficiently small, the minimum eigenvalue of \(h^{-2}J_{\scriptscriptstyle T}^tD^2(u_h)_{\scriptscriptstyle T}J_{\scriptscriptstyle T}\) on the triangle T is bounded below by a positive constant independent of T and h, which implies that the same is true for \(D_h^2u_h\) because \(\mathcal {T}_h\) is a quasi-uniform triangulation. \(\square \)

Therefore, for h sufficiently small, \(u_h\) is a strictly convex polynomial on each \(T\in \mathcal {T}_h\). Note that (3.9) is equivalent to

$$\begin{aligned} \xi ^t \mathrm {Cof}(D_h^2u_h)\xi \ge \alpha _\flat |\xi |^2\qquad \text { on all } T\in \mathcal {T}_h \text { and for all } \xi \in \mathbb {R}^2. \end{aligned}$$
(3.12)

3.3 A priori error estimates

We have, by the fundamental theorem of calculus (cf. [38, Lemma A.1]),

$$\begin{aligned} \det D^2 u-\det D_h^2 u_h=\Big [\int _0^1 \mathrm {Cof}D_h^2(tu+(1-t)u_h) dt\Big ]:D_h^2(u-u_h), \end{aligned}$$
(3.13)

which is valid for all points in \(\Omega \) except those on the edges of \(\mathcal {T}_h\). Here and below we use the colon to denote the Frobenius inner product between matrices.

Let \(A\in [L_\infty (\Omega )]^{2\times 2}\) be defined by the integral on the right-hand side of (3.13). For h sufficiently small, we have, by (1.6), (3.6) and (3.12),

$$\begin{aligned} \alpha |\xi |^2\le \xi ^tA(x)\xi \le \beta |\xi |^2 \qquad \forall \,\xi \in \mathbb {R}^2 \quad \text {and almost all } x\in \Omega , \end{aligned}$$
(3.14)

where \(0<\alpha \le \beta \) are constants independent of h.

The proof of the following lemma is given in Appendix A.

Lemma 3.2

Under the condition (3.14) we have

$$\begin{aligned} \Vert D_h^2(\zeta - v)\Vert _{L_2(\Omega )}&\le \frac{1}{1-\delta }\Big [\alpha ^{-1}\Vert A:D_h^2 (\zeta -v)\Vert _{L_2(\Omega )}\nonumber \\&\quad +2C_\dag \Big (\sum _{e\in \mathcal {E}_h^i}|e|^{-1}\Vert [[\partial v/\partial n]]\Vert _{L_2(e)}^2\Big )^\frac{1}{2}\Big ] \end{aligned}$$
(3.15)

for all \(\zeta \in H^2(\Omega )\cap H^1_0(\Omega )\) and \(v\in V_h\cap H^1_0(\Omega )\), where

$$\begin{aligned} \delta =\frac{\beta -\alpha }{(\alpha ^2+\beta ^2)^\frac{1}{2}} \end{aligned}$$
(3.16)

and the positive constant \(C_\dag \) only depends on the shape regularity of \(\mathcal {T}_h\).

We can now establish an a priori error estimate for \(u_h\).

Theorem 3.3

There exists a positive constant C independent of h such that

$$\begin{aligned} \Vert u-u_h\Vert _h\le Ch^2. \end{aligned}$$
(3.17)

Proof

We can assume h is sufficiently small so that (3.14) is satisfied. We begin with the triangle inequality

$$\begin{aligned} \Vert D_h^2(u-u_h)\Vert _{L_2(\Omega )}\le & {} \Vert D_h^2\big ((u-\phi )-(u_h-\Pi _h\phi )\big )\Vert _{L_2(\Omega )}\nonumber \\&+\Vert D_h^2(\phi -\Pi _h\phi )\Vert _{L_2(\Omega )}. \end{aligned}$$
(3.18)

Note that \(u-\phi \in H^2(\Omega )\cap H^1_0(\Omega )\) by (1.1b) and \(u_h-\Pi _h\phi \in V_h\cap H^1_0(\Omega )\) by (2.7) and Remark 2.9. Hence it follows from (3.15) that

$$\begin{aligned}&\Vert D_h^2\big ((u-\phi )-(u_h-\Pi _h\phi )\big )\Vert _{L_2(\Omega )}\nonumber \\&\quad \le C\Big [\Vert A:D_h^2\big ((u-\phi )-(u_h-\Pi _h\phi )\big )\Vert _{L_2(\Omega )}\nonumber \\&\qquad +\Big (\sum _{e\in \mathcal {E}_h^i}|e|^{-1}\Vert [[\partial (u_h-\Pi _h\phi )/\partial n]]\Vert _{L_2(e)}^2\Big )^\frac{1}{2}\Big ]\nonumber \\&\quad \le C\Big [\Vert A:D_h^2(u-u_h)\Vert _{L_2(\Omega )}+\Big (\sum _{e\in \mathcal {E}_h^i}|e|^{-1}\Vert [[\partial u_h/\partial n]]\Vert _{L_2(e)}^2\Big )^\frac{1}{2}\nonumber \\&\qquad +\Vert D_h^2(\phi -\Pi _h\phi )\Vert _{L_2(\Omega )}+\Big (\sum _{e\in \mathcal {E}_h^i}|e|^{-1}\Vert [[\partial \Pi _h\phi /\partial n]]\Vert _{L_2(e)}^2\Big )^\frac{1}{2}\Big ]. \end{aligned}$$
(3.19)

Putting (1.1a), (3.13), (3.18) and (3.19) together, we have

$$\begin{aligned} \Vert D_h^2(u-u_h)\Vert _{L_2(\Omega )}&\le C\Big [\Vert \psi -\det D_h^2u_h\Vert _{L_2(\Omega )}+\Big (\sum _{e\in \mathcal {E}_h^i}|e|^{-1}\Vert [[\partial u_h/\partial n]]\Vert _{L_2(e)}^2\Big )^\frac{1}{2}\nonumber \\&\quad +\mathrm {Osc}(\phi )\Big ], \end{aligned}$$
(3.20)

where

$$\begin{aligned} \mathrm {Osc}(\phi )=\Vert D_h^2(\phi -\Pi _h\phi )\Vert _{L_2(\Omega )}+ \Big (\sum _{e\in \mathcal {E}_h^i}|e|^{-1}\Vert [[\partial \Pi _h\phi /\partial n]]\Vert _{L_2(e)}^2\Big )^\frac{1}{2}. \end{aligned}$$
(3.21)

It follows from Remark 2.7, (3.4), (3.5), (3.20) and (3.21) that

$$\begin{aligned} \Vert D_h^2(u-u_h)\Vert _{L_2(\Omega )}\le Ch^2, \end{aligned}$$
(3.22)

and then the estimate (3.17) follows from (2.11), (3.4) and (3.22). \(\square \)

Remark 3.4

A careful tracking of the constant C in (3.17) shows that it takes the form of \(C_* (\beta /\alpha )[(1/\alpha )+1]\), where \(\alpha \) and \(\beta \) are the constants in (3.14) and the positive constant \(C_*\) depends only on u, \(\phi \) and the shape regularity of \(\mathcal {T}_h\).

According to the Poincaré-Friedrichs and Sobolev inequalities for piecewise \(H^2\) functions (cf. [16, 18]), we have

$$\begin{aligned} \Vert \zeta \Vert _{L_2(\Omega )}+|\zeta |_{H^1(\Omega )}+\Vert \zeta \Vert _{L_\infty (\Omega )}\le C\Vert \zeta \Vert _h \end{aligned}$$
(3.23)

for all \(\zeta \in H^1_0(\Omega )\) that is piecewise \(H^2\) with respect to \(\mathcal {T}_h\). It follows from (2.2), (2.4), Remark 2.9, (2.11), (3.17), (3.23) and the triangle inequality that

$$\begin{aligned}&\Vert u-u_h\Vert _{L_2(\Omega )}+|u-u_h|_{H^1(\Omega )}+\Vert u-u_h\Vert _{L_\infty (\Omega )}\nonumber \\&\quad \le \Vert u-\Pi _hu\Vert _{L_2(\Omega )}+|u-{ \Pi _h} u_h|_{H^1(\Omega )}+\Vert u-{ \Pi _h}u_h\Vert _{L_\infty (\Omega )}\nonumber \\&\qquad +C(\Vert u-\Pi _hu\Vert _h+\Vert u-u_h\Vert _h)\nonumber \\&\quad \le Ch^2. \end{aligned}$$
(3.24)

Remark 3.5

Numerical results in Example 4.1 indicate that the convergence in \(\Vert \cdot \Vert _{L_2(\Omega )}\), \(|\cdot |_{H^1(\Omega )}\) and \(\Vert \cdot \Vert _{L_\infty (\Omega )}\) is better than \(O(h^2)\).

3.4 An a posteriori error estimate

Since finding a global minimizer of a nonconvex function is in general NP-hard, an optimization algorithm usually only produces an approximate stationary point \({\tilde{u}}_h\) of the cost function \(\mathcal {J}_h\). Therefore we need more than the a priori error estimate (3.17) to ensure the convergence of the approximate solutions to the solution u of (1.1).

In the following discussion, it suffices to assume that u belonging to \(W^{2,\infty }(\Omega )\) is strictly convex, i.e., (1.5) is satisfied almost everywhere in \(\Omega \).

Let \({\tilde{u}}_h\in L_h\) be elementwise strictly convex. Then the relation (3.13) is valid with \(u_h\) replaced by \({\tilde{u}}_h\), and we have, by Lemma 3.2 and the arguments in the proof of Theorem 3.3,

$$\begin{aligned} \Vert D_h^2(u-{\tilde{u}}_h)\Vert _{L_2(\Omega )}&\le C\Big [\Vert \det D_h^2{\tilde{u}}_h-\psi \Vert _{L_2(\Omega )}+\Big (\sum _{e\in \mathcal {E}_h^i}|e|^{-1}\Vert [[\partial {\tilde{u}}_h/\partial n]]\Vert _{L_2(e)}^2\Big )^\frac{1}{2}\nonumber \\&\quad +\mathrm {Osc}(\phi )\Big ], \end{aligned}$$
(3.25)

where the positive constant C depends only on the minimum and maximum eigenvalues of \(D^2u\) and \(D_h^2{\tilde{u}}_h\) over \(\Omega \) and the shape regularity of \(\mathcal {T}_h\).

Therefore, after verifying the elementwise strict convexity of an approximate solution \({{\tilde{u}}}_h\) for (2.8), we can monitor the convergence of \({{\tilde{u}}}_h\) by evaluating the residual-based error estimator

$$\begin{aligned} \eta _h({{\tilde{u}}}_h)=\Vert \det D_h^2{{\tilde{u}}}_h-\psi \Vert _{L_2(\Omega )}+\Big (\sum _{e\in \mathcal {E}_h^i}|e|^{-1}\Vert [[\partial {\tilde{u}}_h/\partial n]]\Vert _{L_2(e)}^2\Big )^\frac{1}{2}. \end{aligned}$$
(3.26)

According to (2.11) and (3.25), the estimator \(\eta _h\) is reliable for the error measured by the norm \(\Vert \cdot \Vert _h\):

$$\begin{aligned} \Vert u-{\tilde{u}}_h\Vert _h\le C\big (\eta _h({\tilde{u}}_h)+\mathrm {Osc}(\phi )\big ). \end{aligned}$$
(3.27)

Moreover \(\mathrm {Osc}(\phi )\) is \(O(h^2)\) (cf. Remark 2.7).

In the other direction the obvious relations

$$\begin{aligned} |e|^{-1}\Vert [[\partial {\tilde{u}}_h/\partial n]]\Vert _{L_2(e)}^2&= |e|^{-1}\Vert [[\partial (u- {\tilde{u}}_h)/\partial n]]\Vert _{L_2(e)}^2 \end{aligned}$$
(3.28)

and

$$\begin{aligned} \Vert \det D_h^2{\tilde{u}}_h-\psi \Vert _{L_2(T)}&=\Vert \det D^2_h{\tilde{u}}_h-\det D^2u\Vert _{L_2(T)} \end{aligned}$$
(3.29)

imply that \(\eta _h({\tilde{u}}_h)\) is also locally efficient. Therefore we can use \(\eta _h\) to generate adaptive meshes when the solution of (1.1) is less smooth.

4 Numerical results

We have tested our method on three examples with known solutions. For each example we solve (2.7)–(2.9) by an active set algorithm (cf. Appendix B) that produces an approximate stationary point of the cost function in (2.9). The elementwise convexity of the approximate solutions are checked numerically by Algorithm C.1 in Appendix C.

For Example 4.2 and Example 4.3, where the known solutions do not belong to \(H^4(\Omega )\), we have also solved (2.7)–(2.9) on adaptive meshes generated by the error estimator in (3.26) and a Dörfler marking strategy [32].

The relative errors of the approximate solution \({\tilde{u}}_h\) in various norms are defined by

$$\begin{aligned} e^r_{2,h} = \frac{\Vert u - {\tilde{u}}_h\Vert _{h}}{\Vert u\Vert _{H^2(\Omega )}},\quad e^r_{1,h} = \frac{\Vert u - {\tilde{u}}_h\Vert _{H^1(\Omega )}}{\Vert u\Vert _{H^1(\Omega )}},\quad e^r_{0,h} = \frac{\Vert u - {\tilde{u}}_h\Vert _{L^2(\Omega )}}{\Vert u\Vert _{L^2(\Omega )}} \end{aligned}$$

and

$$\begin{aligned} e^r_{\infty ,h} = \frac{\max \limits _{p\in \mathcal {V}_h}|u(p) - {\tilde{u}}_h(p)|}{\Vert u\Vert _{L^\infty (\Omega )}}, \end{aligned}$$

where \(\mathcal {V}_h\) is the set of the vertices of the triangulation \(\mathcal {T}_h\).

All the numerical experiments were carried out on a MacBook Pro laptop computer with a 2.8GHz Quad-Core Intel Core i7 processor and with 16GB 2133 MHz LPDDR3 memory. We use MATLAB (R2018b v.9.5.0) in our computations.

Example 4.1

This example is from [28], where \(\Omega \) is the unit square \((0,1)^2\),

$$\begin{aligned} \psi (x) = (1+|x|^2)e^{|x|^2/2} \quad \text {and}\quad \phi (x) = e^{x^2/2}. \end{aligned}$$

The exact solution is \(u = e^{x^2/2}\). The assumptions (1.2)–(1.4) are satisfied.

The errors of the approximate solutions \({\tilde{u}}_h\) obtained by the optimization algorithm on uniform meshes are reported in Table 1. The order of convergence for \(e_{2,h}^r\) is 2, which agrees with the estimate in Theorem 3.3 for the solutions \(u_h\) of (2.8). The orders of convergence for \(e_{1,h}^r\), \(e_{0,h}^r\) and \(e_{\infty ,h}^r\) are higher.

Table 1 Relative errors versus mesh size h and orders of convergence (Example 4.1)

The residual \(\eta _h({\tilde{u}}_h)\) and the cost \(\mathcal {J}_h({\tilde{u}}_h)\) are reported in Table 2, their behaviors agree with the estimates (3.1), (3.4) and (3.5) for the minimizer \(u_h\). It is observed from the CPU times in Table 2 that a good approximate solution at \(h=2^{-2}\) was computed in 3 seconds.

We have verified that all the approximate solutions are elementwise strictly convex, and the reliability of the error estimator \(\eta _h\) can be observed by comparing \(e_{2,h}^r\) in Table 1 and \(\eta _h({\tilde{u}}_h)\) in Table 2.

We have also solved the same problem on four other regular polygons (cf. Fig. 2), where the diameters of these polygons are 2.3660 (triangle), 1.6420 (pentagon), 1.5774 (hexagon) and 1.5307 (octagon).

It is observed from the convergence histories of \(e_{2,h}^r\) and \(e_{\infty ,h}^r\) in Fig. 3 that the performance of our method is similar for all five polygons.

Table 2 Residual, Cost and CPU Time (Example 4.1)
Fig. 2
figure 2

Regular triangle, pentagon, hexagon and octagon

Fig. 3
figure 3

Convergence histories of \(e_{2,h}^r\) (left) and \(e_{\infty ,h}^r\) (right) for five regular polygons (Example 4.1)

Example 4.2

This example is from [63], where \(\Omega = (-1,1)^2\),

$$\begin{aligned} \psi (x) = \left\{ \begin{aligned}&16\quad&\ \text{ in }\ \ |x|\le 1/2\\&64 - 16|x|^{-1} \quad&\text{ in }\ \ 1/2\le |x| \end{aligned} \right. , \end{aligned}$$

the exact solution is

$$\begin{aligned} u(x) = \left\{ \begin{aligned}&2|x|^2\quad&\ \text{ in }\ \ |x|\le 1/2\\&2(|x| - 1/2)^2 + 2|x|^2\quad&\text{ in }\ \ 1/2\le |x| \end{aligned} \right. , \end{aligned}$$

and \(\phi \in H^4(\Omega )\) equals u in a neighborhood of \(\partial \Omega \). For this example, the function \(\psi \) is piecewise smooth and discontinuous along the circle defined by \(|x|=1/2\), and the Aleksandrov solution u is a piecewise smooth \(C^1\) function.

The computational meshes are generated by a bisection procedure to fit the circle where \(\psi \) is discontinuous. The first two meshes and the final mesh are presented in Fig. 4.

Fig. 4
figure 4

Computational meshes (Example 4.2)

We have verified that all the approximate solutions are elementwise strictly convex. The relative errors are reported in Table 3. The convergence of \(e_{2,h}^r\) is of a reduced order \(\approx 0.5\), and the orders of convergence are higher for the lower order norms.

The residual \(\eta _h({\tilde{u}}_h)\), the cost \(\mathcal {J}_h({\tilde{u}}_h)\) and the CPU time are provided in Table 4. It is observed that a satisfactory approximate solution with 1009 dofs was computed in less than 3 seconds. The reliability estimate (3.27) is confirmed by comparing the values of \(e_{2,h}^r\) and \(\eta _h({\tilde{u}}_h)\).

Table 3 Relative errors versus mesh size h and orders of convergence (Example 4.2)
Table 4 Residual, Cost and CPU Time (Example 4.2)

We also tested the performance of the a posteriori error estimator in (3.26) for this example. The convergence histories of \(e_{2,h}^r\) and \(e_{\infty ,h}^r\) on bisection and adaptive meshes are shown in Fig. 5. The advantages of the adaptive meshes can be observed until round-off errors interfere at finer meshes.

The discrete solution on the final bisection mesh and the adaptive mesh with 3385 dofs are displayed in Fig. 6.

Fig. 5
figure 5

Convergence histories of \(e_{2,h}^r\) (left) and \(e_{\infty ,h}^r\) (right) on bisection and adaptive meshes (Example 4.2)

Fig. 6
figure 6

(Left) graph of the computed solution on the final bisection mesh and (Right) adaptive mesh with 3385 dofs (Example 4.2)

Example 4.3

This example is from [64], which is a modification of an example in [48]. The domain \(\Omega \) is the unit square \( = (0,1)^2\),

$$\begin{aligned} \psi (x) = \max \left( 1 - \frac{0.2}{|x - (1/2,1/2)|},0\right) , \end{aligned}$$

the exact solution of this example is

$$\begin{aligned} u(x) = \frac{1}{2}(\max (|x - (1/2,1/2)| - 0.2),0)^2, \end{aligned}$$

and \(\phi \in H^4(\Omega )\) equals u in a neighborhood of \(\partial \Omega \). For this example the function \(\psi \) vanishes on the disc defined by \(|x-(1/2,1/2)|\le 0.2\) and the Aleksandrov solution u is a piecewise smooth \(C^1\) function.

We solved this problem by the nonlinear least-squares method on uniform meshes. The approximate solutions are elementwise strictly convex outside the disc where \(\psi =0\). The relative errors of \({\tilde{u}}_h\) are reported in Table 5. The order of convergence for \(e_{2,h}^r\) is roughly 0.5 and the orders of convergence for \(e_{1,h}^r\), \(e_{0,h}^r\) and \(e_{\infty ,h}^r\) are better than 1. The discrete solution at \(h=2^{-5}\) can be found in Fig. 8.

The residual \(\eta _h({\tilde{u}}_h)\), the cost \(\mathcal {J}_h({\tilde{u}}_h)\) and the CPU time are provided in Table 6. Comparing \(e_{2,h}^r\) in Table 5 and \(\eta _h({\tilde{u}}_h)\) in Table 6, one can see that the reliability estimate (3.27) is no longer valid because of the lack of strict convexity for both the discrete solutions and the continuous solution inside the disc where \(\psi =0\). It can also be seen that a satisfactory approximate solution at \(h=2^{-3}\) was computed in less than 8 seconds.

Table 5 Relative errors versus mesh size h and orders of convergence (Example 4.3)
Table 6 Residual, Cost and CPU Time (Example 4.3)

We also tested the performance of the a posteriori residual error estimator in (3.26) for this example. The convergence histories of \(e_{2,h}^r\) and \(e_{\infty ,h}^r\) on uniform and adaptive meshes are shown in Fig. 7. The advantage of adaptive meshes is observed. The adaptive mesh with 30187 dofs in Fig. 8 clearly captures the singularities of the exact solution along the circle defined by \(|x-(1/2,1/2)|=0.2\).

Fig. 7
figure 7

Convergence histories of \(e_{2,h}^r\) (left) \(e_{\infty ,h}^r\) (right) on uniform and adaptive meshes (Example 4.3)

Fig. 8
figure 8

(Left) graph of the computed solution on uniform mesh with \(h=2^{-5}\) and (Right) adaptive mesh with 30187 dofs (Example 4.3)

5 Concluding remarks

By going beyond the classical definition of a finite element, we are able to construct a \(C^0\) interior penalty method for the Dirichlet boundary value problem of the Monge–Ampère equation, where the elementwise convexity of the approximate solutions can be enforced. This in turn enables us to use existing results for second order elliptic equations in nondivergence form to obtain both a priori and a posteriori error estimates. The a posteriori error estimate is a significant part of our method since it allows us to access the convergence of the approximate solutions generated by optimization algorithms that are not necessarily global minimizers.

The approach in this paper can be extended to smooth domains. We also note that convexity enforcing is useful for the problem of prescribed Gaussian curvature (cf. [38, 42]) and the nonlinear least-squares approach can be applied to the Pucci equations (cf. [20, 42]). These are some of our ongoing projects.