1 Introduction

Most inorganic solids in nature are polycrystals. They are composed of microscopic crystallites (grains) of varying size and orientation in which the atoms are arranged in a periodic, crystalline pattern. In spite of their ubiquity, it remains poorly understood why in these materials such highly regular structures develop at the microscale. The core challenge is to investigate the phenomenon of crystallization, that is, the tendency of atoms to self-assemble into a crystal structure. An ultimate solution would be to understand this as a consequence of the interatomic interactions, where such interactions are determined by the laws of quantum mechanics.

In view of the current state of research, however, the crystallization question seems out of reach in this generality. It is thus necessary to consider reduced models and to study simplified theories which, however, retain essential features of the interatomic interactions. We follow this route by restricting to zero temperature and by describing our system in the frame of Molecular Mechanics [1, 30, 37] as a classical system of particles, whose interaction is given in terms of an empirical pair interaction potential. Moreover, we consider planar rather than three-dimensional models. Given a configuration \(X =\{x_1,\ldots ,x_N\} \subset {\mathbb {R}}^2\) consisting of a finite number of particles, their configurational energy \({\mathcal {E}}(X)\) takes the form

$$\begin{aligned} {\mathcal {E}}(X) = \frac{1}{2}\sum \limits _{i \ne j} V_{\mathrm{pair}} \big (|x_i-x_j|\big ), \end{aligned}$$

where \( V_{\mathrm{pair}} :[0,+\infty ) \rightarrow \overline{{\mathbb {R}}}\) denotes the pair potential. (The factor 1/2 accounts for double counting.) Such potentials typically are repulsive for close-by atoms while two atoms at larger distances (yet still in their interaction range) exert attractive forces on each other. The latter favors the formation of clusters, whereas the short-range repulsion guarantees that the atoms keep a minimal distance.

Notably, even for commonly used models such as the Lennard–Jones potential, the crystallization problem is still open beyond the one-dimensional setting. (In one dimension, the situation is considerably easier: crystallization at zero temperature for Lennard–Jones interactions is shown in [31]. Recent results for positive temperature including an analysis of boundary layers are obtained in [34, 35]. For results on dimers we refer to [6, 29].) The first rigorous results for a two-dimensional system were achieved in [32, 33, 43]; see also the recent paper [18]. For the very special choice of the ‘Heitmann–Radin sticky disk’ interaction potential

$$\begin{aligned} V_{\mathrm{sticky}} (r) = {\left\{ \begin{array}{ll} +\infty &{}\text {if }\,\, r<1,\\ -1 &{}\text {if }\,\,r=1,\\ 0&{}\text {if }\,\, r>1, \end{array}\right. } \end{aligned}$$
(1.1)

it was shown in [33] that ground states, that is, minimizers under the cardinality constraint \(\#X= N\), crystallize: they are subsets of the triangular lattice. The potential \( V_{\mathrm{sticky}}\) is pictured schematically in Fig. 1.

Fig. 1
figure 1

The interaction potential \( V_{\mathrm{sticky}}\)

On the one hand, it draws its motivation from being the most basic choice of a potential featuring the properties discussed above. On the other hand, it models extremely brittle materials and might be viewed as an ‘infinitely brittle’ limiting model for more generic interaction potentials, in which the hard core radius, the equlibrium distance, and the interaction range coincide. Slightly more general potentials are discussed in [43] which, however, do not allow for soft elastic interactions either. Still only partial results are available for more general potentials or higher dimensions, see [7] for a recent survey. Most noteworthy, [21, 47] in two and [24] in three dimensions prove that crystalline structures have optimal bulk energy scaling and crystals are ground states subject to their own boundary conditions. Such conditions, however, are insufficient, respectively, prohibitive in view of our goal to investigate the emergence of polycrystals. For this task, it is indispensable to both work at the surface energy scale, which is much finer than the bulk scaling, and to allow for free boundary conditions.

The ground states of sticky disk potentials in two dimensions are by now very well understood, and not only on the atomic microscale. In [3] the macroscopic shape was identified as being the Wulff shape of an associated crystalline perimeter functional. Fine properties and surface fluctuations were investigated in [45] and quantified in terms of an \(N^{3/4}\) law (see the comment below (1.2)). Sharp constants for this law were then established in [17] and the uniqueness of ground states was characterized in [19]. We also mention extensions to other crystals [16, 40, 42] and dimers [26, 27]. By way of contrast, in dimension three or higher the recent results [11, 39, 41] characterize optimal energy configurations within classes of lattices and are in this sense conditional to crystallization.

The main objective of our contribution is to advance our understanding of (microscopic) crystallization and formation of macroscopic clusters beyond ground states and single crystals. Indeed, all of the aforementioned results ultimately rely on the emergence of a single crystal which is supported on a unique periodic structure. Restricting our analysis to the basic Heitmann–Radin sticky disk potential (1.1), we succeed in deriving a rather complete picture on the formation of general polycrystals by considering the \(\Gamma \)-limit for the interaction energy in the surface energy regime in the infinite particle limit. (We refer to [8, 14] for an exhaustive treatment of \(\Gamma \)-convergence.) The first relevant steps in this direction were obtained in [20], where the authors prove a compactness result for polycrystals and identify the \(\Gamma \)-limit in the case of a single crystal limiting configuration. In the present work, we prove a full \(\Gamma \)-convergence result and provide a limiting continuum model consisting of grains that are characterized by a rotation and, in addition, a micro-translation. We also analyze in depth the surface energy of grain boundaries both for vacuum–solid and solid–solid phase transitions.

We proceed to describe our particle model in more detail. The minimal energy of a configuration \(X_N = \{x_1,\ldots ,x_N\} \subset {\mathbb {R}}^2\) of N particles has been determined already in [32]:

$$\begin{aligned} \min \{{\mathcal {E}}(X_N) :\#X_N = N\} = -\lfloor 3N - \sqrt{12N-3}\rfloor \approx -3N + \mathrm{O}(\sqrt{N}). \end{aligned}$$
(1.2)

The leading order term \(-3N\) comes from \(N - \mathrm{O}(\sqrt{N})\) atoms in the bulk, each having six neighbors. The lower order term \(\sim \sqrt{N}\) is due to missing neighbors of a number \(\mathrm{O}(\sqrt{N})\) of atoms at the boundary and is thus a surface energy. (The aforementioned \(N^{3/4}\) law quantifies the surprisingly large possible deviations of ground states from the macroscopic Wulff shape which involve a number of \(\sim N^{3/4} \gg \sqrt{N}\) particles.)

As polycrystals will not be ground states in general, but rather metastable states with surface energy contributions from atoms at individual grain boundaries, we proceed to address the class of all configurations at the finite surface energy scaling, that is, we consider \(X_N \subset {\mathbb {R}}^2\), \(\# X_N = N\), with bounded normalized energy

$$\begin{aligned} \frac{{\mathcal {E}}(X_N)+3N}{\sqrt{N}} = \frac{1}{2\sqrt{N}}\sum \limits _{x \in X_N} \Big ( 6 + \sum \limits _{y \in X_N {\setminus } \lbrace x \rbrace } V_{\mathrm{sticky}} \big (|x-y|\big )\Big ) \end{aligned}$$

as \(N \rightarrow \infty \). Here, we have subtracted the minimal energy \(-3\) per particle times the number of particles and rescaled with \(\sqrt{N}\).

The diameter of an N-particle configuration \(X_N\) with energy given in (1.2) is \(\sim \sqrt{N}\). To obtain configurations which are contained in a bounded domain, we therefore rescale the configuration by a factor \(\varepsilon :=1/\sqrt{N}\), that is, \(X_\varepsilon := \varepsilon X_N\). We then study the asymptotics of the energy \(E_\varepsilon (X_\varepsilon )\) where the energy functional \(E_\varepsilon \) is defined on finite point sets \(X \subset \mathbb R^2\) by

$$\begin{aligned} E_\varepsilon (X) = \frac{1}{2}\sum \limits _{x \in X}\varepsilon \Big (6+ \sum \limits _{y \in X{\setminus } \lbrace x\rbrace } V_{\mathrm{sticky}}\Big ( \frac{|x-y|}{\varepsilon } \Big )\Big ). \end{aligned}$$
(1.3)

This will allow us to pass to a macroscopic description as \(\varepsilon \rightarrow 0\). In what follows, we consider the energy \(E_\varepsilon \) in (1.3) without cardinality constraint since the energy has already been normalized with respect to the minimal energy per particle.

Our main results are a full \(\Gamma \)-convergence proof for the functionals \(E_\varepsilon \) towards a surface energy functional (Theorem 2.3) and a detailed analysis of the limiting continuum surface energy density (Proposition 2.2 and Theorem 2.5). We also prove a corresponding compactness result for bounded energy sequences (Theorem 2.1), which turns out to be comparatively straightforward. The proofs in fact also provide a rather complete picture of the structure of grain boundaries. We collect these findings of independent interest in Theorem 5.4. Our continuum description keeps track not only of the orientation angles of various grains but depends additionally on a micro-translation vector which in particular measures the translational offset of two lattices with the same orientation. Indeed, the introduction of such an augmented field does not only provide a finer characterization of the continuum limit, but turns out to be crucial when polycrystals with multiple solid–solid grain boundaries are considered.

The limiting surface energy \(\varphi \) is a function of the relative orientation of the two grains, their microscopic translation misfit, and the normal to the interface. For solid–vacuum surfaces this was identified in [3, 20] as the Finsler norm whose unit ball is shaped like a Voronoi cell of the lattice in the solid part. In other words, this is just the surface energy density of the crystal perimeter. For solid–solid interfaces, however, the problem is considerably more subtle as there are atomic interactions across the interface. In softer materials, one expects dislocations to accumulate and elastic strain to concentrate near such grain boundaries. We refer to [23, 38] for recent mathematical developments on substantiating the Read–Shockley formula, see [44]; in such a regime. By way of contrast, within our extremely brittle set-up, generically \(\varphi \) turns out to be given by the sum of the solid–vacuum surface energies of the two grains. Here, the term generic refers to the fact that the surface energy may be smaller only for a countable number of mismatch angles between the two lattices, and corresponding micro-translations contained in a finite number of spheres.

We proceed with some comments on the general proof strategy. As is customary for variational limits with interfacial energies, the density \(\varphi \) is expressed in terms of a cell formula minimizing the asymptotic surface energy between two grains separated by a flat grain boundary. In such cell problems, it is instrumental to pass from a mere \(L^1\)-convergence to fixed boundary values in order to match the \(\Gamma \)-\(\liminf \) and \(\Gamma \)-\(\limsup \) inequalities. Motivated by [5, 25, 46] for vectorial problems in liquid-liquid phase transitions and [13, 15, 36] in solid–solid phase transitions, we use a cut-off construction, the so-called fundamental estimate, to replace an asymptotic realization by the exact attainnment of converging boundary values in a first step. Here, our extremely brittle set-up on the one hand renders geometric rigidity estimates easier as compared to, for example, [13, 15]. On the other hand, this calls for carefully refined cut-off constructions since very small modifications in the configurations may induce a lot of energy. However, in contrast to [13, 15], a cell problem with converging boundary data turns out to be insufficient in the presence of multiple grain boundaries. Thus, a further step is needed to show that they can be replaced by fixed boundary values. Also this passage is subtle due to our rigid set-up which requires a thorough analysis of possible touching points of two lattices (points with distance \(\varepsilon \)). Finally, let us also mention that related, very general \(\Gamma \)-convergence results for elastic materials exhibiting discontinuities along surfaces, see for example [4, 10, 28], do not apply to our situation. Most notably, in [28], a model similar to ours featuring rigid grains is considered. Unfortunately, these results cannot be used in our setting as they fundamentally rely on continuous surface interactions.

At the core of our proofs, there are two key steps to which we devote Sections 5 and 6, respectively. Firstly, Lemma 5.1 allows us to reduce the cell formula to two lattices only. An expanded version of this observation is detailed in Theorem 5.4. It shows that in our brittle set-up there are no interpolating boundary layers at interfaces. This is done by employing techniques from graph theory in order to exclude inclusions of grains of different orientation as the prescribed boundary datum. The basic idea behind its proof is that to each admissible configuration one can associate its bond-graph and for this graph such inclusions induce non-triangular faces which in turn lead to fewer bonds than a competitor without such inclusions. This can be quantified via the face defect, see definition (5.4). Once established, this in particular results in a largely simplified analysis of the interaction energy with vacuum as compared to [20], see Lemma 6.1. More importantly, it is crucial for the second main ingredient of the proof: the quantification of solid–solid interactions with the help of Lemma 6.2, which clarifies when the surface energy can be smaller than twice the interaction energy with vacuum and plays a pivotal role in order to show that converging boundary values can be replaced by fixed ones. This can be understood as a rigidity theorem for the mismatch-angle between two grains: the generically expected interaction energy can exceed the grain boundary energy only for finitely many mismatch angles depending on the excess. Its proof relies on the fact that such an energy gap can only occur if the two lattices have many touching points (points with distance \(\varepsilon \)). This entails that the touching points of the two lattices have to be rather equi-distributed along the interface. This, however, can only happen in a periodic landscape, which reduces the possible mismatch-angle to a finite set. Many further ingredients of our proofs are more standard (blow-up, density arguments, fundamental estimate, ...), but technically challenging in our case since the energy is very rigid and thus very sensitive to small changes of the configuration.

The paper is organized as follows: in Section 2 we introduce the model and present the main results. Section 3 is devoted to the proofs of compactness and \(\Gamma \)-convergence. They fundamentally rely on a fine characterization of the surface energy density whose proof is postponed to Sections 47. In Section 4 we address the fundamental estimate and in Section 7 we show that converging boundary values can be replaced by fixed ones. Sections 5 and 6 are devoted to the reduction of the cell formula to two lattices only and to the characterization of solid–vacuum/solid–solid interactions at grain boundaries, respectively.

2 Setting of the Problem and Main Results

In this section we introduce our model, give basic definitions, and present our main results.

2.1 Configurations and Atomistic Energy

In what follows we always assume that X is a finite subset of \({\mathbb {R}}^2\). We denote by \( V_{\mathrm{sticky}} :[0,+\infty ) \rightarrow \overline{{\mathbb {R}}}\) the Heitmann–Radin potential defined in (1.1), see Fig. 1. By \(\varepsilon >0\) we denote the atomic spacing. The normalized atomistic energy \(E_\varepsilon \) of a given configuration X is given by (1.3). The notion normalized has been explained in the introduction and is chosen in such a way that an infinite triangular lattice with spacing \(\varepsilon \) has energy zero. Equivalently, the energy can be expressed in terms of the neighborhoods of the atoms. To this end, we introduce the neighborhood of \(x \in X\) by

$$\begin{aligned} {\mathcal {N}}_\varepsilon (x) =\{y\in X : |x-y| = \varepsilon \}. \end{aligned}$$
(2.1)

If \(\varepsilon =1\), we omit the subscript \(\varepsilon \) and just write \({\mathcal {N}}(x)\) for simplicity. In view of \( V_{\mathrm{sticky}} (r) = \infty \) for \(r \in (0,1)\), an elementary geometric argument shows that for configurations X with \(E_\varepsilon (X) < +\infty \) it holds that

$$\begin{aligned} \# {\mathcal {N}}_\varepsilon (x) \leqq 6 \ \ \ \text {for all }x \in X. \end{aligned}$$
(2.2)

In particular, if \(\# {\mathcal {N}}_\varepsilon (x) = 6\), the neighbors form a regular hexagon with center x and diameter \(2\varepsilon \). By (1.1) and (1.3) we can now rewrite the energy as

$$\begin{aligned} E_\varepsilon (X)= \frac{1}{2}\sum \limits _{x\in X} \varepsilon \big (6-\#{\mathcal {N}}_\varepsilon (x)\big ). \end{aligned}$$

Additionally, for \(X \subset {\mathbb {R}}^2\) and Borel sets \(B \subset {\mathbb {R}}^2\), we define a localized version of the energy by

$$\begin{aligned} E_\varepsilon (X,B)= \frac{1}{2}\sum \limits _{x\in X\cap B} \varepsilon \big (6-\#{\mathcal {N}}_\varepsilon (x)\big ). \end{aligned}$$
(2.3)

2.2 Basic Definitions

This subsection is devoted to basic notions which we will use throughout the paper.

Notation We let \({\mathbb {S}}^1 = \lbrace x\in {\mathbb {R}}^2 :|x| = 1 \rbrace \). Given \(\nu \in {\mathbb {S}}^1\), we denote by \(\nu ^\bot \in {\mathbb {S}}^1\) the unit vector obtained by rotating \(\nu \) by \(\pi /2\) in a clockwise sense. The scalar product between two vectors \(x,y \in \mathbb R^2\) is denoted by \(\langle x,y \rangle \). Without further notice, we sometimes identify vectors \(x \in \mathbb R^2\) with elements of \(\mathbb C\). In particular, we identify rotations in the plane with a multiplication with a unit vector in \({\mathbb {C}}\): namely, the rotation of \(x \in \mathbb R^2\) by an angle \(\theta \in [0,2\pi )\) is indicated by \(e^{i\theta } x\). For \( t\in \mathbb R\), we write \(\lfloor t \rfloor = \max \lbrace k\in \mathbb Z:k \leqq t\rbrace \) and \(\lceil t \rceil = \min \lbrace k\in \mathbb Z:k \geqq t\rbrace \).

We denote by \({\mathcal {L}}^2\) and \({\mathcal {H}}^1\) the two-dimensional Lebesgue measure and the one-dimensional Hausdorff measure, respectively. We write \(\chi _E\) for the characteristic function of any \(E\subset \mathbb R^2\), which is 1 on E and 0 otherwise. If E is a set of finite perimeter, we denote its essential boundary by \(\partial ^* E\), see [2, Definition 3.60]. For \(r>0\) and \(x \in {\mathbb {R}}^2\), we denote by \(B_r(x)\) the open ball of radius r centered in x. For simplicity, we write \(B_r\) if \(x=0\). Given \(A \subset {\mathbb {R}}^2\), \(\tau \in {\mathbb {R}}^2\), and \(\lambda \in {\mathbb {R}}\), we define

$$\begin{aligned}&A+ \tau = \{x+\tau : x \in A \},\quad \lambda A = \{\lambda x : x \in A\} \text { and }\nonumber \\&(A)_\varepsilon =\{x+y :\, x \in A, y \in B_\varepsilon \}. \end{aligned}$$
(2.4)

For \(x_1,x_2 \in {\mathbb {R}}^2\), we define the line segment between \(x_1\) and \(x_2\) by

$$\begin{aligned} {[}x_1;x_2] =\big \{\lambda x_1 + (1-\lambda ) x_2 : \lambda \in {[}0,1]\big \}. \end{aligned}$$
(2.5)

By \(Q^\nu = \lbrace y \in \mathbb R^2:-\frac{1}{2} \leqq \langle y,\nu \rangle< \frac{1}{2}, -\frac{1}{2} \leqq \langle y,\nu ^\bot \rangle < \frac{1}{2} \rbrace \) we denote the half-open unit cube in \({\mathbb {R}}^2\) with center zero and two sides parallel to \(\nu \in {\mathbb {S}}^1\). Moreover, we define the half-cubes

$$\begin{aligned} Q^{\nu ,\pm } = \lbrace y \in Q^{\nu } :\pm \langle \nu , y\rangle \geqq 0\rbrace . \end{aligned}$$
(2.6)

Here and in what follows, we will frequently use the notation ± to indicate that a property holds for both signs \(+\) and −. In a similar fashion, for \(x \in {\mathbb {R}}^2\) and \(\rho >0\) we define \(Q^\nu _\rho (x) := x + \rho Q^\nu \) and \(Q^{\nu ,\pm }_\rho (x) := x + \rho Q^{\nu ,\pm }\). For \(\rho = 1\), we write \(Q^\nu (x)\) instead of \(Q^\nu _1(x)\) for simplicity. For \(\varepsilon >0\) and \(Q^\nu _\rho (x)\) we introduce the notation of boundary regions

$$\begin{aligned} \partial ^\pm _\varepsilon Q^\nu _\rho (x) = x + \left\{ y \in \overline{Q^\nu _{\rho + 10\varepsilon } {\setminus } Q^\nu _{\rho - 10\varepsilon }} :\pm \langle \nu , y \rangle \geqq 5 \varepsilon \right\} ; \end{aligned}$$
(2.7)

see also Fig. 3 for an illustration. For \(\rho = 1\), we write \(\partial ^\pm _\varepsilon Q^\nu (x)\) instead of \(\partial ^\pm _\varepsilon Q^\nu _\rho (x)\).

The triangular lattice We define the triangular lattice as the set of points given by

$$\begin{aligned} {\mathscr {L}}:=\left\{ p+q\omega : p,q\in {\mathbb {Z}} \right\} , \end{aligned}$$

where \(\omega := \frac{1}{2}+\frac{i}{2}\sqrt{3} \in \mathbb C\).

The set of lattice isometries We denote by \({\mathbb {A}}\) the set of rotations by angles in \([0,\frac{\pi }{3})\) equipped with the metric of the one-dimensional torus, that is, \({\mathbb {A}} = {\mathbb {R}} / \frac{\pi }{3} {\mathbb {Z}}\). In a similar fashion, we introduce the set of translations \({\mathbb {T}}= {\mathbb {R}}^2 / {\mathscr {L}} = {\mathbb {C}} / {\mathscr {L}}\). We observe that each translation \(\tau \in {\mathbb {T}}\) can be represented by a vector in

$$\begin{aligned} \{\lambda _1 + \lambda _2 \omega :0 \leqq \lambda _1<1,0 \leqq \lambda _2<1 \}. \end{aligned}$$
(2.8)

We introduce the set of lattice isometries by

$$\begin{aligned} {\mathcal {Z}}:= \big ({\mathbb {A}}\times {\mathbb {T}} \times \{1\}\big ) \cup \{{\mathbf {0}}\}, \end{aligned}$$
(2.9)

where for each \(\theta \in {\mathbb {A}}\) and \(\tau \in {\mathbb {T}}\) the triple \(z = (\theta ,\tau ,1) \in {\mathcal {Z}}\) represents the rotated and translated lattice

$$\begin{aligned} {\mathscr {L}}(z) = {\mathscr {L}}(\theta ,\tau ,1) := e^{i\theta } ({\mathscr {L}} + \tau ). \end{aligned}$$

Here, the entry 1 encodes that a lattice is present. On the contrary, \({\mathbf {0}}=(0,0,0) \in {\mathbb {A}}\times {\mathbb {T}} \times \{0\} \) represents the empty set, also referred to as vacuum in what follows. We set

$$\begin{aligned} {\mathscr {L}}({\mathbf {0}}) = \emptyset . \end{aligned}$$

Note that \({\mathbb {A}} \simeq {\mathbb {S}}^1 \) and \({\mathbb {T}} \simeq {\mathbb {S}}^1\times {\mathbb {S}}^1\). Therefore, the three-dimensional set \({\mathcal {Z}}\) can naturally be embedded into \({\mathbb {R}}^7\). We endow \({\mathcal {Z}}\) with the product topology, that is, \(z_j=(\theta _j,\tau _j,1) \rightarrow z=(\theta ,\tau ,1)\) if and only if \(\theta _j \rightarrow \theta \) in \({\mathbb {A}}\) and \(\tau _j\rightarrow \tau \) in \({\mathbb {T}}\). Moreover, \(z_j \rightarrow {\mathbf {0}}\) if and only if \(z_j = {\mathbf {0}}\) for all j large enough. For a set \(A \subset {\mathbb {R}}^2\), \(z \in {\mathcal {Z}}\), and a configuration X with \(E_\varepsilon (X) <+\infty \), we say that X coincides with the lattice \( \varepsilon {\mathscr {L}}(z)\) on A, written \(X= \varepsilon {\mathscr {L}}(z)\) on A, if

$$\begin{aligned} X \cap A = (\varepsilon {\mathscr {L}}(z)) \cap A. \end{aligned}$$
(2.10)

The state space For \(A \subset {\mathbb {R}}^2\), we introduce the space of piecewise constant functions \(PC(A;{\mathcal {Z}})\) with values in \({\mathcal {Z}}\) as functions of the form

$$\begin{aligned} u = \sum \limits _{j=1}^\infty \chi _{G_j} z_j, \end{aligned}$$
(2.11)

where \(\lbrace z_j \rbrace _j \subset {\mathcal {Z}} {\setminus } \lbrace {\mathbf {0}} \rbrace \) are pairwise distinct and \( G_j \subset A\) are pairwise disjoint sets satisfying \({\mathcal {L}}^2\big (\bigcup \nolimits _{j=1}^\infty G_j\big ) < \infty \) and

$$\begin{aligned} \sum \limits _{j=1}^\infty {\mathcal {H}}^1(\partial ^* G_j) < + \infty . \end{aligned}$$
(2.12)

Here, \(\lbrace G_j\rbrace _j\) represent the grains of the polycrystal and \(\lbrace z_j\rbrace _j\) the corresponding orientation and translation of the lattice. We remark that this space can be identified with

$$\begin{aligned} PC(A;{\mathcal {Z}})=\big \{u \in SBV(A;{\mathcal {Z}}):\, \nabla u =0, \, {\mathcal {L}}^2(\lbrace u \ne {\mathbf {0}} \rbrace )< + \infty , \, {\mathcal {H}}^1(J_u) <+\infty \big \}. \end{aligned}$$
(2.13)

Here, u is a function in \( SBV(A;{\mathcal {Z}})\) in the sense that \(u \in SBV(A;{\mathbb {R}}^7)\) and u takes values in \({\mathcal {Z}}\). The jump set of u is denoted by \(J_u\). The one-sided limits of u at a jump point will be indicated by \(u^+\) and \(u^-\) in what follows, and the normal will be denoted by \(\nu _u\). We refer to [2, Definition 4.21] for details on this space. In a similar fashion, we say \(u \in PC_{\mathrm{loc}}({\mathbb {R}}^2;{\mathcal {Z}})\) if \(u|_A \in PC(A;{\mathcal {Z}})\) for all compact sets \(A \subset {\mathbb {R}}^2\).

Identification of configurations with piecewise constant functions We now relate atomistic configurations X to the state space defined above. Consider \(x \in X\cap {\mathscr {L}}\) such that \({\mathcal {N}}(x) \subset {\mathscr {L}}\). Then, we define the open lattice Voronoi cell of x by

$$\begin{aligned} V(x)= x + \frac{1}{\sqrt{3}} e^{i\pi /6} \, \mathrm{int} \big (\mathrm {conv} \{\pm 1,\pm \omega ,\pm \omega ^2\}\big ), \end{aligned}$$
(2.14)

where \(\mathrm {conv}\lbrace \cdot \rbrace \) denotes the convex hull of a point set, and int the interior. In a similar fashion, if x and the points in its neighborhood \({\mathcal {N}}_\varepsilon (x)\) lie in a scaled rotated and translated lattice \(\varepsilon {\mathscr {L}}(z)\), for \(\varepsilon >0\) and \(z = (\theta ,\tau ,1) \in {\mathcal {Z}}\), we define \(V^z_\varepsilon (x) = x + e^{i\theta } \varepsilon V(0)\). We also point out the implicit dependence on \(\tau \) here, since \(x = e^{i\theta }(v +\tau )\) for some \(v\in {\mathscr {L}}\).

Given a configuration X with \(E_\varepsilon (X) < +\infty \), we now identify X with a suitable function \(u\in PC({\mathbb {R}}^2;{\mathcal {Z}})\). Since \(E(X) < +\infty \), we have \(\#{\mathcal {N}}_\varepsilon (x) \leqq 6\) for all \(x \in X\) with equality only if \(\lbrace x\rbrace \cup {\mathcal {N}}_\varepsilon (x) \subset e^{i \theta (x)}\varepsilon ({\mathscr {L}} + \tau (x))\) for a unique pair \((\theta (x),\tau (x))\in {\mathbb {A}} \times {\mathbb {T}}\). We set

$$\begin{aligned} z(x) = { \big (\theta (x),\tau (x),1\big ) } \in {\mathcal {Z}} \ \ \ \hbox { for all } x \in X \hbox { with } \# {\mathcal {N}}_\varepsilon (x) = 6 \end{aligned}$$

and define \(u_\varepsilon ^X:{\mathbb {R}}^2\rightarrow {\mathcal {Z}}\) by

$$\begin{aligned} u_\varepsilon ^X(x) := {\left\{ \begin{array}{ll} z(x) \text { on }V^{z(x)}_\varepsilon (x)&{} \text { if }\,\, x \in X\text { with } \#{\mathcal {N}}_\varepsilon (x)=6 ,\\ {\mathbf {0}} &{} \text { else.} \end{array}\right. } \end{aligned}$$
(2.15)

In what follows, if no confusion may arise, we write \(u_\varepsilon \) instead of \(u_\varepsilon ^X\). We note that this definition is well posed in the sense that \(V^{z(x_1)}_\varepsilon (x_1) \cap V^{z(x_2)}_\varepsilon (x_2) = \emptyset \) for all \(x_1,x_2 \in X\), \(x_1 \ne x_2\), with \(\#{\mathcal {N}}_\varepsilon (x_1) = \#{\mathcal {N}}_\varepsilon (x_2) = 6\). In fact, if this were not the case, one of the six atoms in \({\mathcal {N}}_\varepsilon (x_1)\) (forming a regular hexagon on \(\partial B_\varepsilon (x_1)\)) would have distance smaller than 1 to \(x_2\). This contradicts \(E_\varepsilon (X) < + \infty \). Clearly, \(u_\varepsilon \) as defined in (2.15) lies in \(PC({\mathbb {R}}^2; {\mathcal {Z}})\).

The function \(u_\varepsilon \) for some finite energy configuration X is illustrated in Fig. 2. We point out that the translation \(\tau (x)\) induces a shift of the Voronoi cells by the vector \(\varepsilon e^{i\theta (x)} \tau (x)\). This is the reason why we call the variable \(\tau \) a micro-translation.

Fig. 2
figure 2

A function \(u_\varepsilon \) defined in (2.15): the different regions \(\{u=z\}\) with \(z \ne {\mathbf {0}}\) (here illustrated in different shades of gray) are made of unions of regular hexagons. The complement of those regions is the set \(\{u={\mathbf {0}}\}\)

Convergence Let \(\{X_\varepsilon \}_\varepsilon \) be a sequence of configurations. We say that \(X_\varepsilon \rightarrow u\) in \(L^1_{\mathrm {loc}}({\mathbb {R}}^2)\) if \(u_\varepsilon \rightarrow u\) in \(L^1_{\mathrm {loc}}({\mathbb {R}}^2;{\mathcal {Z}})\), where \(u_\varepsilon \) is given by (2.15) for \(X_\varepsilon \).

2.3 Main Results

We now formulate our main results. We start with a compactness result for sequences of configurations with bounded energy. Recall the definition for convergence of configurations in Sect. 2.2.

Theorem 2.1

(Compactness) Let \(\{X_\varepsilon \}_\varepsilon \) be a sequence of configurations with

$$\begin{aligned} \sup \limits _{\varepsilon >0} E_\varepsilon (X_\varepsilon ) <+\infty . \end{aligned}$$

Then, there exists a subsequence \( \{\varepsilon _k\}_{k\in {\mathbb {N}}}\) with \(\varepsilon _k \rightarrow 0\) and a function \(u \in PC({\mathbb {R}}^2;{\mathcal {Z}})\) such that \(X_{\varepsilon _k} \rightarrow u\) in \(L^1_{\mathrm{loc}}({\mathbb {R}}^2)\) as \(k \rightarrow +\infty \).

For \(\varepsilon >0\) and \(\nu \in {\mathbb {S}}^1\), recall the definition of \(\partial ^\pm _\varepsilon Q^\nu _\rho \) in (2.7). Recall also the coincidence with a lattice in (2.10). The following proposition introduces the density \(\varphi :{\mathcal {Z}} \times {\mathcal {Z}} \times {\mathbb {S}}^1\rightarrow [0,+\infty )\) which appears in our continuum limiting functional (see Fig. 3 for an illustration):

Proposition 2.2

(Density) For every \(z^+,z^- \in {\mathcal {Z}}\), \(\nu \in {\mathbb {S}}^1\), \(x_0 \in \mathbb R^2\), and \(\rho >0\) there exists

$$\begin{aligned} \varphi (z^+,z^-,\nu )= \lim _{\varepsilon \rightarrow 0} \frac{1}{\rho }\min \Big \{E_\varepsilon \big (X,Q^\nu _\rho (x_0)\big ):\, X = \varepsilon {\mathscr {L}}(z^\pm ) \text { on } \partial _\varepsilon ^\pm Q^\nu _\rho (x_0) \Big \}, \end{aligned}$$
(2.16)

and is independent of \(x_0\) and \(\rho \).

Fig. 3
figure 3

Illustration of a competitor for the cell-problem on \(Q^\nu _\rho \) in the definition of \(\varphi \). On the light gray hatched and dark gray regions we have \(X=\varepsilon {\mathscr {L}}(z^\pm )\), respectively. We point out that the competitor is prescribed in a small neighborhood \( \partial _\varepsilon ^- Q^\nu _\rho \cup \partial _\varepsilon ^+ Q^\nu _\rho \) both inside and outside of the cube. (The thickness of the neighborhood is larger than the lattice spacing, see (2.7). Here, for illustration purposes, it is drawn with thickness \(2\varepsilon \) instead of \(10\varepsilon \))

The limiting functional \(E :PC({\mathbb {R}}^2;{\mathcal {Z}}) \rightarrow [0,+\infty )\) is defined by

$$\begin{aligned} E(u) = \int _{J_u} \varphi (u^+(x),u^-(x),\nu _u(x))\,\mathrm {d}{\mathcal {H}}^1(x). \end{aligned}$$
(2.17)

In view of (2.13), functions in \(PC(\mathbb R^2;{\mathcal {Z}})\) lie in SBV, and therefore \(u^+\), \(u^-\), and \(\nu _u\) are well defined. The following statement shows that E can be interpreted as the effective limit of the atomistic energies \(E_\varepsilon \) in the sense of \(\Gamma \)-convergence:

Theorem 2.3

\((\Gamma \)-convergence) It holds that \(E = \Gamma (L^1_{\mathrm{loc}})\text {-}\lim _{\varepsilon \rightarrow 0} E_\varepsilon \); more precisely,

  1. (i)

    \((\Gamma \)-liminf inequality) For each \(u \in PC({\mathbb {R}}^2;{\mathcal {Z}})\) and each sequence \(\{X_{\varepsilon }\}_{\varepsilon }\) with \(X_{\varepsilon } \rightarrow u\) in \(L^1_{\mathrm{loc}}({\mathbb {R}}^2)\) it holds that

    $$\begin{aligned} \liminf _{\varepsilon \rightarrow 0} E_{\varepsilon }(X_{\varepsilon }) \geqq E(u). \end{aligned}$$
  2. (ii)

    \((\Gamma \)-limsup inequality) For each \(u \in PC({\mathbb {R}}^2;{\mathcal {Z}})\) we find configurations \(\lbrace X_{\varepsilon } \rbrace _\varepsilon \) such that \(X_{\varepsilon } \rightarrow u\) in \(L^1_{\mathrm{loc}}({\mathbb {R}}^2)\) and

    $$\begin{aligned} \lim _{\varepsilon \rightarrow 0} E_{\varepsilon }(X_{\varepsilon }) = E(u). \end{aligned}$$

Here and in the sequel, we follow the usual convention that convergence of the continuous parameter \(\varepsilon \rightarrow 0\) stands for convergence of arbitrary sequences \(\lbrace \varepsilon _k \rbrace _k\) with \(\varepsilon _k \rightarrow 0\) as \(k \rightarrow +\infty \).

Remark 2.4

(Extension to \(L^1\)) Defining \(E_\varepsilon :L^1({\mathbb {R}}^2;{\mathcal {Z}}) \rightarrow [0,+\infty ]\) by

$$\begin{aligned} E_\varepsilon (u) = {\left\{ \begin{array}{ll} E_\varepsilon (X) &{}\text {if there exists } X \text { such that } u = u_\varepsilon ^X, \\ +\infty &{}\text {otherwise,} \end{array}\right. } \end{aligned}$$

and extending E to all of \(L^1({\mathbb {R}}^2;{\mathcal {Z}})\) by setting \(E(u) = +\infty \) if \(u \in L^1({\mathbb {R}}^2;{\mathcal {Z}}) {\setminus } PC({\mathbb {R}}^2;{\mathcal {Z}})\), in view of Theorem 2.1, this indeed implies \(\Gamma (L^1_{\mathrm{loc}})\text {-}\lim _{\varepsilon \rightarrow 0} E_\varepsilon = E\).

We close this section by providing properties of the density \(\varphi \). To this end, we introduce the function \(\varphi _{\mathrm{hex}}:\mathbb R^2 \rightarrow [0,+\infty )\) defined by

$$\begin{aligned} \varphi _{\mathrm{hex}}(\nu ) = \frac{2}{\sqrt{3}} \sum \limits _{k=1}^3 |\langle \nu , \omega ^k \rangle |. \end{aligned}$$
(2.18)

Note that \(\varphi _{\mathrm{hex}}\) is a Finsler norm whose unit ball is a regular hexagon in \(\mathbb R^2\) with vertices in \(\frac{1}{2} e^{i\pi /6}\{\pm 1,\pm \omega ,\pm \omega ^2\}\), cf. [3, 20].

Theorem 2.5

(Properties of \(\varphi )\) Let \(\varphi \) be the density given in Proposition 2.2, extended to a function defined on \({\mathcal {Z}} \times {\mathcal {Z}} \times {\mathbb {R}}^2\) which is positively 1-homogeneous in the third variable. Then \(\varphi \) satisfies the following properties:

  1. (i)

    (Solid–vacuum energy) There holds \(\varphi (z,{\mathbf {0}},\nu ) = \varphi ({\mathbf {0}},z,\nu ) = \varphi _{\mathrm {hex}}(e^{-i\theta } \nu )\) for all \(z = (\theta ,\tau ,1) \in {\mathcal {Z}}{\setminus } \lbrace {\mathbf {0}} \rbrace \) and \(\nu \in {\mathbb {S}}^1\).

  2. (ii)

    (Solid–solid energy) There exists a null-set \({\mathcal {N}}\) in \(({\mathcal {Z}}{\setminus } \lbrace {\mathbf {0}}\rbrace )^2\) (with respect to its six-dimensional Haar measure) such that for all pairs \((z^+,z^-) \in ({\mathcal {Z}}{\setminus } \lbrace {\mathbf {0}}\rbrace )^2 {\setminus } {\mathcal {N}} \), \(z^+ \ne z^-\), and \(\nu \in {\mathbb {S}}^1\) that holds

    $$\begin{aligned} \varphi (z^+,z^-,\nu )= \varphi _{\mathrm {hex}}\big (e^{-i\theta ^+} \nu \big )+ \varphi _{\mathrm {hex}}\big (e^{-i\theta ^-} \nu \big ), \end{aligned}$$

    and for all \((z^+,z^-)\in {\mathcal {N}}\), \(z^+ \ne z^-\), and \(\nu \in {\mathbb {S}}^1\) it holds that

    $$\begin{aligned} \frac{1}{2}\varphi _{\mathrm {hex}}\big (e^{-i\theta ^+} \nu \big )+ \frac{1}{2}\varphi _{\mathrm {hex}}\big (e^{-i\theta ^-} \nu \big ) \leqq \varphi (z^+,z^-,\nu )< \varphi _{\mathrm {hex}}\big (e^{-i\theta ^+} \nu \big )+ \varphi _{\mathrm {hex}}\big (e^{-i\theta ^-} \nu \big ), \end{aligned}$$

    where we write \(z^+ =(\theta ^+,\tau ^+,1)\) and \(z^- =(\theta ^-,\tau ^-,1)\).

    Moreover, there are exceptional sets \({{\mathcal {G}}_{{\mathbb {A}}}} \subset {\mathbb {A}}\) of angles and, for each \(\theta \in {{\mathcal {G}}_{{\mathbb {A}}}}\), \({{\mathcal {G}}_{{\mathbb {T}}}}(\theta ) \subset {\mathbb {R}}^2\) of translation vectors such that \({{\mathcal {G}}_{{\mathbb {A}}}}\) is countable and each \({{\mathcal {G}}_{{\mathbb {T}}}}(\theta ) \) is contained in a finite union of spheres, with

    $$\begin{aligned} {\mathcal {N}} \subset \big \{ (z^+, z^-) \in ({\mathcal {Z}}{\setminus } \lbrace {\mathbf {0}}\rbrace )^2 :\, \theta ^+-\theta ^- \in {{\mathcal {G}}_{{\mathbb {A}}}},\, e^{i\theta ^+}\tau ^+-e^{i\theta ^-}\tau ^- \in {{\mathcal {G}}_{{\mathbb {T}}}}(\theta ^+-\theta ^-)\big \}. \end{aligned}$$
  3. (iii)

    (Convexity) The mapping \(\nu \mapsto \varphi (z^+,z^-,\nu ) \) is convex for all \(z^+,z^-\in {\mathcal {Z}}\).

  4. (iv)

    (Rotational invariance) For all \(z^\pm =(\theta ^\pm ,\tau ^\pm ,1)\), \(\nu \in {\mathbb {S}}^1\), and \(\theta \in {\mathbb {A}}\) it holds that

    $$\begin{aligned} \varphi \big ((\theta ^++\theta ,\tau ^+,1),(\theta ^-+\theta ,\tau ^-,1),e^{i\theta }\nu \big ) =\varphi \big ((\theta ^+,\tau ^+,1),(\theta ^-\tau ^-,1),\nu \big ). \end{aligned}$$
  5. (v)

    (Translational invariance) For all \(z^\pm =(\theta ^\pm ,\tau ^\pm ,1)\), \(\nu \in {\mathbb {S}}^1\), and \(\tau \in {\mathbb {T}}\) it holds that

    $$\begin{aligned} \varphi \Big (\big (\theta ^+,\tau ^++e^{-i\theta ^+}\tau ,1\big ), \big (\theta ^-,\tau ^-+e^{-i\theta ^-}\tau ,1\big ),\nu \Big )=\varphi \big ((\theta ^+,\tau ^+,1),(\theta ^-\tau ^-,1),\nu \big ). \end{aligned}$$

We note that the interaction with vacuum, see property (i), has already been addressed in [3, 20]. A main novelty of our work lies in the characterization (ii). For explicit choices of the sets \({{\mathcal {G}}_{{\mathbb {A}}}}\) and \({{\mathcal {G}}_{{\mathbb {T}}}}(\theta )\) we refer to (6.2) and the paragraph above Lemma 7.6, respectively. In particular, (ii) states that generically the surface energy between two lattices is if each of the two lattices would interact with vacuum. In this case, the continuum energy E of a function \(u = \sum \nolimits _{j=1}^\infty \chi _{G_j} z_j\) corresponds to the crystalline perimeter of the grains \(\lbrace G_j\rbrace _j\), induced by \(\varphi _{\mathrm{hex}}\). In the non-generic case \((z^+,z^-) \in {\mathcal {N}}\), two lattices \({\mathscr {L}}(z^+)\) and \({\mathscr {L}}(z^-)\) have many touching pairs (that is, pairs of points with distance 1) which reduce the energy (2.3). Optimal interfaces for both cases for a normal vector \(\nu \) are illustrated in Fig. 4. We remark that the exact characterization of \(\varphi \) seems to be a difficult issue which is beyond the scope of the present analysis. In fact, counting the number of touching pairs depending on the relative orientation of the two lattices seems to be a non-trivial number theoretic problem, see Remark 2.6 and Fig. 5 below for some details in that direction. We remark that the properties of \({{\mathcal {G}}_{{\mathbb {A}}}}\) and \({{\mathcal {G}}_{{\mathbb {T}}}}(\theta )\) imply that \({\mathcal {N}}\) is of Hausdorff-dimension at most four. Finally, note that (iv) and (v) express the fact that both the atomistic and the continuum model are frame indifferent.

More precisely, our proof in Lemma 6.2 shows that the non-degeneracy in Theorem 2.5(ii) above can be quantized: for every \(\eta > 0\) there are only a finite number of differences \(\theta \) of lattice rotations and a corresponding finite number of spheres containing the difference of lattice shifts for which

$$\begin{aligned} \varphi (z^+,z^-,\nu ) \leqq \varphi _{\mathrm {hex}}\big (e^{-i\theta ^+} \nu \big )+ \varphi _{\mathrm {hex}}\big (e^{-i\theta ^-} \nu \big ) - \eta . \end{aligned}$$

These numbers only depend on \(\eta \). Moreover, we remark that the lower bound provided for \(\varphi \) is attained, for example, for \(z^- = (0,0,1)\), \(z^+ = (0,i,1)\), and \(\nu = i\), see Fig. 4c. (Consider \(X = \{x\in \varepsilon {\mathscr {L}}(0,0,1):\langle x, i \rangle \leqq 0\} \cup \{x\in \varepsilon {\mathscr {L}}(0,i,1):\langle x, i \rangle \geqq \varepsilon \}\) in (2.16).)

Fig. 4
figure 4

Different scenarios of optimal interfaces for a fixed normal \(\nu \) and different lattices \({\mathscr {L}}(z^\pm )\). The dark gray and white points form the lattice \({\mathscr {L}}(z^+)\) and the lattice \({\mathscr {L}}(z^-)\), respectively. Edges are depicted between points of distance 1. a Two lattices \({\mathscr {L}}(z^\pm )\) are depicted for which \(\varphi \) is less than twice the interaction energy with the vacuum. b We see two lattices \({\mathscr {L}}(z^\pm )\) for which \(\varphi \) is equal to twice the interaction energy with the vacuum. c Two lattices for which the lower bound in Theorem 2.5(ii) is attained

Remark 2.6

We finally point out that for \(\theta ^+ - \theta ^- \in {{\mathcal {G}}_{{\mathbb {A}}}}\), \(e^{i \theta ^+} \tau ^+ - e^{i \theta ^-} \tau ^- \in {{\mathcal {G}}_{{\mathbb {T}}}}(\theta ^+-\theta ^-)\) the calculation of \(\varphi \) seems to be a difficult problem. In fact, for \(e^{i(\theta ^+-\theta ^-)} =\frac{v_1}{v_2}\) with \(v_1,v_2 \in {\mathscr {L}}\) and \(|v_1|=|v_2|\), depending on the factorization of \(v_1,v_2\) in \({\mathscr {L}}\), there may be points \((x,y) \in {\mathscr {L}}(z^+) \times {\mathscr {L}}(z^-)\) such that \(x,y\notin {\mathscr {L}}(z^+) \cap {\mathscr {L}}(z^-)\) and \(|x-y|=1\). If this is the case, the relative position of two such atoms is fixed through the prime factors of \(v_1,v_2\), respectively. This leads to two major challenges in the calculation of \(\varphi \): (i) the characterization of points \((x,y) \in {\mathscr {L}}(z^+)\times {\mathscr {L}}(z^-)\) such that \(|x-y|=1\) depending on the relative orientation \(e^{i(\theta ^+-\theta ^-)}\) of the two lattices seems to be a non-trivial number theoretic problem. (ii) even after the characterization of the set of points \((x,y) \in {\mathscr {L}}(z^+)\times {\mathscr {L}}(z^-)\) such that \(|x-y|=1\) for different normals \(\nu \) to the interface, it is not always clear if it is energetically convenient to include such points in the construction of the optimal interface due to their relative orientation. Such a situation is illustrated in Fig. 5.

Fig. 5
figure 5

Two lattices \({\mathscr {L}}(z^\pm )\) for which \(\varphi \) is less than twice the interaction energy with vacuum. The dark gray points form the lattice \({\mathscr {L}}(z^+)\) and the white points the lattice \({\mathscr {L}}(z^-)\). The black and light gray points are those that are of distance 1 to the other lattice, as emphasized by an edge between them

The compactness and \(\Gamma \)-convergence results will be proved in Section 3. The properties of several cell formulas related to \(\varphi \), which are fundamental for the proofs, are postponed to Sections. 47. Finally, the proofs of Proposition 2.2 and Theorem 2.5 are given in Section 7.2.

3 Proof of the Main Results

This section is devoted to the proofs of our main results. We start with some preliminary properties. Then we prove compactness and finally we address the \(\Gamma \)-convergence result.

3.1 Preliminaries

We state and prove some elementary properties of the family \({E}_\varepsilon \). Recall the representation of the energy in (2.3) and the definition of sets in (2.4).

Lemma 3.1

(Properties of \(E_\varepsilon )\) Let \(\varepsilon >0\) and let X be a configuration with \(E_\varepsilon (X) < + \infty \). Then it holds that

  1. (i)

    \(E_\varepsilon (e^{i\theta }X+\tau ,e^{i\theta }A+\tau )=E_\varepsilon (X,A)\) for all \(\theta \in [0,2\pi )\), \(\tau \in {\mathbb {R}}^2\), and \(A \subset {\mathbb {R}}^2\),

  2. (ii)

    \(E_{\lambda \varepsilon }(\lambda X,\lambda A) = \lambda E_\varepsilon (X,A)\) for all \(\lambda >0\) and \(A \subset {\mathbb {R}}^2\),

  3. (iii)

    \(E_\varepsilon (X,A)\leqq E_\varepsilon (X,B)\) for all \(A \subset B \subset {\mathbb {R}}^2\),

  4. (iv)

    \(E_\varepsilon (X,A \cup B) = E_\varepsilon (X,B)+E_\varepsilon (X,A)\) for all \(A,B \subset {\mathbb {R}}^2\) with \(A \cap B =\emptyset \),

  5. (v)

    There exists \(C>0\) such that for all \(A \subset {\mathbb {R}}^2\) there holds \(\#(X \cap A) \leqq C{\mathcal {L}}^2((A)_\varepsilon )/\varepsilon ^2\).

Proof

Proof of \(\mathrm {(i)}\): Given \(\theta \in [0,2\pi )\) and \(\tau \in {\mathbb {R}}^2\), we define \({\tilde{x}} = e^{i\theta }x + \tau \) for each \(x \in {\mathbb {R}}^2\). The statement follows by noting that \(|{\tilde{x}}-{\tilde{y}}|=|x-y|\) for all \(x,y \in {\mathbb {R}}^2\) and \({\tilde{x}} \in e^{i\theta }A + \tau \) if and only if \(x \in A\). This implies \(y \in {\mathcal {N}}_\varepsilon (x)\) if and only if \({\tilde{y}} \in {\mathcal {N}}_\varepsilon ({\tilde{x}})\).

Proof of \(\mathrm {(ii)}\): For \(\lambda >0\) and \(x \in {\mathbb {R}}^2\), we define \( x_\lambda =\lambda x\). Clearly, we have \(|x_\lambda -y_\lambda |= \lambda |x-y|\) for all \(x,y \in {\mathbb {R}}^2\) and \( x_\lambda \in \lambda A\) if and only if \(x\in A\). This implies \(y_\lambda \in {\mathcal {N}}_{\lambda \varepsilon }(x_\lambda )\) if and only if \(y \in {\mathcal {N}}_\varepsilon (x)\).

Proof of \(\mathrm {(iii)}\): This statement follows from the fact that for all configurations X with finite energy and all \(x \in X\) we have \(6-\#{\mathcal {N}}_\varepsilon (x)\geqq 0\) by (2.2).

Proof of \(\mathrm {(iv)}\): This follows from the fact that, if \(A\cap B=\emptyset \), each term of the summation on the left hand side occurs also in the right hand side and vice versa.

Proof of \(\mathrm {(v)}\): Since X is a configuration with finite energy, there holds \(|x-y| \geqq \varepsilon \) for all \(x,y \in X\), \(x\ne y\). Therefore, \(B_{\varepsilon /2}(x) \cap B_{\varepsilon /2}(y) = \emptyset \) for all \(x,y \in X\), \(x\ne y\). By (2.4), we obtain \(\bigcup \nolimits _{x \in X \cap A} B_{\varepsilon /2}(x) \subset (A)_\varepsilon \) and therefore

$$\begin{aligned} \pi \varepsilon ^2/4 \, \#(X\cap A) = {\mathcal {L}}^2\Big (\bigcup \limits _{x \in X \cap A} B_{\varepsilon /2}(x) \Big ) \leqq {\mathcal {L}}^2 \big ((A)_\varepsilon \big ). \end{aligned}$$

From this the claim follows with \(C= 4/\pi \). \(\square \)

The following scaling property will be instrumental:

Lemma 3.2

(Scaling) For \(\varepsilon >0\), consider configurations \(X_\varepsilon \) satisfying \(E_\varepsilon (X_\varepsilon ) < +\infty \) and \(\lambda X_\varepsilon \) for \(\lambda >0\). By \(u_{\lambda \varepsilon }^{\lambda }\) and \(u_\varepsilon \) we denote the functions corresponding to \(\lambda X_\varepsilon \) and \(X_\varepsilon \), respectively, as defined in (2.15). Then, there holds

$$\begin{aligned} u_{\lambda \varepsilon }^{\lambda }(\lambda x)=u_\varepsilon (x) \text { for all }x \in {\mathbb {R}}^2. \end{aligned}$$
(3.1)

Moreover, for each bounded \(A \subset {\mathbb {R}}^2\), we have \(u_{\lambda \varepsilon }^{\lambda } \rightarrow u(\lambda ^{-1} \, \cdot )\) in \(L^1(\lambda A)\) as \(\varepsilon \rightarrow 0\) if and only if \(u_\varepsilon \rightarrow u\) in \(L^1( A)\).

Proof

We first prove (3.1). To see this, it suffices to note that \(x \in X_\varepsilon \) if and only if \(\lambda x \in \lambda X_\varepsilon \), \(\#({\mathcal {N}}_\varepsilon (x)\cap X)=6\) if and only if \(\#({\mathcal {N}}_{\lambda \varepsilon }(\lambda x)\cap \lambda X_\varepsilon )=6\), and \((x \cup {\mathcal {N}}_\varepsilon (x)) \subset \varepsilon e^{i\theta }({\mathscr {L}}+\tau )\) if and only if \((\lambda x \cup {\mathcal {N}}_{\lambda \varepsilon }(\lambda x)) \subset \lambda \varepsilon e^{i\theta }({\mathscr {L}}+\tau )\) for \(\theta \in {\mathbb {A}}\) and \(\tau \in {\mathbb {T}}\). Therefore, in view of (2.15) and the definition of the Voronoi cells \(V_\varepsilon ^z(x)\) below (2.14), (3.1) holds true. The equivalence of the convergence follows by a change of variables: we set \(y=\lambda x\) and obtain

$$\begin{aligned} \lambda ^2\int _{A} |u_{ \varepsilon }(x)- u(x)| \, \mathrm {d}x =\lambda ^2 \int _{A} |u_{\lambda \varepsilon }^\lambda (\lambda x)- u(x)| \, \mathrm {d}x = \int _{\lambda A} |u_{\lambda \varepsilon }^\lambda (y)- u(\lambda ^{-1} y)| \, \mathrm {d}y \end{aligned}$$

for every bounded \(A \subset {\mathbb {R}}^2\). \(\square \)

3.2 Compactness

In this subsection we prove Theorem 2.1. As a preparation, we show the following coercivity property:

Proposition 3.3

(Coercivity) Let X be a configuration with \(E_\varepsilon (X) < +\infty \) and let \(A \subset \mathbb R^2 \) be a Borel set. Then, there exists a universal \(C>0\) such that

$$\begin{aligned} {\mathcal {H}}^1(J_u\cap {A}) \leqq CE_\varepsilon (X,(A)_\varepsilon ), \end{aligned}$$
(3.2)

where u associated to X is given by (2.15) and \((A)_\varepsilon \) is defined in (2.4).

Proof

Let \(A \subset \mathbb R^2 \) be a Borel set. Consider \(X \subset {\mathbb {R}}^2\) with \(E_\varepsilon (X) <+\infty \). In view of (2.11) and (2.15), the function u associated to X can be written in the form \(u = \sum \nolimits _{j=1}^\infty \chi _{G_j} z_j\) for pairwise distinct \(\lbrace z_j \rbrace _j \subset {\mathcal {Z}} {\setminus } \lbrace {\mathbf {0}} \rbrace \) and pairwise disjoint \(\lbrace G_j \rbrace _j \subset {\mathbb {R}}^2\). By [2, Remark 4.22] it suffices to check that

$$\begin{aligned} \sum \limits _{j=1}^\infty {\mathcal {H}}^1( \partial ^* G_j \cap {A}) \leqq C E_\varepsilon (X,(A)_\varepsilon ). \end{aligned}$$
(3.3)

Due to the construction in (2.15), each \(G_j\) is made of a finite union of regular hexagons with sidelength \(\varepsilon /\sqrt{3}\) such that at the center of each such hexagon there is an atom \(x \in X\) with \(\#{\mathcal {N}}_\varepsilon (x)=6\). If an edge of such a hexagon is contained in \(\partial ^* G_j\), then there exists a point \(y \in {\mathcal {N}}_\varepsilon (x)\) such that \(\#{\mathcal {N}}_\varepsilon (y) <6\), see Fig. 2. If the intersection of that edge with A is non-empty, then \(y \in (A)_\varepsilon \cap X\), see (2.1) and (2.4). Note that each such y is selected for at most six different edges of hexagons contained in \(\partial G^*_j\). By (2.3), this yields

$$\begin{aligned} \sum \limits _{j \in {\mathbb {N}}} {\mathcal {H}}^1( \partial ^* G_j \cap {A}) \leqq \tfrac{6}{\sqrt{3}} \varepsilon \, \#\{ y \in X \cap (A)_\varepsilon :\#{\mathcal {N}}_\varepsilon (y) <6\} \leqq \tfrac{12}{\sqrt{3}} E_\varepsilon (X,(A)_\varepsilon ), \end{aligned}$$

where we used that each edge of the hexagon has length \(\varepsilon /\sqrt{3}\). \(\square \)

Proof of Theorem 2.1

Let \(\lbrace X_\varepsilon \rbrace _\varepsilon \) and \(\lbrace u_\varepsilon \rbrace _\varepsilon \) be given, as defined in (2.15). Recall that \({\mathcal {Z}}\) can be embedded into \({\mathbb {R}}^7\) and that it is closed and bounded, see (2.9). Therefore, for each \(B_r\), \(r \in {\mathbb {N}}\), we can use Proposition 3.3 and a compactness result for piecewise constant functions, see [2, Theorem 4.25], to find a subsequence \(\{\varepsilon _k\}_k\) and \(u^r \in PC(B_r;{\mathcal {Z}})\) such that \(u_{\varepsilon _k} \rightarrow u^r\) in measure and thus also in \(L^1(B_r;{\mathcal {Z}})\). By lower semicontinuity there holds \({\mathcal {H}}^1(J_{u^r} \cap B_r) \leqq C\) for a constant independent of r. By a diagonal argument, we obtain \(u:{\mathbb {R}}^2 \rightarrow {\mathcal {Z}}\) with \(u=u^r\) on \(B_r\) for all \(r \in \mathbb N\) such that \(u_{\varepsilon _k} \rightarrow u\) in \(L^1_{\mathrm{loc}}(\mathbb R^2;{\mathcal {Z}})\). Clearly, \({\mathcal {H}}^1(J_u)<+\infty \). Thus, to show that \(u \in PC({\mathbb {R}}^2;{\mathcal {Z}})\), it remains to check that \({\mathcal {L}}^2(\lbrace u \ne {\mathbf {0}}\rbrace ) < + \infty \).

Using (3.2) with \(A={\mathbb {R}}^2\), the isoperimetric inequality on \({\mathbb {R}}^2\), \({\mathcal {L}}^2( \{u_{\varepsilon _k}\ne {\mathbf {0}}\}) <+\infty \), and the fact that \({\mathcal {L}}^2(\{u \ne {\mathbf {0}}\})\) is lower semicontinuous with respect to strong \(L^1_{\mathrm{loc}}\) convergence, we obtain

$$\begin{aligned} \big ({\mathcal {L}}^2(\{u \ne {\mathbf {0}}\})\big )^{1/2}&\leqq \liminf _{k \rightarrow +\infty } \big ({\mathcal {L}}^2(\{u_{\varepsilon _k} \ne {\mathbf {0}}\}) \big )^{1/2}\leqq \liminf _{k \rightarrow +\infty } C{\mathcal {H}}^1(\partial ^* \{u_{\varepsilon _k}\ne {\mathbf {0}}\}) \\&\leqq \liminf _{k \rightarrow +\infty } C{\mathcal {H}}^1(J_{u_{\varepsilon _k}})\leqq \liminf _{k\rightarrow +\infty } C E_{\varepsilon _k}(X_{\varepsilon _k}) <+\infty . \end{aligned}$$

This implies that \(u \in PC({\mathbb {R}}^2;{\mathcal {Z}})\) and concludes the proof. \(\square \)

3.3 Lower Bound

This subsection is devoted to the proof of Theorem 2.3(i). For the proof, it is instrumental to use a different cell formula. In contrast to imposing boundary conditions as in (2.16), we require \(L^1\)-convergence to the function \(u^{\nu }_{z^+,z^-} \in PC_{\mathrm{loc}}({\mathbb {R}}^2;{\mathcal {Z}})\) defined by

$$\begin{aligned} u^{\nu }_{z^+,z^-}(x) = {\left\{ \begin{array}{ll} z^+ &{} \text { if }\,\, \langle x, \nu \rangle \geqq 0, \\ z^- &{} \text { if }\,\,\langle x, \nu \rangle < 0, \end{array}\right. } \end{aligned}$$
(3.4)

for \(x \in {\mathbb {R}}^2\), \(z^+,z^- \in {\mathcal {Z}}\), and \(\nu \in {\mathbb {S}}^1\). More precisely, for \(z^+,z^- \in {\mathcal {Z}}\) and \(\nu \in {\mathbb {S}}^1\) we introduce

$$\begin{aligned} \begin{aligned} \psi (z^+,z^-,\nu )&:= \inf \Big \{ \liminf _{\varepsilon \rightarrow 0} E_\varepsilon \big (X_\varepsilon ,Q^\nu (y_\varepsilon )\big ):\, y_\varepsilon \in {\mathbb {R}}^2, \, \\&\quad \qquad \lim _{\varepsilon \rightarrow 0} \int _{Q^\nu } |u_\varepsilon (x + y_\varepsilon )- u^{\nu }_{z^+,z^-}(x)| \, \mathrm{d}x = 0 \Big \}, \end{aligned} \end{aligned}$$
(3.5)

where \(u_\varepsilon \) denotes the function associated to \(X_\varepsilon \), as defined in (2.15). The density \(\psi \) is related to \(\varphi \) (see (2.16)) in the following way:

Proposition 3.4

(Relation of \(\psi \) and \(\varphi )\) For all \(z^+,z^- \in {\mathcal {Z}}\) and \(\nu \in {\mathbb {S}}^1\) it holds that

$$\begin{aligned} \psi (z^+,z^-,\nu )\geqq \varphi (z^+,z^-,\nu ). \end{aligned}$$

We postpone the proof of Proposition 3.4 to Sections 47. It will follow by combining Lemmas 4.1, 7.1, and Proposition 7.2. After a further comment about the definition of \(\psi \), we proceed with the proof of the lower bound.

Remark 3.5

(Varying cubes in the definition of \(\psi \)) We point out that, in contrast to many other cell formulas in the literature, the position of the cubes in (3.5) is not fixed but may vary along the sequence \(\varepsilon \rightarrow 0\). This general definition is necessary as the problem is not translation invariant in the variables \(z^\pm \), although the discrete energy has such a property, see Lemma 3.1(i). To see this issue, consider a sequence \(\lbrace X_\varepsilon \rbrace _\varepsilon \) contained in a fixed lattice \(X_\varepsilon \subset \varepsilon e^{i\theta }({\mathscr {L}}+\tau )\). Then, for a fixed translation \(\sigma \in \mathbb R^2\), the shifted configurations \({\tilde{X}}_\varepsilon := X_\varepsilon + \sigma \) are contained in \(\varepsilon e^{i\theta }({\mathscr {L}} + \tau _\varepsilon )\), where the translation \(\tau _\varepsilon := (\tau + e^{-i\theta }\sigma /\varepsilon )\) (modulo \({\mathscr {L}}\)) is in general different from \(\tau \) and highly oscillating. This in general implies \({\tilde{u}}_\varepsilon \ne u_\varepsilon (\cdot - \sigma )\), where \({u}_\varepsilon \) and \({\tilde{u}}_\varepsilon \) are given in (2.15). This lack of translational invariance is remedied in our approach by minimizing over all possible cell centers. Note that only a posteriori we are able to show that the cell formula \(\varphi \) is actually independent of the center, see Proposition 2.2.

Proof of Theorem 2.3(i)

Let \(\lbrace X_{\varepsilon }\rbrace _\varepsilon \) be a sequence with \(X_{\varepsilon } \rightarrow u\) in \(L^1_{\mathrm{loc}}({\mathbb {R}}^2)\) for \(u \in PC({\mathbb {R}}^2;{\mathcal {Z}})\). Clearly, it suffices to treat the case

$$\begin{aligned} \sup \limits _{\varepsilon >0} E_\varepsilon (X_\varepsilon ) <+\infty . \end{aligned}$$
(3.6)

We proceed in two steps. We first identify a limiting measure associated to the discrete configurations (Step 1). Then, we proceed by a blow-up procedure for the jump part of this measure (Step 2).

Step 1: Identification of a limiting measure. We consider the family of positive measures \(\{\mu _\varepsilon \}_\varepsilon \) given by

$$\begin{aligned} \mu _\varepsilon :=\frac{1}{2} \sum \limits _{x\in X} \varepsilon \left( 6-\#{\mathcal {N}}_\varepsilon (x)\right) \delta _x. \end{aligned}$$

By (2.3) we observe that for all open sets \(A \subset {\mathbb {R}}^2\) it holds that

$$\begin{aligned} |\mu _\varepsilon |(A) = \mu _\varepsilon (A) = E_\varepsilon (X_\varepsilon ,A). \end{aligned}$$
(3.7)

Therefore, by (3.6) we get \(\sup _{\varepsilon >0} |\mu _\varepsilon |({\mathbb {R}}^2) < +\infty \). Thus, as \(\mathbb R^2\) is locally compact, up to passing to a subsequence (not relabeled), there exists a positive finite Radon measure \(\mu \) such that

$$\begin{aligned} \mu _\varepsilon \overset{*}{\rightharpoonup } \mu . \end{aligned}$$
(3.8)

By the Radon–Nykodym Theorem we may decompose \(\mu \) into two mutually singular non-negative measures

$$\begin{aligned} \mu =\xi {\mathcal {H}}^1|_{J_u} +\mu _s. \end{aligned}$$

The main point is to prove

$$\begin{aligned} \xi (x_0) \geqq \psi (z^+,z^-,\nu ) \quad \text {for } {\mathcal {H}}^1\text {-almost every } x_0 \in J_u, \end{aligned}$$
(3.9)

where \(z^+\) and \(z^-\) denote the one-sided limits of u at \(x_0\) and \(\nu \) denotes the corresponding normal. (For notational convenience, the explicit dependence on u is omitted.) Once this is shown, the statement follows from (2.17), (3.7), (3.8), and Proposition 3.4. In fact,

$$\begin{aligned} \liminf _{\varepsilon \rightarrow 0} E_{\varepsilon }(X_{\varepsilon })= & {} \liminf _{\varepsilon \rightarrow 0} \mu _\varepsilon (\mathbb R^2) \geqq \mu (\mathbb R^2) \geqq \int _{J_u} \xi \, \mathrm{d} {\mathcal {H}}^1 \geqq \int _{J_u} \varphi (z^+,z^-,\nu ) \, \mathrm{d}{\mathcal {H}}^1 \\= & {} E(u). \end{aligned}$$

Step 2: Blow-up argument. It remains to prove (3.9). By the properties of SBV-functions and Radon measures we know that for \({\mathcal {H}}^1\)-almost every \(x_0 \in J_u\) it holds that

  1. (a)

    \( \displaystyle \lim _{\rho \rightarrow 0} \frac{1}{\rho ^2}\int _{Q^{\nu }_{\rho }(x_0)}|u(x)-u_{z^+,z^-}^{\nu }(x-x_0)|\, \mathrm {d}x=0, \)

  2. (b)

    \(\displaystyle \lim _{\rho \rightarrow 0} \frac{1}{\rho }{\mathcal {H}}^1\big (J_u \cap Q^\nu _{\rho }(x_0)\big )=1\),

  3. (c)

    \(\displaystyle \xi (x_0) = \lim _{\rho \rightarrow 0} \frac{\mu (Q^\nu _{\rho }(x_0))}{{\mathcal {H}}^1\big (J_u \cap Q^\nu _{\rho }(x_0)\big )}\);

see, for example, [2, Theorem 2.63, Theorem 3.78, and Remark 3.79]. Here, \(u^\nu _{z^+,z^-}\) is defined in (3.4). It suffices to prove (3.9) for all \(x_0 \in J_u\) such that (a)–(c) hold. We fix \(\rho _n \rightarrow 0\) such that \(|\mu |(\partial Q^\nu _{\rho _n}(x_0))=0\) for all \(n \in {\mathbb {N}}\). By (3.7), (3.8), (b), (c), and the Portmanteu Theorem, we get

$$\begin{aligned} \xi (x_0)&= \lim _{\rho \rightarrow 0} \frac{\mu (Q^\nu _{\rho }(x_0))}{{\mathcal {H}}^1(J_u \cap Q^\nu _{\rho }(x_0))} = \lim _{\rho \rightarrow 0}\frac{\mu (Q^\nu _{\rho }(x_0))}{\rho } = \lim _{n \rightarrow +\infty } \frac{1}{\rho _n} \lim _{\varepsilon \rightarrow 0} \mu _\varepsilon \big (Q^\nu _{\rho _n}(x_0)\big )\\&= \lim _{n\rightarrow +\infty } \frac{1}{\rho _n} \lim _{\varepsilon \rightarrow 0} E_\varepsilon \big (X_\varepsilon ,Q^\nu _{\rho _n}(x_0)\big ). \end{aligned}$$

We introduce the configuration \( X_\varepsilon ^n := \rho _n^{-1} X_\varepsilon \) and obtain by Lemma 3.1(ii) (for \(\lambda = 1/\rho _n\))

$$\begin{aligned} \xi (x_0) = \lim _{n\rightarrow +\infty } \lim _{\varepsilon \rightarrow 0 } E_{\varepsilon /\rho _n} \big ( X^n_\varepsilon , Q^\nu (\rho ^{-1}_n x_0) \big ). \end{aligned}$$
(3.10)

Since \(X_{\varepsilon } \rightarrow u\) in \(L^1_{\mathrm{loc}}({\mathbb {R}}^2)\), we obtain by definition that \(u_\varepsilon \rightarrow u\) in \(L^1_{\mathrm{loc}}({\mathbb {R}}^2)\), see the end of Section 2.2. By \(u_\varepsilon ^n\) we denote the function corresponding to \(X_\varepsilon ^n\). By (3.1) we have \(u^n_{\varepsilon }( x)=u_\varepsilon (\rho _n x)\) for all \(x \in {\mathbb {R}}^2\). In particular, Lemma 3.2 yields \(u_\varepsilon ^n \rightarrow u^n\) on \(Q^\nu (\rho _n^{-1} x_0)\), where \(u^n(x):= u(\rho _n x) \) for \(x \in {\mathbb {R}}^2\). By (a), change of variables, and the fact that \(u^n(x+\rho _n^{-1}x_0) = u(x_0 + \rho _nx)\) as well as \(u^{\nu }_{z^+,z^-}(x) = u^{\nu }_{z^+,z^-}(\rho _n x)\) for \(x\in {\mathbb {R}}^2\), we also get that

$$\begin{aligned}&\lim _{n\rightarrow +\infty } \int _{Q^\nu } | u^n(x+ \rho _n^{-1}x_0) - u^\nu _{z^+,z^-}(x) | \, \mathrm{d}x\\&\quad = \lim _{n\rightarrow +\infty } \frac{1}{\rho ^2_n}\int _{Q^{\nu }_{\rho _n}(x_0)}|u(x)-u_{z^+,z^-}^{\nu }(x-x_0)|\, \mathrm{d}x= 0. \end{aligned}$$

Therefore, by recalling (3.10) and \(u_\varepsilon ^n \rightarrow u^n\) on \(Q^\nu (\rho _n^{-1} x_0)\), by using a standard diagonal argument, we find an infinitesimal sequence \(\{\varepsilon (n)\}_n\) such that for \( X^n := X^n_{\varepsilon (n)}\) and \(u^n:= u^n_{\varepsilon (n)}\) we have

$$\begin{aligned} \xi (x_0) = \lim _{n\rightarrow +\infty } E_{\varepsilon _n} \big (X^n,Q^\nu (y^n) \big ), \end{aligned}$$
(3.11)

and

$$\begin{aligned} \lim _{n \rightarrow +\infty }\int _{Q^\nu } |u^n(x + y^n ) - u_{z^+,z^-}^\nu (x) | \, \mathrm{d}x= 0, \end{aligned}$$

where \(\varepsilon _n =\varepsilon (n)/\rho _n\) and \(y^n=\rho _n^{-1}x_0\). Since the sequence is admissible in (3.5), (3.11) implies \(\xi (x_0) \geqq \psi (z^+,z^-,\nu ) \). This shows (3.9) and concludes the proof. \(\square \)

3.4 Upper Bound

This subsection is devoted to the proof of Theorem 2.3(ii). The following density result will be instrumental:

Lemma 3.6

Let \(u \in PC({\mathbb {R}}^2;{\mathcal {Z}})\). Then there exists a sequence \((u_n)_n \subset PC({\mathbb {R}}^2;{\mathcal {Z}})\) with \(u_n \rightarrow u\) in \(L^1(\mathbb R^2)\) and \(\limsup _{n\rightarrow +\infty } E(u_n) \leqq E(u)\) such that each \(u_n\) attains only finitely many values and has polygonal jump set, that is, \(J_{u_n}\) consists of finitely many segments.

Proof

Consider \(u \in PC({\mathbb {R}}^2;{\mathcal {Z}})\). We proceed in three steps. We first show that u can be approximated by functions with finite support (Step 1). Then, we approximate with functions attaining only finitely many values (Step 2) and finally show that the jump set can be approximated by a finite number of segments (Step 3). Note that it suffices to show that for each \(\delta >0\) there exists a function \(u_\delta \) with the desired properties satisfying

$$\begin{aligned} E(u_\delta ) \leqq E(u) + \delta \ \ \ \text { and } \ \ \ \Vert u - u_\delta \Vert _{L^1(\mathbb R^2)} \leqq \delta . \end{aligned}$$
(3.12)

We prove (3.12) up to the multiplication with a uniform constant \(C > 0\) that is independent of \(\delta \). Replacing \(u_\delta \) with \(u_{\delta /C}\) then yields the result.

Step 1: Reduction to finite support. We show that for every \(u \in PC({\mathbb {R}}^2;{\mathcal {Z}})\) and for every \(\delta >0\) there exist \(R>0\) and \(u_\delta \in PC({\mathbb {R}}^2;{\mathcal {Z}})\) such that (3.12) is satisfied and it holds that

$$\begin{aligned} \{u_\delta \ne {\mathbf {0}}\} \subset B_R. \end{aligned}$$
(3.13)

To this end, fix \(\delta >0\). Since there holds \({\mathcal {L}}^2(\lbrace u \ne {\mathbf {0}} \rbrace ) < +\infty \), we can choose \(R'>0\) such that

$$\begin{aligned} {\mathcal {L}}^2\big (\{u\ne {\mathbf {0}}\} \cap ({\mathbb {R}}^2{\setminus } B_{R'})\big ) \leqq \delta . \end{aligned}$$
(3.14)

By the coarea formula and the previous inequality, we can select \(R \in (R',R'+1)\) such that

$$\begin{aligned} {\mathcal {H}}^1\big ( \{u\ne {\mathbf {0}}\} \cap \partial B_{R}\big )&\leqq {\mathcal {L}}^2\big (\{u\ne {\mathbf {0}}\} \cap (B_{R'+1}{\setminus } B_{R'})\big )\nonumber \\&\leqq {\mathcal {L}}^2\big (\{u\ne {\mathbf {0}}\} \cap ({\mathbb {R}}^2{\setminus } B_{R'})\big ) \leqq \delta . \end{aligned}$$
(3.15)

Define \(u_\delta \in PC({\mathbb {R}}^2;{\mathcal {Z}})\) by \(u_\delta = u\chi _{B_R}\). Then clearly (3.13) holds. We choose the orientation of \(\nu _{u_\delta }(x)\) for \(x \in J_u \cap \partial B_R\) such that \(u^+_\delta \) coincides with the trace of u from the interior of \(B_R\). As \(\varphi (z,{\mathbf {0}},\nu ) \leqq C\) for all \(z \in {\mathcal {Z}}\) and \(\nu \in {\mathbb {S}}^1\) by Theorem 2.5(i), we use (3.15) to get

$$\begin{aligned} E(u_\delta )&= \int _{B_{R} \cap J_{u_\delta }} \varphi (u_\delta ^+,u_\delta ^-,\nu _{u_\delta })\,\mathrm {d}{\mathcal {H}}^1 + \int _{\partial B_{R} \cap \{u\ne {\mathbf {0}}\}} \varphi (u_\delta ^+,{\mathbf {0}},\nu _{u_\delta })\,\mathrm {d}{\mathcal {H}}^1 \\&\leqq \int _{B_{R} \cap J_{u}} \varphi (u^+,u^-,\nu _{u})\,\mathrm {d}{\mathcal {H}}^1 + C{\mathcal {H}}^1( \{u\ne {\mathbf {0}}\} \cap \partial B_{R}) \leqq E(u) + C\delta . \end{aligned}$$

This implies the first inequality of (3.12). To see the second inequality of (3.12), note that \(|z| \leqq C\) for all \(z\in {\mathcal {Z}}\) and therefore by (3.14)

$$\begin{aligned} \Vert u_\delta -u\Vert _{L^1({\mathbb {R}}^2)} = \Vert u_\delta -u\Vert _{L^1({\mathbb {R}}^2{\setminus } B_R)}\leqq C{\mathcal {L}}^2 \big (\{u\ne {\mathbf {0}}\} \cap ({\mathbb {R}}^2{\setminus } B_{R'} ) \big ) \leqq C\delta . \end{aligned}$$

Step 2: Reduction to functions attaining finitely many values. Consider \(u \in PC({\mathbb {R}}^2;{\mathcal {Z}})\). By Step 1 we may assume that (3.13) holds for some \(R>0\), that is, \(\{u \ne {\mathbf {0}}\} \subset B_R\). For each \(\delta >0\), we prove that there exists \(u_\delta \in PC({\mathbb {R}}^2;{\mathcal {Z}})\) such that (3.12) holds and \(u_\delta \) attains only finitely many values. Recall by (2.11) that u can be written in the form \(u = \sum \nolimits _{j=1}^\infty \chi _{G_j} z_j\) for pairwise distinct \(\lbrace z_j \rbrace _j \subset {\mathcal {Z}} {\setminus } \lbrace {\mathbf {0}} \rbrace \) and pairwise disjoint \(\lbrace G_j \rbrace _j \subset {\mathbb {R}}^2\). In view of (2.12), we can choose \(J_\delta \in {\mathbb {N}}\) sufficiently large such that

$$\begin{aligned} \sum \limits _{j=J_\delta +1}^\infty {\mathcal {H}}^1\big (\partial ^* G_j\big ) \leqq \delta /R. \end{aligned}$$
(3.16)

Note that \(G_j \subset B_R\) for all \(j \in \mathbb N\) since \(\{u \ne {\mathbf {0}}\} \subset B_R\). Due to the isoperimetric inequality on \(B_R\) along with \( {\mathcal {L}}^2(G_j) \leqq {\mathcal {L}}^2(B_R) =\pi R^2 \) for all \(j \in \mathbb N\), we obtain

$$\begin{aligned} \sum \limits _{j=J_\delta +1}^\infty {\mathcal {L}}^2(G_j) \leqq \ \sqrt{\pi R^2} \sum \limits _{j=J_\delta +1}^\infty \big ({\mathcal {L}}^2(G_j)\big )^{1/2} \leqq CR\sum \limits _{j=J_\delta +1}^\infty {\mathcal {H}}^1\big (\partial ^* G_j\big )\leqq C\delta , \end{aligned}$$
(3.17)

where \(C>0\) is a universal constant. Now we define

$$\begin{aligned} u_\delta := {\left\{ \begin{array}{ll}\displaystyle u &{}\displaystyle \text {in } \bigcup \limits _{j=1}^{J_\delta } G_j, \\ {\mathbf {0}} &{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

Then, by (3.17) and \(\Vert u \Vert _\infty \leqq C\) we get \(\Vert u_\delta -u\Vert _{L^1({\mathbb {R}}^2)}= \Vert u_\delta -u\Vert _{L^1(B_R)}{\leqq } C\delta \). Moreover, setting for brevity \(\Gamma := \bigcup _{j=J_\delta +1}^\infty \partial ^* G_j\) we obtain, by (3.16), that

$$\begin{aligned} E(u_\delta )&= \int _{J_{u_\delta }}\varphi (u^+_\delta ,u^-_\delta ,\nu _{u_\delta }) \, \mathrm {d}{\mathcal {H}}^1 = \int _{J_{u_\delta }\cap \Gamma }\varphi (u^+_\delta ,u^-_\delta ,\nu _{u_\delta })\,\mathrm {d} {\mathcal {H}}^1\\&\quad +\int _{J_{u_\delta }{\setminus } \Gamma }\varphi (u^+_\delta ,u^-_\delta ,\nu _{u_\delta })\, \mathrm {d}{\mathcal {H}}^1\\&\leqq C \sum \limits _{j=J_\delta +1}^\infty {\mathcal {H}}^1\big ( \partial ^* G_j \big ) +E(u)\leqq C\delta +E(u), \end{aligned}$$

where we have used \(\varphi (z_1,z_2,\nu ) \leqq C\) for all \(z_1,z_2 \in {\mathcal {Z}}\) and \(\nu \in {\mathbb {S}}^1\). Therefore, (3.12) holds, and Step 2 is concluded.

Step 3: Reduction to polyhedral jump sets. Consider \(u \in PC({\mathbb {R}}^2;{\mathcal {Z}})\). By Steps 1–2 we can assume that u attains only finitely many values, and its support is contained in \(B_R\). By Theorem 2.5(iii) we get that the mapping \(\nu \mapsto \varphi (z_1,z_2,\nu )\) is convex and thus continuous for all \(z_1,z_2 \in {\mathcal {Z}}\). Therefore, by [9, Theorem 2.1 and Corollary 2.4] (with \(\Omega = B_R\) and \({\mathcal {Z}}\) being the range of u) we obtain a function \(u_\delta \in PC({\mathbb {R}}^2;{\mathcal {Z}})\) with polyhedral jump set such that (3.12) is satisfied. This concludes the proof. \(\square \)

We are now in a position to prove Theorem 2.3(ii).

Fig. 6
figure 6

The construction for the \(\Gamma \)-\(\limsup \) in the case where the jump set is polyhedral: The part \(\Gamma _1 \cup \Gamma _2\) of the jump set is shown. Here, \(x_1^2\) equals \(x_2^1\). The region \((M)_\delta \) is shown as the dotted circles around the points in M. Also the cubes used in the construction to cover the segments \(\Gamma _1\) and \(\Gamma _2\) are indicated

Proof of Theorem 2.3(ii)

By Lemma 3.6 and a general density argument in the theory of \(\Gamma \)-convergence (see [8, Remark 1.29]), it suffices to construct recovery sequences for \(u \in PC({\mathbb {R}}^2;{\mathcal {Z}})\) such that u attains only finitely many values, and u has a polygonal jump set. Our goal is to prove that there exists \(\lbrace X_\varepsilon \rbrace _\varepsilon \) such that \(X_\varepsilon \rightarrow u\) in \(L^1_{\mathrm{loc}}(\mathbb R^2) \) and \(\limsup _{\varepsilon \rightarrow 0} E_\varepsilon (X_\varepsilon ) \leqq E(u)\).

Let \(J_u = \bigcup _{i=1}^N \Gamma _i =\bigcup _{i=1}^N [x_i^1;x_i^2]\), where the sets \(\Gamma _i\) are line segments between the points \(x_i^1\) and \(x_i^2\), defined in (2.5), with length \(l_i\), orientation \(\nu _i^\bot \), and normal \(\nu _i\). We can assume that the traces \((u^+,u^-) = (u^+_i,u^-_i)\) are constant along each line segment, and that two segments \(\Gamma _i\) and \(\Gamma _j\) intersect at most at endpoints of \(\Gamma _i\) and \(\Gamma _j\). Denote by M the collection of points where at least two of such line segments meet. Fix \(0<\delta <\frac{1}{3} \min \{ |x-y| :x,y \in M, \, x\ne y \}\) and choose \(\rho \in (0,\delta )\) small enough such that

$$\begin{aligned} \rho < \frac{1}{\sqrt{2}}\text {dist}\Big (\Gamma _i {\setminus } \big (B_\delta (x_i^1) \cup B_\delta (x_i^2)\big ), \, \Gamma _j {\setminus } \big (B_\delta (x_j^1) \cup B_\delta (x_j^2)\big ) \Big ) \ \ \ \text { for all }i \ne j. \end{aligned}$$
(3.18)

This choice of \(\rho \) implies that \(Q^\nu _\rho (x_1) \cap Q^\nu _\rho (x_2) =\emptyset \) for all \(x_1 \in \Gamma _i {\setminus } (B_\delta (x_i^1) \cup B_\delta (x_i^2))\) and \(x_2 \in \Gamma _j {\setminus } (B_\delta (x_j^1) \cup B_\delta (x_j^2))\), \(i\ne j\). As the traces \((u^+,u^-)\) are constant on \(\Gamma _i\), it holds that

$$\begin{aligned} \int _{ \Gamma _i } \varphi (u^+,u^-,\nu _u)\,\mathrm {d}{\mathcal {H}}^1 = l_i \,\varphi (u^+_i,u^-_i,\nu _i) \ \text { for all } i \in \lbrace 1,\ldots , N\rbrace . \end{aligned}$$
(3.19)

We define

$$\begin{aligned} P_i^\rho = \big \{x_i^1 + k\rho \nu ^\perp _i :\, k \in {\mathbb {N}}, \, 0 \leqq k \leqq \lfloor l_i/\rho \rfloor \big \}, \ \ \ \Gamma _i^\rho = \bigcup \limits _{x \in P_i^\rho } Q^\nu _\rho (x), \ \ \ \Gamma _\rho = \bigcup \limits _{i=1}^N \Gamma ^\rho _i \end{aligned}$$

as well as (recall (2.7))

$$\begin{aligned} H^\varepsilon {=} \bigcup _{i=1}^N \bigcup _{x \in P_i^\rho } \big (x {+} \partial ^H_\varepsilon Q^\nu _\rho \big ), \ \text {where } \ \partial _\varepsilon ^HQ^\nu _\rho := \overline{Q^\nu _{\rho + 10\varepsilon } {\setminus } Q^\nu _{\rho - 10\varepsilon }} {\setminus } \big (\partial _\varepsilon ^+ Q^\nu _\rho \cup \partial _\varepsilon ^- Q^\nu _\rho \big ). \end{aligned}$$

In view of Proposition 2.2, we can choose \(\varepsilon =\varepsilon (\rho , \delta )>0\) sufficiently small such that, for each \(x\in P_i^\rho \), we can choose a configuration \(X_\varepsilon ^x \subset \mathbb R^2\) satisfying \(X_\varepsilon ^x = \varepsilon {\mathscr {L}}(u^\pm _i) \text { on } \partial _\varepsilon ^\pm Q^\nu _\rho (x)\) and

$$\begin{aligned} E_\varepsilon (X_\varepsilon ^x, Q^\nu _\rho (x)) \leqq \rho \, \varphi (u^+_i, u^-_i, \nu _i) + \delta \rho /l_i. \end{aligned}$$
(3.20)

We introduce the configuration

$$\begin{aligned} X^\delta _\varepsilon ={\left\{ \begin{array}{ll} X_\varepsilon ^x &{}\text {in }\,\, Q^\nu _\rho (x){\setminus } ((M)_\delta \cup H^\varepsilon ), \ \text {for } x \in P_i^\rho \text { for some } i\in \lbrace 1,\ldots ,N\rbrace ,\\ {\mathscr {L}}(z)&{}\text {in }\,\, \{u=z\} {\setminus } ((M)_\delta \cup \Gamma _\rho ) \text { for } z \in \mathrm{Im}(u),\\ \emptyset &{}\text {in }\,\,(M)_\delta \cup H^\varepsilon ; \\ \end{array}\right. } \end{aligned}$$

see Fig. 6 for an illustration. Here, \((M)_\delta \) denotes the \(\delta \)-neighborhood of M, see (2.4), and \(\mathrm{Im}(u)\) denotes the image of u. The set \(H^\varepsilon \) is introduced in order to ensure that \(E_\varepsilon (X_\varepsilon ^\delta )<+\infty \) since atoms in \(H^\varepsilon \) of two adjacent cubes could violate the constraint of having at least distance \(\varepsilon \). Indeed, by \(X^\delta _\varepsilon = \emptyset \) on \((M)_\delta \cup H^\varepsilon \) and the boundary conditions of \(X_\varepsilon ^x\), we get \(|x-y| \geqq \varepsilon \) for all \(x,y \in X_\varepsilon ^\delta \), \(x\ne y\), and therefore \(E_\varepsilon (X_\varepsilon ^\delta ) <+\infty \). We have \(\#{\mathcal {N}}_\varepsilon (x)=6\) for each atom \(x \in X^\delta _\varepsilon {\setminus } ((M)_{\delta +\varepsilon } \cup \Gamma _\rho )\). To see this, we take the boundary conditions of \(X_\varepsilon ^x\) and the choice of \(\rho \) in (3.18) into account. By (2.3) this implies

$$\begin{aligned} E_\varepsilon \big (X^\delta _\varepsilon , {\mathbb {R}}^2 {\setminus } ((M)_{\delta +\varepsilon } \cup \Gamma _\rho )\big ) =0. \end{aligned}$$
(3.21)

Therefore, it remains to account for the energy contribution inside the cubes \(Q^\nu _\rho (x)\), \(x\in P_i^\rho \), and the set \((M)_{\delta +\varepsilon }\). First, note that for \( {\bar{x}} \in M\) we have that

$$\begin{aligned} \#\big (X^\delta _\varepsilon \cap B_{\delta +\varepsilon }( {\bar{x}} )\big ) \leqq C\delta /\varepsilon . \end{aligned}$$
(3.22)

In fact, \((M)_\delta \cap X^\delta _\varepsilon = \emptyset \) by definition and thus \(X^\delta _\varepsilon \cap B_\delta ( {\bar{x}}) = \emptyset \). As \(E_\varepsilon (X_\varepsilon ^\delta ) <+\infty \), by Lemma 3.1(v) and a simple computation we get \(\#(X^\delta _\varepsilon \cap (B_{\delta +\varepsilon }({\bar{x}}){\setminus } B_\delta ({\bar{x}}))) \leqq C \varepsilon ^{-2} {\mathcal {L}}^2(B_{\delta +2\varepsilon }({\bar{x}}){\setminus } B_{\delta -\varepsilon }({\bar{x}})) \leqq C\delta /\varepsilon \) for a universal constant \(C>0\). This yields (3.22) and then by (2.3) we get

$$\begin{aligned} E_\varepsilon \big (X^\delta _\varepsilon , (M)_{\delta +\varepsilon }\big ) \leqq C\delta , \end{aligned}$$
(3.23)

where C depends also on \(\#M\). By definition of \(X^\delta _\varepsilon \), for \(x \in P_i^\rho \) we have that \(X_\varepsilon ^\delta = X_\varepsilon ^x\) in \(Q^\nu _{\rho +\varepsilon }(x) {\setminus } ( H^\varepsilon \cup (M)_\delta )\). As \(E_\varepsilon (X_\varepsilon ^\delta ) <+\infty \), we can employ Lemma 3.1(v) to deduce that \(\#(X_\varepsilon ^\delta \cap (H^\varepsilon )_\varepsilon \cap Q^\nu _\rho (x) ) \leqq \varepsilon ^{-2}C{\mathcal {L}}^2( ( H^\varepsilon \cap Q^\nu _\rho (x))_{2\varepsilon }) \leqq C\). Hence, by (2.3) we obtain

$$\begin{aligned} E_\varepsilon \big ( X^\delta _\varepsilon , Q^\nu _\rho (x)\big ) \leqq E_\varepsilon \big (X^x_\varepsilon ,Q^\nu _\rho (x)\big ) + C\varepsilon \end{aligned}$$
(3.24)

for all \(x \in P_i^\rho \) such that \( \mathrm {dist}(Q^\nu _\rho (x), (M)_\delta ) \geqq \varepsilon \). On the other hand, for \(x \in P_i^\rho \) such that \(\mathrm {dist}(Q^\nu _\rho (x), (M)_\delta ) <\varepsilon \), we use the estimate in (3.22) with \({\bar{x}} \in M\) such that \(\mathrm {dist}(Q^\nu _\rho (x), (M)_\delta ) = \mathrm {dist}(Q^\nu _\rho (x), B_\delta ({\bar{x}}))\) (and so \(\mathrm {dist}(Q^\nu _\rho (x), (M{\setminus }\{{\bar{x}}\})_\delta ) > \varepsilon \)) and obtain

$$\begin{aligned} E_\varepsilon \big ( X^\delta _\varepsilon ,Q^\nu _\rho (x)\big ) \leqq E_\varepsilon \big (X^x_\varepsilon ,Q^\nu _\rho (x)\big ) + C(\varepsilon +\delta ). \end{aligned}$$
(3.25)

Consequently, using (3.20), (3.24)–(3.25), and Lemma 3.1(iii), we obtain

$$\begin{aligned}&\sum \limits _{x \in P_i^\rho } E_\varepsilon \big (X^\delta _\varepsilon ,Q^\nu _\rho (x){\setminus } (M)_{\delta +\varepsilon } \big ) \leqq \sum \limits _{x \in {\tilde{P}}_i^\rho } E_\varepsilon \big (X^\delta _\varepsilon ,Q^\nu _\rho (x)\big ) \\&\quad \leqq \rho \lfloor l_i/\rho \rfloor \, \varphi (u^+_i,u^-_i,\nu _i) + C\delta +C\varepsilon /\rho , \end{aligned}$$

where we have set \({\tilde{P}}_i^\rho = \{ x \in P_i^\rho : Q^\nu _\rho (x)\not \subset (M)_{\delta +\varepsilon } \}\). Here, C depends on N and \(\# M\), but is independent of \(\varepsilon \), \(\delta \), and \(\rho \). Thus, by choosing \(\varepsilon \) small enough with respect to \(\rho \) (that is, with respect to \(\delta \)) we get by (3.19) that

$$\begin{aligned} \sum \limits _{x \in P_i^\rho } E_\varepsilon \big (X^\delta _\varepsilon ,Q^\nu _\rho (x){\setminus } (M)_{\delta +\varepsilon } \big )&\leqq l_i \,\varphi (u^+_i,u^-_i,\nu _i)+C\delta \nonumber \\&=\int _{\Gamma _i \cap J_u} \varphi (u^+,u^-,\nu _u)\,\mathrm {d}{\mathcal {H}}^1+C\delta . \end{aligned}$$
(3.26)

Now, by Lemma 3.1(iv), (3.21), (3.23), and (3.26) we conclude

$$\begin{aligned} E_\varepsilon (X_\varepsilon ^\delta )&\leqq \sum _{i=1}^N \sum _{x \in P_i^\rho } E_\varepsilon \big (X^\delta _\varepsilon ,Q^\nu _\rho (x){\setminus } (M)_{\delta +\varepsilon } \big ) + E_\varepsilon \big (X_\varepsilon ^\delta , (M)_{\delta +\varepsilon }\big ) \\&\quad + E_\varepsilon \big (X^\delta _\varepsilon , {\mathbb {R}}^2 {\setminus } ((M)_{\delta +\varepsilon } \cup \Gamma _\rho )\big )\\&\leqq \sum \limits _{i=1}^N \int _{\Gamma _i \cap J_u} \varphi (u^+,u^-,\nu _u)\,\mathrm {d}{\mathcal {H}}^1 + CN\delta = \int _{J_u} \varphi (u^+,u^-,\nu _u)\,\mathrm {d}{\mathcal {H}}^1+CN\delta . \end{aligned}$$

By choosing \(\delta = \delta (\varepsilon ) \rightarrow 0\) sufficiently slowly as \(\varepsilon \rightarrow 0\) we obtain \(X_\varepsilon ^{\delta (\varepsilon )} \rightarrow u\) in \(L^1_{\mathrm{loc}}(\mathbb R^2)\) (see Section 2.2 for the definition of this convergence) and

$$\begin{aligned} \limsup _{\varepsilon \rightarrow 0} E_\varepsilon (X_\varepsilon ^{\delta (\varepsilon )})&\leqq \int _{J_u} \varphi (u^+,u^-,\nu _u)\,\mathrm {d}{\mathcal {H}}^1. \end{aligned}$$

This concludes the proof. \(\square \)

To conclude the proof of the main theorems, it remains to show Proposition 2.2, Theorem 2.5, and Proposition 3.4. This is subject to the next sections.

4 Cell Formula Part I: Relation of \(L^1\)-convergence and Boundary Values

In this first part about cell formulas, we show that the condition of \(L^1\)-convergence as given in the cell formula \(\psi \), see (3.5), can be replaced by converging boundary values. More precisely, in this section we consider \(\Phi : {\mathcal {Z}}\times {\mathcal {Z}} \times {\mathbb {S}}^1\rightarrow [0,+\infty ) \) defined by

$$\begin{aligned} \Phi (z^+,z^-,\nu )&= \min \Big \{\liminf _{\varepsilon \rightarrow 0}\inf \Big \{ E_\varepsilon (X_\varepsilon ,Q^\nu (y_\varepsilon )) :\, y_\varepsilon \in \, {\mathbb {R}}^2, \, X_\varepsilon \nonumber \\&\qquad \qquad = \varepsilon {\mathscr {L}}(z_\varepsilon ^\pm ) \text { on } \partial ^\pm _\varepsilon Q^\nu (y_\varepsilon ) \Big \} :\lbrace z^\pm _\varepsilon \rbrace _\varepsilon \subset {\mathcal {Z}} \text { with } z^\pm _\varepsilon \rightarrow z^\pm \Big \}, \end{aligned}$$
(4.1)

where the identity \(X_\varepsilon =\varepsilon {\mathscr {L}}(z_\varepsilon ^\pm )\) is defined in (2.10) and \(\partial ^\pm _\varepsilon Q^\nu (y_\varepsilon )\) in (2.7). This means that near the boundary of the cube the configuration is contained in at most two different lattices \(\varepsilon {\mathscr {L}}(z^\pm _\varepsilon )\). (Less is possible if \(z^\pm _\varepsilon = {\mathbf {0}}\).) We note that the minimum in (4.1) is attained by a standard diagonal sequence argument. Our aim is to prove the following statement:

Lemma 4.1

(Relation of \(\psi \) and \(\Phi )\) Let \(z^+,z^- \in {\mathcal {Z}}\) and \(\nu \in {\mathbb {S}}^1\). Then

$$\begin{aligned} \begin{aligned} \psi (z^+,z^-,\nu ) \geqq \Phi (z^+,z^-,\nu ). \end{aligned} \end{aligned}$$
(4.2)

In Section 7, we will prove \(\Phi (z^+,z^-,\nu ) = \varphi (z^+,z^-,\nu )\) for all \(z^+, z^- \in {\mathcal {Z}}\) and \(\nu \in {\mathbb {S}}^1\), see Lemma 7.1, and Proposition 7.2. This along with Lemma 4.1 will conclude the proof of Proposition 3.4.

As it is customary in the analysis of cell formulas, the proof of Lemma 4.1 crucially relies on a cut-off argument which allows to construct configurations attaining the boundary values. Whereas for problems on Sobolev spaces this is usually achieved by a convex combination of functions, our discrete problem is considerably more delicate. In fact, on the one hand, the system is quite flexible due to the rotational and translational invariance of the atomistic energy, cf. Lemma 3.1(i). On the other hand, the system is very rigid as small changes in the configuration may induce a lot of energy due to the discontinuous interaction potential, see (1.1). This calls for a refined cut-off construction.

The construction fundamentally relies on the fact that the energy of an optimal sequence in (3.5) is concentrated asymptotically arbitrarily close to the interface. (Similar properties can be observed in related phase transition problems, see for example [12, 13, 15].) As a preliminary step, we need to show that in the definition of \(\psi \) we may replace cubes by rectangles. To this end, we introduce half-open rectangles with sides parallel to \(\nu \) by

$$\begin{aligned} R^\nu _{l,h}(y) = y + \Big \{ x \in {\mathbb {R}}^2:\, -\frac{h}{2} \leqq \langle x, \nu \rangle< \frac{h}{2}, \ -\frac{l}{2} \leqq \langle x, \nu ^\bot \rangle < \frac{l}{2} \Big \}, \end{aligned}$$
(4.3)

where \(y \in {\mathbb {R}}^2\), and \(l,h >0\). We simply write \(R^\nu _{l,h}\) instead of \(R^\nu _{l,h}(y)\) if the rectangle is centered at \(y=0\). Recall the definition in (3.4).

Lemma 4.2

(Density \(\psi \) on rectangles) For all \(z^+,z^- \in {\mathcal {Z}}\), all \(\nu \in {\mathbb {S}}^1\), and all \(l,h>0\) there holds

$$\begin{aligned} \psi (z^+,z^-,\nu )&= \inf \Big \{ \liminf _{\varepsilon \rightarrow 0} \frac{1}{l} E_\varepsilon \big (X_\varepsilon ,R^\nu _{l,h}(y_\varepsilon )\big ):\, y_\varepsilon \in {\mathbb {R}}^2, \nonumber \\&\qquad \quad \lim _{\varepsilon \rightarrow 0} \int _{R^\nu _{l,h}} |u_\varepsilon (x + y_\varepsilon )- u^{\nu }_{z^+,z^-}(x)| \, \mathrm{d}x = 0 \Big \}. \end{aligned}$$
(4.4)

Proof

For convenience, we denote the function on the right hand side of (4.4) in the variables \((z^+,z^-,\nu ,l,h)\) by \(\Psi \). We will use certain scaling properties of \(\Psi \):

$$\begin{aligned}&\Psi (z^+,z^-,\nu ,\lambda \ell ,\lambda \kappa ) =\Psi (z^+,z^-,\nu ,\ell ,\kappa ) \text { for all }\lambda >0. \end{aligned}$$
(4.5)
$$\begin{aligned}&\Psi (z^+,z^-,\nu ,\ell , \kappa ) \leqq \Psi (z^+,z^-,\nu ,\ell ,\lambda \kappa ) \text { for all }\lambda \geqq 1. \end{aligned}$$
(4.6)
$$\begin{aligned}&\Psi (z^+,z^-,\nu ,\ell ,\kappa ) \leqq \Psi (z^+,z^-,\nu ,\lambda \ell ,\kappa ) \text { for all }\lambda \in {\mathbb {N}}. \end{aligned}$$
(4.7)
$$\begin{aligned}&\ell _1\Psi (z^+,z^-,\nu ,\ell _1,\kappa ) \leqq \ell _2 \Psi (z^+,z^-,\nu ,\ell _2,\kappa ) \text { for all }0<\ell _1 \leqq \ell _2. \end{aligned}$$
(4.8)

We postpone the proof of (4.5)–(4.8) to Step 3 of the proof, and first derive the statement.

Step 1: Independence of l. We start by proving the independence of the length l, that is,

$$\begin{aligned} \Psi (z^+,z^-,\nu ,l, h) = \Psi (z^+,z^-,\nu ,\mu l,h) \end{aligned}$$
(4.9)

for all \(\mu >0\). To this end, consider first \(\mu \in {\mathbb {N}}\). Using (4.5) and then (4.6) with \(\lambda = \mu \), \(\ell = l\), and \(\kappa = h /\mu \), we obtain

$$\begin{aligned} \Psi (z^+,z^-,\nu , \mu l, h) = \Psi \left( z^+,z^-,\nu , l, h/\mu \right) \leqq \Psi \left( z^+,z^-,\nu , l, h\right) . \end{aligned}$$

By (4.7) for \(\mu \in {\mathbb {N}}\) it holds that \(\Psi (z^+,z^-,\nu , \mu l, h) \geqq \Psi (z^+,z^-,\nu ,l,h)\). Combining the estimates we get

$$\begin{aligned} \Psi (z^+,z^-,\nu , \mu l, h) = \Psi (z^+,z^-,\nu ,l,h) \end{aligned}$$
(4.10)

for \(\mu \in \mathbb N\). Now substituting l with \(\frac{l}{\mu }\) in the previous equation, we obtain

$$\begin{aligned} \Psi (z^+,z^-,\nu , l, h) = \Psi (z^+,z^-,\nu ,l/\mu ,h) \end{aligned}$$
(4.11)

for all \(\mu \in {\mathbb {N}}\) and \(l >0\). Hence, due to (4.10) and (4.11), equality (4.9) holds for all \(\mu \in {\mathbb {Q}}^+\).

Now, for general \(\mu >0\), we take a sequence \(\{\mu _n\}_n \subset {\mathbb {Q}}\) such that \(\mu _n \leqq \mu _{n+1}\) for all \(n \in {\mathbb {N}}\) and \(\mu _n \rightarrow \mu \). By (4.8) and the fact that (4.9) holds for all \(\mu \in {\mathbb {Q}}\), we obtain

$$\begin{aligned} \Psi (z^+,z^-,\nu , l, h) = \Psi (z^+,z^-,\nu , \mu _n l, h) \leqq \frac{\mu }{\mu _n}\Psi (z^+,z^-,\nu , \mu l, h). \end{aligned}$$

Taking \(n\rightarrow +\infty \), we obtain

$$\begin{aligned} \Psi (z^+,z^-,\nu , l, h) \leqq \Psi (z^+,z^-,\nu , \mu l, h). \end{aligned}$$
(4.12)

This yields one inequality in (4.9). Applying (4.12) for \(\lambda \) in place of \(\mu \) and \(l/\lambda \) in place of l we also get

$$\begin{aligned} \Psi (z^+,z^-,\nu , l, h)=\Psi (z^+,z^-,\nu , \lambda l/\lambda , h) \geqq \Psi (z^+,z^-,\nu , l/\lambda , h). \end{aligned}$$

If we choose \(\lambda = \mu ^{-1}\), we get the other inequality in (4.9).

Step 2: Independence of h. Let \(\mu >0\). By first applying (4.5) and then (4.9) we obtain

$$\begin{aligned} \Psi (z^+,z^-,\nu ,l,h) = \Psi \left( z^+,z^-,\nu ,\mu l,\mu h\right) = \Psi (z^+,z^-,\nu , l,\mu h). \end{aligned}$$

This yields the desired independence of the height h.

Step 3: Proof of (4.5)–(4.8). It remains to prove (4.5)–(4.8).

Step 3.1: Proof of (4.5). Fix \(\lambda , \ell ,\kappa >0\). Let \(X_\varepsilon \subset {\mathbb {R}}^2\) and \(y_\varepsilon \in {\mathbb {R}}^2\) be given such that \(\lim \nolimits _{\varepsilon \rightarrow 0} \int _{R^\nu _{\ell ,\kappa }} |u_\varepsilon (x + y_\varepsilon )- u^{\nu }_{z^+,z^-}(x)| \, \mathrm{d}x = 0\) and

$$\begin{aligned} \Psi (z^+,z^-,\nu ,\ell ,\kappa ) = \liminf _{\varepsilon \rightarrow 0} \frac{1}{\ell }E_\varepsilon \big (X_\varepsilon ,R^\nu _{ \ell , \kappa }(y_\varepsilon )\big ). \end{aligned}$$
(4.13)

(By a standard diagonal sequence argument the infimum on the right hand side of (4.4) is attained.) Set \(X_\varepsilon ^\lambda =\lambda X_\varepsilon \). By (3.1) we get that the corresponding functions \(u_{\lambda \varepsilon }^\lambda \), see (2.15), satisfy \(u^\lambda _{\lambda \varepsilon }( x)=u_\varepsilon (\lambda ^{-1} x)\) for all \(x \in {\mathbb {R}}^2\). Change of variables \(y = \lambda ^{-1}x\) and \(u^{\nu }_{z^+,z^-}(y) = u^{\nu }_{z^+,z^-}(\lambda y)\) imply

$$\begin{aligned}&\lim _{\varepsilon \rightarrow 0} \int _{R^\nu _{\lambda \ell ,\lambda \kappa }} |u_{\lambda \varepsilon }^\lambda (x + \lambda y_\varepsilon )- u^{\nu }_{z^+,z^-}(x)| \, \mathrm{d}x\\&\quad = \lim _{\varepsilon \rightarrow 0} \lambda ^2 \int _{R^\nu _{\ell ,\kappa }} |u_\varepsilon (y + y_\varepsilon )- u^{\nu }_{z^+,z^-}(y)| \, \mathrm{d}y= 0. \end{aligned}$$

Using Lemma 3.1(ii) along with (4.13) and the definition of \(\Psi \), we obtain

$$\begin{aligned} \Psi (z^+,z^-,\nu , \lambda \ell ,\lambda \kappa )&\leqq \liminf _{\varepsilon \rightarrow 0} \frac{1}{\lambda \ell }E_{\lambda \varepsilon }\big (X_\varepsilon ^\lambda ,R^\nu _{\lambda \ell ,\lambda \kappa }( \lambda y_\varepsilon )\big ) \\ {}&= \liminf _{\varepsilon \rightarrow 0} \frac{1}{\ell }E_\varepsilon \big (X_\varepsilon ,R^\nu _{ \ell , \kappa }(y_\varepsilon )\big ) = \Psi (z^+,z^-,\nu ,\ell ,\kappa ). \end{aligned}$$

By exchanging \(\lambda \) with \(\frac{1}{\lambda }\) and \(\ell ,\kappa \) with \(\lambda \ell ,\lambda \kappa \), respectively, we obtain (4.5).

Step 3.2: Proof of (4.6). Fix \(\lambda \geqq 1\) and \(\ell , \kappa >0\). Consider \(X_\varepsilon \subset {\mathbb {R}}^2\) and \(y_\varepsilon \in {\mathbb {R}}^2\) such that \(\lim \nolimits _{\varepsilon \rightarrow 0} \int _{R^\nu _{\ell ,\lambda \kappa }} |u_\varepsilon (x + y_\varepsilon )- u^{\nu }_{z^+,z^-}(x)| \, \mathrm{d}x = 0\) and

$$\begin{aligned} \Psi (z^+,z^-,\nu ,\ell ,\lambda \kappa ) = \liminf _{\varepsilon \rightarrow 0} \frac{1}{\ell }E_\varepsilon \big (X_\varepsilon ,R^\nu _{ \ell , \lambda \kappa }(y_\varepsilon )\big ). \end{aligned}$$

By Lemma 3.1(iii) and the definition of \(\Psi \) we get

$$\begin{aligned}&\Psi (z^+,z^-,\nu ,\ell , \kappa ) \leqq \liminf _{\varepsilon \rightarrow 0} \frac{1}{\ell }E_\varepsilon \big (X_\varepsilon ,R^\nu _{\ell , \kappa }(y_\varepsilon )\big )\leqq \liminf _{\varepsilon \rightarrow 0} \frac{1}{\ell }E_\varepsilon \big (X_\varepsilon ,R^\nu _{\ell ,\lambda \kappa }(y_\varepsilon )) \\&\quad = \Psi (z^+,z^-,\nu ,\ell ,\lambda \kappa \big ). \end{aligned}$$

Step 3.3: Proof of (4.7). Let \(\lambda \in {\mathbb {N}}\) and \(\ell , \kappa >0\). Consider \(X_\varepsilon \subset {\mathbb {R}}^2\) and \(y_\varepsilon \in {\mathbb {R}}^2\) such that

$$\begin{aligned} \lim \limits _{\varepsilon \rightarrow 0} \int _{R^\nu _{\lambda \ell , \kappa }} |u_\varepsilon (x + y_\varepsilon )- u^{\nu }_{z^+,z^-}(x)| \, \mathrm{d}x = 0 \end{aligned}$$
(4.14)

and

$$\begin{aligned} \Psi (z^+,z^-,\nu ,\lambda \ell ,\kappa ) = \liminf _{\varepsilon \rightarrow 0} \frac{1}{\lambda \ell }E_\varepsilon \big (X_\varepsilon ,R^\nu _{ \lambda \ell , \kappa }(y_\varepsilon )\big ). \end{aligned}$$
(4.15)

We decompose the half-open rectangle \(R^\nu _{ \lambda \ell , \kappa }(y_\varepsilon )\) into pairwise disjoint half-open rectangles of the form

$$\begin{aligned} R^\nu _{ \lambda \ell , \kappa }(y_\varepsilon ) = \bigcup \limits _{j=0}^{\lambda -1} R^\nu _{ \ell , \kappa } (y_j^\varepsilon ), \end{aligned}$$

where \(y_j^\varepsilon = y_\varepsilon + \frac{2 j - \lambda + 1}{2} \ell \nu ^\perp \). Now, using Lemma 3.1(iv), we derive that there exists \(j_0\) such that

$$\begin{aligned} E_\varepsilon \big (X_\varepsilon ,R^\nu _{ \ell , \kappa }(y^\varepsilon _{j_0})\big )&\leqq \frac{1}{\lambda } \sum \limits _{j=0}^{\lambda -1}E_\varepsilon \big (X_\varepsilon ,R^\nu _{\ell , \kappa }(y_j^\varepsilon ) \big ) = \frac{1}{\lambda }E_\varepsilon \big (X_\varepsilon ,R_{\lambda \ell , \kappa }^\nu (y_\varepsilon ) \big ). \end{aligned}$$
(4.16)

By (4.14) and the fact that \(u^{\nu }_{z^+,z^-}(x) = u^{\nu }_{z^+,z^-}(x +t\nu ^\bot )\) for all \(x\in {\mathbb {R}}^2\) and \(t \in {\mathbb {R}}\), see (3.4), we get that \(\lim \nolimits _{\varepsilon \rightarrow 0} \int _{R^\nu _{\ell , \kappa }} |u_\varepsilon (x + y^\varepsilon _{j_0})- u^{\nu }_{z^+,z^-}(x)| \, \mathrm{d}x = 0\). By the definition of \(\Psi \), (4.15), and (4.16) this yields

$$\begin{aligned}&\Psi (z^+,z^-, \nu , \ell , \kappa ) \leqq \liminf _{\varepsilon \rightarrow 0} \frac{1}{\ell } E_\varepsilon \big (X_\varepsilon ,R^\nu _{ \ell , \kappa }(y_{j_0}^\varepsilon )\big ) \leqq \liminf _{\varepsilon \rightarrow 0} \frac{1}{\lambda \ell }E_\varepsilon \big (X_\varepsilon ,R_{\lambda \ell ,\kappa }^\nu (y_\varepsilon )\big )\\&\quad = \Psi (z^+,z^-, \nu , \lambda \ell , \kappa ). \end{aligned}$$

This implies (4.7).

Step 3.4: Proof of (4.8). Let \(0<\ell _1 \leqq \ell _2\). Consider \(X_\varepsilon \subset {\mathbb {R}}^2\) and \(y_\varepsilon \in {\mathbb {R}}^2\) such that \(\lim \nolimits _{\varepsilon \rightarrow 0} \int _{R^\nu _{\ell _2, \kappa }} |u_\varepsilon (x + y_\varepsilon )- u^{\nu }_{z^+,z^-}(x)| \, \mathrm{d}x = 0\) and

$$\begin{aligned} \Psi (z^+,z^-,\nu ,\ell _2,\kappa ) = \liminf _{\varepsilon \rightarrow 0} \frac{1}{\ell _2}E_\varepsilon \big (X_\varepsilon ,R^\nu _{ \ell _2 , \kappa }(y_\varepsilon )\big ). \end{aligned}$$

By using Lemma 3.1(iii) along with \(\ell _2 \geqq \ell _1\) and the definition of \(\Psi \) we get

$$\begin{aligned} \Psi (z^+,z^-,\nu ,\ell _1, \kappa )&\leqq \liminf _{\varepsilon \rightarrow 0} \frac{1}{\ell _1}E_\varepsilon \big (X_\varepsilon ,R^\nu _{\ell _1, \kappa }(y_\varepsilon )\big ) \leqq \liminf _{\varepsilon \rightarrow 0} \frac{1}{\ell _1}E_\varepsilon \big (X_\varepsilon ,R^\nu _{\ell _2, \kappa }(y_\varepsilon )\big ) \\ {}&= \frac{\ell _2}{\ell _1}\liminf _{\varepsilon \rightarrow 0} \frac{1}{\ell _2}E_\varepsilon \big (X_\varepsilon ,R^\nu _{\ell _2, \kappa }(y_\varepsilon )\big )= \frac{\ell _2}{\ell _1}\Psi (z^+,z^-,\nu ,\ell _2, \kappa ). \end{aligned}$$

This yields (4.8) and concludes the proof. \(\square \)

We now proceed with the proof of Lemma 4.1.

Proof of Lemma 4.1

In view of (3.5), we can choose a subsequence in \(\varepsilon \) (not relabeled) and configurations \(X_\varepsilon \subset {\mathbb {R}}^2\) and \(y_\varepsilon \in {\mathbb {R}}^2\) such that \( \lim \nolimits _{\varepsilon \rightarrow 0} \int _{Q^\nu } |u_\varepsilon (x + y_\varepsilon )- u^{\nu }_{z^+,z^-}(x)| \, \mathrm{d}x = 0\) and

$$\begin{aligned} \psi (z^+,z^-,\nu ) = \lim _{\varepsilon \rightarrow 0} E_\varepsilon \big (X_\varepsilon ,Q^\nu (y_\varepsilon )\big ). \end{aligned}$$
(4.17)

We perform a refined cut-off construction and split the proof into several steps. As explained above, the construction is quite delicate due to the fact that the energy is very sensitive to small changes of the configurations. First, we use Lemma 4.2 to prove that the energy of \(X_\varepsilon \) concentrates around a strip close to the limiting interface (Step 1). This allows us to select one dominant component on each side of the interface, that is, on the upper and the lower half-cube (Step 2). Here, the notion “component” refers to a subset of a specific triangular lattice.

Our goal in the subsequent steps is to modify the configuration \(X_\varepsilon \) such that it coincides with these lattices near the boundary of the upper and lower half-cube, respectively. In Step 3, we give a precise cardinality estimate on the number of points that differ from the lattices of the two dominant components in terms of \(\mathrm{o}(\varepsilon ^{-2})\). In Step 4, we select a “good layer” where we can modify our configuration. “Good” means here that, in that layer, the configuration coincides with the lattice of the dominant component up to \(o(\varepsilon ^{-1})\) atoms. In Step 5, we show that the configuration constructed in Step 4 is an asymptotic energy lower bound for the original configuration. Finally, in Step 6, we conclude by observing that the constructed configuration is a competitor in the definition of \(\Phi \). We will perform this construction under the assumption that in both the upper and the lower half-cube there exist (dominant) lattices. The case of vacuum calls for small adaptions which are described at the end in Step 7.

Step 1: The energy concentrates near the line \(\{\langle \nu , (x-y_\varepsilon )\rangle =0\}\). Recall (4.3). We show that for all \(\delta \in (0,1)\) it holds that

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} E_\varepsilon \big (X_\varepsilon ,Q^\nu (y_\varepsilon ){\setminus } R^\nu _{1,\delta }(y_\varepsilon )\big ) = 0. \end{aligned}$$
(4.18)

By Lemma 3.1(iii), Lemma 4.2, (4.17), and the fact that \(\lbrace X_\varepsilon \rbrace _\varepsilon \) is admissible in the definition of \(\psi \) on \(R^\nu _{1,\delta }\), see (4.4), we obtain

$$\begin{aligned} \psi (z^+,z^-,\nu ) \leqq \liminf _{\varepsilon \rightarrow 0} E_\varepsilon \big (X_\varepsilon ,R^\nu _{1,\delta }(y_\varepsilon )\big ) \leqq \lim _{\varepsilon \rightarrow 0} E_\varepsilon \big (X_\varepsilon ,Q^\nu (y_\varepsilon )\big ) = \psi (z^+,z^-,\nu ). \end{aligned}$$

Lemma 3.1(iv) then implies

$$\begin{aligned} 0&\leqq \limsup _{\varepsilon \rightarrow 0} E_\varepsilon \big (X_\varepsilon ,Q^\nu (y_\varepsilon ){\setminus } R^\nu _{1,\delta }(y_\varepsilon )\big )\\&= \limsup _{\varepsilon \rightarrow 0} \Big ( E_\varepsilon \big (X_\varepsilon ,Q^\nu (y_\varepsilon )\big ) -E_\varepsilon \big (X_\varepsilon ,R^\nu _{1,\delta }(y_\varepsilon )\big )\Big )\\ {}&\leqq \lim _{\varepsilon \rightarrow 0} E_\varepsilon \big (X_\varepsilon ,Q^\nu (y_\varepsilon )\big ) - \liminf _{\varepsilon \rightarrow 0}E_\varepsilon \big (X_\varepsilon ,R^\nu _{1,\delta }(y_\varepsilon )\big )=0. \end{aligned}$$

This yields (4.18) and concludes Step 1.

In order to shorten the notation, we omit the dependence on the center \(y_\varepsilon \) and simply write \(Q^\nu _\rho \) instead of \(Q^\nu _\rho (y_\varepsilon )\) for \(\rho >0\) and \(R^\nu _{1,\delta }\) instead of \(R^\nu _{1,\delta }(y_\varepsilon )\). For brevity, we also define (omitting the center \(y_\varepsilon \)) the rectangles \(P_{\delta ,\varepsilon }^\pm = Q^{\nu ,\pm }_{1-\varepsilon }{\setminus } R^\nu _{1-\varepsilon ,\delta }\), where \(Q^{\nu ,\pm }_{1-\varepsilon }\) is defined below (2.6). We will prove all auxiliary statements along the proof for the upper half-cube \(Q^{\nu ,+}\) only since the arguments for the lower one are analogous. In what follows, \(\delta \in (0,1)\) is fixed sufficiently small. Without restriction, we may suppose that \(\varepsilon \ll \delta \).

Step 2: Single dominant component in the upper and lower half. We prove that there exist sequences \(\{z^\pm _\varepsilon \}_\varepsilon \subset {\mathcal {Z}}\) such that \(z^\pm _\varepsilon \rightarrow z^\pm \) and

$$\begin{aligned} {\mathcal {L}}^2\big (\{u_\varepsilon \ne z^\pm _\varepsilon \} \cap P_{\delta ,\varepsilon }^\pm \big )\leqq C E_\varepsilon \big (X_\varepsilon ,Q^\nu {\setminus } R^\nu _{1,\delta /2}\big ), \end{aligned}$$
(4.19)

where \(C>0\) is a universal constant independent of \(\varepsilon \).

Recall by (2.11) and (2.15) that the function \(u_\varepsilon \) can be written as \(u_\varepsilon = \sum \nolimits _{j=1}^\infty \chi _{G^\varepsilon _j} z^\varepsilon _j\) for pairwise distinct \(\lbrace z^\varepsilon _j \rbrace _j \subset {\mathcal {Z}} {\setminus } \lbrace {\mathbf {0}} \rbrace \) and pairwise disjoint \(\lbrace G^\varepsilon _j \rbrace _j \subset {\mathbb {R}}^2\). By Proposition 3.3 (more precisely, see (3.3)), (2.4), and Lemma 3.1(iii) we have

$$\begin{aligned} \sum \limits _{j=1}^\infty {\mathcal {H}}^1( \partial ^* G^\varepsilon _j \cap P_{\delta ,\varepsilon }^+ ) \leqq CE_\varepsilon \big (X_\varepsilon ,(P_{\delta ,\varepsilon }^+)_\varepsilon \big ) \leqq C E_\varepsilon \big (X_\varepsilon ,Q^\nu {\setminus } R^\nu _{1,\delta /2}\big ), \end{aligned}$$
(4.20)

where in the last step we used \((P_{\delta ,\varepsilon }^+)_\varepsilon \subset Q^\nu {\setminus } R^\nu _{1,\delta /2}\). We also define the vacuum inside \(Q^\nu \) by \(G^\varepsilon _0 := Q^\nu {\setminus } \bigcup _{j=1}^\infty G^\varepsilon _j\). By the relative isoperimetric inequality (see for example [22, Theorem 2, Section 5.6.2]), there exists \(c>0\) such that for all \(j \in \mathbb N_0\) that holds

$$\begin{aligned} \min \big \{ {\mathcal {L}}^2(G_j^\varepsilon \cap&P_{\delta ,\varepsilon }^+), {\mathcal {L}}^2( P_{\delta ,\varepsilon }^+{\setminus } G_j^\varepsilon ) \big \}\nonumber \\&\quad \leqq \min \big \{ {\mathcal {L}}^2(G_j^\varepsilon \cap P_{\delta ,\varepsilon }^+), {\mathcal {L}}^2( P_{\delta ,\varepsilon }^+{\setminus } G_j^\varepsilon ) \big \}^{1/2} {\mathcal {L}}^2(P_{\delta ,\varepsilon }^+)^{1/2} \nonumber \\&\quad \leqq c{\mathcal {H}}^1(\partial ^* G_j^\varepsilon \cap P_{\delta ,\varepsilon }^+), \end{aligned}$$
(4.21)

where we used \({\mathcal {L}}^2(P_{\delta ,\varepsilon }^+)\leqq 1\). (Note that the theorem in the reference above is stated and proved in a ball, but that the argument only relies on Poincaré inequalities, and thus easily extends to the rectangles \(P_{\delta ,\varepsilon }^+\). Since the ratio of length and width is controlled, the constant is independent of \(\delta \) and \(\varepsilon \).) Then, from (4.20), (4.21), and \(\partial ^* G^\varepsilon _0 \cap P_{\delta ,\varepsilon }^+ \subset \bigcup _{j=1}^\infty (\partial ^* G^\varepsilon _j \cap P_{\delta ,\varepsilon }^+)\) it follows that

$$\begin{aligned} \sum \limits _{j=0}^\infty \min \big \{ {\mathcal {L}}^2(G_j^\varepsilon \cap P_{\delta ,\varepsilon }^+), \, {\mathcal {L}}^2(P_{\delta ,\varepsilon }^+{\setminus } G_j^\varepsilon ) \big \} \leqq C E_\varepsilon \big (X_\varepsilon ,Q^\nu {\setminus } R^\nu _{1,\delta /2}\big ). \end{aligned}$$
(4.22)

We now get that there is a unique dominant component, that is, there exists \(j_\varepsilon \in \mathbb N_0\) such that

$$\begin{aligned} {\mathcal {L}}^2(G_{j_\varepsilon }^\varepsilon \cap P_{\delta ,\varepsilon }^+) > \frac{1}{2}{\mathcal {L}}^2(P_{\delta ,\varepsilon }^+). \end{aligned}$$
(4.23)

In fact, assume by contradiction that this were not the case. Then, we get, for all \(j \in \mathbb N_0\),

$$\begin{aligned} \min \big \{ {\mathcal {L}}^2(G_j^\varepsilon \cap P_{\delta ,\varepsilon }^+), {\mathcal {L}}^2( P_{\delta ,\varepsilon }^+{\setminus } G_j^\varepsilon ) \big \} = {\mathcal {L}}^2(G_j^\varepsilon \cap P_{\delta ,\varepsilon }^+). \end{aligned}$$

By using (4.22) we obtain \( {\mathcal {L}}^2(P_{\delta ,\varepsilon }^+) = \sum _{j=0}^\infty {\mathcal {L}}^2(G_j^\varepsilon \cap P_{\delta ,\varepsilon }^+) \leqq C E_\varepsilon \big (X_\varepsilon ,Q^\nu {\setminus } R^\nu _{1,\delta /2}\big ). \) This contradicts (4.18) for \(\varepsilon \) small enough. Now (4.22) and (4.23) imply (4.19) for the choice \(z^+_\varepsilon = z_{j_\varepsilon }^\varepsilon \).

To conclude this step, we note that the convergence \( \lim \nolimits _{\varepsilon \rightarrow 0} \int _{Q^\nu } |u_\varepsilon (x + y_\varepsilon )- u^{\nu }_{z^+,z^-}(x)| \, \mathrm{d}x = 0\) along with (4.23) also yields \(z^+_\varepsilon \rightarrow z^+\).

The rest of the proof is divided into two cases: (a) \(z_\varepsilon ^+ \ne {\mathbf {0}}\) and (b) \(z_\varepsilon ^+ = {\mathbf {0}}\), that is, \(X_\varepsilon \) converges to a lattice in the upper half of the cube or there is vacuum. We perform the proof for case (a). At the end of the proof (Step 7), we indicate the necessary changes to treat case (b).

Step 3: Cardinality estimate. We prove that there exists \(C>0\) such that

$$\begin{aligned} \varepsilon ^2 \#\left( \big ( \varepsilon {\mathscr {L}}(z^\pm _\varepsilon ) \triangle X_\varepsilon \big ) \cap P_{\delta ,\varepsilon }^\pm \right) \leqq CE_\varepsilon \big (X_\varepsilon , Q^\nu {\setminus } R^\nu _{1,\delta /2}\big ), \end{aligned}$$
(4.24)

where here and in what follows \(\triangle \) denotes the symmetric difference of sets. First, consider some \(x \in (\varepsilon {\mathscr {L}}(z^+_\varepsilon ) {\setminus } X_\varepsilon ) \cap P_{\delta ,\varepsilon }^+\). Then, by the definition of \(u_\varepsilon \) in (2.15) we get

$$\begin{aligned} u_\varepsilon (y) \ne z^+_\varepsilon \text { for all } y \in B_{\varepsilon /4}(x). \end{aligned}$$
(4.25)

Indeed, otherwise we would find \(y \in B_{\varepsilon /4}(x)\) and \(x' \in X_\varepsilon \cap B_{\varepsilon /\sqrt{3}}(y)\) with \(\# {\mathcal {N}}_\varepsilon (x') = 6\) and \(\lbrace x' \rbrace \cup {\mathcal {N}}_\varepsilon (x') \subset \varepsilon {\mathscr {L}}(z^+_\varepsilon )\). The latter follows from the fact that \( V_\varepsilon ^{z^+_\varepsilon }(x') \subset B_{\varepsilon /\sqrt{3}}(x') \). In particular, we have \(x' \in \varepsilon {\mathscr {L}}(z^+_\varepsilon )\) and \(|x-x'| \leqq |x-y| + |y-x'| \leqq \varepsilon /4 + \varepsilon /\sqrt{3} < \varepsilon \). This, however, is impossible, since \(|x_1-x_2| \geqq \varepsilon \) for all \(x_1,x_2\in \varepsilon {\mathscr {L}}(z^+_\varepsilon )\), \(x_1 \ne x_2\).

On the other hand, if there exists \(x \in (X_\varepsilon {\setminus } \varepsilon {\mathscr {L}}(z^+_\varepsilon )) \cap P_{\delta ,\varepsilon }^+ \), then we find \(x_0 \in \varepsilon {\mathscr {L}}(z^+_\varepsilon ) \cap P_{\delta ,\varepsilon }^+ \) with \(|x_0-x| < \varepsilon \). Clearly, \(x_0 \notin X_\varepsilon \) by (1.1) and the fact that \(E_\varepsilon (X_\varepsilon )<+\infty \). Repeating the reasoning in (4.25) we find that

$$\begin{aligned} u_\varepsilon (y) \ne z^+_\varepsilon \text { for all } y \in B_{\varepsilon /4}(x_0). \end{aligned}$$
(4.26)

Note that, in this procedure, \(x_0\) can be chosen for at most six \(x\in X_\varepsilon \) independently of \(\varepsilon \) since \(\#(X_\varepsilon \cap B_\varepsilon (x_0))\leqq 6\) due to \(E_\varepsilon (X_\varepsilon )<+\infty \). Using (4.19), \({\mathcal {L}}^2(B_{\varepsilon /4}(x) \cap P_{\delta ,\varepsilon }^+ ) \geqq c\varepsilon ^2 \) for all \(x \in \varepsilon {\mathscr {L}}(z^+_\varepsilon ) \cap P_{\delta ,\varepsilon }^+\), and (4.25)–(4.26) we conclude

$$\begin{aligned} \varepsilon ^2 \#\left( \big ( \varepsilon {\mathscr {L}}(z^+_\varepsilon ) \triangle X_\varepsilon \big ) \cap P_{\delta ,\varepsilon }^+ \right) \leqq C {\mathcal {L}}^2 \big ( \{u_\varepsilon \ne z^+_\varepsilon \} \cap P_{\delta ,\varepsilon }^+\big ) \leqq CE_\varepsilon \big (X_\varepsilon , Q^\nu {\setminus } R^\nu _{1,\delta /2}\big ). \end{aligned}$$

Step 4: Cut-off construction. In this step, we construct a new configuration \(Y_\varepsilon ^+ \subset \mathbb R^2\) such that \( Y_\varepsilon ^+ = \varepsilon {\mathscr {L}}(z_\varepsilon ^+) \text { on } \partial ^+_\varepsilon Q^\nu \), see (2.7). This construction changes the configuration in the upper half-cube \(Q^{\nu ,+}\). Step 5 then shows that the energy of \(Y^+_\varepsilon \) is asymptotically equal to the one of \(X_\varepsilon \). The procedure can then be repeated on the lower half-cube. We defer this to Step 6 below.

Set \(N_\varepsilon = \left\lfloor \frac{\delta }{6\varepsilon }\right\rfloor \). (Here and in the sequel, we do not highlight the dependence on \(\delta \) to save notation.) For \(k\in \lbrace 0,\ldots ,N_\varepsilon +1\rbrace \) we let \(r_k = 1-\delta +3k\varepsilon \) and define the layers

$$\begin{aligned} S_k^\varepsilon = \big (Q^{\nu ,+}_{r_k} {\setminus } Q^{\nu ,+}_{r_{k-1}} \big ) {\setminus } R^\nu _{1,\delta }. \end{aligned}$$
(4.27)

For \(k\in \lbrace 1,\ldots ,N_\varepsilon \rbrace \) we also define the “thickened layers” \(L_{k}^\varepsilon =S_{k-1}^\varepsilon \cup S_{k}^\varepsilon \cup S_{k+1}^\varepsilon \). Our goal is to perform a transition to the lattice \(\varepsilon {\mathscr {L}}(z^+_\varepsilon )\) on one of these layers. To this end, we choose a convenient layer by an averaging argument: by (4.24) there exists \(k_\varepsilon \in \{1,\ldots ,N_\varepsilon \}\) such that

$$\begin{aligned} \#\left( ( \varepsilon {\mathscr {L}}(z_\varepsilon ^+) \triangle X_\varepsilon ) \cap L_{k_\varepsilon }^\varepsilon \right)&\leqq \frac{1}{N_\varepsilon }\sum \limits _{k=1}^{N_\varepsilon }\#\left( ( \varepsilon {\mathscr {L}}(z_\varepsilon ^+) \triangle X_\varepsilon ) \cap L_{k}^\varepsilon \right) \nonumber \\&\leqq \frac{3}{N_\varepsilon }\,\#\left( ( \varepsilon {\mathscr {L}}(z_\varepsilon ^+) \triangle X_\varepsilon ) \cap P_{\delta ,\varepsilon }^+ \right) \leqq \frac{C}{\varepsilon \delta } E_\varepsilon \big (X_\varepsilon ,Q^\nu {\setminus } R^\nu _{1,\delta /2}\big ). \end{aligned}$$
(4.28)

Here, we used \(L_{k}^\varepsilon \subset P_{\delta ,\varepsilon }^+\) for all k and \(\varepsilon \delta \leqq CN_\varepsilon \varepsilon ^2\). The factor 3 is due to the fact that we count each strip \(S_k^\varepsilon \) at most three times. Set \(D^\varepsilon :=Q^\nu _{r_{k_\varepsilon -1}} \cup ( Q^{\nu ,-} {\setminus } R^\nu _{1,\delta })\). We now define \(Y_\varepsilon ^+\) by

$$\begin{aligned} Y_\varepsilon ^+ = {\left\{ \begin{array}{ll} \varepsilon {\mathscr {L}}(z^+_\varepsilon ) &{}\text {in }\,\, (P_{\delta ,\varepsilon }^+ {\setminus } Q^\nu _{r_{k_\varepsilon }}) \cup \partial ^+_\varepsilon Q^\nu ,\\ \emptyset &{} \text {in }\,\, (R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon -1}}) {\setminus } (\partial ^+_\varepsilon Q^\nu \cup \partial ^-_\varepsilon Q^\nu ), \\ X_\varepsilon \cap \varepsilon {\mathscr {L}}(z^+_\varepsilon ) &{}\text {in }\,\, S_{k_\varepsilon }^\varepsilon ,\\ X_\varepsilon &{}\text {in }\,\, D^\varepsilon \cup \partial ^-_\varepsilon Q^\nu . \end{array}\right. } \end{aligned}$$
(4.29)

See Fig. 7 for an illustration of the different regions. We briefly explain the definition. In \(D^\varepsilon \cup \partial ^-_\varepsilon Q^\nu \), the configuration remains unchanged, and near the boundary of the upper half-cube it coincides with the lattice \(\varepsilon {\mathscr {L}}(z^+_\varepsilon )\). In \(S_{k_\varepsilon }^\varepsilon \), we use the intersection \(X_\varepsilon \cap \varepsilon {\mathscr {L}}(z^+_\varepsilon )\). In this sense, \(S_{k_\varepsilon }^\varepsilon \) can be understood as a transition layer. Eventually, small regions near the boundary close to the interface \(\partial Q^{\nu ,+} \cap \partial Q^{\nu ,-}\) do not contain atoms. This is convenient since in this region the energy of the original configuration possibly does not vanish. Note that the latter ensures that \(|y_1-y_2| \geqq \varepsilon \) for all \(y_1,y_2 \in Y_\varepsilon ^+, y_1 \ne y_2,\) and therefore

$$\begin{aligned} E_\varepsilon (Y_\varepsilon ^+) < + \infty . \end{aligned}$$
(4.30)

Finally, we point out that \(Y_\varepsilon ^+ \not \subset Q^\nu \) due to the definition of \(\partial _\varepsilon ^\pm Q^\nu \) in (2.7), see also Fig. 3.

Fig. 7
figure 7

The different regions for \(Y_\varepsilon ^+\) inside \(Q^\nu \): dark gray region \(D^\varepsilon \cup \partial ^-_\varepsilon Q^\nu \), gray region \((R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon -1}}){\setminus } (\partial ^+_\varepsilon Q^\nu \cup \partial ^-_\varepsilon Q^\nu )\), light gray region \(S_{k_\varepsilon }^\varepsilon \), and white region \( (P_{\delta ,\varepsilon }^+ {\setminus } Q^\nu _{r_{k_\varepsilon }}) \cup \partial ^+_\varepsilon Q^\nu \). The two dashed lines enclose the region \(R^\nu _{1,\delta }\)

Step 5: Energy estimate. In this step we show that the energy of the configuration constructed in Step 4 is asymptotically controlled by the original energy, that is,

$$\begin{aligned} \liminf _{\varepsilon \rightarrow 0} E_\varepsilon (Y_\varepsilon ^+,Q^\nu ) \leqq \liminf _{\varepsilon \rightarrow 0} E_\varepsilon (X_\varepsilon ,Q^\nu ) + C\delta \end{aligned}$$
(4.31)

for some universal \(C>0\). In order to obtain (4.31), we distinguish three regions:

$$\begin{aligned} A^\varepsilon _1= \overline{(R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon -1}})_\varepsilon }, \ \ \ \ \ \ A^\varepsilon _2= \overline{(S_{k_\varepsilon }^\varepsilon )_\varepsilon }{\setminus } A^\varepsilon _1, \ \ \ \ \ \ A^\varepsilon _3 = Q^\nu {\setminus } (A^\varepsilon _1 \cup A^\varepsilon _2). \end{aligned}$$
(4.32)

Energy estimate on \(A^\varepsilon _1\): We claim that there exists a universal \(C>0\) such that

$$\begin{aligned} E_\varepsilon (Y_\varepsilon ^+,A^\varepsilon _1) \leqq C\delta . \end{aligned}$$
(4.33)

In fact, due to (4.29), we have \(Y_\varepsilon ^+ \cap (R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon -1}})= ( \varepsilon {\mathscr {L}}(z^+_\varepsilon ) \cap R^\nu _{1,\delta } \cap \partial ^+_\varepsilon Q^\nu ) \cup ( X_\varepsilon \cap R^\nu _{1,\delta } \cap \partial ^-_\varepsilon Q^\nu )\). As \({\mathcal {L}}^2((R^\nu _{1,\delta } \cap \partial ^\pm _\varepsilon Q^\nu )_\varepsilon ) \leqq C \delta \varepsilon \), see (2.7) and (4.3), by Lemma 3.1(v) we get

$$\begin{aligned} \#\big (Y_\varepsilon ^+ \cap (R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon -1}})\big ) \leqq C\delta /\varepsilon . \end{aligned}$$
(4.34)

Here, Lemma 3.1 is applicable by (4.30). Additionally, we note that \(R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon -1}}\) consists of two rectangles and we have \({\mathcal {H}}^1(\partial (R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon -1}})) \leqq C\delta \). Hence, by Lemma 3.1(v) we obtain

$$\begin{aligned} \# \big ((A^\varepsilon _1\cap Y_\varepsilon ^+) {\setminus } ( R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon -1}}) \big )&\leqq C\varepsilon ^{-2} {\mathcal {L}}^2\Big (\big (\overline{(R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon -1}})_\varepsilon }{\setminus } ( R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon -1}})\big )_\varepsilon \Big )\\ {}&\leqq C\varepsilon ^{-1} {\mathcal {H}}^1\big (\partial ( R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon -1}})\big ) \leqq C\delta /\varepsilon . \end{aligned}$$

This along with (4.34) yields \(\#(A^\varepsilon _1\cap Y_\varepsilon ^+) \leqq C\delta /\varepsilon \), and therefore (4.33) follows by (2.3).

Energy estimate on \(A^\varepsilon _2\): We prove that there exists a universal \(C>0\) such that

$$\begin{aligned} E_\varepsilon (Y_\varepsilon ^+,A^\varepsilon _2) \leqq (1+C /\delta ) \, E_\varepsilon \big (X_\varepsilon ,Q^\nu {\setminus } R^\nu _{1,\delta /2}\big ). \end{aligned}$$
(4.35)

First, the definition of \( L_{k_\varepsilon }^\varepsilon \) below (4.27) implies \((A^\varepsilon _2)_\varepsilon \subset L_{k_\varepsilon }^\varepsilon \). For \(x\in Y_\varepsilon ^+\), we denote the neighborhood of x with respect to \(Y_\varepsilon ^+\) by \({\mathcal {N}}_{\varepsilon ,Y}(x)\), cf. (2.1). We claim that

$$\begin{aligned} \#{\mathcal {N}}_{\varepsilon ,Y}(x) \geqq \#{\mathcal {N}}_\varepsilon (x) -6\, \#\big (\overline{B_\varepsilon (x)} \cap (X_\varepsilon {\setminus }\varepsilon {\mathscr {L}}(z^+_\varepsilon ))\big ) \ \ \ \text { for all } x\in X_\varepsilon \cap Y_\varepsilon ^+ \cap A^\varepsilon _2. \end{aligned}$$
(4.36)

In fact, if \(\overline{B_\varepsilon (x)} \cap (X_\varepsilon {\setminus }\varepsilon {\mathscr {L}}(z^+_\varepsilon ))\ne \emptyset \), the right hand side is nonpositive since \(\#{\mathcal {N}}_\varepsilon (x)\leqq 6\), see (2.2). Since \(\#{\mathcal {N}}_{\varepsilon ,Y}(x) \geqq 0\), (4.36) follows in this case. On the other hand, if \(\overline{B_\varepsilon (x)} \cap (X_\varepsilon {\setminus }\varepsilon {\mathscr {L}}(z^+_\varepsilon ))= \emptyset \), by (4.29), we may have only increased the cardinality of the neighborhood by adding atoms in \(\varepsilon {\mathscr {L}}(z_\varepsilon ^+) {\setminus } X_\varepsilon \), that is, \(\#{\mathcal {N}}_{\varepsilon ,Y}(x) \geqq \#{\mathcal {N}}_\varepsilon (x)\). This again yields (4.36).

We split the sum into \(X_\varepsilon \cap Y_\varepsilon ^+\) and \(Y_\varepsilon ^+ {\setminus } X_\varepsilon \). By using (2.3), \(A^\varepsilon _2 \subset L_{k_\varepsilon }^\varepsilon \), Lemma 3.1(iii), and (4.36) we obtain

$$\begin{aligned}&E_\varepsilon (Y_\varepsilon ^+,A^\varepsilon _2) \leqq C\varepsilon \#\, \big \{x \in A^\varepsilon _2 \cap (Y_\varepsilon ^+ {\setminus } X_\varepsilon )\big \} + \frac{1}{2} \underset{x \in A^\varepsilon _2}{\sum _{x \in Y_\varepsilon ^+ \cap X_\varepsilon }}\varepsilon \big (6-\#{\mathcal {N}}_{\varepsilon ,Y}(x)\big ) \nonumber \\&\quad \leqq C\varepsilon \#\big \{x \in (Y_\varepsilon ^+ \cap L_{k_\varepsilon }^\varepsilon ) {\setminus } X_\varepsilon \big \} + 3 \varepsilon \quad \underset{x \in A^\varepsilon _2}{\sum _{x \in Y_\varepsilon ^+ \cap X_\varepsilon }} \# \big (\overline{B_\varepsilon (x)} \cap (X_\varepsilon {\setminus }\varepsilon {\mathscr {L}}(z_\varepsilon ^+))\big )\nonumber \\&\qquad + E_\varepsilon (X_\varepsilon , L_{k_\varepsilon }^\varepsilon ). \end{aligned}$$
(4.37)

Note, by (4.29), that \(Y_\varepsilon ^+ \subset \varepsilon {\mathscr {L}}(z^+_\varepsilon ) \cup X_\varepsilon \) in \(L_{k_\varepsilon }^\varepsilon \). Therefore, in view of (4.28), we obtain

$$\begin{aligned} \#\big \{x \in (Y_\varepsilon ^+ \cap L_{k_\varepsilon }^\varepsilon ) {\setminus } X_\varepsilon \big \} \leqq \#\big \{x \in (\varepsilon {\mathscr {L}}(z^+_\varepsilon ) \triangle X_\varepsilon ) \cap L_{k_\varepsilon }^\varepsilon \big \} \leqq \frac{C}{\varepsilon \delta }E_\varepsilon \big (X_\varepsilon ,Q^\nu {\setminus } R^\nu _{1,\delta /2}\big ). \end{aligned}$$
(4.38)

Exploiting (4.28) once more, we get

$$\begin{aligned} \begin{aligned} \sum \limits _{x \in Y_\varepsilon ^+ \cap X_\varepsilon \cap A^\varepsilon _2} \#\big (\overline{B_\varepsilon (x)} \cap (X_\varepsilon {\setminus }\varepsilon {\mathscr {L}}(z^+_\varepsilon ))\big )&\leqq C \#\big \{x \in (\varepsilon {\mathscr {L}}(z^+_\varepsilon ) \triangle X_\varepsilon ) \cap L_{k_\varepsilon }^\varepsilon \big \}\\&\leqq \frac{C}{\varepsilon \delta }E_\varepsilon \big (X_\varepsilon ,Q^\nu {\setminus } R^\nu _{1,\delta /2}\big ). \end{aligned} \end{aligned}$$
(4.39)

Here, the first inequality holds because \(|x_1-x_2| \geqq \varepsilon \) for \(x_1,x_2 \in X_\varepsilon \), \(x_1\ne x_2\), and \(\overline{B_\varepsilon (x)} \subset L_{k_\varepsilon }^\varepsilon \) for all \(x \in A^\varepsilon _2\). Hence, we get that every point in \( (X_\varepsilon {\setminus }\varepsilon {\mathscr {L}}(z^+_\varepsilon )) \cap L_{k_\varepsilon }^\varepsilon \) is only accounted for at most seven times in the sum. Now, using (4.37)–(4.39), \(L_{k_\varepsilon }^\varepsilon \subset Q^\nu {\setminus } R^\nu _{1,\delta }\), and Lemma 3.1(iii), we obtain (4.35).

Energy estimate on \(A^\varepsilon _3\): We claim that

$$\begin{aligned} E_\varepsilon (Y_\varepsilon ^+,A^\varepsilon _3) \leqq E_\varepsilon (X_\varepsilon ,Q^\nu ). \end{aligned}$$
(4.40)

Recalling (4.32) we get that each \(x \in A^\varepsilon _3\cap Y_\varepsilon ^+\) lies either in \(T^\varepsilon := (P_{\delta ,\varepsilon }^+ {\setminus } Q^\nu _{r_{k_\varepsilon }}) \cup (\partial ^+_\varepsilon Q^\nu {\setminus } R^\nu _{1,\delta })\) or in \(D^\varepsilon \). If \(x \in A^\varepsilon _3\cap Y_\varepsilon ^+ \cap T^\varepsilon \), then also \(\overline{B_\varepsilon (x)} \subset T^\varepsilon \). (Here, we use the definition of \(A^\varepsilon _1\), \(A^\varepsilon _2\) and (2.7).) Then, (4.29) implies \(\#{\mathcal {N}}_{\varepsilon ,Y}(x) =6\). On the other hand, if \(x \in A^\varepsilon _3\cap Y_\varepsilon ^+ \cap D^\varepsilon \), then \(X_\varepsilon \cap \overline{B_\varepsilon (x)} = Y^+_\varepsilon \cap \overline{B_\varepsilon (x)}\), which yields \({\mathcal {N}}_{\varepsilon ,Y}(x) = {\mathcal {N}}_\varepsilon (x)\). Thus, by (2.3) and Lemma 3.1(iii),(iv) we obtain (4.40). In fact, we get

$$\begin{aligned} E_\varepsilon (Y_\varepsilon ^+,A^\varepsilon _3)&= E_\varepsilon \big (Y_\varepsilon ^+,A^\varepsilon _3 \cap T^\varepsilon \big ) + E_\varepsilon \big (Y_\varepsilon ^+,A^\varepsilon _3 \cap D^\varepsilon \big )\\&=E_\varepsilon \big (Y_\varepsilon ^+,A^\varepsilon _3 \cap D^\varepsilon \big ) \leqq E_\varepsilon (X_\varepsilon ,Q^\nu ). \end{aligned}$$

To conclude this step of the proof, it suffices to recall that by Lemma 3.1(iv)

$$\begin{aligned} E_\varepsilon (Y_\varepsilon ^+,Q^\nu )=E_\varepsilon (Y_\varepsilon ^+,A^\varepsilon _1) +E_\varepsilon (Y_\varepsilon ^+,A^\varepsilon _2)+E_\varepsilon (Y_\varepsilon ^+,A^\varepsilon _3). \end{aligned}$$

Then we obtain (4.31) by (4.18), (4.33), (4.35), and (4.40).

Step 6: Conclusion. By repeating the cut-off construction in Step 4 on \(Q^{\nu ,-}\) for \(z_\varepsilon ^-\), we obtain a configuration \(Y_\varepsilon \) such that \(Y_\varepsilon = \varepsilon {\mathscr {L}}(z_\varepsilon ^\pm )\) on \(\partial ^\pm _\varepsilon Q^\nu (y_\varepsilon )\) and

$$\begin{aligned} \liminf _{\varepsilon \rightarrow 0} E_\varepsilon \big (Y_\varepsilon ,Q^\nu (y_\varepsilon )\big ) \leqq \liminf _{\varepsilon \rightarrow 0} E_\varepsilon \big (X_\varepsilon ,Q^\nu (y_\varepsilon )\big ) + C\delta \end{aligned}$$
(4.41)

by (4.31), where we reinclude the center \(y_\varepsilon \) in the notation for clarification. Since \(z^\pm _\varepsilon \rightarrow z^\pm \) by Step 2, we observe by the definition of \(\Phi \) in (4.1) that

$$\begin{aligned} \liminf _{\varepsilon \rightarrow 0} E_\varepsilon (Y_\varepsilon ,Q^\nu (y_\varepsilon ))\geqq \Phi (z^+,z^-,\nu ). \end{aligned}$$

By using (4.17), (4.41) and by passing to \(\delta \rightarrow 0\), we obtain the statement of the lemma.

Step 7: Adaptions in \(\mathrm {(b)}\). To conclude the proof of the lemma, it remains to describe Steps 3–5 in the case of vacuum, that is, \(z^+_\varepsilon = {\mathbf {0}}\).

Step 3 for case \(\mathrm {(b)}\): Cardinality estimate. We prove that

$$\begin{aligned} \varepsilon ^2\#(X_\varepsilon \cap P_{\delta ,\varepsilon }^+) \leqq CE_\varepsilon (X_\varepsilon ,Q^\nu {\setminus } R^\nu _{1, \delta /2 }) \end{aligned}$$
(4.42)

for a universal \(C>0\). In fact, if \(x \in X_\varepsilon \) has \(\#{\mathcal {N}}_\varepsilon (x)=6\), then \(u_\varepsilon (x) \ne {\mathbf {0}}\) on \(B_{\varepsilon /2}(x)\) by (2.15) and the fact that \(B_{\varepsilon /2}(x) \subset V_\varepsilon ^{z(x)}(x)\). Also note the \(B_{\varepsilon /2}(x) \cap B_{\varepsilon /2}(y) =\emptyset \) for \(x,y\in X_\varepsilon \), \(x\ne y\). Thus, by (2.3), (4.19) (with \(z^+_\varepsilon = {\mathbf {0}}\)), and Lemma 3.1(iii) we get

$$\begin{aligned}&\varepsilon ^2\#(X_\varepsilon \cap P_{\delta ,\varepsilon }^+) \leqq \varepsilon ^2\#\{x \in X_\varepsilon \cap P_{\delta ,\varepsilon }^+ : \#{\mathcal {N}}_\varepsilon (x)=6\} + \varepsilon ^2\sum \limits _{x \in X_\varepsilon \cap P_{\delta ,\varepsilon }^+}(6-\#{\mathcal {N}}_\varepsilon (x)) \\&\quad \leqq C{\mathcal {L}}^2\big (\{u_\varepsilon \ne {\mathbf {0}}\}\cap P_{\delta ,\varepsilon }^+\big )+ 2 \varepsilon E_\varepsilon \big (X_\varepsilon , Q^\nu {\setminus } R^\nu _{1,\delta /2} \big ) \leqq CE_\varepsilon \big (X_\varepsilon ,Q^\nu {\setminus } R^\nu _{1,\delta /2} \big ), \end{aligned}$$

where we again used that \(P_{\delta ,\varepsilon }^+ \subset Q^\nu {\setminus } R^\nu _{1,\delta /2}\). This concludes Step 3 in case (b).

Step 4 for case \(\mathrm {(b)}\): Cut-off construction. We now explain the construction of a new configuration \(Y^+_\varepsilon \) such that \( Y_\varepsilon ^+ = {\mathbf {0}} \text { on } \partial ^+_\varepsilon Q^\nu \). Again set \(N_\varepsilon = \left\lfloor \frac{\delta }{6\varepsilon }\right\rfloor \) and define \(S_k^\varepsilon \) as in (4.27), as well as \(L_{k}^\varepsilon =S_{k-1}^\varepsilon \cup S_{k}^\varepsilon \cup S_{k+1}^\varepsilon \). Similar to (4.28), by averaging over k and using (4.42), there exists \(k_\varepsilon \in \{1,\ldots ,N_\varepsilon \}\) such that

$$\begin{aligned} \begin{aligned} \#( X_\varepsilon \cap L_{k_\varepsilon }^\varepsilon )&\leqq \frac{1}{N_\varepsilon }\sum \limits _{k=1}^{N_\varepsilon }\#( X_\varepsilon \cap L_{k}^\varepsilon ) \leqq \frac{3}{N_\varepsilon }\#(X_\varepsilon \cap P_{\delta ,\varepsilon }^+) \leqq \frac{C}{\varepsilon \delta } E_\varepsilon \big (X_\varepsilon ,Q^\nu {\setminus } R^\nu _{1,\delta /2}\big ), \end{aligned} \end{aligned}$$
(4.43)

where we again use that each strip \(S_k^\varepsilon \) is counted at most three times. We define

$$\begin{aligned} Y_\varepsilon ^+={\left\{ \begin{array}{ll} \emptyset &{} \text {in } \,\, \big ( (P_{\delta ,\varepsilon }^+ \cup R^\nu _{1,\delta }) {\setminus } (Q_{r_{k_\varepsilon }}^\nu \cup \partial _\varepsilon ^- Q^\nu ) \big ) \cup \partial _\varepsilon ^+ Q^\nu , \\ X_\varepsilon &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$
(4.44)

Note that, since \(E_\varepsilon ( X_\varepsilon ) <+\infty \), we have that \(E_\varepsilon (Y_\varepsilon ^+) <+\infty \).

Step 5 for case \(\mathrm {(b)}\): Energy estimate. We again split the estimate into the three sets \(A^\varepsilon _1\), \(A^\varepsilon _2\), and \(A^\varepsilon _3\) defined in (4.32).

Energy estimate for \(A^\varepsilon _1\): We claim that there exists \(C>0\) such that

$$\begin{aligned} E_\varepsilon (Y_\varepsilon ^+,A^\varepsilon _1) \leqq C\delta . \end{aligned}$$
(4.45)

In fact, due to (4.44), we have \(Y_\varepsilon ^+ \cap (R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon }})=X_\varepsilon \cap R^\nu _{1,\delta } \cap \partial _\varepsilon ^- Q^\nu \), where, similarly as in (4.34), \(\#(X_\varepsilon \cap R^\nu _{1,\delta } \cap \partial _\varepsilon ^- Q^\nu ) \leqq C \delta /\varepsilon \). As \(R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon -1}}\) consists of two rectangles with \({\mathcal {H}}^1(\partial (R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon -1}})) \leqq C\delta \) and \(Y_\varepsilon ^+\) satisfies \(E_\varepsilon (Y_\varepsilon ^+) <+\infty \), we obtain, by Lemma 3.1(v)

$$\begin{aligned} \#(A^\varepsilon _1 \cap Y_\varepsilon ^+)&= \#\big ( \big (A^\varepsilon _1 {\setminus } ( R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon }})\big ) \cap Y^+_\varepsilon \big ) + \# \big ( X_\varepsilon \cap R^\nu _{1,\delta } \cap \partial _\varepsilon ^- Q^\nu \big ) \\&\leqq C\varepsilon ^{-2} {\mathcal {L}}^2\big (\big (A^\varepsilon _1 {\setminus } (R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon }})\big )_\varepsilon \big ) + C\delta /\varepsilon \\&\leqq C\varepsilon ^{-1}{\mathcal {H}}^1\big (\partial (R^\nu _{1,\delta } {\setminus } Q^\nu _{r_{k_\varepsilon -1}})\big ) + C\delta /\varepsilon \leqq C\delta /\varepsilon . \end{aligned}$$

Then (4.45) follows by (2.3).

Energy estimate for \(A^\varepsilon _2\): We claim that there exists \(C>0\) such that

$$\begin{aligned} E_\varepsilon (Y_\varepsilon ^+,A^\varepsilon _2) \leqq \frac{C}{\delta }E_\varepsilon \big (X_\varepsilon ,Q^\nu {\setminus } R^\nu _{1,\delta /2}\big ). \end{aligned}$$
(4.46)

In fact, if \(x \in Y_\varepsilon ^+\cap A^\varepsilon _2\), then \(x \in X_\varepsilon \cap L_{k_\varepsilon }^\varepsilon \). Using (2.3) and (4.43) we obtain (4.46).

Energy estimate for \(A^\varepsilon _3\): We observe that

$$\begin{aligned} E_\varepsilon (Y_\varepsilon ^+,A^\varepsilon _3) \leqq E_\varepsilon (X_\varepsilon ,Q^\nu ). \end{aligned}$$
(4.47)

Indeed, if \(x \in Y_\varepsilon ^+ \cap (Q^\nu {\setminus } (A^\varepsilon _1 \cup A^\varepsilon _2))\), then \({\mathcal {N}}_{\varepsilon ,Y}(x) = {\mathcal {N}}_\varepsilon (x)\), where the neighborhood of x with respect to \(Y_\varepsilon ^+\) is again denoted by \({\mathcal {N}}_{\varepsilon ,Y}(x)\). Therefore, (4.47) follows by (2.3) and Lemma 3.1(iii).

Summarizing, (4.45)–(4.47) and (4.18) yield

$$\begin{aligned} \liminf _{\varepsilon \rightarrow 0} E_\varepsilon (Y_\varepsilon ^+,Q^\nu ) \leqq \liminf _{\varepsilon \rightarrow 0} E_\varepsilon (X_\varepsilon ,Q^\nu ) + C\delta , \end{aligned}$$

which is the analog to (4.31). The rest of the proof (that is, Step 6) remains unchanged. \(\square \)

5 Reduction of the Problem to Subsets of Two Lattices

In the previous section, we have seen that the condition of \(L^1\)-convergence in the definition of \(\psi \) (see (3.5)) can be replaced by converging boundary values, see the definition of \(\Phi \) in (4.1). From now on, it will be convenient to express the problem with lattice spacing equal to 1. Recall (2.7) and observe that by Lemma 3.1 the cell formula for \(\Phi \) can be written as

$$\begin{aligned} \Phi (z^+,z^-,\nu )&= \min \Big \{ \liminf _{T\rightarrow +\infty } \frac{1}{T}\inf \Big \{E_1\big (X_T,Q^\nu _T(y_T)\big ):\, y_T \in \mathbb R^2, \nonumber \\&\qquad \qquad X_T = {\mathscr {L}}(z^\pm _T) \text { on } \partial ^\pm _1 Q_T^\nu (y_T) \Big \} :\lbrace z^\pm _T\rbrace _T \subset {\mathcal {Z}} \text { with } z^\pm _T \rightarrow z^\pm \Big \} \end{aligned}$$
(5.1)

for all \(z^+,z^- \in {\mathcal {Z}}\) and \(\nu \in {\mathbb {S}}^1\). This section is devoted to a fundamental ingredient for the proof of relation of \(\Phi \) and \(\varphi \), and the properties of \(\varphi \), which will be addressed in Sections 6 and 7. We show that the minimization problem in (5.1) can be reduced to configurations that are subsets of two lattices only (or just one if either \(z^+={\mathbf {0}}\) or \(z^-={\mathbf {0}}\)). For the formulation of the lemma, we introduce two further notions: we say that a set \(Y \subset \mathbb R^2\) is connected if for each pair \(x,y \in Y\) there exists a chain \((v_1,\ldots ,v_n)\) with \(v_i \in Y\) for \(i \in \lbrace 1,\ldots ,n\rbrace \), \(v_1= x\), \(v_n= y\), and \(|v_{i+1} - v_i| = 1\) for \(i \in \lbrace 1,\ldots ,n-1\rbrace \). Moreover, given a configuration X and \(Y \subset X\), we define the boundary of Y inside \(Q^\nu _T(y)\) by

$$\begin{aligned} \partial Y = \{ x \in Y \cap Q^\nu _T(y) :\#({\mathcal {N}}(x) \cap Y) < 6\}. \end{aligned}$$
(5.2)

Lemma 5.1

(Reduction to subsets of two lattices) Let \(z^+,z^- \in {\mathcal {Z}}\), \(\nu \in {\mathbb {S}}^1\), \(y \in \mathbb R^2\), and \(T>0\). Let \(X \subset {\mathbb {R}}^2\) be a minimizer of

$$\begin{aligned} \min \Big \{E_1\big (X,Q^\nu _T(y)\big ):\ X = {\mathscr {L}}(z^\pm ) \text { on } \partial ^\pm _1 Q_T^\nu (y) \Big \}. \end{aligned}$$
(5.3)

Then, it satisfies the following two properties:

  1. (i)

    (Subset of lattices) There holds \(X = X^+ \cup X^-\) on \(Q_T^\nu (y)\), where \(X^\pm \subset {\mathscr {L}}(z^\pm )\) and \(X^\pm \) is connected.

  2. (ii)

    (Structure of boundaries) The sets \(\partial X^+\) and \(\partial X^-\) defined in (5.2) are connected and satisfy \(\# {\mathcal {N}}(x) \leqq 5\) for all \(x \in \partial X^\pm \), as well as \(\max _{x,y \in \partial X^\pm } |x-y| \geqq T\).

Note that the minimum in (5.3) exists since \(E_1\) is lower semicontinuous, see (1.1) and (1.3), and the problem is finite dimensional. We also point out that \(X^+\cap X^- \ne \emptyset \) is possible, see for example Fig. 4, that is, the two grains described by \(X^+\) and \(X^-\) can have common atoms. Resolving this ambiguity by introducing a specific choice, the grain boundary and bonds connecting the two grains can be described in more detail.

Lemma 5.2

(Bonds between grain boundaries) Let \(X^\pm \) be the sets found in Lemma 5.1. There exist \(Y^\pm \) with \(X^\pm {\setminus } \partial X^\mp \subset Y^\pm \subset X^\pm \) such that

  1. (i)

    (Partition into grains) \(Y^+ \cup Y^- = X^+ \cup X^-\) and \(Y^+ \cap Y^- \cap Q_T^\nu (y) = \emptyset \).

  2. (ii)

    (Grain and bulk boundaries) \(\partial Y^\pm \subset \partial X^\pm \) and \(Y^\pm = {\mathscr {L}}(z^\pm )\) on \(\partial ^\pm _1 Q_T^\nu (y)\).

  3. (iii)

    (Neighborhood structure at grain boundary) it holds that

    $$\begin{aligned} \big | \sum \limits _{x \in \partial Y^\pm } \# ({\mathcal {N}}(x) \cap Y^\pm ) - 4 \# \partial Y^\pm \big | \leqq 2. \end{aligned}$$

We thus have that on average each boundary atom has four neighbors in the same grain. As it has at most five neighbors in the whole configuration, it has on average less than one bond connecting it to the other grain.

From a technical perspective, Lemma 5.1 will provide an important tool to study the properties of the cell formulas. From the physical point of view, it shows that our extremely brittle set-up, while allowing for rebonding, does not support interpolating boundary layers near cracks. Its proof will require some concepts from graph theory which will be only needed for this part of the article. For this reason, it is possible to omit the proofs of Lemmas 5.1 and 5.2 on first reading and to proceed directly with Section 6. As our graph theoretic description gives in fact a more precise picture of the geometry of grain boundaries, which is of some independent interest, we summarize these findings in Theorem 5.4 at the end of Section 5.

We now address the proof of the lemma and start by introducing some notions from graph theory.

The bond graph: We define the bond graph of \(X\subset \mathbb R^2\) as the set of positions X with the set of bonds \(\{\{x,y\}:\, x \in X, \ y \in {\mathcal {N}}(x)\}\), where \({\mathcal {N}}(x) = {\mathcal {N}}_1(x)\) is defined in (2.1). As for configurations with finite energy \(E_1\) there holds \(\mathrm {dist}(x,X {\setminus } \{x\})\geqq 1\) for all \(x\in X\) and \(y \in {\mathcal {N}}(x)\) only if \(|x-y| = 1 < \sqrt{2}\), the bond graph is planar. Indeed, given a quadrilateral with all sides and one diagonal equal to 1, the second diagonal is \(\sqrt{3} >1\).

A sequence of atoms \(p=(v_1,\ldots ,v_n) \subset X\) is called a simple path in X if the atoms are distinct and \(\lbrace v_{j-1}, v_{j}\rbrace \) are bonds for \(j\in \lbrace 1,\ldots ,n-1\rbrace \). If \((v_1,\ldots ,v_{n-1})\) is a simple path and \(v_{n-1}\) is connected to \(v_n = v_1\) by a bond, p is a cycle in X. We say that a configuration is connected if each two atoms are joinable through a simple path. (Note that this definition is consistent with the one given before the statement of Lemma 5.1.) A bond is called acyclic if it is not contained in any cycle of the bond graph. The reduced bond graph of X is obtained by first deleting all acyclic bonds and then all atoms which are not connected to any other atom. By a face of X we always mean a face of its reduced bond graph. The boundary of a face is given by a disjoint union of cycles and by a unique cycle if the reduced bond graph is connected. Such a boundary is called a polygon and, in particular, a j-gon if it consists of \(j\in \mathbb N\) atoms.

Sub-configuration: We say that \(Z \subset X\) is a sub-configuration of X. All notions defined above are defined analogously for any sub-configuration Z of X.

Face defect: We define the face defect of a sub-configuration \(Z \subset X\) by

$$\begin{aligned} \eta (Z) = \sum \limits _{j \geqq 3} \, (j-3)f_j(Z), \end{aligned}$$
(5.4)

where \(f_j(Z)\) denotes the number of polygons with j atoms in the bond graph of Z.

Strong connectedness: We say that a configuration Z is strongly connected if \(Z {\setminus } \lbrace x \rbrace \) is connected for every \(x \in Z\). Note that strongly connected graphs with more than two atoms coincide with their reduced bond graph as they do not contain acyclic bonds since removing one of the atoms belonging to the bond would disconnect the configuration.

Maximal components: Fix \(Q^\nu _T(y)\). Let \(z^+,z^- \in {\mathcal {Z}} \) and consider \(X \subset {\mathbb {R}}^2\) such that \(X= {\mathscr {L}}(z^\pm )\) on \(\partial ^\pm _1Q^\nu _T(y)\). We denote the set of strongly connected subsets of lattices by

$$\begin{aligned} {\mathcal {C}}^\pm = \big \{Z \subset X \cap {\mathscr {L}}(z^\pm ) :\, Z \cap \partial ^\pm _1 Q^\nu _T(y) \ne \emptyset , \, Z \text { is strongly connected}\big \}. \end{aligned}$$

We introduce the maximal components, denoted by \(M^\pm \), as the maximal elements in \({\mathcal {C}}^\pm \) with respect to set inclusion. These sets can be written as

$$\begin{aligned} M^\pm = \bigcup \limits _{Z \in {\mathcal {C}}^\pm } Z. \end{aligned}$$
(5.5)

Note that \(M^+=\emptyset \) or \(M^-=\emptyset \) if \(z^+={\mathbf {0}}\) or \(z^-={\mathbf {0}}\), respectively. Moreover, we point out that \(M^\pm \) are in general not subsets of \(Q^\nu _T(y)\). We illustrate \(M^\pm \cap Q^\nu _T\) in Fig. 8.

Fig. 8
figure 8

A schematic picture of \(M^+ \cap Q^\nu _T(y)\), depicted in dark gray, and of \(M^- \cap Q^\nu _T(y)\), depicted in light gray. Their boundaries are illustrated in bold. We depict also a curve \(p_\gamma \) considered in Step 2 of the proof below

Lemma 5.3

(Simple paths in maximal components) Let \(\gamma = (x_1,\ldots ,x_k)\) be a simple path in X with \(x_1,x_k \in M^+\) (or both in \(M^-)\) such that \(x_2,\ldots , x_{k-1} \notin M^+ \) (or \(x_2,\ldots , x_{k-1} \notin M^- \), respectively). Then \(k \geqq 4\).

Proof

Let \(\gamma \) be as in the statement, without restriction with \(x_1,x_k \in M^+\). Recall that \(M^+ \subset {\mathscr {L}}(z^+)\). If we had \(k=3\), then we would necessarily get \(x_2 \in {\mathscr {L}}(z^+)\), as well, see Fig. 9. This, however, contradicts the choice of the maximal component \(M^+\). In fact, also \(M^+\cup \lbrace x_2 \rbrace \) would be a strongly connected set. \(\square \)

Fig. 9
figure 9

The three different (up to rotation and reflection) possibilities of paths of length 3

Proof

Without restriction we assume \(z^+ \ne z^-\). The proof strategy is as follows: we first show that X consists of at most two connected components which contain the lower and the upper part of the boundary, respectively (Step 1). We are then left with at most two connected components which contain the maximal components \(M^\pm \) defined in (5.5). Then, we prove that these components \(M^\pm \) do not contain holes. This ensures that \(\partial M^\pm \cap Q^\nu _T(y)\) are simple paths (Step 2). Finally, we show that there are no parts of X that may be connected to \(M^\pm \), but that are not subsets of the upper and lower lattice \({\mathscr {L}}(z^\pm )\) (Step 3). Steps 1–3 are proved by contradiction, that is, we suppose that X did not satisfy the abovementioned properties and then we show that the configuration can be modified in such a way that the energy strictly decreases. Some technical estimates are given in Steps 4–5.

Fix \(z^\pm \in {\mathcal {Z}}\), \(\nu \in {\mathbb {S}}^1\), \(T>0\), and \(y \in \mathbb R^2\). Denote by \(X \subset \mathbb R^2\) a minimizer of (5.3). Without loss of generality we assume that

$$\begin{aligned} X \subset \{x \in \overline{(Q_T^\nu (y))_{ 1}} :{\mathcal {N}}(x) \cap Q_T^\nu (y) \ne \emptyset \} \cup \partial _1^+ Q_T^\nu (y) \cup \partial _1^- Q_T^\nu (y). \end{aligned}$$
(5.6)

In particular, we have \(X={\mathscr {L}}(z^\pm )\) on \(\partial _1^\pm Q_T^\nu (y)\). By \(M^\pm \) we denote its maximal upper and lower component, respectively, given by (5.5). (Recall that \(M^+=\emptyset \) or \(M^-=\emptyset \) if \(z^+={\mathbf {0}}\) or \(z^-={\mathbf {0}}\).) Without restriction we assume that \(z^\pm = (\theta ^\pm ,\tau ^\pm ,1)\). Otherwise, we apply all arguments just to the component \(z^\pm \) with \(z^\pm \ne {\mathbf {0}}\).

Step 1: X has at most two connected components in \(Q_T^\nu (y)\) and \(\#{\mathcal {N}}(x) \geqq 2\) for all \(x \in X \cap Q_T^\nu (y)\). First, we observe that the maximal components \(M^+\) and \(M^-\) are either contained in one single or in two different connected components of X. Assume by contradiction that the configuration X consists of more than the (at most two) connected components containing \(M^\pm \). Then we can remove the other connected components not containing \(M^\pm \) and obtain a new configuration which has strictly less energy and the same boundary data as X. This follows directly from the definition of the energy in (2.3).

Moreover, if there exists \( x' \in X\) such that \(\#{\mathcal {N}}( x' ) \leqq 1\), then we can consider the configuration \(X{\setminus } \{ x' \}\) to obtain a configuration with strictly less energy since, by (2.3), we have

$$\begin{aligned} E_1(X,Q^\nu _T(y))&= \frac{1}{2}\sum \limits _{x \in X\cap Q^\nu _T(y)} (6-\#{\mathcal {N}}(x)) \geqq E_1\big (X{\setminus } \lbrace x' \rbrace ,Q^\nu _T(y)\big ) + 2. \end{aligned}$$

Step 2: \(\partial M^\pm \) is a simple path. In this step, we show that each of the sets \(\partial M^\pm \) defined in (5.2) is a simple path in X joining the lateral faces of \(Q^\nu _T(y)\). More precisely, let

$$\begin{aligned} H_{\nu ^\perp ,-}^T(y)&:= \{x \in {\mathbb {R}}^2:\langle (x-y), \nu ^\perp \rangle < - T/2 \} \ \ \text { and } \ \ H_{\nu ^\perp ,+}^T(y) \\&= \{x \in {\mathbb {R}}^2:\langle (x-y), \nu ^\perp \rangle \geqq T/2 \}. \end{aligned}$$

Then there are \(v^\pm _- \in M^\pm \cap H_{\nu ^\perp ,-}^T(y)\) and \(v^\pm _+ \in M^\pm \cap H_{\nu ^\perp ,+}^T(y)\) such that \(\{v^\pm _-, v^\pm _+\} \cup \partial M^\pm \) is a simple path with first element \(v^\pm _-\) and last element \(v^\pm _+\).

To prove this, we color each (closed) equilateral triangle of sidelength 1 all of whose corners are contained in \(M^\pm \) in dark/light gray, respectively, see Fig. 8. We first show that there are no cycles in \(\partial M^\pm \). Since \(M^\pm \) is strongly connected, this also yields that the colored regions inside \(Q_T^\nu (y)\) are simply connected and that \(\partial M^\pm \) lies on the boundary of the respective colored region. Assume by contradiction that there exists a cycle \(p=(v_1,\ldots ,v_{n}) \subset M^\pm \) with \(v_{n}=v_1\). Denote by \(\mathrm {int}(p)\) the interior connected component of the curve

$$\begin{aligned} p_\gamma =\bigcup \limits _{i=1}^{n-1}[v_i; v_{i+1}]; \end{aligned}$$

see Fig. 8. Now define

$$\begin{aligned} {\tilde{X}} = {\left\{ \begin{array}{ll} {\mathscr {L}}(z^+)&{}\text {in } \,\,\mathrm {int}(p),\\ X &{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

Since we did not change the neighborhood of each atom \(x \in Q^\nu _T(y) {\setminus } \overline{\mathrm {int}(p)}\), we obtain by (2.3) and Lemma 3.1(iv)

$$\begin{aligned} E_1\big ({\tilde{X}},Q^\nu _T(y)\big )&= E_1\big ({\tilde{X}},\overline{\mathrm {int}(p)}\big ) + E_1\big ({\tilde{X}},Q^\nu _T(y) {\setminus } \overline{\mathrm {int}(p)}\big ) \\ {}&< E_1\big (X,\overline{\mathrm {int}(p)}\big ) + E_1\big (X,Q^\nu _T(y) {\setminus } \overline{\mathrm {int}(p)}\big ) = E_1\big (X,Q^\nu _T(y)\big ), \end{aligned}$$

where we have used that \(\#{\mathcal {N}}(x) = 6\) for all \(x \in {\tilde{X}} \cap \mathrm {int}(p)\) and that every \(x \in p\) has at least as many bonds in \({\tilde{X}}\) as in X, while for at least one \(x \in p\) the number of bonds has increased. We have constructed a configuration \({\tilde{X}}\) with strictly less energy and the same boundary data as X. This contradicts the fact that \(X \subset \mathbb R^2\) is a minimizer of (5.3), and shows that there are no such cycles in \(M^\pm \).

We next show that even the complement of each colored region inside \(Q^\nu _T(y)\) is connected. If this were not the case, without restriction we assume for contradiction that there are \(v, w \in M^+ \cap H_{\nu ^\perp ,+}^T(y)\) such that there is a simple path with first element v, last element w, and intermediate elements in \(\partial M^+\), whose bonds together with a segment in \(\partial Q_T^\nu (y)\) bound a region free of dark triangles. By the boundary conditions, we can suppose that \( 6 \geqq \langle v, \nu \rangle > \langle w, \nu \rangle \geqq -6 \), see also Fig. 8. We extend it to a cycle p by placing additional atoms in \({\mathscr {L}}(z^+) \cap \overline{(Q_T^\nu (y))_\varepsilon } \cap H_{\nu ^\perp ,+}^T(y)\). Our assumptions on X specified in (5.6) and Step 1 guarantee that each point in \({\mathscr {L}}(z^+)\) on or inside of p has distance at least 1 to every atom of the connected component of X that contains \(M^-\). Now let

$$\begin{aligned} {\tilde{X}} = {\left\{ \begin{array}{ll} {\mathscr {L}}(z^+)&{}\text {in } \,\,\overline{\mathrm {int}(p)},\\ X &{}\text {in } \,\,\mathbb R^2 {\setminus } \overline{\mathrm {int}(p)}, \\ \emptyset &{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

In a fashion similiar to before we get \(E_1({\tilde{X}},Q^\nu _T(y)) < E_1(X,Q^\nu _T(y))\), which also shows that this situation does not occur. We conclude that each \(M^\pm \) is strongly connected and both the dark and the light colored areas have connected complements relative to \(Q^\nu _T(y)\).

We claim that \(\partial M^\pm \) has to be a simple path. Assume by contradiction that this were not the case, for example, for \(M^+\). Then, since \(\partial M^+\) lies on the boundary of the region in dark gray being the union of triangles, we find \(x \in \partial M^+\) which is a corner of exactly two of these triangles and these triangles share only x as a common point, see Fig. 10. Since \(\partial M^+\) does not contain cycles, we find \(x^+,x^- \in {\mathcal {N}}(x)\) such that each path in \(M^+\) connecting \(x^+\) with \(x^-\) contains x. This, however, contradicts the strong connectedness of \(M^+\), and shows that \(\partial M^+\) is a simple path. This concludes Step 2.

Fig. 10
figure 10

A point \(x \in \partial M^\pm \) that would make \(\partial M^\pm \) a non-simple path

Step 3: Comparison with subsets of the lattice. Our goal is to show that there holds \(X \subset {\mathscr {L}}(z^+) \cup {\mathscr {L}}(z^-)\). Recalling the definition of \(M^\pm \) in (5.5), it thus suffices to show that removing the connected components of \((X \cap Q^\nu _T(y)) {\setminus } (M^+ \cup M^-)\) would strictly decrease the energy which clearly contradicts the assumption that X is a minimizer. (Recall that we have already reduced to the case that X consists of at most two connected components. Note, however, that \((X \cap Q^\nu _T(y)) {\setminus } (M^+ \cup M^-)\) might consist of more connected components.)

This will conclude the proof of the statement: it shows that the minimizer X is indeed a subset of \({\mathscr {L}}(z^+) \cup {\mathscr {L}}(z^-)\). Moreover, the property that \(\partial M^\pm \cap Q^\nu _T(y)\) are simple paths joining the lateral faces of \(Q^\nu _T(y)\) has already been addressed in Step 2. Finally, we observe that \(\# {\mathcal {N}}(x) \leqq 5\) for all \(x \in \partial M^\pm \). In fact, \(\# {\mathcal {N}}(x) = 6\) for some \(x \in \partial M^\pm \) would entail \(\lbrace x \rbrace \cup {\mathcal {N}}(x) \subset M^\pm \) as \(M^\pm \subset {\mathscr {L}}(z^\pm )\) is the maximal component. This contradicts (5.2).

Now, consider a connected component \(X'\) of \((X \cap Q^\nu _T(y)) {\setminus } (M^+ \cup M^-)\). We want to prove that

$$\begin{aligned} E_1\big (X,Q^\nu _T(y)\big ) \geqq E_1\big (X {\setminus } X',Q^\nu _T(y)\big )+1. \end{aligned}$$
(5.7)

We first introduce some further notation. By \(\Gamma ^\pm \subset \partial M^\pm \) we denote the smallest connected sets \(\Gamma ^\pm \supset {\mathcal {N}}(X') \cap M^\pm \), where we define \({\mathcal {N}}(X') := \bigcup _{x \in X'} {\mathcal {N}}(x) {\setminus } X'\). Define \(\Gamma := \Gamma ^+ \cup \Gamma ^-\) and \(X_\Gamma := X' \cup \Gamma \). Note that both \(\Gamma ^-\) and \(\Gamma ^+\) are simple paths in X since \(\partial M^\pm \) are simple paths, see Fig. 11. For \(x \in X_\Gamma \), we introduce the internal and external neighborhoods by

$$\begin{aligned} {\mathcal {N}}_i(x) = {\mathcal {N}}(x) \cap X_\Gamma , \ \ \ \ \ \ \ \ {\mathcal {N}}_e(x) = {\mathcal {N}}(x) {\setminus } X_\Gamma , \end{aligned}$$
(5.8)

that is, the set of neighbors inside and outside of \(X_\Gamma \), respectively. Note that \(X_\Gamma \) is connected. Its reduced bond graph is delimited by a finite union of disjoint cycles. We denote by \(\partial X_\Gamma \) the union of these cycles and by \(d = \# \partial X_\Gamma \) its cardinality. (The notation is unrelated to (5.2).) We further define

$$\begin{aligned}&f_j = \# j\text {-gons of }X_\Gamma , \quad f= \sum \limits _j f_j, \quad \eta = \eta (X_\Gamma ), \quad n_{\Gamma } = \#\Gamma , \quad n=\#X_\Gamma ,\nonumber \\&b_\Gamma = \#\big \{\{x,y\} :x,y \in \Gamma , \, y \in {\mathcal {N}}(x)\big \}, \ \ \ \ b=\#\big \{\{x,y\} :x,y \in X_\Gamma , \, y \in {\mathcal {N}}(x)\big \},\nonumber \\&b_{\mathrm {ac}} = \#\big \{\{x,y\} \text { acyclic}:x,y \in X_\Gamma , \, y \in {\mathcal {N}}(x)\big \}, \end{aligned}$$
(5.9)

where \(\eta \) was introduced in (5.4). Note that f corresponds to the number of faces both in the bond graph and in the reduced bond graph of \(X_\Gamma \). We will see that it holds that

$$\begin{aligned} 2+d+2b_{\mathrm {ac}}+\eta \geqq 3n_\Gamma -b_\Gamma . \end{aligned}$$
(5.10)

We defer the proof of (5.10) to Steps 4–5 below and proceed to prove (5.7).

Since in the passage from X to \(X {\setminus } X'\) the neighborhood of atoms outside \(X_\Gamma \) is left unchanged and for atoms in \(\Gamma \) the neighbors outside of \(X_\Gamma {\setminus } \Gamma \) remain, in view of (2.3), we need to check that

$$\begin{aligned} \frac{1}{2}\sum \limits _{x \in X_\Gamma }(6- \#{\mathcal {N}}(x))\geqq \frac{1}{2}\sum \limits _{x \in \Gamma } \big (6-(\#{\mathcal {N}}_e(x) + \#({\mathcal {N}}(x) \cap \Gamma )\big )+1. \end{aligned}$$
(5.11)

We can count the faces to obtain

$$\begin{aligned} 2b-d-2b_{\mathrm {ac}}= \sum \limits _{j \geqq 3}\, jf_j = \eta + 3f. \end{aligned}$$
(5.12)

Indeed, the first identity follows from the fact that in the summation all bonds contained in the union of cycles delimiting the reduced bond graph of \(X_\Gamma \) are counted only once, the acyclic bonds are not counted, and all other cyclic bonds are counted twice. The second identity follows from (5.4). As the bond graph is planar and connected, we can apply Euler’s formula (omitting the exterior face) to get \(n-b+f =1\). Then, by (5.10) and (5.12) we derive

$$\begin{aligned} 3n - b \geqq 3n_\Gamma -b_\Gamma +1. \end{aligned}$$

By the definitions in (5.8)–(5.9) and the facts that \(\sum \nolimits _{x \in X_\Gamma } \#{\mathcal {N}}_i(x)=2b\), \(\sum _{x\in \Gamma } \#({\mathcal {N}}(x) \cap \Gamma ) = 2b_\Gamma \) this implies

$$\begin{aligned} \frac{1}{2}\sum \limits _{x \in X_\Gamma }(6- \#{\mathcal {N}}_i(x))\geqq \frac{1}{2}\sum \limits _{x \in \Gamma } \big (6- \#({\mathcal {N}}(x) \cap \Gamma )\big )+1. \end{aligned}$$
(5.13)

Now we note that \(\#{\mathcal {N}}(x) -\#{\mathcal {N}}_e(x) =\#{\mathcal {N}}_i(x)\) for \(x\in \Gamma \) and \({\mathcal {N}}(x) = {\mathcal {N}}_i(x)\) for \(x \in X_\Gamma {\setminus } \Gamma \), see (5.8). This along with (5.13) shows the desired estimate (5.11). To conclude the proof, it remains to show (5.10).

Fig. 11
figure 11

The different possibilities of \(X'\) touching \(M^\pm \) corresponding to case (a) on the top left, case (b) on the top right, and case (c) in the two bottom pictures. \(M^+\) is always depicted in gray, \(M^-\) in light gray, and \(X'\) in dark gray. \(\Gamma ^+\) and \(\Gamma ^-\) are depicted by the bold black lines

Step 4: Proof of (5.10). Recall that \(\Gamma \) consists of the two simple paths \(\Gamma ^+\) and \(\Gamma ^-\). We need to distinguish three cases:

$$\begin{aligned} \text {(a) } \Gamma \text { is not connected, } \ \ \ \ \ \text {(b) }\Gamma \text { is a cycle, (c) }\Gamma \text { is a simple path.} \end{aligned}$$

Since \(\Gamma ^\pm \) are simple paths, and the bond graph of \(X^\prime \) is planar and connected, we see that these are all possibilities that may occur, see Fig. 11 for an illustration of the different cases. At this point, we also use that \(\Gamma ^\pm \) are the smallest connected sets with \(\Gamma ^\pm \supset {\mathcal {N}}(X') \cap M^\pm \) and \(\Gamma ^\pm \subset \partial M^\pm \), where \(\partial M^\pm \) is a simple path connecting \(H_{\nu ^\perp ,-}^T(y) \cap {\mathscr {L}}(z^\pm )\) and \(H_{\nu ^\perp ,+}^T(y) \cap {\mathscr {L}}(z^\pm )\).

First of all, we observe that

$$\begin{aligned} \text {Case (a):} \ \ n_\Gamma \leqq b_\Gamma +2, \ \ \ \ \ \ \text {Case (b):} \ \ n_\Gamma \leqq b_\Gamma , \ \ \ \ \ \ \text {Case (c):} \ \ n_\Gamma \leqq b_\Gamma +1. \end{aligned}$$
(5.14)

This is due to the fact that the bond graph of \(\Gamma \) contains \(\Gamma ^\pm \) and a simple path containing k bonds consists of \(k+1\) atoms, and in a cycle the number of bonds equals the number of atoms. (As there may be more bonds present if there are triangles in the bond graph, we get inequalities.)

Using (5.14), it suffices to prove

$$\begin{aligned} d+2b_{\mathrm {ac}}+\eta \geqq {\left\{ \begin{array}{ll} 2n_\Gamma &{}\text {in case (a)},\\ 2n_\Gamma -2 &{}\text {in case (b)},\\ 2n_\Gamma -1&{}\text {in case (c)}, \end{array}\right. } \end{aligned}$$
(5.15)

where d, \(\eta \), \(n_\Gamma \), and \(b_\mathrm {ac}\) are defined in (5.9). This will rely on the estimate

$$\begin{aligned} \eta \geqq n_\Gamma -2. \end{aligned}$$
(5.16)

We first show (5.15) in the three cases and defer the proof of (5.16) to Step 5. Observe that if a connected component \({\tilde{\Gamma }}\) of \(\Gamma \) satisfies \({\tilde{\Gamma }} \not \subset \partial X_\Gamma \), then \(\#{\tilde{\Gamma }} = 1\) and \({\tilde{\Gamma }}\) connects to \(X'\) by one acyclic bond. This follows from the observation that, whenever \(x \in {\tilde{\Gamma }}\) satisfies \({\mathcal {N}}(x) \cap X_{\Gamma } \geqq 2\), then x lies on a cycle in \(X_\Gamma \) and thus, as an element of \(\Gamma \), is contained in \(\partial X_\Gamma \).

Case (a): Suppose first \(\Gamma \subset \partial X_\Gamma \). Since \(\partial X_\Gamma \) is a disjoint union of cycles and \(\Gamma \) consists of two simple paths, we get \(\#(\partial X_\Gamma {\setminus } \Gamma ) \geqq 2\). In fact, if \(\Gamma ^+\) and \(\Gamma ^-\) intersect the same cycle of \(\partial X_\Gamma \), this follows from the fact that \(\Gamma ^+ \cup \Gamma ^-\) is not connected. If \(\Gamma ^+\) and \(\Gamma ^-\) intersect different cycles of \(\partial X_\Gamma \), it suffices to use that \(\Gamma ^\pm \) are not cycles. This shows \(d\geqq n_\Gamma + 2\). Then (5.16) implies (5.15). If \(\Gamma ^-\subset \partial X_\Gamma \), \(\Gamma ^+ \not \subset \partial X_\Gamma \), then, as before, \(\#(\partial X_\Gamma {\setminus } \Gamma ^-) \geqq 1\) and thus \(d \geqq \#\Gamma ^- + 1\). The observation below (5.16) gives \(\# \Gamma ^+=1\) and \(b_{\mathrm {ac}} \geqq 1\), so particularly \(d\geqq n_\Gamma \). Then again (5.16) implies (5.15). The case \(\Gamma ^-\not \subset \partial X_\Gamma \), \(\Gamma ^+ \subset \partial X_\Gamma \) is analogous. Finally, if \(\Gamma ^-, \Gamma ^+ \not \subset \partial X_\Gamma \), then \(n_{\Gamma } = 2\) and \(b_{\mathrm {ac}} \geqq 2\) since \(\Gamma ^-\) and \(\Gamma ^+\) cannot be connected to \(X'\) by the same (acyclic) bond. This proves (5.15).

Case (b): Since \(\Gamma \) is a cycle, we get \(\Gamma \subset \partial X_\Gamma \). Thus, we obtain \(n_\Gamma \leqq d\) and (5.16) yields (5.15).

Case (c): Suppose first that \(\Gamma \subset \partial X_\Gamma \). Since \(\Gamma \) is not a cycle and \(\partial X_\Gamma \) is a union of cycles, we get \(\# (\partial X_\Gamma {\setminus } \Gamma ) \geqq 1\). This implies \(d \geqq n_\Gamma +1\). Then (5.16) again yields (5.15). If \(\Gamma \not \subset \partial X_\Gamma \), then \(n_{\Gamma } = 1\) and \(b_{\mathrm {ac}} \geqq 1\), from which (5.15) follows.

Step 5: Proof of (5.16). It remains to check (5.16). To this end, we classify the polygons in the (reduced) bond graph of \(X_\Gamma \) in the following way: for \(k \geqq 1\), we set

$$\begin{aligned} \partial \text {-}k\text {-gon}= \{P \text { polygon in } X_\Gamma :\, \#(P \cap \Gamma ) = k \} \ \ \ \text { and } \ \ \ \partial \text {-gon} = \bigcup \limits _{k\geqq 1} \partial \text {-}k\text {-gon}, \end{aligned}$$

and define \(D_k=\#\partial \text {-}k\text {-gon}\). In order to estimate the cardinality of \(P \in \partial \text {-}k\text {-gon}\), we introduce the following condition:

$$\begin{aligned} \text { there exist} \ \ \ x_+ \in M^+ \cap P \ \ \ \text {and} \ \ \ x_- \in (M^- {\setminus } M^+) \cap P \ \ \ \text { with } \ \ \ |x_+-x_-|=1. \end{aligned}$$
(5.17)

We claim that always, \(\#P \geqq k+1\), while if (5.17) does not hold then it holds that \(\#P \geqq k+2\).

To see the first claim we note that clearly \(\#P \geqq k\). If \(\#P = k\), then \(P \subset \Gamma \) and \(\Gamma \) is a cycle, hence \(P = \Gamma \). But then all bonds connecting \(\Gamma \) and \(X'\) are acyclic. As observed below (5.16), this entails \(\# \Gamma = 1\) which, however, is not possible in case \(\Gamma \) is a cycle.

Assume now (5.17) does not hold. First, suppose that \(P \cap \Gamma \subset M^+\) or \(P \cap \Gamma \subset M^-\). If \(k=1\), the statement \(\#P \geqq k+2\) is clear as \(\# P \geqq 3\). If \(k \geqq 2\), we can choose a simple path in P such that only the first and the last atom lie in \(M^+\) (or \(M^-\), respectively). The statement then follows from Lemma 5.3. On the other hand, if \(P \cap (M^+ {\setminus } M^-) \ne \emptyset \) and \(P \cap (M^- {\setminus } M^+) \ne \emptyset \), then there exist two simple paths contained in P joining \(M^+ {\setminus } M^-\) and \(M^- {\setminus } M^+\). Since (5.17) does not hold, each of these two paths contains an atom that is not contained in \(\Gamma \). This implies \(\#P \geqq k+2\).

We are now in a position to prove (5.16). By the definition of \(\eta \) and the cardinality estimate for \(\partial \text {-}k\text {-gons}\) we obtain

$$\begin{aligned} \begin{aligned} \eta = \sum \limits _{j \geqq 3} f_j(j-3) \geqq&\sum \limits _{k\geqq 1}\, D_k(k+2-3) -N \\ \geqq&\sum \limits _{k\geqq 1}\, D_k(k-1) - {\left\{ \begin{array}{ll} 0 &{}\text {in case (a)},\\ 2 &{}\text {in case (b)},\\ 1&{}\text {in case (c)}, \end{array}\right. } \end{aligned} \end{aligned}$$
(5.18)

where N denotes the number of \(\partial \)-gons satisfying case (5.17). We used that: in case (a) we have \(N = 0\) since otherwise \(\Gamma \) would be connected, in case (b) the fact that \(X'\) is connected and the planarity of the bond graph imply that \(N \leqq 2\), and in case (c) we get \(N \leqq 1\) since \(\Gamma \) is a simple path. Finally, we claim that

$$\begin{aligned} \sum \limits _{k\geqq 1} D_k(k-1)\geqq {\left\{ \begin{array}{ll} n_\Gamma -2 &{}\text {in case (a)},\\ n_\Gamma &{}\text {in case (b)},\\ n_\Gamma -1&{}\text {in case (c)}, \end{array}\right. } \end{aligned}$$
(5.19)

Indeed, this follows from the fact that each bond in between two successive atoms \(x,y \in \Gamma \) is contained in exactly one \(\partial \)-gon and \(k-1\) estimates from above the number of bonds between atoms in \(\Gamma \cap P\) whenever \(P \in \partial \text {-}k\text {-gon}\) as otherwise \(P = \Gamma \) and \(\# P = k\) which we have excluded above. (The estimate is strict if \(\Gamma \cap P\) is not connected.) By combining (5.18)–(5.19) we obtain (5.16). This concludes the proof. \(\square \)

Proof of Lemma 5.2

Without restriction we assume that \(z^+ \ne z^-\). Let \(X^\pm \) be as in the statement of Lemma 5.1, that is, \(X^\pm = M^\pm \). We define

$$\begin{aligned} Y^+&= X^+ {\setminus } (\partial X^+ \cap \partial X^-) \cup \big \{ x \in \partial X^+ \cap \partial X^- :\#({\mathcal {N}}(x) \cap X^+) \geqq \#({\mathcal {N}}(x) \cap X^-)\big \}, \\ Y^-&= X^- {\setminus } (\partial X^+ \cap \partial X^-) \cup \big \{ x \in \partial X^+ \cap \partial X^- :\#({\mathcal {N}}(x) \cap X^+) < \#({\mathcal {N}}(x) \cap X^-)\big \}. \end{aligned}$$

Proof of \(\mathrm {(i)}\). Property (i) is obviously satisfied by construction.

Proof of \(\mathrm {(ii)}\). As a preparation, let us note that, if \(x \in X^+ \cap X^-\), then \({\mathcal {N}}(x) \cap X^+ \cap X^- = \emptyset \) since \(z^+ \ne z^-\). Moreover, if \(x \in X^+ \cap X^- \cap Q^\nu _T(y) = \partial X^+ \cap \partial X^-\), then \(\#{\mathcal {N}}(x) \leqq 5\) by Lemma 5.1(ii). Since \(X^\pm \) is strongly connected, we also have \(\#({\mathcal {N}}(x) \cap X^\pm ) \geqq 2\). Our definition of \(Y^\pm \) then entails

$$\begin{aligned} x \in X^\pm {\setminus } Y^\pm \implies \#({\mathcal {N}}(x) \cap X^\pm ) = 2. \end{aligned}$$
(5.20)

This ensures that \(Y^\pm = X^\pm = {\mathscr {L}}(z^\pm )\) on \(\partial ^\pm _1 Q^\nu _T(y)\). Furthermore, it entails \(\partial Y^\pm \subset \partial X^\pm \). Indeed, \(y \in \partial Y^\pm {\setminus } \partial X^\pm \) would give \(\# ({\mathcal {N}}(y) \cap X^\pm ) = 6\) and \(\# ({\mathcal {N}}(y) \cap Y^\pm ) \leqq 5\), that is, there exists \(x \in X^\pm {\setminus } Y^\pm \) with \(|x-y| = 1\). But then \(\#({\mathcal {N}}(x) \cap {\mathcal {N}}(y)\cap X^\pm )=2\), which yields the contradiction \(\#({\mathcal {N}}(x) \cap X^\pm ) \geqq 3\).

Proof of \(\mathrm {(iii)}\). Since \(X^\pm \) is simply connected and \(x \in \partial X^\pm {\setminus } \partial Y^\pm \) is only possible if \( \# ({\mathcal {N}}(x) \cap X^\pm ) = 2\) (see (5.20)), we get that \(\partial Y^\pm \) is a simple path connecting the lateral faces of \(Q^\nu _T(y)\). More precisely, by Step 2 of the proof of Lemma 5.1, there are \(v^\pm _- \in X^\pm \cap H_{\nu ^\perp ,-}^T(y)\) and \(v^\pm _+ \in X^\pm \cap H_{\nu ^\perp ,+}^T(y)\) such that \(\{v^\pm _-, v^\pm _+\} \cup \partial Y^\pm \) is a simple path with first element \(v^\pm _-\) and last element \(v^\pm _+\). The bonds between any two consecutive atoms in this chain form a polygonal line and we denote by \(\alpha (x)\) the (interior) angle it forms at atom x.

As the first and the last segments cross the lateral faces of \(Q^\nu _T(y)\) and \(Y^\pm \) is strongly connected, we have

$$\begin{aligned} \sum \limits _{x \in \partial Y^\pm } (\pi - \alpha (x)) \in \frac{1}{3} \{ -2\pi , -\pi , 0, \pi , 2\pi \}. \end{aligned}$$

Since \(X^\pm \) is simply connected, due to (5.20), the same holds true for \(Y^\pm \). Hence, \(\alpha (x)\) relates to the number of neighbours of x within \(Y^\pm \) by the formula

$$\begin{aligned} \alpha (x) = \frac{1}{3} \big ( \#({\mathcal {N}}(x) \cap Y^\pm ) -1 \big ) \pi . \end{aligned}$$

As a consequence we obtain

$$\begin{aligned} \Big | \sum \limits _{x \in \partial Y^\pm } \big ( \# ({\mathcal {N}}(x) \cap Y^\pm ) - 4 \big ) \Big | = \Big | \frac{3}{\pi } \sum \limits _{x \in \partial Y^\pm } \big ( \alpha (x) - \pi \big ) \Big | \leqq 2. \end{aligned}$$

This concludes the proof. \(\square \)

We summarize our main findings on the structure of grain boundaries obtained in the proof of Lemma 5.1 in the following theorem:

Theorem 5.4

(Reduction to subsets of two lattices) Let \(z^+,z^- \in {\mathcal {Z}}\), \(z^+ \ne z^-\), \(\nu \in {\mathbb {S}}^1\), \(y \in \mathbb R^2\), and \(T>0\). Let \(X \subset {\mathbb {R}}^2\) be a minimizer of

$$\begin{aligned} \min \Big \{E_1\big (X,Q^\nu _T(y)\big ):\ X = {\mathscr {L}}(z^\pm ) \text { on } \partial ^\pm _1 Q_T^\nu (y) \Big \}. \end{aligned}$$

Then \(X = M^+ \cup M^-\) on \(Q_T^\nu (y)\), where \(M^+, M^-\) are the maximal components of X, see (5.5). Coloring each (closed) equilateral triangle of sidelength 1 all of whose corners are contained in \(M^\pm \) in dark/light gray, yields two simply connected plain regions containing \(\partial ^\pm _1 Q^\nu _T(y)\), respectively, whose boundary part inside of \(Q^\nu _T(y)\) is given by a simple path of atoms.

6 Characterization of Solid–Vacuum/Solid–Solid Interactions

This section is devoted to establish a relation between the cell formula \(\Phi \) defined in (4.1) and the density \(\varphi _{\mathrm{hex}}\) given in (2.18). In particular, we will analyze the situation where the two lattices \({\mathscr {L}}(z^+)\) and \({\mathscr {L}}(z^-)\), which determine the admissible configurations at the boundary, allow for touching points, that is, atoms \(x^+ \in {\mathscr {L}}(z^+)\) and \(x^- \in {\mathscr {L}}(z^-)\) with \(|x^+ - x^-| =1\). We start by formulating the two results of this section.

Lemma 6.1

(Relation of \(\Phi \) and \(\varphi _{\mathrm{hex}})\) There exists a universal constant \(C>0\) such that for each \(\nu \in {\mathbb {S}}^1\) and for every sequence of centers \(\lbrace y_T\rbrace _T\) the following properties hold:

  1. (i)

    If \(z^+=(\theta ,\tau ,1) \in {\mathcal {Z}}\) and \(z^- = {\mathbf {0}}\) or if \(z^+ = {\mathbf {0}}\) and \(z^-=(\theta ,\tau ,1) \in {\mathcal {Z}}\), there holds for all \(T>0\)

    $$\begin{aligned} \Big | \frac{1}{T} \min \big \{E_1\big (X_T,Q^\nu _T(y_T)\big ) :\, X_T&= {\mathscr {L}}(z^\pm ) \text { on } \partial ^\pm _1 Q^\nu _T(y_T)\big \} -\varphi _{\mathrm {hex}}\big (e^{-i\theta } \nu \big )\Big | \leqq C/T. \end{aligned}$$
  2. (ii)

    For all \(z^+ =(\theta ^+,\tau ^+,1)\), \(z^- =(\theta ^-,\tau ^-,1) \in {\mathcal {Z}}\) there holds for all \(T>0\)

    $$\begin{aligned}&\frac{1}{T} \min \big \{E_1\big (X_T,Q^\nu _T(y_T)\big ) :\, X_T = {\mathscr {L}}(z^\pm ) \text { on } \partial ^\pm _1 Q^\nu _T(y_T)\big \} \\&\quad \leqq \varphi _{\mathrm {hex}}\big (e^{-i\theta ^+} \nu \big )+ \varphi _{\mathrm {hex}}\big (e^{-i\theta ^-} \nu \big ) + C/T. \end{aligned}$$

    Moreover, if \(z^+ \ne z^-\), then also

    $$\begin{aligned}&\frac{1}{T} \min \big \{E_1\big (X_T,Q^\nu _T(y_T)\big ) :\, X_T = {\mathscr {L}}(z^\pm ) \text { on } \partial ^\pm _1 Q^\nu _T(y_T)\big \} \\&\quad \geqq \frac{1}{2} \varphi _{\mathrm {hex}}\big (e^{-i\theta ^+} \nu \big )+ \frac{1}{2} \varphi _{\mathrm {hex}}\big (e^{-i\theta ^-} \nu \big ) - C/T. \end{aligned}$$

Note that this lemma indeed provides a relation between \(\varphi _{\mathrm{hex}}\) and the density \(\Phi \) since

$$\begin{aligned} \Phi (z^+,z^-,\nu ) \leqq \liminf _{T\rightarrow +\infty } \frac{1}{T} \min \big \{E_1\big (X_T,Q^\nu _T(y_T)\big ) :\, X_T = {\mathscr {L}}(z^\pm ) \text { on } \partial ^\pm _1 Q^\nu _T(y_T)\big \} \end{aligned}$$
(6.1)

for all \(z^\pm \in {\mathcal {Z}}\), \(\nu \in {\mathbb {S}}^1\), and all \(\lbrace y_T\rbrace _T\). We point out that the energy density \(\varphi _{\mathrm {hex}}\) has already been identified in [3, 20]. In our exposition, once the technical result about reduction to two lattices (see Lemma 5.1) has been achieved, the proof of Lemma 6.1(i) is rather simple compared to [20, Theorem 2.2]. In addition, this version with convergence rate is a novel result and is needed in order to prove Proposition 2.2.

The next lemma is a refinement which addresses the question under which conditions on the difference of the rotation angles \(\theta ^+ - \theta ^-\) equality holds in (ii). To formulate this statement, recall \(\omega = \frac{1}{2}+\frac{i}{2}\sqrt{3}\) from Section 2.2. We introduce the set of good angles, denoted by \({{\mathcal {G}}_{{\mathbb {A}}}}\), as the angles \(\theta \in {\mathbb {A}}\) which can be written as

$$\begin{aligned} e^{i\theta }= \frac{v_1}{v_2}, \ \ \ \text { with } v_1,v_2 \in {\mathscr {L}} {\setminus } \{0\}. \end{aligned}$$
(6.2)

Here, the division of \(v_1,v_2 \in {\mathbb {C}}\) has to be understood in the sense of complex numbers. I.e., such angles correspond to rotations which transform one lattice point into another one. Note that \({{\mathcal {G}}_{{\mathbb {A}}}}\) is clearly countable. From an algebraic standpoint, our notion of \({{\mathcal {G}}_{{\mathbb {A}}}}\) coincides with those angles \(\theta \) such that \(e^{i\theta }\) is a fraction of the commutative ring \({\mathscr {L}}\).

Lemma 6.2

(Touching lattices) Let \(z^\pm =(\theta ^\pm ,\tau ^\pm ,1) \in {\mathcal {Z}}\) be such that

$$\begin{aligned} \Phi (z^+,z^-,\nu ) \leqq \varphi _{\mathrm{hex}}\big (e^{-i\theta ^-}\nu \big ) + \varphi _{\mathrm{hex}}\big (e^{-i\theta ^+} \nu \big ) - \eta \end{aligned}$$
(6.3)

for an \(\eta > 0\). Then, there exists an optimal sequence \(\lbrace X_T\rbrace _T\) for \(\Phi (z^+,z^-,\nu )\), see (5.1), such that for all \(T>0\) large enough, there holds \(X_T \subset {\mathscr {L}}(z^+_T) \cup {\mathscr {L}}(z^-_T)\), where \(z^\pm _T=(\theta ^\pm _T,\tau ^\pm _T,1) \in {\mathcal {Z}}\), and the rotation angles satisfy

$$\begin{aligned} \theta ^+_T - \theta ^-_T = \theta ^+ - \theta ^- \in {{\mathcal {G}}_{{\mathbb {A}}}} \ \ \ \text {for all }T>0. \end{aligned}$$
(6.4)

More precisely, \(e^{i(\theta ^+ - \theta ^-)} = v_1/v_2\) for lattice vectors \(v_1,v_2 \in {\mathscr {L}} {\setminus } \{0\}\) with \(|v_1|,|v_2| \leqq C_{\eta }\), where \(C_{\eta }>0\) only depends on \(\eta \).

Condition (6.3) means that the surface energy between sub-lattices of \({\mathscr {L}}(z^+)\) and \({\mathscr {L}}(z^-)\) can be strictly less than the sum of the surface energies corresponding to each lattice interacting with the vacuum. This indicates that there are many atoms (in a certain sense) in \({\mathscr {L}}(z^+)\) with distance 1 to atoms in \({\mathscr {L}}(z^-)\). Therefore, we speak of lattices which have “touching points”. The lemma shows two properties of optimal sequences: (i) they can be chosen as a subset of two lattices only, cf. also Lemma 5.1, (ii) the difference of the corresponding rotation angles is constant and lies in \({{\mathcal {G}}_{{\mathbb {A}}}}\).

We now proceed with the proofs of the two lemmas.

Proof of Lemma 6.1

For the whole proof, we fix \(\nu \in {\mathbb {S}}^1\) and a sequence of centers \(\lbrace y_T\rbrace _T\).

Proof of \(\mathrm {(i)}\). Let \(z= (\theta ,\tau ,1)\in {\mathcal {Z}}{\setminus } \lbrace {\mathbf {0}}\rbrace \). We only prove the result for \(z^+ = z\) and \(z^- = {\mathbf {0}}\) since the argumentation for the reflected boundary conditions is the same. We obtain the statement by showing separately the two inequalities, where one is proved by a slicing argument and the other one in a constructive way.

Step 1: First inequality. The goal of this step is to prove

$$\begin{aligned} \frac{1}{T} \min \big \{E_1\big (X_T,Q^\nu _T(y_T)\big ) :\, X_T = {\mathscr {L}}(z^\pm ) \text { on } \partial ^\pm _1 Q^\nu _T(y_T)\big \} \geqq \varphi _{\mathrm {hex}}\big (e^{-i\theta } \nu \big ) -C/T. \end{aligned}$$

Consider \(X_T \subset {\mathbb {R}}^2\) satisfying \(X_T = {\mathscr {L}}(z)\) on \(\partial ^+_1 Q^\nu _T(y_T)\), \(X_T={\mathscr {L}}({\mathbf {0}}) = \emptyset \) on \(\partial ^-_1 Q^\nu _T(y_T)\), and

$$\begin{aligned} E_1\big (X_T,Q^\nu _T(y_T)\big ) = \min \big \{E_1({\tilde{X}}_T,Q^\nu _T({y}_T)\Big ) :\, {\tilde{X}}_T = {\mathscr {L}}(z^\pm ) \text { on } \partial ^\pm _1Q^\nu _T({y}_T)\big \}. \end{aligned}$$
(6.5)

By Lemma 5.1, we get that \(X_T \subset {\mathscr {L}}(z) = e^{i\theta }({\mathscr {L}}+\tau )\). Recall the definition \(\omega = \frac{1}{2}+\frac{i}{2}\sqrt{3}\). We now perform a slicing argument: for \(k \in \lbrace 1,2,3\rbrace \), we define for each \(\mu \in \mathbb R\)

$$\begin{aligned} { I_k(\mu ) := \big \{\lambda e^{i\theta }\omega ^k + \mu e^{i\theta } (\omega ^k)^\perp :\, \lambda \in {\mathbb {R}} \big \} } \end{aligned}$$

the line in lattice direction \(e^{i\theta }\omega ^k\) passing through the line \(\mathbb Re^{i\theta } (\omega ^k)^\perp \) at point \(\mu e^{i\theta }(\omega ^k)^\perp \). We set

$$\begin{aligned} {\mathcal {I}}_k =\Big \{\mu \in {\mathbb {R}} :\, I_k(\mu ) \cap {\mathscr {L}}(z) \ne \emptyset , \ I_k(\mu ) \cap [y_T - \tfrac{T}{2} \nu ^\perp ; y_T + \tfrac{T}{2} \nu ^\perp ] \Big \}. \end{aligned}$$

Due to the boundary conditions, up to a bounded number of times independent of both \(\nu \) and T, for each \(\mu \in {\mathcal {I}}_k\) we find \(x \in X_T \subset {\mathscr {L}}(z)\) such that \(x+e^{i\theta } \omega ^k \notin X_T\) or \(x- e^{i\theta } \omega ^k \notin X_T\). (Note that a bounded number of lattice lines, independent of T, in direction \(e^{i\theta }\omega ^k\) and passing through \([y_T - \frac{T}{2} \nu ^\perp ; y_T + \frac{T}{2} \nu ^\perp ]\) does not intersect \(\partial ^+_1 Q^\nu _T(y_T)\).) By (2.3) this yields

$$\begin{aligned} E_1\big (X_T, Q^\nu _T(y_T)\big ) \geqq \sum \limits _{k=1}^3 \#{\mathcal {I}}_k -C \end{aligned}$$
(6.6)

for a constant \(C>0\) independent of T. It remains to estimate \(\#{\mathcal {I}}_k\). For \(\mu \in {\mathbb {R}}\) such that \(I_k(\mu )\cap {\mathscr {L}}(z)\ne \emptyset \), we get \(I_k(\mu \pm \sqrt{3}/2)\cap {\mathscr {L}}(z) \ne \emptyset \) and \(I_k(\mu ')\cap {\mathscr {L}}(z) =\emptyset \) for all \(\mu ' \in (\mu -\sqrt{3}/2,\mu +\sqrt{3}/2) {\setminus } \lbrace \mu \rbrace \). Finally, we have

$$\begin{aligned} {\mathcal {L}}^1\Big (\Pi _k\big ([y_T - \tfrac{T}{2} \nu ^\perp ; y_T + \tfrac{T}{2} \nu ^\perp ]\big )\Big ) = T \big |\langle \nu , e^{i\theta } \omega ^k \rangle \big |, \end{aligned}$$

where \(\Pi _k\) denotes the orthogonal projection onto \(\mathbb Re^{i\theta } (\omega ^k)^\perp \). We therefore obtain

$$\begin{aligned} \#{\mathcal {I}}_k \geqq \frac{2T}{\sqrt{3}} \big | \langle \nu , e^{i\theta } \omega ^k \rangle \big | -C =\frac{2T}{\sqrt{3}} \big |\langle e^{-i\theta } \nu , \omega ^k \rangle \big | -C. \end{aligned}$$
(6.7)

By (2.18) and (6.6)–(6.7) we conclude

$$\begin{aligned} \frac{1}{T} E_1\big (X_T, Q^\nu _T(y_T)\big ) \geqq \frac{2}{\sqrt{3}}\sum \limits _{k=1}^3 \big |\langle e^{-i\theta } \nu , \omega ^k \rangle \big | -C/T = \varphi _{\mathrm {hex}}\big (e^{-i\theta } \nu \big ) - C/T. \end{aligned}$$

This along with (6.5) shows the first inequality.

Step 2: Second inequality. The goal of this step is to prove

$$\begin{aligned} \frac{1}{T} \min \big \{E_1\big (X_T,Q^\nu _T(y_T)\big ) :\, X_T = {\mathscr {L}}(z^\pm ) \text { on } \partial ^\pm _1 Q^\nu _T(y_T)\big \} \leqq \varphi _{\mathrm {hex}}\big (e^{-i\theta } \nu \big ) + C/T. \end{aligned}$$
(6.8)

This is achieved by constructing an explicit competitor for the minimization problem: we define \(X^+_T\) by

$$\begin{aligned} X^+_T = {\left\{ \begin{array}{ll} {\mathscr {L}}(z) &{}\text {in } \,\,\{x:\langle x-y_T, \nu \rangle \geqq 5\},\\ \emptyset &{}\text {otherwise,} \end{array}\right. } \end{aligned}$$
(6.9)

that is, \(X^+_T\) is a (discrete version of a) half space. We directly see that \(X^+_T = {\mathscr {L}}(z)\) on \(\partial ^+_1 Q^\nu _T(y_T)\) and \(X^+_T=\emptyset \) on \(\partial ^-_1 Q^\nu _T(y_T)\). To estimate its energy, we start by observing that for this choice of \(X^+_T\) equality holds in (6.6) with \({\mathcal {I}}_k\) as defined above, up to an error of order \(\mathrm{O}(1)\). Indeed, if \(x \in {\mathscr {L}}(z){\setminus } X^+_T\), then either \(x+ \lambda e^{i\theta } \omega ^k \notin X^+_T\) for all \(\lambda \in {\mathbb {N}}\) or \(x-\lambda e^{i\theta } \omega ^k \notin X^+_T\) for all \(\lambda \in {\mathbb {N}}\). Then, the equalities in (6.6) and (6.7) along with (2.18) yield

$$\begin{aligned} \frac{1}{T} E_1\big (X^+_T, Q^\nu _T(y_T)\big ) \leqq \frac{2}{\sqrt{3}}\sum \limits _{k=1}^3 \big | \langle e^{-i\theta } \nu , \omega ^k \rangle \big | +C/T = \varphi _{\mathrm {hex}}\big (e^{-i\theta } \nu \big ) + C/T. \end{aligned}$$
(6.10)

This shows (6.8). For purposes of the proof of (ii) below, we note that construction (6.9) with \(-\nu \) in place of \(\nu \) can be applied to obtain a configuration \(X^-_T \subset {\mathbb {R}}^2\) with \({X}^-_T = {\mathscr {L}}(z)\) on \(\partial ^-_1 Q^\nu _T(y_T)\) and \(X^-_T=\emptyset \) on \(\partial ^+_1 Q^\nu _T(y_T)\) which satisfies (6.10).

Proof of \(\mathrm {(ii)}\). Fix \(z^+ =(\theta ^+,\tau ^+,1) \in {\mathcal {Z}}\) and \(z^- =(\theta ^-,\tau ^-,1) \in {\mathcal {Z}}\). We show the first inequality by an explicit construction. The second one is obtained with the help of Lemma 5.2.

Step 1: First inequality. We define \(X_T = X_T^+ \cup X_T^-\), where

$$\begin{aligned}&X^+_T = {\left\{ \begin{array}{ll} {\mathscr {L}}(z^+) &{}\text {in } \,\,\{x:\langle x-y_T,\nu \rangle \geqq 5\},\\ \emptyset &{}\text {otherwise.} \end{array}\right. }, \\&X^-_T = {\left\{ \begin{array}{ll} {\mathscr {L}}(z^-) &{}\text {in } \,\,\{x:\langle x-y_T, \nu \rangle \leqq -5\},\\ \emptyset &{}\text {otherwise.} \end{array}\right. } \end{aligned}$$

Then, \(X_T\) clearly satisfies the boundary conditions \(X_T = {\mathscr {L}}(z^\pm )\) on \(\partial ^\pm _1 Q^\nu _T(y_T)\) and by repeating the reasoning in (6.10) we find that

$$\begin{aligned} \frac{1}{T}E_T\big (X_T,Q^\nu _T(y_T)\big )&= \frac{1}{T}\Big ( E_1\big (X_T^+,Q^\nu _T(y_T)\big ) + E_1\big (X_T^-,Q^\nu _T(y_T)\big ) \Big ) \\ {}&\leqq \varphi _{\mathrm {hex}}\big (e^{-i\theta ^+} \nu \big )+ \varphi _{\mathrm {hex}}\big (e^{-i\theta ^-} \nu \big ) +C/T. \end{aligned}$$

Step 2: Second inequality. Consider \(X_T \subset {\mathbb {R}}^2\) satisfying \(X_T = {\mathscr {L}}(z^\pm )\) on \(\partial ^\pm _1 Q^\nu _T(y_T)\) and

$$\begin{aligned} E_1\big (X_T,Q^\nu _T(y_T)\big ) = \min \big \{E_1\big ({\tilde{X}}_T,Q^\nu _T({y}_T)\big ) :\, {\tilde{X}}_T = {\mathscr {L}}(z^\pm ) \text { on } \partial ^\pm _1Q^\nu _T({y}_T)\big \}. \end{aligned}$$

By Lemmas 5.1 and 5.2 there holds \(X_T = X^+_T \cup X^-_T = Y^+_T {{\dot{\cup }}} Y^-_T\) on \(Q_T^\nu (y)\), where \(Y^\pm _T = {\mathscr {L}}(z^\pm )\) on \(\partial ^\pm _1 Q^\nu _T(y_T)\) and

$$\begin{aligned} \big | \sum \limits _{x \in \partial Y^\pm _T} \# ({\mathcal {N}}(x) \cap Y^\pm _T) - 4 \# \partial Y^\pm _T \big | \leqq 2. \end{aligned}$$

Since \(\#{\mathcal {N}}(x) \leqq 5\) for any \(x \in \partial Y^\pm _T\) (\(\subset \partial X^\pm _T\)), we get

$$\begin{aligned} \frac{1}{2} \sum \limits _{x \in Y^\pm _T \cap Q^\nu _T(y_T)} (6 - \#{\mathcal {N}}(x)) \geqq \frac{1}{2} \# \partial Y^\pm _T \geqq \frac{1}{4} \sum \limits _{x \in \partial Y^\pm _T} \big ( 6 - \# ({\mathcal {N}}(x) \cap Y^\pm _T) \big ) - 1/2. \end{aligned}$$

So observing that \(Y^\pm _T\) is a competitor in (Step 1 of) (i) above and using that \(Y^+_T \cap Y^-_T \cap Q^\nu _T(y_T) = \emptyset \), we find that

$$\begin{aligned} \frac{1}{T} E_1\big (X_T,Q^\nu _T(y_T)\big )&\geqq \frac{1}{2T} E_1\big (Y^+_T,Q^\nu _T(y_T)\big ) + \frac{1}{2T} E_1 \big (Y^-_T,Q^\nu _T(y_T)\big ) - 1/T \\&\geqq \frac{1}{2} \varphi _{\mathrm {hex}}\big (e^{-i\theta ^+} \nu \big )+ \frac{1}{2} \varphi _{\mathrm {hex}}\big (e^{-i\theta ^-} \nu \big ) - C/T. \end{aligned}$$

This concludes the proof. \(\square \)

Proof of Lemma 6.2

Let \(\lbrace X_T \rbrace _T\) be an optimal sequence for \(\Phi (z^+,z^-,\nu )\) and denote by \(\lbrace y_T\rbrace _T\) the corresponding centers of the cubes. Due to Lemma 5.1, we may without restriction assume that \(X_T = X^+_T \cup X_T^-\), for sub-configurations \(X^\pm _T\) satisfying \(X_T^\pm \subset {\mathscr {L}}(z^\pm _T) \), where \(z^\pm _T = (\theta ^\pm _T,\tau ^\pm _T,1) \rightarrow z^\pm =(\theta ^\pm ,\tau ^\pm ,1)\) as \(T \rightarrow +\infty \). Moreover, the sets \(\partial X^\pm \) defined in (5.2) are connected, and there holds \(X_T= {\mathscr {L}}(z^\pm _T)\) on \(\partial _1^\pm Q^\nu _T(y_T)\). In what follows, we fix a subsequence (not relabeled) such that by (6.3) we have

$$\begin{aligned} \varphi _{\mathrm{hex}}\big (e^{-i\theta ^+}\nu \big ) + \varphi _{\mathrm{hex}}\big (e^{-i\theta ^-}\nu \big )-\lim _{T\rightarrow +\infty } \frac{1}{T}E_1\big (X_T,Q^\nu _T(y_T)\big ) \geqq \eta > 0. \end{aligned}$$
(6.11)

Our strategy to show (6.4) lies in proving

$$\begin{aligned} e^{i(\theta _T^+ - \theta _T^-)} = \frac{v_{T}^+}{v_{T}^-} \ \ \ \text {with} \ \ \ v_{T}^+,v_{T}^- \in {\mathscr {L}} \ \ \ \text {satisfying} \ \ \ |v_{T}^+| = |v_{T}^-|\leqq C_\eta \end{aligned}$$
(6.12)

for all T sufficiently large, where \(C_\eta \) only depends on \(\eta \). From this estimate, the statement in (6.4) easily follows. In fact, given (6.12), since \({\mathscr {L}}\) is a discrete set and \(\theta ^\pm _T \rightarrow \theta ^\pm \), \(e^{i(\theta ^+_T - \theta ^-_T)} = v_{T}^+/v_{T}^-\) is eventually constant and we find \(\theta ^+ -\theta ^- = \theta _T^+- \theta _T^- \in {{\mathcal {G}}_{{\mathbb {A}}}}\) for all T large enough.

Let us come to the proof of (6.12). Recall by Lemma 5.1 that \(X_T\) is contained in the two components \(X_T^+\) and \(X_T^-\). We further define the set of touching points

$$\begin{aligned} {\mathcal {T}}^+_{T}&= \{x \in X_T^+:\, \exists \, y \in X_T^- \text { such that } |x-y|=1\}, \\ {\mathcal {T}}_{T}^-&= \{x \in X_T^-:\, \exists \, y \in X_T^+ \text { such that } |x-y|=1\}. \end{aligned}$$

Note that \({\mathcal {T}}_{T}^\pm \subset \bigcup _{x\in \partial X_T^\pm } (\lbrace x\rbrace \cup {\mathcal {N}}(x))\), see definition (5.2). (\({\mathcal {T}}_{T}^\pm {\setminus } \partial X_T^\pm \ne \emptyset \) is possible if \(X_T^+ \cap X_T^- \ne \emptyset \).) By (2.2) we also observe that

$$\begin{aligned} \# {\mathcal {T}}_{T}^+ / 6 \leqq \# {\mathcal {T}}_{T}^- \leqq 6\# {\mathcal {T}}_{T}^+. \end{aligned}$$
(6.13)

We start with a brief outline of the proof. Steps 1–4 are devoted to some preliminary estimates: we first show that the cardinality of the sets \( \partial X_T^\pm \) and \( {\mathcal {T}}_{T}^\pm \) scales like T by providing a lower bound for \( {\mathcal {T}}_{T}^\pm \) (Step 1) and an upper bound for \(\partial X_T^\pm \) (Step 2). Then we show that, for the majority of points in \( {\mathcal {T}}_{T}^\pm \), neighborhoods contain many points of \( \partial X_T^\pm \) (Step 3) and also elements of \( {\mathcal {T}}_{T}^\pm \) (Step 4). Based on this, we can find quadrilaterals consisting of two points in \({\mathcal {T}}^+_{T}\) and two points in \( {\mathcal {T}}^-_{T}\) where two sides have length 1 and the other two sides are parallel to lattice vectors of the form \(e^{i\theta ^+_T}w_T^+\) and \(e^{i\theta ^-_T}w_T^-\), respectively, for some \(w_T^+, w_T^- \in {\mathscr {L}}\) with controlled norm. From this, (6.12) can be derived (Step 5 and Step 6).

Step 1: Cardinality of touching points. We show \(\# {\mathcal {T}}_{T}^\pm \geqq \frac{\eta }{22}T\) for T large enough. By (2.2), (2.3), and the fact that \(X^\pm _T= {\mathscr {L}}(z^\pm _T)\) on \(\partial _1^\pm Q^\nu _T(y_T)\), we obtain

$$\begin{aligned} E_1\big (X_T,Q^\nu _T(y_T)\big )&\geqq \frac{1}{2}\sum _{x \in X_T^+ \cap Q^\nu _T(y_T)} \big (6-\#({\mathcal {N}}(x) \cap X_T^+)\big ) \\&\quad + \frac{1}{2} \sum _{x \in X_T^- \cap Q^\nu _T(y_T)} \big (6-\#({\mathcal {N}}(x) \cap X_T^-)\big ) \\&\quad \ \ \ - 3(\#{\mathcal {T}}_{T}^+ + \#{\mathcal {T}}_{T}^-), \end{aligned}$$

and therefore,

$$\begin{aligned} 3(\#{\mathcal {T}}_{T}^+ + \#{\mathcal {T}}_{T}^-) \geqq E_1\big (X_T^+,Q^\nu _T(y_T)\big ) + E_1\big (X_T^-,Q^\nu _T(y_T)\big ) - E_1\big (X_T,Q^\nu _T(y_T)\big ). \end{aligned}$$

We note by the definition of \(X_T\) that the subconfigurations \(X_T^+\) and \(X_T^-\) are competitors for the minimization problems appearing in Lemma  6.1(i). Dividing by T and passing to the \(\liminf \) along \(T\rightarrow +\infty \), by (6.11) we therefore conclude

$$\begin{aligned} \liminf _{T\rightarrow +\infty } \, \frac{1}{T} ( \#{\mathcal {T}}_{T}^+ + \#{\mathcal {T}}_{T}^- ) \geqq \eta /3. \end{aligned}$$

This yields \(\liminf _{T\rightarrow +\infty } \frac{1}{T}\#{\mathcal {T}}_{T}^\pm \geqq \frac{\eta }{21}\) by (6.13), and concludes Step 1.

Step 2: A priori bound on the length of the boundaries. We claim that for \(T > 0\) large enough the boundaries \(\partial X_T^\pm \subset Q^\nu _T(y_T)\) (cf. (5.2)) satisfy

$$\begin{aligned} \# ( \partial X_T^+ \cup \partial X_T^- ) \leqq 8T. \end{aligned}$$
(6.14)

In fact, by Lemma 5.1(ii) there holds \(\#{\mathcal {N}}(x) \leqq 5\) for all \(x \in \partial X_T^\pm \) and therefore for T sufficiently large we get by (2.3), (6.11), and the fact that \(\Vert \varphi _{\mathrm{hex}} \Vert _{L^\infty ({\mathbb {S}}^1)} = 2 \) (see (2.18))

$$\begin{aligned} \# ( \partial X_T^+ \cup \partial X_T^- )&\leqq \sum \limits _{x \in X_T \cap Q_T^\nu (y_T)} (6-\#{\mathcal {N}}(x)) = 2\, E_1\big (X_T,Q^\nu _T(y_T)\big ) \\&\leqq 2T \big ( \varphi _{\mathrm{hex}}\big (e^{-i\theta ^+}\nu \big ) + \varphi _{\mathrm{hex}}\big (e^{-i\theta ^-}\nu \big ) \big )\leqq 8T. \end{aligned}$$

Step 3: Atomic density lower bound for \(\partial X_T^\pm \). We claim that there exists a universal \(0< c <1\) such that, for all \(T> r \geqq 1\), we have

$$\begin{aligned} \#\big ( \partial X_T^\pm \cap B_r(x) \big ) \geqq c r \quad \text {for all }x \in \mathbb R^2 \text { with } \mathrm {dist}(x, \partial X_T^\pm ) \leqq 1. \end{aligned}$$
(6.15)

To prove this estimate we assume without restriction that \(T > 3r\). Due to Lemma 5.1(ii), \(\partial X^\pm _T\) is connected and \(\partial X_T^\pm {\setminus } B_r(x) \ne \emptyset \). Therefore, there has to exist a simple path in \(\partial X^\pm _T\) that connects some atom in \(\partial X_T^\pm {\setminus } B_r(x)\) with an atom in \(\overline{B_1(x)}\) and has at least cr atoms inside \(B_r(x)\).

Step 4: Bounded gap between points in \({\mathcal {T}}_{T}^\pm \). Given \(R>0\), we introduce the set of R-isolated points by

$$\begin{aligned} {\mathcal {I}}^\pm _{T,R} := \big \{ x\in {\mathcal {T}}_T^\pm :\, B_R(x) \cap {\mathcal {T}}_T^\pm \subset \overline{B_2(x)} \big \}. \end{aligned}$$
(6.16)

We claim that there exists a universal \({\bar{c}} > 0\) such that for \(R={\bar{c}}/\eta \) and all T sufficiently large

$$\begin{aligned} \# {\mathcal {I}}^\pm _{T,R} \leqq \# {\mathcal {T}}_T^\pm /2. \end{aligned}$$
(6.17)

To see this, note that due to (6.14), (6.15) for \(r=R/2\) (use that \( \mathrm {dist}(x, \partial X_T^\pm ) \leqq 1\) for all \(x \in {\mathcal {T}}_T^\pm \)) and Step 1 we have

$$\begin{aligned} \# {\mathcal {I}}^\pm _{T,R} \leqq \frac{2}{cR} \sum \limits _{x \in {\mathcal {I}}^\pm _{T,R}} \#\big (\partial X_T^\pm \cap B_{R/2}(x) \big ) \leqq \frac{C}{cR} \#\partial X_T^\pm \leqq \frac{C}{cR}T \leqq \frac{C}{c {\bar{c}}} \# {\mathcal {T}}_{T}^\pm , \end{aligned}$$

where \(C>0\) denotes a universal constant varying from step to step. Here, in the second step we accounted for possible multiple counting by using that, due to the definition of \({\mathcal {I}}^\pm _{T,R}\), the intersection \(B_{R/2}(x) \cap B_{R/2}(y)\), \(x,y \in {\mathcal {I}}^\pm _{T,R_T}\), can be non-empty only if \(|x-y| \leqq 2\). The assertion follows if \({\bar{c}}\) is chosen big enough.

Step 5: Bounded gap between pairs of points having the same relative position. We choose two arbitrary lattice vectors \(\xi _1,\xi _2\) satisfying \(e^{-i\theta ^-_T}\xi _1, e^{-i\theta _T^+}\xi _2 \in B_{2R} \cap ({\mathscr {L}}{\setminus } \{0\})\) with \(R>0\) given by Step 4. Define

$$\begin{aligned} {\mathcal {D}}^{\xi _1,\xi _2}_{T}&=\big \{(x_1,y_1) \in {\mathcal {T}}_{T}^- \times {\mathcal {T}}_{T}^-:\, \text {there exist }x_2,y_2 \in {\mathcal {T}}_{T}^+ \text { such that} \\&\ \ \quad |x_1 - x_2| = 1, \, |y_1 - y_2| = 1 \text { and } x_1-y_1=\xi _1,\, x_2-y_2=\xi _2 \big \}. \end{aligned}$$

The set consists of pairs \((x_1,y_1)\) in \({\mathcal {T}}_T^-\) whose difference is \(\xi _1\) and which have corresponding neighbors in \({\mathcal {T}}_T^+\) with difference \(\xi _2\).

We observe by (6.16) that for \(x_1 \in {\mathcal {T}}_{T}^- {\setminus } {\mathcal {I}}^-_{T,R}\) we find \(\xi _1 \in B_R \cap e^{i\theta _T^-} {\mathscr {L}}\) with \(|\xi _1| > 2\) and \(y_1 \in {\mathcal {T}}_{T}^-\) such that \(x_1 - y_1 = \xi _1\). We denote the corresponding neighbors in \({\mathcal {T}}_{T}^+\) by \(x_2\) and \(y_2\), respectively. Since \(x_2,y_2 \in e^{i\theta _T^+} ({\mathscr {L}}+ \tau _T^+ )\) and \(|x_1-y_1| = |x_2 - y_2| =1\), we find \(\xi _2 \in B_{2R} \cap e^{i\theta _T^+}{\mathscr {L}}\) such that \(x_2 - y_2 = \xi _2\). Clearly, \(\xi _2 \ne 0\) as \(|\xi _1|>2\). This discussion along with (6.17) implies

$$\begin{aligned} \frac{1}{2} \#{\mathcal {T}}_{T}^- \leqq \# \big ( {\mathcal {T}}_T^-{\setminus } {\mathcal {I}}^-_{T,R}\big ) \leqq \sum \limits _{(\xi _1,\xi _2)} \# {\mathcal {D}}^{\xi _1,\xi _2}_{T}, \end{aligned}$$
(6.18)

where the sum runs over all pairs \((\xi _1,\xi _2)\) with \(e^{-i\theta _T^-}\xi _1, e^{-i\theta _T^+}\xi _2 \in B_{2R} \cap ({\mathscr {L}}{\setminus } \{0\})\). Choose \((\zeta _1^T,\zeta _2^T) \in (B_{2R} \cap e^{i\theta _T^-}({\mathscr {L}}{\setminus } \{0\})) \times (B_{2R} \cap e^{i\theta _T^+}({\mathscr {L}}{\setminus } \{0\}))\) such that \(\#{\mathcal {D}}^{\zeta _1^T,\zeta _2^T}_{T} \geqq \#{\mathcal {D}}^{\xi _1,\xi _2}_{T}\) for all \((\xi _1,\xi _2) \in (B_{2R} \cap e^{i\theta _T^-}({\mathscr {L}}{\setminus } \{0\})) \times (B_{2R} \cap e^{i\theta _T^+}({\mathscr {L}}{\setminus } \{0\}))\). Then, (6.18) and the fact that the number of pairs \(e^{-i\theta _T^-}\xi _1,e^{-i\theta _T^+}\xi _2 \in B_{2R} \cap ({\mathscr {L}}{\setminus } \{0\})\) is controlled by \(CR^4\) yield

$$\begin{aligned} \#{\mathcal {T}}_{T}^- \leqq CR^4 \, \# {\mathcal {D}}^{\zeta ^T_1,\zeta ^T_2}_{T} \end{aligned}$$
(6.19)

for a universal \(C > 0\). We write \({\mathcal {D}}^{\zeta _1^T,\zeta _2^T}_{T} =\{x_{j}^T,y_{j}^T\}_{j=1}^{M_T}\) for some \(M_T \in {\mathbb {N}}\). We claim that there is a universal \(c' > 0\) such that for \(\varrho = c' \eta ^{-5}\)

$$\begin{aligned} \text {there exist }j,k,l \in \lbrace 1,\ldots ,M_T\rbrace \text { pairwise distinct such that }x_k^T,x_l^T \in B_\varrho (x_j^T). \end{aligned}$$
(6.20)

Assume that, on the contrary, \(\varrho \) is such that each \(B_{\varrho }(x_k^T) {\setminus } \lbrace x_k^T \rbrace \) contains at most one point \(\{x_{j}^T\}_{j=1}^{M_T}\). Then, it is elementary to see that we can choose \(\lbrace {\tilde{x}}_j^T \rbrace _{j=1}^{\lceil M_T/2 \rceil } \subset \{x_{j}^T\}_{j=1}^{M_T}\) such that \(B_{\varrho /2}({\tilde{x}}^T_j) \cap B_{\varrho /2}({\tilde{x}}^T_k) = \emptyset \) for \(j,k \in \lbrace 1,\ldots , \lceil M_T/2 \rceil \rbrace \), \(j\ne k\). This along with (6.14), (6.15), and \(2\lceil M_T/2 \rceil \geqq \#{\mathcal {D}}^{\zeta _1^T,\zeta _2^T}_{T}\) implies

$$\begin{aligned} \#{\mathcal {D}}^{\zeta _1^T,\zeta _2^T}_{T}&\leqq 2\Big \lceil \frac{M_T}{2} \Big \rceil \leqq \frac{4}{c\varrho } \sum _{j=1}^{\lceil M_T/2 \rceil } \#\big (\partial X_T^- \cap B_{\varrho /2}({\tilde{x}}_j^T) \big ) \leqq \frac{4}{c\varrho } \# \partial X_T^- \leqq \frac{32T}{c\varrho }. \end{aligned}$$

From (6.19), \(\# {\mathcal {T}}_{T}^- \geqq \frac{\eta }{22}T\) (see Step 1), and the choice \(R = {\bar{c}}/\eta \) in Step 4 we then get \(\varrho \leqq c' \eta ^{-5}/2\) for a universal \(c' > 0\). The assertion of (6.20) is thus guaranteed for \(\varrho = c' \eta ^{-5}\). This concludes Step 5.

Step 6: Conclusion. We denote the three atoms identified in (6.20) by \(x_1^1, x_1^2, x_1^3\) (for convenience, we use a different notation and labeling), and denote by \(y_1^1,y_1^2,y_1^3\) the corresponding points such that \((x^j_1,y^j_1) \in {\mathcal {D}}^{\zeta _1^T,\zeta _2^T}_T\) for \(j\in \lbrace 1,2,3\rbrace \). In particular, recall that

$$\begin{aligned} |x^1_1-x^2_1|, \ \ |x^1_1-x^3_1|, \ \ |x^2_1-x^3_1|\leqq 2 \varrho . \end{aligned}$$
(6.21)

By the definition of \({\mathcal {D}}^{\zeta _1^T,\zeta _2^T}_T\), there exist \((x^1_2,y^1_2),(x^2_2,y^2_2),(x^3_2,y^3_2)\) such that \(|x^j_1-x^j_2|=|y^j_1-y^j_2|=1\), \(\zeta _1^T=x^j_1-y^j_1\), and \(\zeta _2^T=x^j_2-y^j_2\) for \(j \in \lbrace 1,2,3 \rbrace \). Now for each j, the four points \(\{x^j_1,x^j_2,y^j_2,y^j_1\}\) form a quadrilateral (possibly self-intersecting) with two edges of length one and two edges oriented in \(\zeta ^T_1\) and \(\zeta ^T_2\), respectively. Now there are two cases to consider: (a) \(\zeta _1^T=\zeta _2^T\) and (b) \(\zeta _1^T\ne \zeta _2^T\).

Case \(\mathrm {(a)}\): We have that \(x^1_1-y^1_1=x^1_2-y^1_2\), where \(x^1_1-y^1_1=e^{i\theta _T^-} v_1\) and \(x^1_2-y^1_2= e^{i \theta _T^+}v_2\) for \(v_1,v_2 \in ({\mathscr {L}}{\setminus } \lbrace 0 \rbrace ) \cap B_{2R}\). Then \(e^{i\theta _T^-} v_1 = e^{i\theta _T^+} v_2\) and thus (6.12) holds for \(v^+_T =v_1\) and \(v^-_T = v_2\) with \(|v^+_T|, |v^-_T| \leqq 2 R = 2 {\bar{c}}/\eta \).

Case \(\mathrm {(b)}\): Note that two of the three quadrilaterals \(\{x^j_1,x^j_2,y^j_2,y^j_1\}\), \(j \in \lbrace 1,2,3\rbrace \), are necessarily translates of each other. In fact, there are only two different quadrilaterals (up to translation) with fixed order of the sides, prescribed side-length 1 of two opposite edges, and prescribed length and orientation of the other two edges, see Fig. 12.

Fig. 12
figure 12

The two possible quadrilaterals in Step 6, where \(\xi _1,\xi _2\) are given unlike vectors and \(\nu _{1,1}, \nu _{2,1}, \nu _{1,2}, \nu _{2,2}\) denote the possible sides of length 1

Without restriction, assume that the quadrilaterals for \(j=1\) and \(j=2\) are translates of each other. Then we get \(x_1^1 - x_2^1 =x_1^2 - x_2^2\). We write \(x_1^j = e^{i \theta _T^-} (b_1^j + \tau _T^-)\) and \(x_2^j = e^{i \theta ^+_T} (b_2^j + \tau ^+_T)\) for suitable \(b^j_1, b^j_2 \in {\mathscr {L}}\) for \(j \in \lbrace 1,2 \rbrace \). (Note that the lattice vectors depend on T which we do not include in the notation for convenience.) Then \(x_1^1 - x_2^1 =x_1^2 - x_2^2\) implies \(e^{i \theta _T^-} (b_1^1 - b_1^2) = e^{i \theta ^+_T} (b_2^1 - b_2^2)\). Since \(x_1^1 \ne x_1^2\) we have \(b_1^1 - b_1^2 \ne 0\) and thus also \(b_2^1 - b_2^2 \ne 0\), and therefore,

$$\begin{aligned} e^{i (\theta ^+_T - \theta ^-_T)} = \frac{b_1^1 - b_1^2}{b_2^1 - b_2^2}. \end{aligned}$$

Due to (6.21), we obtain \(|b_1^1 - b_1^2| = |x_1^1 - x_1^2|\leqq 2\varrho \) and, since \(|b_1^1 - b_1^2| = |b_2^1 - b_2^2|\), also \(|b_2^1 - b_2^2| \leqq 2\varrho \). As we clearly also have \(b_1^1 - b_1^2, b_2^1 - b_2^2 \in {\mathscr {L}}\), we derive that (6.12) holds for \(v_{T}^+:= b_1^1 - b_1^2\) and \(v_{T}^- := b_2^1 - b_2^2\) with \(|v^+_T|, |v^-_T| \leqq 2 \varrho = 2c'\eta ^{-5}\). As explained below (6.12), (6.12) implies (6.4), and therefore the proof is concluded. \(\square \)

7 Cell Formula Part II: Relation of Converging and Fixed Boundary Values

In this final section about cell formulas we show that converging boundary conditions as in the cell formula \(\Phi \), see (4.1), can be replaced by fixed boundary values. Moreover, we show Proposition 2.2 and the properties of \(\varphi \) stated in Theorem 2.5. We introduce the auxiliary function

$$\begin{aligned} {\bar{\varphi }}(z^+,z^-,\nu )&:= \liminf _{T \rightarrow +\infty } \frac{1}{T} \inf \big \{E_{1}\big (X_T,Q^\nu _{T}(y_T)\big ) :\, y_T \in \mathbb R^2, \nonumber \\&\ \qquad \qquad \qquad \qquad \quad \ X_T = {\mathscr {L}}(z^\pm ) \text { on } \partial ^\pm _{1}Q^\nu _{T}(y_T) \big \} \end{aligned}$$
(7.1)

for \(z^\pm \in {\mathcal {Z}}\) and \(\nu \in {\mathbb {S}}^1\). The main goal of this section is to prove the following two statements:

Lemma 7.1

For each \(z^+, z^- \in {\mathcal {Z}}\) and \(\nu \in {\mathbb {S}}^1\) it holds that

$$\begin{aligned} \Phi (z^+,z^-,\nu ) = {\bar{\varphi }}(z^+,z^-,\nu ). \end{aligned}$$
(7.2)

Moreover, for \(z^\pm = (\theta ^\pm ,\tau ^\pm ,1) \in {\mathcal {Z}}\) with \(\{(x,y) \in {\mathscr {L}}(z^+) \times {\mathscr {L}}(z^-) :\, |x - y| = 1 \} = \emptyset \), we have \( {\bar{\varphi }}(z^+,z^-,\nu ) = {\varphi }_{\mathrm{hex}}\big (e^{-i\theta ^+}\nu \big ) + {\varphi }_{\mathrm{hex}}\big (e^{-i\theta ^-}\nu \big ).\)

Proposition 7.2

For every \(z^+,z^- \in {\mathcal {Z}}\), \(\nu \in {\mathbb {S}}^1\), and every sequence \(\lbrace y_T\rbrace _T \in \mathbb R^2\) there exists

$$\begin{aligned} {\bar{\varphi }}(z^+,z^-,\nu ) = \lim _{T\rightarrow +\infty }\frac{1}{T}\min \left\{ E_1\big (X_T,Q^\nu _T(y_T)\big ):\, X_T = {\mathscr {L}}(z^\pm ) \text { on } \partial _1^\pm Q^\nu _T(y_T) \right\} \end{aligned}$$
(7.3)

and is independent of \(\lbrace y_T\rbrace _T\). In particular, we get \(\varphi \equiv {\bar{\varphi }}\), and the statement of Proposition 2.2 holds.

We point out that Lemma 7.1, Proposition 7.2, and Lemma 4.1 conclude the proof of Proposition 3.4. Section 7.1 is devoted to the proof of Lemma 7.1. Afterwards, in Section 7.2, we show Proposition 7.2 (which particularly yields Proposition 2.2) and we prove further properties of the density \(\varphi \) stated in Theorem 2.5. Then, all proofs of our main results announced in Section 2.3 are concluded.

7.1 Converging and Fixed Boundary Values

This subsection is devoted to the proof of Lemma 7.1. By definition it is clear that \(\Phi (z^+,z^-,\nu ) \leqq {\bar{\varphi }}(z^+,z^-,\nu )\) for all \(z^+,z^- \in {\mathcal {Z}}\) and \(\nu \in {\mathbb {S}}^1\). To see (7.2), it therefore suffices to prove the opposite inequality

$$\begin{aligned} \Phi (z^+,z^-,\nu ) \geqq {\bar{\varphi }}(z^+,z^-,\nu ). \end{aligned}$$
(7.4)

Moreover, we observe that if \(z^+ = {\mathbf {0}}\) or \(z^- = {\mathbf {0}}\), then Lemma 6.1(i) and the continuity of \(\varphi _{\mathrm{hex}}\) imply \(\Phi (z^+,z^-,\nu ) = {\bar{\varphi }}(z^+,z^-,\nu ) = \varphi _{\mathrm {hex}}(e^{-i\theta } \nu )\), where \(\theta \) is the angle corresponding to \(z^+\) or \(z^-\), respectively. Therefore, it suffices to treat the case \(z^\pm = (\theta ^\pm ,\tau ^\pm ,1) \in {\mathcal {Z}}\). To this end, it is crucial that converging boundary values as in (4.1) can be replaced by fixed ones. We split the analysis into two steps by first addressing the rotations and then the translations. We start with the rotations. In view of Lemma 6.2, we may without restriction assume that \(\theta ^+ - \theta ^- \in {{\mathcal {G}}_{{\mathbb {A}}}}\) since otherwise \(\Phi (z^+,z^-,\nu ) \geqq {\varphi }_{\mathrm{hex}}(e^{-i\theta ^+}\nu ) + {\varphi }_{\mathrm{hex}}(e^{-i\theta ^-}\nu )\) and (7.4) follows from Lemma 6.1(ii). Lemma 6.2 already implies that the difference of rotations \(\theta ^+_T - \theta ^-_T\) is constant in T. The next lemma shows that also \(\theta _T^+\) and \(\theta _T^-\) can be chosen to be constant.

Lemma 7.3

(Fixed rotations) Consider \(z_T^\pm =(\theta _T^\pm ,\tau ^\pm _T,1) \in {\mathcal {Z}}\) such that \(\theta _T^+-\theta _T^-= \theta ^+-\theta ^-\) for all \(T>0\) for some \(\theta ^+,\theta ^-\in {\mathbb {A}}\) and \(\theta _T^\pm \rightarrow \theta ^\pm \). Let \(\nu \in {\mathbb {S}}^1\). Then, there holds

$$\begin{aligned}&\liminf _{T \rightarrow +\infty } \frac{1}{T} \inf \big \{E_1\big (X_T,Q^\nu _T(y_T)\big ) :\, y_T \in {\mathbb {R}}^2, \, X_T= {\mathscr {L}}(z^\pm _T) \text { on } \partial ^\pm _1 Q^\nu _T(y_T) \big \} \\&\quad \geqq \liminf _{T \rightarrow +\infty } \frac{1}{T} \inf \big \{E_1\big (X_T,Q^\nu _T(y_T)\big ) :\, y_T \in {\mathbb {R}}^2, \, X_T = {\mathscr {L}}({\hat{z}}^\pm _T) \text { on } \partial ^\pm _1 Q^\nu _T(y_T) \big \}, \end{aligned}$$

where \({\hat{z}}^\pm _T := (\theta ^\pm ,\tau ^\pm _T,1)\).

We defer the proof and proceed with the properties of translations. Again consider \(z^\pm =(\theta ^\pm ,\tau ^\pm ,1) \in {\mathcal {Z}}\) with \(\theta ^+ - \theta ^- \in {{\mathcal {G}}_{{\mathbb {A}}}}\). Recall by (6.2) that there holds \(e^{i(\theta ^+ - \theta ^-)}= \frac{v_1}{v_2}\) for \(v_1,v_2 \in {\mathscr {L}}\cap {\mathbb {C}}\) with \(|v_1|=|v_2|\). We consider the coincidence site lattice

$$\begin{aligned} e^{i\theta ^+}{\mathscr {L}} \cap e^{i\theta ^-}{\mathscr {L}} = \{ ja +kb:j,k\in {\mathbb {Z}} \}, \end{aligned}$$
(7.5)

where \(a,b \in e^{i\theta ^+}{\mathscr {L}} \cap e^{i\theta ^-}{\mathscr {L}} \) are spanning vectors of minimal length. Then, for later purposes, we define the fundamental parallelogram of \(e^{i\theta ^+}{\mathscr {L}} \cap e^{i\theta ^-}{\mathscr {L}}\) by

$$\begin{aligned} {P}_{\theta ^+,\theta ^-} = \big \{ \lambda _1a+\lambda _2b : \, 0\leqq \lambda _1< 1, \, 0 \leqq \lambda _2 < 1\big \}. \end{aligned}$$
(7.6)

We will use the following uniform closedness property of the set of touching points between sequences of translates of two perfect lattices.

Lemma 7.4

(Closedness of touching points) Consider \(z_n^\pm = (\theta ^\pm ,\tau ^\pm _n,1) \in {\mathcal {Z}}\) for \(n \in \mathbb N\) and \(z^\pm = (\theta ^\pm ,\tau ^\pm ,1) \in {\mathcal {Z}}\) such that \(\theta ^+ -\theta ^-\in {{\mathcal {G}}_{{\mathbb {A}}}}\) and \(\tau ^\pm _n \rightarrow \tau ^\pm \). For \(x \in {\mathscr {L}}(z^+)\), \(y \in {\mathscr {L}}(z^-)\) we set

$$\begin{aligned} x_n^+ = x + e^{i\theta ^+}(\tau ^+_n - \tau ^+) \in {\mathscr {L}}(z^+_n), \qquad y_n^- = y + e^{i\theta ^-}(\tau ^-_n - \tau ^-) \in {\mathscr {L}}(z^-_n). \end{aligned}$$

Then, there is an \(n_0 \in \mathbb N\) such that for all \(n \geqq n_0\) and all \(x \in {\mathscr {L}}(z^+)\), \(y \in {\mathscr {L}}(z^-)\) the following implications are verified:

$$\begin{aligned}&\mathrm{(i)} \ \ |x - y|< 1 \implies |x_n^+ - y^-_n| < 1 \ \ \ \ \text {and} \\&\mathrm{(ii)} \ \ |x - y|> 1 \implies |x_n^+ - y^-_n| > 1. \end{aligned}$$

In particular, \(|x_n^+ - y^-_n| =1 \) for some \(n \geqq n_0\) implies \(|x-y|=1\).

We again defer the proof and now proceed with the proof of Lemma 7.1.

Proof of Lemma 7.1

Let \(z^+,z^- \in {\mathcal {Z}}\), \(\nu \in {\mathbb {S}}^1\). Recalling the discussion at the beginning of the subsection, we note that it suffices to show inequality (7.4). Moreover, we can assume that \(z^\pm = (\theta ^\pm ,\tau ^\pm ,1)\) and that \(\theta ^+ - \theta ^- \in {{\mathcal {G}}_{{\mathbb {A}}}}\).

Let \(\lbrace X_T\rbrace _T\) be an optimal sequence for \(\Phi \) with corresponding centers \(\lbrace y_T\rbrace _T\) of the cubes, that is,

$$\begin{aligned} \liminf _{T\rightarrow +\infty } \frac{1}{T}E_1\big (X_T,Q^\nu _T(y_T)\big ) = \Phi (z^+,z^-,\nu ) <+\infty . \end{aligned}$$
(7.7)

By applying Lemma 6.2, we can suppose that \(X_T = X_T^+ \cup X_T^-\) with \(X_T^\pm \subset {\mathscr {L}}(z_T^\pm )\) and \(X_T = {\mathscr {L}}(z_T^\pm )\) on \(\partial _1^\pm Q_T^\nu (y_T)\), where \(z^\pm _T = (\theta _T^\pm ,\tau ^\pm _T,1) \rightarrow z^\pm \). By (6.4) and Lemma 7.3 we can also assume that \(\theta _T^\pm = \theta ^\pm \) for all T. We distinguish the two cases (a) \(\{(x,y) \in {\mathscr {L}}(z^+) \times {\mathscr {L}}(z^-) :\, |x - y| = 1 \} = \emptyset \) and (b) \(\{(x,y) \in {\mathscr {L}}(z^+) \times {\mathscr {L}}(z^-) :\, |x - y| = 1 \} \ne \emptyset \).

Case \(\mathrm {(a)}\): \(\{(x,y) \in {\mathscr {L}}(z^+) \times {\mathscr {L}}(z^-) :\, |x - y| = 1 \} = \emptyset \). By Lemma 7.4 we can assume that \(\{(x,y) \in {\mathscr {L}}(z^+_T) \times {\mathscr {L}}(z^-_T) :\, |x - y| = 1 \} = \emptyset \) for all T. Thus, we get \({\mathcal {N}}(x) \cap X_T^- = \emptyset \) for all \(x \in X_T^+\) and viceversa. Therefore, by (2.3) we obtain

$$\begin{aligned} \Phi (z^+,z^-,\nu )&= \liminf _{T\rightarrow +\infty } \frac{1}{T}E_1\big (X_T,Q^\nu _T(y_T)\big ) = \liminf _{T\rightarrow +\infty } \Big (\frac{1}{T}E_1\big (X^+_T,Q^\nu _T(y_T)\big )\\&\quad +\frac{1}{T}E_1\big (X^-_T,Q^\nu _T(y_T)\big ) \Big ). \end{aligned}$$

Note that \(X_T^\pm = {\mathscr {L}}(z^\pm _T)={\mathscr {L}}(\theta ^\pm ,\tau ^\pm _T,1)\) on \(\partial ^\pm _1 Q^\nu _T(y_T)\) and \(X_T^\pm =\emptyset \) on \(\partial ^\mp _1 Q^\nu _T(y_T)\). By Lemma 6.1(i), the energy on each sublattice \(X^+_T\) on \(X^-_T\) can be estimated separately, and we obtain

$$\begin{aligned} \Phi (z^+,z^-,\nu ) \geqq {\varphi }_{\mathrm{hex}}\big (e^{-i\theta ^+}\nu \big ) + {\varphi }_{\mathrm{hex}}\big (e^{-i\theta ^-}\nu \big ). \end{aligned}$$
(7.8)

Then, Lemma  6.1(ii) and (7.1) imply \(\Phi (z^+,z^-,\nu ) \geqq {\bar{\varphi }}(z^+,z^-,\nu )\) and \({\bar{\varphi }}(z^+,z^-,\nu ) = {\varphi }_{\mathrm{hex}}\big (e^{-i\theta ^+}\nu \big ) + {\varphi }_{\mathrm{hex}}\big (e^{-i\theta ^-}\nu \big )\). This concludes the proof of (7.4) in case (a). We also point out that the property stated below (7.2) holds. (In case \(\theta ^+ - \theta ^- \notin {{\mathcal {G}}_{{\mathbb {A}}}}\), (7.8) is immediate from (6.3).)

Case \(\mathrm {(b)}\): \(\{(x,y) \in {\mathscr {L}}(z^+) \times {\mathscr {L}}(z^-) :\, |x - y| = 1 \} \ne \emptyset \). Our goal is to construct a new competitor \({\tilde{X}}_T = {\tilde{X}}_T^+ \cup {\tilde{X}}_T^-\) such that \({\tilde{X}}_T^\pm \subset {\mathscr {L}}(z^\pm )\), \({\tilde{X}}^\pm _T = {\mathscr {L}}(z^\pm )\) on \(\partial ^\pm _1 Q_{T+22}^\nu (y_T)\), and

$$\begin{aligned} E_1\big ({\tilde{X}}_T,Q^\nu _{T+22}(y_T)\big ) \leqq E_1\big (X_T,Q^\nu _T(y_T)\big )+C. \end{aligned}$$
(7.9)

Once this is established, by (7.1) and (7.7) we clearly get

$$\begin{aligned} \Phi (z^+,z^-,\nu )&= \liminf _{T\rightarrow +\infty } \frac{1}{T} E_1\big (X_T,Q^\nu _T(y_T)\big ) \geqq \liminf _{T \rightarrow +\infty } \frac{1}{T+ 22 }E_1\big ({\tilde{X}}_T,Q^\nu _{T+ 22 }(y_T)\big ) \\&\geqq {\bar{\varphi }}(z^+,z^-,\nu ). \end{aligned}$$

To construct \({\tilde{X}}_T\), we first extend \(X_T\) to \({\hat{X}}_T\) by

$$\begin{aligned} {\hat{X}}_T = {\left\{ \begin{array}{ll} X_T &{}\text {on }\,\, Q^\nu _{T+10}(y_T){\setminus } A_T, \\ {\mathscr {L}}(z^\pm _T) &{}\text {on }\,\, \{x:\pm \langle \nu ,x-y_T\rangle \geqq 2\} \cap \big ( Q^\nu _{T+34}(y_T){\setminus } (Q^\nu _{T+10}(y_T) \cup A_T)\big ), \\ \emptyset &{}\text {on }\,\, A_T \cup \big ( {\mathbb {R}}^2 {\setminus } Q^\nu _{T+34}(y_T)\big ), \end{array}\right. } \end{aligned}$$
(7.10)

where \(A_T=Q^\nu _{10}(y_T+(T/2)\nu ^\perp )\cup Q^\nu _{10}(y_T-(T/2)\nu ^\perp ) \cup (\{|x:\langle \nu ,x-y_T\rangle |< 2\} {\setminus } Q^\nu _{T+10}(y_T))\). By definition, we get \(E_1({\hat{X}}_T)<+\infty \) since \(|x-y| \geqq 1\) for all \(x,y \in {\hat{X}}_T\), \(x\ne y\). Note that we can write \({\hat{X}}_T = {\hat{X}}_T^+ {{\dot{\cup }}} {\hat{X}}_T^-\), where \({\hat{X}}_T^\pm \subset {\mathscr {L}}(z^\pm _T)\) and \({\hat{X}}^\pm _T = {\mathscr {L}}(z^\pm _T)\) on \(\partial ^\pm _1 Q_{T+22}^\nu (y_T)\). We claim that

$$\begin{aligned} E_1\big ({\hat{X}}_T,Q^\nu _{T+32}(y_T)\big ) \leqq E_1\big (X_T,Q^\nu _T(y_T)\big ) +C. \end{aligned}$$
(7.11)

In fact, if there exists \(x\in {\hat{X}}_T \cap Q^\nu _{T}(y_T)\) such that \(\#({\mathcal {N}}(x) \cap {\hat{X}}_T) < \#({\mathcal {N}}(x) \cap X_T)\), then necessarily \(x\in (A_T)_1\cap Q^\nu _{T}(y_T)\). However, \({\mathcal {L}}^2((A_T\cap Q^\nu _{T}(y_T))_{2}) \leqq C\) and therefore, due to Lemma 3.1(v), we get

$$\begin{aligned} \#\big \{x \in {\hat{X}}_T \cap Q^\nu _{T}(y_T) :\, \#({\mathcal {N}}(x) \cap {\hat{X}}_T) < \#({\mathcal {N}}(x) \cap X_T)\big \} \leqq C. \end{aligned}$$
(7.12)

In a similar fashion, if \(x\in {\hat{X}}_T\cap (Q^\nu _{T+32}(y_T) {\setminus } Q^\nu _{T}(y_T))\) such that \( \#({\mathcal {N}}(x) \cap {\hat{X}}_T) < 6\), then necessarily \( x\in (A_T)_1 \cap Q^\nu _{T+32}(y_T)\). Thus, again by Lemma 3.1(v), only a bounded number of atoms in \(Q^\nu _{T+32}(y_T) {\setminus } Q^\nu _{T}(y_T)\) independently of T has less than six neighbors. This along with (7.12) and (2.3) yields (7.11).

Let us now define \({\tilde{X}}_T\). We recall the notation in (2.4) and define \({\tilde{X}}_T = {\tilde{X}}_T^+ \cup {\tilde{X}}_T^-\) by

$$\begin{aligned} {\tilde{X}}^+_T= \big ({\hat{X}}_T^+ + e^{i\theta ^+}(\tau ^+ -\tau ^+_T)\big ), \ \ \ \ \ \ {\tilde{X}}^-_T= \big ({\hat{X}}_T^- + e^{i\theta ^-} (\tau ^- -\tau ^-_T)\big ). \end{aligned}$$

For convenience, we denote the atoms of \({\hat{X}}_T\) by \(\lbrace x^j_T\rbrace _j\) and the corresponding atoms of \({\tilde{X}}_T\) by \(\lbrace {\tilde{x}}^j_T\rbrace _j\), that is, \({\tilde{x}}^j_T=x^j_T+e^{i\theta ^\pm }(\tau ^\pm -\tau ^\pm _T)\) if \(x^j_T\in {\hat{X}}_T^\pm \) . By (7.10) and the choice of \({\hat{X}}_T\), it is obvious that \({\tilde{X}}_T^\pm \subset {\mathscr {L}}(z^\pm )\) and \({\tilde{X}}^\pm _T = {\mathscr {L}}(z^\pm )\) on \(\partial ^\pm _1 Q^\nu _{T+ 22}(y_T)\) for T large enough. Here, the extension \({\hat{X}}_T= {\mathscr {L}}(z_T^\pm )\) on \(\{x:\pm \langle \nu ,x-y_T\rangle \geqq 2\} \cap (Q^\nu _{T+34}(y_T){\setminus } (Q^\nu _{T+10}(y_T) \cup A_T))\) is crucial in order to ensure that these boundary conditions hold for \({\tilde{X}}_T\). (The value 2 is for definiteness only. Every value less than 5 works, provided T is sufficiently large.) To show (7.9), we prove

$$\begin{aligned} E_1\big ({\tilde{X}}_T,Q^\nu _{T+22}(y_T)\big ) \leqq E_1\big ({\hat{X}}_T,Q^\nu _{T+32}(y_T)\big ). \end{aligned}$$

Then, the result follows from (7.11). To this end, we need to check the following for large T:

$$\begin{aligned} \mathrm{(i)} \ \ |x^j_T-x^k_T| =1\implies |{\tilde{x}}^j_T - {\tilde{x}}^k_T|=1, \ \ \ \ \text {and} \ \ \ \ \mathrm{(ii)} \ \ |{\tilde{x}}^j_T-{\tilde{x}}^k_T|\geqq 1 \text { for all }j,k, j\ne k. \end{aligned}$$
(7.13)

In fact, due to (7.13)(ii), \({\tilde{X}}_T\) is a configuration with finite energy. Moreover, (7.13)(i) shows that \(x^k_T \in {\mathcal {N}}(x^j_T)\) implies \({\tilde{x}}^k_T \in {\mathcal {N}}({\tilde{x}}^j_T)\), and therefore the energy can only decrease, see (2.3).

Let us finally check (7.13). If both atoms are in \({\hat{X}}_T^-\) or \({\hat{X}}_T^+\), then it is clear by the definition of \({\tilde{X}}_T\) that \(x^j_T-x^k_T = {\tilde{x}}^j_T-{\tilde{x}}^k_T\), which gives (i) and (ii) due to (7.7) and (7.11). Otherwise, if \(x^j_T \in {\hat{X}}_T^-\) and \(x^k_T \in {\hat{X}}_T^+\) or vice versa, (i) follows from Lemma 7.4, whereas (ii) follows from Lemma 7.4(i), (7.7) and (7.11). \(\square \)

To conclude the proof of Lemma 7.1, it remains to give the proofs of Lemmas 7.3 and 7.4.

Proof of Lemma 7.3

Let \(z_T^\pm =(\theta _T^\pm ,\tau _T^\pm ,1) \in {\mathcal {Z}}\) and \(\nu \in {\mathbb {S}}^1\) be given as in the statement.

Step 1: Rotation to boundary conditions with fixed rotation angles. Choose \({\tilde{y}}_T \in {\mathbb {R}}^2\) and \({\tilde{X}}_T \subset {\mathbb {R}}^2\) satisfying \({\tilde{X}}_T = {\mathscr {L}}(z^\pm _T)\) on \(\partial ^\pm _1 Q^\nu _T({\tilde{y}}_T)\) such that

$$\begin{aligned} E_1\big ({\tilde{X}}_T,Q^\nu _T({\tilde{y}}_T)\big )&\leqq \inf \left\{ E_1\big ({X}_T,Q^\nu _T({y}_T)\big ) :\, {y}_T \in {\mathbb {R}}^2,\, {X}_T \right. \nonumber \\&=\left. {\mathscr {L}}(z^\pm _T) \text { on } \partial ^\pm _1 Q^\nu _T(y_T) \right\} + 1/T. \end{aligned}$$
(7.14)

We define \({X}^{\mathrm{rot}}_T :=e^{i(\theta ^+-\theta _T^+)}{\tilde{X}}_T\), \(\nu _T:=e^{i(\theta ^+-\theta ^+_T)}\nu \), \(y_T^{\mathrm{rot}} := e^{i(\theta ^+-\theta ^+_T)}{\tilde{y}}_T\), and \({\hat{z}}^\pm := (\theta ^\pm ,\tau ^\pm _T,1)\). Then, by Lemma 3.1(i) and \(\theta _T^+ - \theta _T^- = \theta ^+-\theta ^-\) for all T, there holds \(X^{\mathrm{rot}}_T ={\mathscr {L}}({\hat{z}}^\pm _T)\) on \(\partial ^\pm _1 Q^{\nu _T}_T(y_T^{\mathrm{rot}})\) and

$$\begin{aligned}&E_1\big ({\tilde{X}}_T,Q^{\nu }_T({\tilde{y}}_T)\big ) = E_1\big (X^{\mathrm{rot}}_T, Q^{\nu _T}_T(y_T^{\mathrm{rot}})) \\&\quad \geqq \inf \big \{E_1\big ({X}_T,Q^{\nu _T}_T({y}_T)\big ) :\, {y}_T \in {\mathbb {R}}^2, \, {X}_T={\mathscr {L}}({\hat{z}}^\pm _T) \text { on } \partial ^\pm _1 Q^{\nu _T}_T(y_T)\big \} \end{aligned}$$

for all \(T>0\). Therefore, in view of (7.14), to show the statement it suffices to prove

$$\begin{aligned}&\liminf _{T \rightarrow +\infty } \frac{1}{T} \inf \big \{E_1\big (X_T,Q^{\nu _T}_T(y_T)\big ) :\, y_T \in {\mathbb {R}}^2, \, X_T = {\mathscr {L}}({\hat{z}}^\pm _T) \text { on } \partial ^\pm _1 Q^{\nu _T}_T(y_T) \big \}\nonumber \\&\quad \geqq \liminf _{T \rightarrow +\infty } \frac{1}{T} \inf \big \{E_1\big (X_T,Q^\nu _T(y_T)\big ) :\, y_T \in {\mathbb {R}}^2, \, X_T = {\mathscr {L}}({\hat{z}}^\pm _T) \text { on } \partial ^\pm _1 Q^\nu _T(y_T) \big \}. \end{aligned}$$
(7.15)

Note that the difference of the two formulas lies only in the fact that \(\nu \) is replaced by \(\nu _T\), where \(\nu _T \rightarrow \nu \) as \(T \rightarrow +\infty \).

Step 2: Proof of (7.15). Fix \(\delta >0\) and let \(T >0\) be sufficiently large such that \(|\nu _T-\nu | <\delta \). We choose \({\tilde{y}}_T \in {\mathbb {R}}^2\) and \({\tilde{X}}_T \subset {\mathbb {R}}^2\) satisfying \({\tilde{X}}_T = {\mathscr {L}}({\hat{z}}^\pm _T)\) on \(\partial ^\pm _1 Q^{\nu _T}_T({\tilde{y}}_T)\) such that

$$\begin{aligned} E_1\big ({\tilde{X}}_T,Q^{\nu _T}_T({\tilde{y}}_T)\big )&\leqq \inf \big \{E_1\big (X_T,Q^{\nu _T}_T(y_T)\big ) :\, y_T \in {\mathbb {R}}^2, \nonumber \\&\qquad \quad \ \ X_T={\mathscr {L}}({\hat{z}}^\pm _T) \text { on } \partial ^\pm _1 Q^{\nu _T}_T(y_T) \big \} +\delta . \end{aligned}$$
(7.16)

Recall (2.4) and (2.5). We set \(T_\delta = (1+2\delta )T\) and define

$$\begin{aligned} A^\delta _T&=\left( \left[ {\tilde{y}}_T - \frac{T}{2}\nu _T^{\bot }; {\tilde{y}}_T - \frac{T_\delta }{2}\nu ^{\bot } \right] \cup \left[ {\tilde{y}}_T+\frac{T}{2}\nu ^{\bot }_T; {\tilde{y}}_T+ \frac{T_\delta }{2}\nu ^{\bot } \right] \right) _{\kappa T\delta }\\&\qquad \quad {\setminus } \Big (\partial ^+_1 Q^\nu _{T_\delta }({\tilde{y}}_T) \cup \partial ^-_1 Q^\nu _{T_\delta }({\tilde{y}}_T)\Big ), \end{aligned}$$

where \(\kappa > 1\) is chosen sufficiently large later. We define the configuration \({\hat{X}}_T \subset {\mathbb {R}}^2\) by

$$\begin{aligned} {\hat{X}}_T = {\left\{ \begin{array}{ll} {\tilde{X}}_T &{}\text {in } \,\,Q^{\nu _T}_T({\tilde{y}}_T),\\ \emptyset &{}\text {in } \,\,A^\delta _T {\setminus } Q^{\nu _T}_{T}({\tilde{y}}_T),\\ {\mathscr {L}}({\hat{z}}^\pm _T) &{}\text {in } \,\,\{x:\pm \langle \nu , (x-{\tilde{y}}_T)\rangle \geqq 5\} {\setminus } \big ( A^\delta _T\cup Q^{\nu _T}_T({\tilde{y}}_T)\big ). \end{array}\right. } \end{aligned}$$
(7.17)

Here, \(\kappa >1\) is chosen large enough (independently of T) such that \(|x-y| \geqq 1\) for all \(x,y \in {\hat{X}}_T\), \(x\ne y\). In principle, \(|x-y|<1\) may occur for points \(x \in {\tilde{X}}_T \cap Q^{\nu _T}_T({\tilde{y}}_T)\) and \(y \in {\mathbb {R}}^2 {\setminus } Q^{\nu _T}_T({\tilde{y}}_T)\) if \(x \in Q^{\nu _T}_T({\tilde{y}}_T) {\setminus } Q^{\nu _T}_{T-2}({\tilde{y}}_T)\), \(\pm \langle \nu _T, (x-{\tilde{y}}_T)\rangle \geqq -5\) and \(\pm \langle \nu , (y-{\tilde{y}}_T) \rangle \leqq - 5 \), but for \(\kappa \) big enough such pairs of points are contained in \(A^\delta _T\).

We note that \(\partial ^\pm _1 Q^\nu _{T_\delta }({\tilde{y}}_T) \cap Q^\nu _T({\tilde{y}}_T) = \emptyset \) for T large enough since \(\nu _T \rightarrow \nu \) as \(T\rightarrow +\infty \). Thus, by construction we get \({\hat{X}}_T = {\mathscr {L}}({\hat{z}}^\pm _T)\) on \(\partial ^\pm _1 Q^\nu _{T_\delta }({\tilde{y}}_T)\) for T sufficiently large. Therefore, we obtain

$$\begin{aligned}&\inf \big \{E_1\big (X_T,Q^{\nu }_{T_\delta }({y}_T)\big ) :\, y_T \in {\mathbb {R}}^2, \, X_T={\mathscr {L}}({\hat{z}}^\pm _T) \text { on } \partial ^\pm _1 Q^\nu _{T_\delta }(y_T)\big \}\nonumber \\&\quad \leqq E_1\big ({\hat{X}}_T,Q^\nu _{T_\delta }({\tilde{y}}_T)\big ). \end{aligned}$$
(7.18)

We claim that

$$\begin{aligned} E_1\big ({\hat{X}}_T,Q^\nu _{T_\delta }({\tilde{y}}_T)\big )\leqq E_1\big ({\tilde{X}}_T,Q^{\nu _T}_T({\tilde{y}}_T)\big ) + C \kappa \delta T \end{aligned}$$
(7.19)

for a universal \(C>0\). We defer the proof of this estimate to Step 3 below and conclude the proof of (7.15). Dividing (7.19) by \(T_\delta \) and letting \(T \rightarrow +\infty \), we derive

$$\begin{aligned} \liminf _{T \rightarrow +\infty }\frac{1}{T_\delta } E_1\big ({\hat{X}}_T,Q^\nu _{T_\delta }({\tilde{y}}_T)\big )&\leqq \liminf _{T \rightarrow +\infty }\frac{1}{T}E_1\big ({\tilde{X}}_T,Q^{\nu _T}_T({\tilde{y}}_T)\big ) + C \kappa \delta . \end{aligned}$$

This along with (7.16) and (7.18), and the fact that \(\delta >0\) was arbitrary shows (7.15). It thus remains to prove (7.19).

Step 3: Proof of (7.19). We divide the proof into the two estimates

$$\begin{aligned}&E_1\big ({\hat{X}}_T,Q^{\nu _T}_{T}({\tilde{y}}_T)\big )\leqq E_1\big ({\tilde{X}}_T,Q^{\nu _T}_{T}({\tilde{y}}_T)\big ) + C\kappa \delta T,\hbox { and} \end{aligned}$$
(7.20)
$$\begin{aligned}&E_1\big ({\hat{X}}_T,Q^{\nu }_{T_\delta }({\tilde{y}}_T) {\setminus } Q^{\nu _T}_{T}({\tilde{y}}_T)\big ) \leqq C \kappa \delta T, \end{aligned}$$
(7.21)

for a universal \(C>0\). Clearly, (7.20)–(7.21) and Lemma 3.1(iv) imply (7.19). We first prove (7.20). Recall by (7.17) and the boundary values of \({\tilde{X}}_T\) that \({\hat{X}}_T = {\tilde{X}}_T\) in \( \overline{Q^{\nu _T}_{T+2}({\tilde{y}}_T)} {\setminus } (A^\delta _T {\setminus } Q^{\nu _T}_T({\tilde{y}}_T))\). Thus, \(x \in Q^{\nu _T}_T({\tilde{y}}_T)\) can have less neighbors in \({\hat{X}}_T\) than in \({\tilde{X}}_T\) only if \(x \in (A^\delta _T)_1\cap (Q^{\nu _T}_T({\tilde{y}}_T) {\setminus } Q^{\nu _T}_{T-2}({\tilde{y}}_T))\). As \(\mathrm {diam}(A^\delta _T) \leqq C\kappa \delta T\) and therefore \({\mathcal {L}}^2(( (A^\delta _T)_1 \cap (Q^{\nu _T}_T({\tilde{y}}_T) {\setminus } Q^{\nu _T}_{T-2}({\tilde{y}}_T)))_{1}) \leqq C\kappa \delta T\), this implies by Lemma 3.1(v) that a number of atoms \(x \in Q^{\nu _T}_T({\tilde{y}}_T)\) bounded by \(C\kappa \delta T\) have less neighbors in \({\hat{X}}_T\) than in \({\tilde{X}}_T\). This shows (7.20) by (2.3). To see (7.21), again due to (7.17), all atoms \(x \in {\hat{X}}_T \cap (Q^{\nu }_{T_\delta }({\tilde{y}}_T) {\setminus } ( Q^{\nu _T}_{T}({\tilde{y}}_T) \cup {(A^\delta _T)_1)}\) have six neighbors. Hence, their energy contribution is zero. As \({\hat{X}}_T = \emptyset \) in \(A^\delta _T {\setminus } Q^{\nu _T}_{T}({\tilde{y}}_T) \) and \({\mathcal {L}}^2(( {( A^\delta _T)_1} {\setminus } A^\delta _T)_1) \leqq C \kappa \delta T\), this implies, as before, that

$$\begin{aligned} \#\left( {\hat{X}}_T \cap \big ( {( A^\delta _T)_1} \cap Q^{\nu }_{T_\delta }({\tilde{y}}_T) \big ) {\setminus } Q^{\nu _T}_{T}({\tilde{y}}_T)\right) \leqq C {\mathcal {L}}^2(({( A^\delta _T)_1} {\setminus } A^\delta _T)_1)\leqq C \kappa \delta T. \end{aligned}$$

Again, in view of (2.3), this implies (7.21), and concludes the proof. \(\square \)

Proof of Lemma 7.4

Suppose first that \(y \in {P}_{\theta ^+,\theta ^-}\) with \({P}_{\theta ^+,\theta ^-}\) defined in (7.6). Then (i) follows from \(x^+_n \rightarrow x\), \(y^-_n \rightarrow y\), and the observation that there are only finitely many pairs \((x,y) \in {\mathscr {L}}(z^+) \times ({P}_{\theta ^+,\theta ^-} \cap {\mathscr {L}}(z^-))\) with \(|x-y|<1\). The same argument applies to show that (ii) holds true for all pairs \((x,y) \in (({P}_{\theta ^+,\theta ^-})_3 \cap {\mathscr {L}}(z^+)) \times ({P}_{\theta ^+,\theta ^-} \cap {\mathscr {L}}(z^-))\) for large n. Choosing n so big that \(|\tau ^\pm _n-\tau ^\pm | < 1\) also gives (ii) for all \((x,y) \in {\mathscr {L}}(z^+) \times ({P}_{\theta ^+,\theta ^-} \cap {\mathscr {L}}(z^-))\).

Now, consider a general \(y\in {\mathscr {L}}(z^-)\). One finds \(v \in e^{i\theta ^+}{\mathscr {L}} \cap e^{i\theta ^-}{\mathscr {L}}\) such that \(y-v \in {P}_{\theta ^+,\theta ^-}\). The assertion then follows by applying the special case described above to \(x - v\) and \(y -v\), and by observing that \((x-v)^+_n = x^+_n-v\) and \((y-v)^-_n = y^-_n-v\). Finally, the implication \(|x_n^+ - y^-_n| =1 \Rightarrow |x-y|=1\) follows from (i) and (ii) by contraposition. \(\square \)

7.2 Well Definedness and Properties of the Energy Density \(\varphi \)

This final subsection is devoted to the proofs of Proposition 7.2 and Theorem 2.5. Our proofs in this subsection follow standard strategies. Due to the discrete character of our model, however, careful constructions are needed. As a preliminary step, we show that in (7.1) the sequence \(T\rightarrow +\infty \) can be chosen independently of the centers of the cells.

Proposition 7.5

For each \(z^+,z^- \in {\mathcal {Z}}\) and \(\nu \in {\mathbb {S}}^1\) there exists a sequence \(\{T_j\}_j\) such that \(T_j \rightarrow +\infty \) as \(j \rightarrow +\infty \) and for all \(\{y_j\}_j \subset {\mathbb {R}}^2\) it holds that

$$\begin{aligned} \frac{1}{T_j} \min \big \{E_{1}\big (X,Q^\nu _{T_j}(y_j)\big ) :\, X = {\mathscr {L}}(z^\pm ) \text { on } \partial ^\pm _{1}Q^\nu _{T_j}(y_j) \big \}\leqq {\bar{\varphi }}(z^+,z^-,\nu ) + \eta _j, \end{aligned}$$

where \(\lbrace \eta _j\rbrace _j \subset (0,+\infty )\) is a null sequence which depends on \(z^\pm \) and \(\nu \), but is independent of \(\{y_j\}_j\).

Proof

First, if \(z^+ ={\mathbf {0}}\) or \(z^-={\mathbf {0}}\), the statement follows from Lemma 6.1(i) and the definition of \({\bar{\varphi }}\) in (7.1) for any sequence \(\lbrace T_j\rbrace _j\). Now consider \(z^\pm = (\theta ^\pm , \tau ^\pm ,1)\). If \(\theta ^+-\theta ^- \notin {{\mathcal {G}}_{{\mathbb {A}}}}\), the statement follows from Lemma 6.1(ii), (6.1), and Lemma 6.2 for any sequence \(\lbrace T_j\rbrace _j\). Therefore, it remains to treat the case \(\theta ^+-\theta ^- \in {{\mathcal {G}}_{{\mathbb {A}}}}\).

Consider a sequence \(S_j \rightarrow +\infty \), \(\{x_j\}_j\subset {\mathbb {R}}^2\), and configurations \(\lbrace X_j\rbrace _j \subset {\mathbb {R}}^2\) satisfying \(X_j = {\mathscr {L}}(z^\pm )\) on \(\partial ^\pm _{1}Q^\nu _{S_j}(x_j)\) such that

$$\begin{aligned} {\bar{\varphi }}(z^+,z^-,\nu ) = \lim _{j\rightarrow +\infty } \frac{1}{S_j}E_{1}\big (X_j,Q^\nu _{S_j}(x_j)\big ). \end{aligned}$$
(7.22)

By Lemma 5.1 it is not restrictive to assume that \(X_j \subset {\mathscr {L}}(z^\pm )\) for all \(j \in {\mathbb {N}}\). Our goal is to find a sequence \(l_j \rightarrow 1\) such that for all \(\lbrace y_j\rbrace _j\) there are configurations \(\lbrace {\tilde{X}}_j \rbrace _j\subset \mathbb R^2\) satisfying \({\tilde{X}}_j ={\mathscr {L}}(z^\pm )\) on \(\partial ^\pm _{1} Q^\nu _{l_jS_j}(y_j)\) such that

$$\begin{aligned} E_{1} \big ({\tilde{X}}_j,Q^\nu _{l_jS_j}(y_j)\big ) \leqq E_{1}\big (X_j,Q^\nu _{S_j}(x_{j})\big ) + C \end{aligned}$$
(7.23)

for a constant \(C>0\) only depending on \(z^\pm \) and \(\nu \). Once this is achieved, we obtain the statement as follows: we introduce the sequence \(T_j := l_jS_j\), divide (7.23) by \(T_j\), and use (7.22) to get

$$\begin{aligned}&\frac{1}{T_j} \min \big \{E_{1}\big (X,Q^\nu _{T_j}(y_j)\big ) :\, X = {\mathscr {L}}(z^\pm ) \text { on } \partial ^\pm _{1}Q^\nu _{T_j}(y_j) \big \}\\&\quad \leqq \frac{1}{T_j} E_{1} \big ({\tilde{X}}_j,Q^\nu _{l_jS_j}(y_j)\big )\\&\quad \leqq \frac{1}{l_jS_j} E_{1}\big (X_j,Q^\nu _{S_j}(x_j)\big ) + \frac{C}{T_j} \\&\quad \leqq {\bar{\varphi }}(z^+,z^-,\nu ) + \eta _j, \end{aligned}$$

where \(\lbrace \eta _j\rbrace _j\) is a null sequence only depending on \(z^+,z^-,\nu \), and \(\lbrace T_j\rbrace _j\), but independent of the centers \(\lbrace y_j\rbrace _j\).

Consider any sequence of centers \(\lbrace y_j\rbrace _j\). We now construct \({\tilde{X}}_j\) and confirm (7.23). We choose \({\bar{y}}_j \in ({\mathscr {L}}(z^+) \cap {\mathscr {L}}(z^-))+x_j\) such that \(|y_j-{\bar{y}}_j| \leqq \kappa \), where \(\kappa := |a| + |b| + 5\) only depends on the spanning vectors ab in (7.5), but is independent of j. Let \(l_j := 1 + 4\kappa /S_j\). We set

$$\begin{aligned} A_j = \left( \left[ {\overline{y}}_j-\frac{S_j}{2}\nu ^\perp ;y_j-\frac{l_jS_j}{2}\nu ^\perp \right] \right) _{4\kappa } \cup \left( \left[ {\overline{y}}_j+\frac{S_j}{2}\nu ^\perp ;y_j+\frac{l_jS_j}{2}\nu ^\perp \right] \right) _{4\kappa }. \end{aligned}$$

Note that \(\partial ^\pm _{1}Q^\nu _{l_jS_j}(y_j) \cap Q^\nu _{S_j}({\bar{y}}_j) = \emptyset \) since \(S_jl_j - S_j = 4\kappa \), \(|y_j-{\bar{y}}_j| \leqq \kappa \), and \(\kappa \geqq 5\). We define \({\tilde{X}}_j \subset {\mathbb {R}}^2\) by

$$\begin{aligned} {\tilde{X}}_j = {\left\{ \begin{array}{ll} X_j + {\bar{y}}_j-x_j &{}\text {in } \,\,Q^\nu _{S_j} ({\bar{y}}_j) {\setminus } A_j,\\ \emptyset &{}\text {in } \,\,A_j {\setminus } \big ( \partial ^+_{1}Q^\nu _{l_jS_j}(y_j) \cup \partial ^-_{1}Q^\nu _{l_jS_j}(y_j)\big ),\\ {\mathscr {L}}(z^\pm ) &{}\text {in } \,\,\big ( \{\pm \langle \nu , x- y_j \rangle \geqq 5\} {\setminus } \big (A_j \cup Q^\nu _{S_j}( {\bar{y}}_j)\big )\big ) \cup \partial ^\pm _{1} Q^\nu _{l_jS_j}(y_j). \end{array}\right. } \end{aligned}$$

By definition, \({\tilde{X}}_j \) attains the correct boundary conditions, and therefore it remains to check (7.23). First, as \(x_j - {\bar{y}}_j \in {\mathscr {L}}(z^+) \cap {\mathscr {L}}(z^-)\) and \(X_j = {\mathscr {L}}(z^\pm )\) on \(\partial ^\pm _{1}Q^\nu _{S_j}(x_j)\), we observe that \({\tilde{X}}_j = {\mathscr {L}}(z^\pm )\) on \((\partial ^\pm _{1} Q^\nu _{S_j}({\bar{y}}_j) \cap Q^\nu _{S_j}({\bar{y}}_j)) {\setminus } A_j\). This along with the definition of \(A_j\) implies \(|x-y|\geqq 1\) for all \( x, y \in {\tilde{X}}_j\), \(x\ne y\), and thus \(E_{1} \big ({\tilde{X}}_j,Q^\nu _{l_jS_j}(y_j)\big ) < +\infty \). Moreover, by Lemma 3.1(i) we obtain

$$\begin{aligned} E_{1}\big ({\tilde{X}}_j,Q^\nu _{S_j}({\bar{y}}_j)\big ) \leqq E_{1}\big (X_j,Q^\nu _{S_j}(x_j)\big ) + C. \end{aligned}$$
(7.24)

Here, the extra term \(C>0\) is due the fact that we take into account the interactions of points \(x \in {\tilde{X}}_j \cap Q^\nu _{S_j}({\bar{y}}_j) \cap (A_j)_1\) . Since \({\mathcal {L}}^2((A_j)_2) \leqq C_\kappa \) for \(C_\kappa \) depending only \(\kappa \) and \(E_1({\tilde{X}}_j) <+\infty \), by Lemma 3.1(v), the cardinality of these points can be controlled by \(C_\kappa \). Then, by (2.3) we indeed get (7.24). Additionally, it holds that

$$\begin{aligned} E_1\Big ({\tilde{X}}_j,Q^\nu _{l_jS_j}(y_j) {\setminus } Q^\nu _{S_j}({\bar{y}}_j)\Big ) \leqq C, \end{aligned}$$
(7.25)

where C again only depends on \(\kappa \). In fact, all points \(x \in {\tilde{X}}_j \cap (Q^\nu _{l_jS_j}(y_j) {\setminus } Q^\nu _{S_j}({\bar{y}}_j))\) with \(\mathrm {dist}(x,A_j) > 1\) satisfy \(\#{\mathcal {N}}(x) =6\) and therefore they do not contribute to the energy. Again due to Lemma 3.1(v), the cardinality of \(x \in {\tilde{X}}_j\) with \(\mathrm {dist}(x,A_j) \leqq 1\) can be estimated by \(C_\kappa \). This gives (7.25). Now, (7.24)–(7.25) along with Lemma 3.1(iv) imply (7.23). This concludes the proof. \(\square \)

Proof of Proposition 7.2

We first show that, once (7.3) has been established, the result in Proposition 2.2 follows. Indeed, given \(x_0 \in \mathbb R^2\) and \(\rho >0\), estimate (2.16) readily follows from (7.3) for the sequence of centers \(y_T = (T/\rho )x_0\) and a scaling argument, see Proposition 3.1(ii) for \(\varepsilon =\rho /T\), \(\lambda =T/\rho \), and \(A = Q^\nu _\rho (x_0)\).

It remains to prove (7.3). Let \(z^\pm \in {\mathcal {Z}}\), \(\nu \in {\mathbb {S}}^1\), and a sequence \(\lbrace y_T\rbrace _T \subset \mathbb R^2\) be given. In view of the definition of \({\bar{\varphi }}\), see (7.1), it suffices to show that

$$\begin{aligned} \limsup _{T \rightarrow +\infty }\frac{1}{T}\min \big \{E_1\big (X_T,Q^\nu _T(y_T)\big ):\, X_T = {\mathscr {L}}(z^\pm ) \text { on } \partial _1^\pm Q^\nu _T(y_T) \big \}\leqq {\bar{\varphi }}(z^+,z^-,\nu ). \end{aligned}$$
(7.26)

Step 1: Comparison via construction. Consider \(1 \ll S \ll T\). Without restriction, we can assume that \(S \in \lbrace T_j\rbrace _j\), where \(\lbrace T_j\rbrace _j\) is the sequence identified in Proposition  7.5. For simplicity, if \(S = T_j\), we will write \(\eta _S\) instead of \(\eta _{T_j}\) for the null sequence given by Proposition 7.5. Define \(N_{S,T}:= \lfloor T/S\rfloor \). For \(j \in \lbrace 1,\ldots , N_{S,T} \rbrace \) we set \(x_j = y_T + (-T/2- S/2 + j S)\nu ^\perp \). We choose \(X_j \subset {\mathbb {R}}^2\) such that \(X_j = {\mathscr {L}}(z^\pm )\) on \(\partial _1^\pm Q^\nu _S(x_j)\) and

$$\begin{aligned} E_1\big (X_j,Q^\nu _S(x_j)\big )&=\min \left\{ E_1\big (X,Q^\nu _S(x_j)\big ) :\, X = {\mathscr {L}}(z^\pm ) \text { on } \partial _1^\pm Q^\nu _S(x_j) \right\} \nonumber \\&\leqq S\big ({\bar{\varphi }}(z^+,z^-,\nu ) + \eta _S\big ), \end{aligned}$$
(7.27)

where the inequality follows from Proposition 7.5. For \(j=1,\ldots , N_{S,T}\), we introduce the set \(A_j = Q^\nu _{10}(x_j+(S/2)\nu ^\perp ) \cup Q^\nu _{10}(x_j-(S/2)\nu ^\perp )\) and let \(X_T\) be defined by

$$\begin{aligned} X_T = {\left\{ \begin{array}{ll} X_j &{}\text {in } \,\, Q^\nu _S(x_j) {\setminus } A_j , \ j\in \lbrace 1,\ldots ,N_{S,T}\rbrace , \\ \emptyset &{}\displaystyle \text {in } \,\,\{x:\, |\langle \nu , x - y_T \rangle | < 5\} {\setminus } Q^*,\\ {\mathscr {L}}(z^\pm ) &{}\displaystyle \text {in }\,\,\{x:\,\pm \langle \nu , x - y_T \rangle \geqq 5\}{\setminus } Q^*, \end{array}\right. } \end{aligned}$$

where for brevity we have set \(Q^* := \bigcup \nolimits _{j=1}^{N_{S,T}} (Q^\nu _S(x_j){\setminus } A_j)\). Note that \(X_T = {\mathscr {L}}(z^\pm )\) on \(\partial ^\pm _1 Q^\nu _T(y_T)\). For an illustration of the construction, we refer to Fig. 13. We will show that

$$\begin{aligned} E_1\big (X_T,Q^\nu _T(y_T)\big )\leqq \lfloor T/S\rfloor \, S\big ({\bar{\varphi }}(z^+,z^-,\nu ) + \eta _S\big ) + CT/S + CS \end{aligned}$$
(7.28)

for a universal constant \(C>0\). Once this is achieved, we divide by T, take first the \(\limsup \) as \(T\rightarrow +\infty \), and then the limit as \(S\rightarrow +\infty \) (with S chosen from the sequence \(\lbrace T_j\rbrace _j\) given by Proposition 7.5). As \(\eta _S\rightarrow 0\), this yields (7.26) and thus the statement of the proposition.

Fig. 13
figure 13

Illustration of the construction for the existence of the limit on the left as well as the convexity in the third variable on the right. On the white region \(X_T= {\mathscr {L}}(z^-)\), on the light gray region \(X_T= {\mathscr {L}}(z^+)\), and on the dark gray region \(X_T=\emptyset \). The dark gray cubes, that are cut out in order to ensure that \(X_T\) has finite energy, are illustrated on the left, but they are also present in the construction on the right. In the gray cubes, we set \(X_T\) equal to the minimizer with boundary conditions \({\mathscr {L}}(z^\pm )\). For illustration purposes, we suppose that \(w = 0\) in (7.34)

Step 2: Proof of (7.28). It remains to prove (7.28). First, by construction, the definition of \(A_j\), and the boundary values of the configurations \(X_j\), we get \(|x-y|\geqq 1\) for all \( x, y \in X_T\), \(x\ne y\), and therefore \(E(X_T) <+\infty \). By Lemma 3.1(iv) and (7.27) it holds that

$$\begin{aligned}&E_1\big (X_T,Q^\nu _T(y_T)\big ) = \sum \limits _{j=1}^{N_{S,T}} E\big (X_T,Q^\nu _S(x_j)\big ) + E\Big (X_T, Q^\nu _T(y_T) {\setminus } \bigcup \limits _{j=1}^{N_{S,T}} Q^\nu _S(x_j)\Big ) \nonumber \\&\quad \leqq \lfloor T/S\rfloor \, \Big ( S\big ({\bar{\varphi }}(z^+,z^-,\nu ) + \eta _S\big ) + C\Big ) + E\Big (X_T, Q^\nu _T(y_T) {\setminus } \bigcup \limits _{j=1}^{N_{S,T}} Q^\nu _S(x_j)\Big ). \end{aligned}$$
(7.29)

Here, the addend C in the brackets is due to the fact that there may be \(x \in X_T \cap Q^\nu _S(x_j)\) with more neighbors in \(X_j\) than in \(X_T\). This, however, can only occur for atoms in \(x \in Q^\nu _S(x_j)\) such that \(x\in (\partial Q^\nu _S(x_j))_6 \cap (\{y:\,\langle y-x_j, \nu \rangle =0\})_6.\) Since \(E(X_T) <+\infty \), we can apply Lemma 3.1(v) and get that their cardinality is controlled by some universal constant C.

It remains to estimate the energy outside the union of the smaller cubes. We claim that

$$\begin{aligned} E\Big (X_T, Q^\nu _T (y_T) {\setminus } \bigcup \limits _{j=1}^{N_{S,T}} Q^\nu _S(x_j)\Big )\leqq C S. \end{aligned}$$
(7.30)

To see this, note that an atom \(x \in X_T \cap (Q^\nu _T(y_T) {\setminus } \bigcup _{j=1}^{N_{S,T}} Q^\nu _S(x_j)) \) can contribute to the energy only if \(|\langle x - y_T, \nu \rangle | \leqq 6\). Since \(E(X_T) <+\infty \), applying Lemma 3.1(v), we obtain

$$\begin{aligned} \#\Big \{x \in X_T \cap \Big ( Q^\nu _T(y_T) {\setminus } \bigcup \limits _{j=1}^{N_{S,T}} Q^\nu _S(x_j)\Big ):\, |\langle x - y_T, \nu \rangle | \leqq 6 \Big \}&\leqq C\left( T - S \left\lfloor T/S \right\rfloor \right) \\&\leqq CS, \end{aligned}$$

where \(T - S \lfloor T/S \rfloor \) controls the length of the rightmost dark gray region in the left part of Fig. 13. In view of (2.3), this implies (7.30). Combining (7.29) and (7.30) we obtain (7.28), which concludes the proof. \(\square \)

As a final preparation for the proof of Theorem 2.5, we characterize the translations of lattices with touching points. To this end, we introduce the following notation: for given \(\theta = \theta ^+ -\theta ^- \in {{\mathcal {G}}_{{\mathbb {A}}}}\), we say \(e^{i\theta ^+} \tau ^+ - e^{i\theta ^-} \tau ^-\) is a good translation and write \(e^{i\theta ^+} \tau ^+ - e^{i\theta ^-} \tau ^- \in {\mathcal {G}}_{{\mathbb {T}}}(\theta )\), whenever \((\tau ^+, \tau ^-) \in {\mathbb {T}}^2\) are such that there exist \(x \in {\mathscr {L}}(\theta ^+,\tau ^+,1)\) and \(y \in {\mathscr {L}}(\theta ^-,\tau ^-,1)\) with \(|x - y| = 1\). (By rotational invariance this does indeed only depend on the difference \(\theta = \theta ^+ -\theta ^-\).)

Lemma 7.6

(Properties of translations) Suppose that \(\theta = \theta ^+ -\theta ^-\in {{\mathcal {G}}_{{\mathbb {A}}}}\). Then \({\mathcal {G}}_{{\mathbb {T}}}(\theta )\) is contained in a finite union (of arcs) of spheres of radius 1, namely

$$\begin{aligned} {\mathcal {G}}_{{\mathbb {T}}}(\theta ) \subset \bigcup \limits _{x', y'} \partial B_1(y' - x'), \end{aligned}$$

where the union is taken over the all \(x' \in e^{i\theta ^+} {\mathscr {L}} \cap ({P}_{\theta ^+,\theta ^-})_5\) and \(y' \in e^{i\theta ^-} {\mathscr {L}} \cap {P}_{\theta ^+,\theta ^-}\), where \({P}_{\theta ^+,\theta ^-}\) is the fundamental parallelogram defined in (7.6). (Recall also notation (2.4).)

Proof

Consider \(x \in {\mathscr {L}}(\theta ^+,\tau ^+,1)\) and \(y \in {\mathscr {L}}(\theta ^-,\tau ^-,1)\) with \(|x - y| = 1\). We find a shifting vector \(v \in e^{i\theta ^+} {\mathscr {L}} \cap e^{i\theta ^-} {\mathscr {L}}\) such that \(y' := y - v - e^{i\theta ^-} \tau ^- \in e^{i\theta ^-} {\mathscr {L}} \cap {P}_{\theta ^+,\theta ^-}\). By defining \(x' := x - v - e^{i\theta ^+} \tau ^+ \in e^{i\theta ^+} {\mathscr {L}}\) we clearly get

$$\begin{aligned} 1 = |y-x| = \big | \big (y' -x'\big ) - \big (e^{i\theta ^+} \tau ^+ - e^{i\theta ^-} \tau ^-\big )\big |. \end{aligned}$$

The latter identity along with \(|\tau ^\pm | \leqq \sqrt{3} < 2\) (see (2.8)) yields \(x' \in e^{i\theta ^+} {\mathscr {L}} \cap ({P}_{\theta ^+,\theta ^-})_5\) as well as \(e^{i\theta ^+} \tau ^+ - e^{i\theta ^-} \tau ^- \in \partial B_1(y' - x')\). \(\square \)

We close this subsection with the proof of Theorem 2.5.

Proof of Theorem 2.5

Proof of (i),(ii). The proof of (i) follows from the definition of \(\varphi \) and Lemma 6.1(i). For (ii), we use Lemma 6.1(ii) to obtain the inequality

$$\begin{aligned} \frac{1}{2} \varphi _{\mathrm {hex}}\big (e^{-i\theta ^+} \nu \big )+ \frac{1}{2}\varphi _{\mathrm {hex}}\big (e^{-i\theta ^-} \nu \big ) \leqq \varphi (z^+,z^-,\nu ) \leqq \varphi _{\mathrm {hex}}\big (e^{-i\theta ^+} \nu \big )+ \varphi _{\mathrm {hex}}\big (e^{-i\theta ^-} \nu \big ) \end{aligned}$$
(7.31)

for all \((z^+,z^-) \in ({\mathcal {Z}}{\setminus } \lbrace {\mathbf {0}}\rbrace )^2\), \(z^+ \ne z^-\). By Lemma 6.2, \(\varphi = \Phi \) (see Lemma 7.1 and Proposition 7.2), Lemma 7.1, and the definition of \({\mathcal {G}}_{{\mathbb {T}}}(\theta ^+-\theta ^-)\), the inequality in (7.31) can be strict only if \(\theta ^+-\theta ^- \in {{\mathcal {G}}_{{\mathbb {A}}}}\) and \(e^{i\theta ^+}\tau ^+-e^{i\theta ^-}\tau ^- \in {{\mathcal {G}}_{{\mathbb {T}}}}(\theta ^+-\theta ^-)\). Clearly, \({{\mathcal {G}}_{{\mathbb {A}}}} \subset {\mathbb {A}}\) is countable, see (6.2), and \({{\mathcal {G}}_{{\mathbb {T}}}}(\theta ^+-\theta ^-) \subset {\mathbb {R}}^2\) is contained in a finite union of spheres by Lemma 7.6.

Proof of (iii). Let \(\nu _1,\nu _2 \in {\mathbb {S}}^1\), \(\lambda \in (0,1)\). Our goal is to prove

$$\begin{aligned} \varphi (z^+,z^-,\lambda \nu _1 +(1-\lambda )\nu _2) \leqq \lambda \varphi (z^+,z^-,\nu _1)+(1-\lambda )\varphi (z^+,z^-,\nu _2). \end{aligned}$$
(7.32)

Assume that \(\lambda \nu _1 +(1-\lambda )\nu _2\ne 0\) (otherwise the statement is trivial) and define \(\nu = \frac{\lambda \nu _1 +(1-\lambda )\nu _2}{|\lambda \nu _1 +(1-\lambda )\nu _2|} \in {\mathbb {S}}^1\). By the positive 1-homogeneity of \(\varphi \), (7.32) is equivalent to

$$\begin{aligned} \varphi (z^+,z^-,\nu ) \leqq \lambda _1\varphi (z^+,z^-,\nu _1)+\lambda _2\varphi (z^+,z^-,\nu _2), \end{aligned}$$
(7.33)

where \(\lambda _1 = \frac{\lambda }{|\lambda \nu _1 +(1-\lambda )\nu _2|}, \lambda _2 = \frac{1-\lambda }{|\lambda \nu _1 +(1-\lambda )\nu _2|} >0\). In the following, we will prove (7.33).

Step 1: Convexity via construction. We construct competitors for the problem \(\varphi (z^+,z^-,\nu )\), and refer to Fig. 13 for an illustration. Fix \(n \in {\mathbb {N}}\) such that \(\lambda _1,\lambda _2 \leqq n/2\). Let \(1 \ll S \ll T\). As before, we assume that \(S \in \lbrace T_j\rbrace _j\), where \(\lbrace T_j\rbrace _j\) is the sequence identified in Proposition 7.5. For simplicity, if \(S = T_j\), we will write \(\eta _S\) instead of \(\eta _{T_j}\) for the null sequence given by Proposition 7.5.

Define \( N_j(S,T) := \left\lfloor \lambda _j(T-(10n+5)S)/(nS) \right\rfloor \) for \(j \in \lbrace 1,2\rbrace \). In the following, the indices i, j, and k are always chosen from \(j \in \lbrace 1,2\rbrace \), \(i\in \lbrace 0,\ldots ,N_j(S,T)\rbrace \), and \(k\in \lbrace 0,\ldots ,n-1\rbrace \) without further notice. As usual, the orthonormal vectors to \(\nu ,\nu _1,\nu _2\) obtained by clockwise rotation about \(\pi /2\) are denoted by \(\nu ^\bot ,\nu _1^\bot ,\nu _2^\bot \), respectively. From \(\nu = \lambda _1 \nu _1 + \lambda _2 \nu _2\) and the definition of \(N_j(S,T)\) we get

$$\begin{aligned} N_1(S,T) \nu _1^\bot + N_2(S,T) \nu _2^\bot = M \nu ^\bot - w, \end{aligned}$$
(7.34)

where \(M = (T-(10n+5)S)/(nS)\) and \(w = \alpha _1 \nu _1^\bot + \alpha _2 \nu _2^\bot \) for suitable \(0 \leqq \alpha _1, \alpha _2 < 1\), in particular, \(|w| \leqq 2\). We set

$$\begin{aligned}&x_i^{1,k} =\big (-T/2+ 5S + S(M+10)k \big ) \, \nu ^\perp + i\,S\nu _1^\perp , \\&x_i^{2,k} =x_{N_1(S,T)}^{1,k} + 5S\nu ^\perp + i\,S\nu _2^\perp , \end{aligned}$$

and let \(X_i^{j,k} \subset {\mathbb {R}}^2\) be defined as a minimizer of the problem

$$\begin{aligned} \begin{aligned} \min \left\{ E_1\big (X,Q^{\nu _j}_S(x_i^{j,k})\big ) :\, X = {\mathscr {L}}(z^\pm ) \text { on } \partial _1^\pm Q^{\nu _j}_S(x_i^{j,k}) \right\} . \end{aligned} \end{aligned}$$
(7.35)

We recall notation (2.4)–(2.5) and define

$$\begin{aligned} U&= \big ([- \tfrac{T}{2}\nu ^\perp ; x_0^{1,0}]\big )_{\kappa } \cup \bigcup _{k=0}^{n-1} \big ([x_{N_1(S,T)}^{1,k}; x_0^{2,k}]\big )_{\kappa } \cup \bigcup _{k=0}^{n-2} \big ([x_{N_2(S,T)}^{2,k}; x_0^{1,k+1}]\big )_{\kappa } \\&\quad \cup \big ([ x_{N_2(S,T)}^{2,n-1}; \tfrac{T}{2}\nu ^\perp ]\big )_{\kappa }, \end{aligned}$$

where \(\kappa >1\) is chosen later. Note that U consists of \(2n+1\) tubular neighborhoods of segments whose maximal length is bounded by CS. (Apart from the segment \([ x_{N_2(S,T)}^{2,n-1}; \tfrac{T}{2}\nu ^\perp ]\), this follows directly from the choice of the points \(x_i^{j,k}\) and (7.34). For \([ x_{N_2(S,T)}^{2,n-1}; \tfrac{T}{2}\nu ^\perp ]\), it follows from \(x_{N_2(S,T)}^{2,n-1} = (-T/2 + S(M+10)n)\, \nu ^\perp - S w = (T/2 - 5S)\, \nu ^\perp - Sw\), where \(|w| \leqq 2\).) We also observe that \(Q^\nu _T {\setminus } (\bigcup _{i,j,k} Q^{\nu _j}_S(x_i^{j,k}) \cup U)\) consists of two connected components. The connected component intersecting \(\partial ^+_1 Q^\nu _T\) is denoted by \(A^+\) and the other one is denoted by \(A^-\). Note that the cubes \(Q^{\nu _j}_S(x_i^{j,k})\) do not intersect \(\partial ^\pm _1 Q^\nu _T\). We introduce the sets \(A_i^{j,k} = Q^{\nu _j}_{10}( x_i^{j,k} + (S/2) \nu ^\perp _j )\cup Q^{\nu _j}_{10}( x_i^{j,k} - (S/2) \nu ^\perp _j )\) and let \(X_T\) be defined by

$$\begin{aligned} X_T = {\left\{ \begin{array}{ll} X_i^{j,k} &{}\text {in } \,\,Q^{\nu _j}_S(x_i^{j,k}){\setminus } A_i^{j,k}, \\ \emptyset &{}\text {in } \,\, \left( U {\setminus }\left( \bigcup _{i,j,k} Q^{\nu _j}_S(x_i^{j,k}) \cup \partial ^-_1 Q^{\nu }_T\cup \partial ^+_1 Q^\nu _T\right) \right) \cup \bigcup _{i,j,k}A_i^{j,k} ,\\ {\mathscr {L}}(z^\pm ) &{}\text {in } \,\, A^\pm \cup \partial ^\pm _1 Q^\nu _T. \end{array}\right. } \end{aligned}$$
(7.36)

For an illustration of the sets and the configuration \(X_T\) we refer to Fig. 13. Clearly, we have \(X_T = {\mathscr {L}}(z^\pm )\) on \(\partial ^\pm _1 Q^\nu _T\).

Step 2: Energy estimate on \(X_T\). We now estimate the energy of \(X_T\). First, due to the boundary conditions \(X_i^{j,k} = {\mathscr {L}}(z^\pm )\) on \(\partial _1^\pm Q^{\nu _j}_S(x_i^{j,k})\), one can check that for \(\kappa \) big enough there holds \(|x-y| \geqq 1\) for all \(x,y \in X_T\), \(x\ne y\) and therefore \(E_1(X_T)<+\infty \). We now prove the following two sub-estimates:

$$\begin{aligned} E_1\Big (X_T, \big ( A^+ \cup A^- \cup \partial ^+_1 Q^\nu _T\cup \partial ^-_1 Q^\nu _T \big ) \cap Q^\nu _T \Big )\leqq CnS \end{aligned}$$
(7.37)

and

$$\begin{aligned}&E_1\left( X_T,\bigcup \limits _{i,j,k} Q^\nu _S(x_i^{j,k}) \cup \big (U {\setminus } (\partial ^-_1 Q^{\nu }_T\cup \partial ^+_1 Q^\nu _T) \big )\right) \nonumber \\&\quad \leqq \sum \limits _{j=1}^2\frac{\lambda _jT}{S} \left( S\big ({\varphi }(z^+,z^-, \nu _j) + \eta _S\big ) +C\right) , \end{aligned}$$
(7.38)

where \(\lbrace \eta _S \rbrace _S\) denotes a sequence with \(\eta _S \rightarrow 0\) as \(S \rightarrow +\infty \). \(\square \)

Proof of (7.37)

For \(x \in X_T\cap (A^+ \cup A^- \cup \partial ^+_1 Q^\nu _T\cup \partial ^-_1 Q^\nu _T ) \cap Q^\nu _T\) such that \(\mathrm {dist}(x, U) >1\), there holds \(\#{\mathcal {N}}(x)=6\). This follows from the boundary conditions of \(X_i^{j,k}\) on every cube \(Q^{\nu _j}_S(x_i^{j,k})\) and the fact that \(X_T={\mathscr {L}}(z^\pm )\) in \(A^\pm \cup \partial ^\pm _1 Q^\nu _T\). Therefore, in order to obtain (7.37), it suffices to estimate the cardinality of the atoms \(x\in X_T\) lying in \((U)_1\). As U consists of \(2n+1\) tubular neighborhoods of segments whose length is bounded by CS, we get \({\mathcal {L}}^2((U)_2) \leqq CnS\). Therefore, employing Lemma 3.1(v), we obtain \(\#(X_T \cap (U)_1 )\leqq CnS\). By (2.3) this implies (7.37). \(\square \)

Proof of (7.38):

In view of (7.36), in order to obtain (7.38), it suffices to estimate the energy contribution of atoms in \(\bigcup _{i,j,k} (Q^{\nu _j}_S(x_i^{j,k}){\setminus } A_i^{j,k})\). For each ijk, it holds that \(X_T= {\mathscr {L}}(z^\pm )\) on

$$\begin{aligned} \big (\partial Q^{\nu _j}_S(x_i^{j,k})\big )_5 {\setminus } \big \{x:\pm \langle x-x_i^{j,k}, \nu _j\rangle \leqq C \kappa \big \}, \end{aligned}$$

with a constant \(C > 0\) only depending on \(\nu _1, \nu _2\) and \(\nu \). This shows that the cardinality of \(X_T \cap Q^{\nu _j}_S(x_i^{j,k}) \cap ((A_i^{j,k})_1 \cup (U)_1)\), which contains all atoms \(x \in X_T \cap Q^{\nu _j}_S(x_i^{j,k})\) for which possibly \(\#({\mathcal {N}}(x)\cap X_T) < \#({\mathcal {N}}(x)\cap X_i^{j,k})\), is uniformly controlled due to Lemma 3.1(v). We thus obtain \(E\big (X_T,Q^{\nu _j}_S(x_{i}^{j,k})\big ) \leqq E\big (X_i^{j,k},Q^{\nu _j}_S(x_{i}^{j,k})\big ) + C\) by (2.3). Thus, using (7.35), Propositions 7.2 and 7.5 we get

$$\begin{aligned} E\big (X_T,Q^{\nu _j}_S(x_{i}^{j,k})\big ) \leqq E\big (X_i^{j,k},Q^{\nu _j}_S(x_{i}^{j,k})\big ) + C \leqq S\big ({\varphi }(z^+,z^-,\nu _j) + \eta _S\big ) +C. \end{aligned}$$
(7.39)

For \(j \in \lbrace 1,2\rbrace \), we find that

$$\begin{aligned}&\# \big \{(i,k) :\, i=0,\ldots ,N_j(S,T), \, k=0,\ldots ,n-1 \big \} \\&\quad = n\bigg (\bigg \lfloor \frac{\lambda _j(T-(10n+5)S) }{nS} \bigg \rfloor + 1 \bigg ) \leqq \frac{\lambda _j T}{S}. \end{aligned}$$

This, along with (7.39), yields (7.38).

Step 3: Conclusion. Noting that

$$\begin{aligned} \min \left\{ E_1(X,Q^{\nu }_T) :\, X = {\mathscr {L}}(z^\pm ) \text { on } \partial _1^\pm Q^{\nu }_T \right\} \leqq E_1(X_T,Q^\nu _T), \end{aligned}$$

and using (7.37)–(7.38) as well as Lemma 3.1(iv), we have

$$\begin{aligned}&\min \left\{ E_1(X,Q^{\nu }_T) :X = {\mathscr {L}}(z^\pm ) \text { on } \partial _1^\pm Q^{\nu }_T \right\} \leqq \lambda _1 T\big ({\varphi }(z^+,z^-,\nu _1) + \eta _S\big ) +C\lambda _1 T /S \nonumber \\&\quad \ \ \ + \lambda _2T\big ({\varphi }(z^+,z^-,\nu _2) + \eta _S\big ) + C\lambda _2T/S + CnS. \end{aligned}$$

Dividing by T, letting first \(T\rightarrow +\infty \), and then \(S\rightarrow +\infty \), we obtain (7.33) by Proposition 7.2, where we also use \(\eta _S \rightarrow 0\). This concludes the proof of (iii). \(\square \)

Proof of (iv)

Let \(z^\pm =(\theta ^\pm ,\tau ^\pm ,1)\), \(\nu \in {\mathbb {S}}^1\), and \(\theta \in {\mathbb {A}}\). Our goal is to prove

$$\begin{aligned}&\varphi \big ((\theta ^++\theta ,\tau ^+,1),(\theta ^-+\theta ,\tau ^-,1),e^{i\theta } \nu \big )\nonumber \\&\quad =\varphi \big ((\theta ^+,\tau ^+,1),(\theta ^-,\tau ^-,1),\nu \big ). \end{aligned}$$
(7.40)

Due to Proposition 7.2, for every \(T>0\) we can choose \(X_T \subset {\mathbb {R}}^2\), such that \(X_T ={\mathscr {L}}((\theta ^\pm ,\tau ^\pm ,1))\) on \(\partial _1^\pm Q^\nu _T\) and such that

$$\begin{aligned} \lim _{T\rightarrow +\infty } \frac{1}{T} E_1\big (X_T,Q^\nu _T\big ) = \varphi \big ((\theta ^+,\tau ^+,1),(\theta ^-,\tau ^-,1),\nu \big ). \end{aligned}$$
(7.41)

We set \(X_T^\theta = e^{i\theta }X_T\). Then \(X_T^\theta = {\mathscr {L}}((\theta ^\pm +\theta ,\tau ^\pm ,1))\) on \(\partial _1^\pm Q^{\nu _\theta }_T\), where \(\nu _\theta = e^{i\theta }\nu \). Applying Proposition 7.2, Lemma 3.1(i), and (7.41), we obtain

$$\begin{aligned}&\varphi ((\theta ^++\theta ,\tau ^+,1),(\theta ^-+\theta ,\tau ^-,1),e^{i\theta }\nu )\leqq \liminf _{T\rightarrow +\infty } \frac{1}{T} E_1(X_T^\theta ,Q^{\nu _\theta }_T) \\&\quad = \lim _{T\rightarrow +\infty } \frac{1}{T} E_1(X_T,Q^{\nu }_T) = \varphi ((\theta ^+,\tau ^+,1),(\theta ^-,\tau ^-,1),\nu ). \end{aligned}$$

This implies one inequality in (7.40). The other inequality follows by repeating the argument for \(({\tilde{\theta }}^\pm ,\tau ^\pm ,1)= (\theta ^\pm +\theta ,\tau ^\pm ,1)\), \({\tilde{\nu }} = e^{i\theta }\nu \), and \({\tilde{\theta }}=-\theta \). This concludes the proof of (iv). \(\square \)

Proof of (v)

Let \(z^\pm =(\theta ^\pm ,\tau ^\pm ,1)\), \(\nu \in {\mathbb {S}}^1\), and \(\tau \in {\mathbb {T}}\). Our goal is to prove

$$\begin{aligned}&\varphi \Big ( \big (\theta ^+,\tau ^++e^{-i\theta ^+}\tau ,1\big ), \big (\theta ^-,\tau ^-+e^{-i\theta ^-}\tau ,1\big ),\nu \Big )\nonumber \\&\quad =\varphi \big ((\theta ^+,\tau ^+,1),(\theta ^-\tau ^-,1),\nu \big ). \end{aligned}$$
(7.42)

Due to Proposition 7.2, for every \(T>0\) we can choose \(X_T \subset {\mathbb {R}}^2\) , such that \(X_T ={\mathscr {L}}((\theta ^\pm ,\tau ^\pm ,1))\) on \(\partial _1^\pm Q^\nu _T\) and such that (7.41) holds. We set \(X_T^\tau = X_T+\tau \). Then \(X_T^\tau = {\mathscr {L}}((\theta ^\pm ,\tau ^\pm +e^{-i\theta ^{\pm }}\tau ,1))\) on \(\partial _1^\pm Q^{\nu }_T(\tau )\). Applying Proposition 7.2, Lemma 3.1(i), and (7.41), we get

$$\begin{aligned}&\varphi \big (\big (\theta ^+,\tau ^++e^{-i\theta ^+}\tau ,1\big ), \big (\theta ^-,\tau ^-+e^{-i\theta ^-}\tau ,1\big ),\nu \big )\\&\quad \leqq \liminf _{T\rightarrow +\infty } \frac{1}{T} E_1(X_T^\tau ,Q^{\nu }_T(\tau )) = \lim _{T\rightarrow +\infty } \frac{1}{T} E_1(X_T,Q^{\nu }_T) \\&\quad = \varphi \big ((\theta ^+,\tau ^+,1),(\theta ^-,\tau ^-,1),\nu \big ). \end{aligned}$$

This yields one inequality of (7.42). The other inequality follows by repeating the argument for \(({\theta }^\pm ,{\tilde{\tau }}^\pm ,1) = (\theta ^\pm ,\tau ^\pm +e^{-i\theta ^\pm }\tau ,1)\) and \({\tilde{\tau }}=-\tau \). This concludes the proof of (v). \(\square \)