1 Introduction

We consider the Maxwell–Dirac system

$$\begin{aligned} \left\{ \begin{aligned}&(-i\gamma ^\mu \partial _\mu + M) \psi = A_\mu \gamma ^\mu \psi , \\&\square A_\mu = \overline{\psi } \gamma _\mu \psi , \end{aligned} \right. \end{aligned}$$
(1)

on the Minkowski space-time \(\mathbb {R}^{1+d}\) for space dimensions \(d \le 3\). This fundamental model from relativistic field theory describes the interaction of an electron with its self-induced electromagnetic field. Our interest here is in the Cauchy problem with prescribed initial data at time \(t=0\),

$$\begin{aligned} \psi (0,x) = \psi _0(x), \quad A_\mu (0,x) = a_\mu (x), \quad \partial _t A_\mu (0,x) = b_\mu (x), \end{aligned}$$
(2)

and the question of local or global solvability, which has received some attention in recent years; see [1, 4, 7, 14, 15, 18] for the case of one space dimension and [3, 5, 6, 8, 9, 11,12,13, 16] for higher dimensions, and the references therein.

The unknowns are the spinor field \(\psi =\psi (t,x)\), taking values in \(\mathbb {C}^N\) (\(N=2\) for \(d=1,2\); \(N=4\) for \(d=3\)), and the real-valued potentials \(A_\mu =A_\mu (t,x)\), \(\mu =0,1,\dots ,d\). \(M \ge 0\) is the mass.

The equations are written in covariant form on \(\mathbb {R}^{1+d} = \mathbb {R}_t \times \mathbb {R}_x^d\) with the Minkowski metric \((g^{\mu \nu }) = \mathrm {diag}(1,-1,\dots ,-1)\) and coordinates \((x_\mu )\), where \(x_0=t\) is the time and \(x=(x_1,\dots ,x_d)\) is the spatial position. Greek indices range over \(0,1,\dots ,d\), latin indices over \(1,\dots ,d\), and repeated upper and lower indices are implicitly summed over these ranges. We write \(\partial _\mu = \frac{\partial }{\partial x_\mu }\), so \(\partial _0=\partial _t\) is the time derivative, \(\nabla = (\partial _1,\dots ,\partial _d)\) is the spatial gradient, and \(\square = \partial ^\mu \partial _\mu = \partial _t^2-\Delta \) is the d’Alembertian. The \(N \times N\) Dirac matrices \(\gamma ^\mu \) are required to satisfy

$$\begin{aligned} \gamma ^\mu \gamma ^\nu + \gamma ^\nu \gamma ^\mu = 2 g^{\mu \nu } I, \qquad (\gamma ^0)^* = \gamma ^0, \qquad (\gamma ^j)^* = - \gamma ^j. \end{aligned}$$
(3)

We denote by \(\psi ^*\) the complex conjugate transpose, and write \(\overline{\psi }= \psi ^*\gamma ^0\).

Key features of the Maxwell–Dirac system are the gauge invariance, the scaling invariance and the conservation laws, which we now recall.

Firstly, there is a \(\mathrm {U}(1)\) gauge invariance

$$\begin{aligned} \psi \longrightarrow e^{i\chi }\psi , \qquad A_\mu \longrightarrow A_\mu + \partial _\mu \chi , \end{aligned}$$

for any real valued \(\chi (t,x)\). This implies gauge freedom, allowing to specify a gauge condition on the potentials. The particular form (1) of the Maxwell–Dirac system appears when the Lorenz gauge condition \(\partial ^\mu A_\mu = 0\) is imposed, that is,

$$\begin{aligned} \partial _t A_0 = \nabla \cdot \mathbf{A}, \end{aligned}$$
(4)

where \(\mathbf{A} = (A_1,\dots ,A_d)\). Since this gauge condition reduces to certain constraints on the data (2), we did not include it in (1). In addition to the obvious constraint, there is a constraint coming from the Gauss law (implied by (4) and the second equation in (1))

$$\begin{aligned} \nabla \cdot \mathbf{E} = \vert \psi \vert ^2, \end{aligned}$$

where \(\mathbf{E} = \nabla A_0 - \partial _t \mathbf{A}\) is the electric field. Thus, the data constraints are

$$\begin{aligned} b_0 = \sum _{j=1}^d \partial _j a_j, \qquad \sum _{j=1}^d \partial _j (\partial _j a_0 - b_j) = \vert \psi _0 \vert ^2. \end{aligned}$$
(5)

If these are satisfied (in some ball), then a solution of (1), (2) will also satisfy the Lorenz gauge condition (4) (in the cone of dependence over the ball).

Secondly, the system is invariant under the rescaling, in the case \(M=0\),

$$\begin{aligned} \psi (t,x) \longrightarrow \lambda ^{3/2} \psi (\lambda t, \lambda x), \qquad A_\mu (t,x) \longrightarrow \lambda A_\mu (\lambda t, \lambda x) \qquad (\lambda > 0). \end{aligned}$$

For Sobolev data \((\psi _0,a_\mu ,b_\mu ) \in H^s(\mathbb {R}^d) \times H^r(\mathbb {R}^d) \times H^{r-1}(\mathbb {R}^d)\), the scale-invariant regularity (for the homogeneous Sobolev norms, to be precise) is \(s=s_c(d)=\frac{d-3}{2}\) and \(r=r_c(d)=\frac{d-2}{2}\). By the usual heuristic one does not expect well-posedness below this regularity.

Thirdly, we consider conservation laws. While the Maxwell–Dirac system does have a conserved energy, which is roughly speaking at the level of \(H^{1/2}\) for the spinor, this energy does not have a definite sign, so it is difficult to see how to make use of it to prove global existence. On the other hand, one has the conservation of charge

$$\begin{aligned} \int _{\mathbb {R}^d} \vert \psi (t,x) \vert ^2 \, dx = \int _{\mathbb {R}^d} \vert \psi (0,x) \vert ^2 \, dx, \end{aligned}$$
(6)

which plays a key role in all the known global existence results for large data. We will refer to solutions at this regularity, that is, with \(t \mapsto \psi (t,\cdot )\) a continuous map into \(L^2(\mathbb {R}^d)\), as charge class solutions. It should be noted that the charge regularity \(s=0\) coincides with the scaling-critical regularity \(s_c(d) = \frac{d-3}{2}\) when \(d=3\). Thus, the Maxwell–Dirac system is charge-critical in three space dimensions, and charge-subcritical in dimensions \(d=1,2\).

The first global result for (1), (2) was obtained by Chadam [4], in one space dimension, for data \((\psi _0,a_\mu ,b_\mu ) \in H^1(\mathbb {R}) \times H^1(\mathbb {R}) \times L^2(\mathbb {R})\). Chadam first proved local existence and uniqueness, and was able to extend the solution globally by proving an a priori bound on the \(H^1(\mathbb {R}) \times H^1(\mathbb {R}) \times L^2(\mathbb {R})\) norm of the solution via a clever boot-strap argument making use of the conservation of charge (6). But to be able to prove global existence with a more direct use of the conservation of charge, in any dimension, a natural strategy is to try to prove local existence of charge class solutions.Footnote 1

We proceed to recall what is known about local and global well-posedness in the charge class.

Starting with one space dimension, we note that Bournaveas [2] proved global charge-class existence for the related Dirac-Klein-Gordon system, but the argument relies on a null structure in Dirac-Klein-Gordon which is not present in Maxwell–Dirac. Bachelot [1] gave another proof that does not rely on null structure and applies also to Maxwell–Dirac; similar results have been obtained in [14, 17, 18].

In the charge-critical three space-dimensional case, local well-posedness remains an open question in the charge class, but has been proved almost down to that regularity by D’Ancona, Foschi and Selberg [5]; see also [3, 13] for earlier local results at higher regularity, and [9, 12, 16] for small-data global results. The existence of stationary solutions was proved in [8].

In two space dimensions, global well-posedness in the charge class was proved by D’Ancona and Selberg [6].

To summarise, in the charge-subcritical dimensions \(d=1,2\), there is global well-posedness in the charge class, and in the charge-critical dimension \(d=3\), local well-posedness holds almost down to the charge regularity. Our aim here is to show that these results are sharp (or almost sharp, for \(d=3\)), by proving ill-posedness below the charge regularity. This result is somewhat surprising in the subcritical cases, and in particular for \(d=1\). Indeed, it should be noted that in dimensions \(d=2,3\), the proof of local existence at or near the charge regularity is quite involved and requires a subtle null structure that was uncovered in [5]. By contrast, the proof in the case \(d=1\) is elementary (see Sect. 7) and does not require this null structure. It was therefore expected that, by exploiting the latter, one should be able to go below the charge. But our result shows that this is not possible, and this means that the null structure is not helpful in the case \(d=1\).

We remark that our proof of ill-posedness works also in dimensions \(d \ge 4\), but then the critical regularity is above the charge, so this is not really of much interest. In dimensions \(d \ge 4\), global existence and modified scattering for data with small scaling-critical norm has been proved in [11].

We now state our main results.

2 Main results

The following notation is used. For \(1 \le p \le \infty \), \(L^p(\mathbb {R}^d)\) denotes the standard Lebesgue space. For \(s \in \mathbb {R}\), \(H^s(\mathbb {R}^d)\) is the Sobolev space \((1-\Delta )^{s/2} L^2(\mathbb {R}^d)\). For an open set U in \(\mathbb {R}^d\) or \(\mathbb {R}_t \times \mathbb {R}_x^d\), \(\mathcal D'(U)\) is the space of distributions on U. We write

$$\begin{aligned} X_0= & {} H^s(\mathbb {R}^d) \text { for some } s< 0, \text { or } L^p(\mathbb {R}^d) \text { for some } 1 \le p < 2. \\ B= & {} \text { the open unit ball in } \mathbb {R}^d, \text { centred at the origin }. \\ K= & {} \text { the cone of dependence over } B. \\ K_T= & {} K \cap \left( [0,T] \times \mathbb {R}^d \right) , \text { for } T > 0. \end{aligned}$$

Thus, \(K = \{ (t,x) \in \mathbb {R}\times \mathbb {R}^d :0 \le t< 1,\; \vert x \vert < 1-t \}\). The interior of the truncated cone \(K_T\) will be denoted \(\mathrm {Int}(K_T)\).

We will use the following facts concerning \(C^\infty \) solutions of (1), which follow from the general theory for semilinear wave equations. Assume we are given data (2) belonging to \(C^\infty (\mathbb {R}^d)\). Then there exists a corresponding \(C^\infty \) solution \((\psi ,A_\mu )\) of (1) on an open subset U of \([0,\infty ) \times \mathbb {R}^d\) containing the Cauchy hypersurface \(\{0\} \times \mathbb {R}^d\). Moreover, we may assume that U is causal, in the sense that for every point (tx) in U, the cone of dependence \(K^{(t,x)}\), with vertex (tx) and base in \(\{0\} \times \mathbb {R}^d\), is contained in U. The solution in the cone \(K^{(t,x)}\) is uniquely determined by the data in the base of the cone. By the uniqueness, and since the union of two causal sets is again causal, there exists a maximal solution of the type described above, and we call this the maximal \(C^\infty \) forward evolution of the given data.

In the first version of our ill-posedness result, we take vanishing data for the potentials.

Theorem 1

(Ill-posedness I) In space dimensions \(d \le 3\), the Cauchy problem (1), (2) is ill-posed for data

$$\begin{aligned} \psi _0 \in X_0, \qquad a_\mu = b_\mu = 0 \qquad (\mu =0,\dots ,d). \end{aligned}$$

More precisely, there exists \(\psi _0^{\mathrm {bad}} \in X_0 \setminus L^2(\mathbb {R}^d)\) such that for any \(T > 0\) and any neighbourhood \(\Omega _0\) of \(\psi _0^{\mathrm {bad}}\) in \(X_0\), there fails to exist a continuous map

$$\begin{aligned} S :\Omega _0 \longrightarrow \mathcal D'\left( \mathrm {Int}(K_T) \right) , \qquad \psi _0 \longmapsto S[\psi _0] = (\psi ,A_\mu ), \end{aligned}$$

with the property that if \(\psi _0 \in \Omega _0 \cap C_c^\infty (\mathbb {R}^d)\), then \(S[\psi _0]\) is \(C^\infty \) in \(K_T\) and solves (1) there, with intial data \(\psi _0\) and \(a_\mu =b_\mu =0\) in B.

This result applies to (1), (2) without regard to the data constraints (5), which of course are not compatible with the assumption \(a_\mu =b_\mu =0\). We next state an alternative version of the result, which allows to take into account the constraints. In fact, Theorem 1 is an immediate consequence of the following more precise result.

Theorem 2

(Ill-posedness II) Let \(d \le 3\). There exist \(\psi _0^{\mathrm {bad}} \in X_0 \setminus L^2(\mathbb {R}^d)\) and \(\psi _{0,\varepsilon }, a_{\mu ,\varepsilon }, b_{\mu ,\varepsilon } \in C_c^\infty (\mathbb {R}^d)\) for each \(\varepsilon > 0\), such that

  1. (i)

    \(\psi _{0,\varepsilon } \rightarrow \psi _0^{\mathrm {bad}}\) in \(X_0\) as \(\varepsilon \rightarrow 0\).

  2. (ii)

    The maximal \(C^\infty \) forward evolution \((\psi _\varepsilon ,A_{\mu ,\varepsilon })\) of the data \((\psi _{0,\varepsilon }, a_{\mu ,\varepsilon }, b_{\mu ,\varepsilon })\) exists throughout the cone K.

  3. (iii)

    There exists \(T > 0\) such that, as \(\varepsilon \rightarrow 0\), \(A_{0,\varepsilon }(t,x) \rightarrow \infty \) uniformly in any compact subset of \(K_T \cap \{ (t,x) :\vert x \vert < t \}\).

Moreover, we can choose the \(a_{\mu ,\varepsilon }, b_{\mu ,\varepsilon }\) so that either

$$\begin{aligned} a_{\mu ,\varepsilon } = b_{\mu ,\varepsilon } = 0 \qquad \text { for } \mu = 0,\dots ,d, \end{aligned}$$
(7)

or

$$\begin{aligned} b_{0,\varepsilon } = \sum _{j=1}^d \partial _j a_{j,\varepsilon }, \qquad \sum _{j=1}^d \partial _j \left( \partial _j a_{0,\varepsilon } - b_{j,\varepsilon } \right) = \vert \psi _{0,\varepsilon } \vert ^2 \qquad \text { in }B. \end{aligned}$$
(8)

Here, if we choose the alternative (8), then \(a_{\mu ,\varepsilon }, b_{\mu ,\varepsilon }\) do not have limits in the sense of distributions on B as \(\varepsilon \rightarrow 0\). This is not a deficiency of our construction, but is necessarily so, as our next result shows. The following theorem essentially says that the Gauss law for the initial data is ill-posed when we are below the charge regularity.

Theorem 3

(Ill-posedness of constraints) There exists \(\psi _0^{\mathrm {bad}} \in X_0 \setminus L^2(\mathbb {R}^d)\) such that for any neighbourhood \(\Omega _0\) of \(\psi _0^{\mathrm {bad}}\) in \(X_0\), there do not exist continuous maps

$$\begin{aligned} I_\mu , J_\mu :\Omega _0 \longrightarrow \mathcal D'(B) \end{aligned}$$

with the property that if \(\psi _0 \in \Omega _0 \cap C_c^\infty (\mathbb {R}^d)\), then

$$\begin{aligned} a_\mu := I_\mu [\psi _0], \qquad b_\mu := J_\mu [\psi _0] \qquad (\mu =0,\dots ,d) \end{aligned}$$

satisfy the constraint equations (5) in B.

We conclude this section with a brief outline of the key steps in the proof of Theorem 2.

Step 1. We prove global well-posedness in the charge class for (1), (2) in the case where the data only depend on a single coordinate, say \(x_1\).

Step 2. To define the data \(\psi _{0,\varepsilon }, a_{\mu ,\varepsilon }, b_{\mu ,\varepsilon } \in C_c^\infty (\mathbb {R}^d)\), we start with functions of \(x_1\) and cut off smoothly outside the unit ball B. The corresponding maximal \(C^\infty \) forward evolution \((\psi _\varepsilon ,A_{\mu ,\varepsilon })\) exists in the entire cone K, by Step 1, and depends only on t and \(x_1\) there.

Step 3. Using a null form estimate and a boot-strap argument we prove that there exists \(T > 0\) such that \(A_{j,\varepsilon }\), \(j=2,3\), are uniformly bounded in \(K_T\). A further boot-strap argument then yields a lower bound on \(\vert \psi _\varepsilon \vert \) in \( K_T \cap \{ (t,x) :0< t < x_1 \}. \)

Step 4. Letting \(\varepsilon \rightarrow 0\), we show that \(A_{0,\varepsilon }(t,x) \rightarrow \infty \) uniformly in any compact subset of \( K_T \cap \{ (t,x) :\vert x \vert < t \}, \) completing the proof of Theorem 2. In fact, we prove this in the larger set \(K_T \cap \{ (t,x) :\vert x_1 \vert < t \}\).

The remainder of this paper is organised as follows. In Sect. 3 we state the well-posedness result (Step 1), whose elementary proof is deferred until Sect. 7. In Sect. 4, we choose a particular representation of the Dirac matrices in dimensions \(d \le 3\), write out the Maxwell–Dirac system in terms of the components of the spinor, and prove a null form estimate in one space dimension. In Sect. 5 we specify the data (Step 2), and Sect. 6 contains the proof of ill-posedness (Steps 3 and 4).

3 Well-posedness for one-dimensional data

We start by stating the result described in Step 1, the well-posedness in the case where the data only depend on the single coordinate \(x_1\):

$$\begin{aligned} \psi (0,x) = \psi _0(x_1), \quad A_\mu (0,x) = a_\mu (x_1), \quad \partial _t A_\mu (0,x) = b_\mu (x_1). \end{aligned}$$
(9)

Then the solution of (1) will depend only on t and \(x_1\). Indeed, if \((\psi ,A_\mu )\) does not depend on \(x_2,\dots ,x_d\), then (1) is equivalent to

$$\begin{aligned} \left\{ \begin{aligned} (- i\gamma ^0 \partial _t - i\gamma ^1 \partial _1 + M) \psi&= \left( A_0 \gamma ^0 + A_1 \gamma ^1 + \dots + A_d \gamma ^d \right) \psi , \\ (\partial _t^2 - \partial _1^2) A_0&= \psi ^* \psi , \\ (\partial _t^2 - \partial _1^2) A_1&= - \psi ^* \gamma ^0 \gamma ^1 \psi , \\&\ \,\vdots \\ (\partial _t^2 - \partial _1^2) A_d&= - \psi ^* \gamma ^0 \gamma ^d \psi , \end{aligned} \right. \end{aligned}$$
(10)

and this is the system we will solve, with the initial condition (9).

There is conservation of charge, for sufficiently regular solutions:

$$\begin{aligned} \int _{\mathbb {R}} \vert \psi (t,x_1) \vert ^2 \, dx_1 = \int _{\mathbb {R}^d} \vert \psi (0,x_1) \vert ^2 \, dx_1. \end{aligned}$$
(11)

Indeed, premultiplying the Dirac equation in (10) by \(i\overline{\psi }=i\psi ^*\gamma ^0\), taking real parts, and using the fact that M and the \(A_\mu \) are real, and that \(\gamma ^0\) and \(\gamma ^0\gamma ^j\) are hermitian, one obtains the conservation law \(\partial _t \rho + \partial _1 j = 0\), where \(\rho = \psi ^*\psi = \vert \psi \vert ^2\) and \(j = \psi ^*\gamma ^0\gamma ^1\psi \). Integration then gives (11).

We now state the global well-posedness result in the charge class. The \(a_\mu \), \(\mu =0,\dots ,d\), will be taken in the space \(AC(\mathbb {R})\) with norm

$$\begin{aligned} \left\| f \right\| _{AC(\mathbb {R})} = \left\| f \right\| _{L^\infty (\mathbb {R})} + \left\| f' \right\| _{L^1(\mathbb {R})}. \end{aligned}$$

Thus, \(AC(\mathbb {R})\) is the space of absolutely continuous functions \(f :\mathbb {R}\rightarrow \mathbb {C}\) with bounded variation (cf. Corollary 3.33 in [10]), and \(AC_{\mathrm {loc}}(\mathbb {R})\) is the space of locally absolutely continuous functions.

Theorem 4

In any space dimension d, the Maxwell–Dirac system (1) is globally well-posed for one-dimensional data (9) with the regularity

$$\begin{aligned} (\psi _0,a,b) \in \mathfrak X_0 := L^2(\mathbb {R};\mathbb {C}^N) \times AC(\mathbb {R};\mathbb {R}^{d+1}) \times L^1(\mathbb {R};\mathbb {R}^{d+1}), \end{aligned}$$

where \(a = (a_0,\dots ,a_d)\) and \(b=(b_0,\dots ,b_d)\). That is, for any \(T > 0\), there is a unique solution

$$\begin{aligned} (\psi ,A,\partial _t A) \in C([0,T];\mathfrak X_0), \qquad A=(A_0,\dots ,A_d), \end{aligned}$$

depending only on t and \(x_1\). The solution has the following properties:

  1. (i)

    The data-to-solution map is continuous from \(\mathfrak X_0\) to \(C([0,T];\mathfrak X_0)\).

  2. (ii)

    Higher regularity persists. That is, if \(J \in \mathbb N\) and \(\partial _1^j (\psi _0,a_\mu ,b_\mu ) \in \mathfrak X_0\) for \(j \le J\), then \(\partial _t^j \partial _1^k (\psi ,A_\mu ,\partial _t A_\mu ) \in C([0,T];\mathfrak X_0)\) for \(j+k \le J\).

  3. (iii)

    If the data are \(C^\infty \), then so is the solution.

  4. (iv)

    The conservation of charge (11) holds.

  5. (v)

    If the data constraints (5) are satisfied for \(x_1\) in an interval I, then the Lorenz gauge condition \(\partial _t A_0 = \partial _1 A_1\) is satisfied in the cone of dependence over I.

In particular, taking \(d=1\), this result provides an alternative to the charge-class results from [1, 18], with a stronger form of well-posedness and at the same time a much simpler proof. The elementary proof is given in Sect. 7. We use iteration to prove local existence, and to close the estimates we only rely on the energy inequality for the Dirac equation and an estimate for the wave equation deduced from the d’Alembert representation.

4 The Dirac matrices and a null form estimate

In this section we specify our choice of the Dirac matrices, in dimensions \(d \le 3\). We do this in such a way that the Dirac equation in (10), when written in terms of the spinor components, has a form which makes it easy to work with. Recall that that the spinor has \(N=2\) components in space dimensions \(d=1,2\), and \(N=4\) components when \(d=3\). We write

$$\begin{aligned} \psi = \begin{pmatrix} u \\ v \end{pmatrix}, \end{aligned}$$

where uv are \(\mathbb {C}\)-valued for \(d=1,2\) and \(\mathbb {C}^2\)-valued for \(d=3\).

4.1 Space dimension \(d=1\)

We choose

$$\begin{aligned} \gamma ^0 = \left( \begin{matrix} 0 &{} 1 \\ 1 &{}0 \end{matrix} \right) , \qquad \gamma ^1= \left( \begin{matrix} 0 &{} -1 \\ 1 &{} 0 \end{matrix} \right) . \end{aligned}$$

Then (3) is satisfied, and (10) becomes

$$\begin{aligned} \left\{ \begin{aligned} (\partial _t + \partial _x)u&= i (A_0 + A_1) u -iMv, \\ (\partial _t - \partial _x)v&= i (A_0 - A_1) v -iMu, \\ (\partial _t^2-\partial _x^2) A_0&= \vert u \vert ^2 + \vert v \vert ^2, \\ (\partial _t^2-\partial _x^2) A_1&= - \vert u \vert ^2 + \vert v \vert ^2. \end{aligned} \right. \end{aligned}$$
(12)

Since \(A_0\), \(A_1\) are real valued, the first two equations imply

$$\begin{aligned} \left\{ \begin{aligned} (\partial _t + \partial _x)\vert u \vert ^2&= -2M {{\,\mathrm{Im}\,}}\left( \overline{v} u \right) , \\ (\partial _t - \partial _x)\vert v \vert ^2&= 2M {{\,\mathrm{Im}\,}}\left( \overline{v} u \right) . \end{aligned} \right. \end{aligned}$$
(13)

4.2 Dimension \(d=2\)

We choose

$$\begin{aligned} \gamma ^0 = \left( \begin{matrix} 0 &{} 1 \\ 1 &{}0 \end{matrix} \right) , \qquad \gamma ^1= \left( \begin{matrix} 0 &{} -1 \\ 1 &{} 0 \end{matrix} \right) , \\ \qquad \gamma ^2= \left( \begin{matrix} i &{} 0 \\ 0 &{} -i \end{matrix} \right) . \end{aligned}$$

Then (3) is satisfied, and (10) becomes, writing \(x=x_1\) for simplicity,

$$\begin{aligned} \left\{ \begin{aligned} (\partial _t + \partial _x)u&= i (A_0 + A_1) u + A_2 v -iMv, \\ (\partial _t - \partial _x)v&= i (A_0 - A_1) v - A_2 u -iMu, \\ (\partial _t^2-\partial _x^2) A_0&= \vert u \vert ^2 + \vert v \vert ^2, \\ (\partial _t^2-\partial _x^2) A_1&= - \vert u \vert ^2 + \vert v \vert ^2, \\ (\partial _t^2-\partial _x^2) A_2&= - 2 {{\,\mathrm{Im}\,}}(u \overline{v}). \end{aligned} \right. \end{aligned}$$
(14)

Then we also have

$$\begin{aligned} \left\{ \begin{aligned} (\partial _t + \partial _x)\vert u \vert ^2&= 2A_2 {{\,\mathrm{Re}\,}}\left( \overline{v} u \right) -2M {{\,\mathrm{Im}\,}}\left( \overline{v} u \right) , \\ (\partial _t - \partial _x)\vert v \vert ^2&= - 2A_2 {{\,\mathrm{Re}\,}}\left( \overline{v} u \right) + 2M {{\,\mathrm{Im}\,}}\left( \overline{v} u \right) . \end{aligned} \right. \end{aligned}$$
(15)

4.3 Dimension \(d=3\)

The \(4 \times 4\) Dirac matrices are, in \(2 \times 2\) block form,

$$\begin{aligned} \gamma ^0 = \left( \begin{matrix} 0 &{}\quad I \\ I &{}\quad 0 \end{matrix} \right) , \qquad \gamma ^1= \left( \begin{matrix} 0 &{}\quad -I \\ I &{}\quad 0 \end{matrix} \right) , \\ \qquad \gamma ^2= \left( \begin{matrix} \rho &{}\quad 0 \\ 0 &{}\quad -\rho \end{matrix} \right) , \\ \\ \qquad \gamma ^3= \left( \begin{matrix} \kappa &{}\quad 0 \\ 0 &{}\quad -\kappa \end{matrix} \right) , \end{aligned}$$

where I is the \(2 \times 2\) identity matrix and \(\rho \), \(\kappa \) must satisfy

$$\begin{aligned} \rho ^* = - \rho , \qquad \rho ^2 = -I, \qquad \kappa ^* = -\kappa , \qquad \kappa ^2 = -I, \qquad \rho \kappa + \kappa \rho = 0. \end{aligned}$$

Then (3) is satisfied. For example, we can choose

$$\begin{aligned} \rho= & {} \left( \begin{matrix} 0 &{} -1 \\ 1 &{} 0 \end{matrix} \right) , \\ \qquad \kappa= & {} \left( \begin{matrix} i &{} 0 \\ 0 &{} -i \end{matrix} \right) . \end{aligned}$$

Then (10) reads (with \(x=x_1\))

$$\begin{aligned} \left\{ \begin{aligned} (\partial _t + \partial _x)u&= i (A_0 + A_1) u - i A_2 \rho v - i A_3 \kappa v -iMv, \\ (\partial _t - \partial _x)v&= i (A_0 - A_1) v + i A_2 \rho u + i A_3 \kappa u -iMu, \\ (\partial _t^2-\partial _x^2) A_0&= \vert u \vert ^2 + \vert v \vert ^2, \\ (\partial _t^2-\partial _x^2) A_1&= - \vert u \vert ^2 + \vert v \vert ^2, \\ (\partial _t^2-\partial _x^2) A_2&= - 2 {{\,\mathrm{Re}\,}}(v^* \rho u), \\ (\partial _t^2-\partial _x^2) A_3&= - 2 {{\,\mathrm{Re}\,}}(v^* \kappa u), \end{aligned} \right. \end{aligned}$$
(16)

where uv are now \(\mathbb {C}^2\)-valued. Then also

$$\begin{aligned} \left\{ \begin{aligned} (\partial _t + \partial _x)\vert u \vert ^2&= 2A_2 {{\,\mathrm{Im}\,}}\left( v^* \rho u \right) + 2A_3 {{\,\mathrm{Im}\,}}\left( v^* \kappa u \right) - 2M {{\,\mathrm{Im}\,}}\left( v^* u \right) , \\ (\partial _t - \partial _x)\vert v \vert ^2&= - 2A_2 {{\,\mathrm{Im}\,}}\left( v^* \rho u \right) - 2A_3 {{\,\mathrm{Im}\,}}\left( v^* \kappa u \right) + 2M {{\,\mathrm{Im}\,}}\left( v^* u \right) . \end{aligned} \right. \end{aligned}$$
(17)

4.4 A null form estimate

When we move from \(d=1\) to \(d=2\) or \(d=3\), the decisive difference is that we pick up the additional fields \(A_2, A_3\). These fields will be better behaved than \(A_0, A_1\), since the right hand sides of the corresponding equations in (14) and (16) are null forms: They contain a product of \(v^*\) and u, which propagate in transverse directions. This fact will be exploited through the following crucial estimate (which fails for uu and \(u \overline{u}\)).

We use the following notation. For \(x \in \mathbb {R}\) and \(t > 0\), let \(K^{(t,x)}\) denote the backward cone with vertex at (tx), that is,

$$\begin{aligned} K^{(t,x)} = \left\{ (s,y) \in \mathbb {R}^2 :0< s< t, \;\; x-t+s< y < x+t-s \right\} . \end{aligned}$$
(18)

Lemma 1

(Null form estimate) Consider a system of the form

$$\begin{aligned} (\partial _t + \partial _x)u= & {} F(t,x), \quad \qquad u(0,x) = f(x), \\ (\partial _t - \partial _x)v= & {} G(t,x), \quad \qquad v(0,x) = g(x), \end{aligned}$$

where \(x \in \mathbb {R}\), \(t > 0\), and the functions are \(\mathbb {C}\)-valued. For the solution (uv) we have the estimate, for all \(X \in \mathbb {R}\) and \(T > 0\),

$$\begin{aligned}&\iint _{K^{(T,X)}} \vert uv \vert \, dx \, dt \\&\quad \le \left( \left\| f \right\| _{L^1} + \int _0^T \left\| F(t) \right\| _{L^1} \, dt \right) \left( \left\| g \right\| _{L^1} + \int _0^T \left\| G(t) \right\| _{L^1} \, dt \right) . \end{aligned}$$

Proof

Integrating, we have

$$\begin{aligned} u(t,x)&= f(x-t) + \int _0^t F(s,x-t+s) \, ds, \\ v(t,x)&= g(x+t) + \int _0^t G(s,x+t-s) \, ds. \end{aligned}$$

Taking absolute values, we see that for \(0 \le t \le T\),

$$\begin{aligned} \vert u(t,x) \vert&\le \mu (x-t) := \vert f(x-t) \vert + \int _0^T \vert F(s,x-t+s) \vert \, ds, \\ \vert v(t,x) \vert&\le \nu (x+t) := \vert g(x+t) \vert + \int _0^T \vert G(s,x+t-s) \vert \, ds. \end{aligned}$$

By Fubini’s theorem it is then obvious that

$$\begin{aligned} \iint _{K^{(T,X)}} \vert uv \vert \, dx \, dt \le \left\| \mu \right\| _{L^1(\mathbb {R})} \left\| \nu \right\| _{L^1(\mathbb {R})}. \end{aligned}$$

But

$$\begin{aligned} \left\| \mu \right\| _{L^1(\mathbb {R})} \le \left\| f \right\| _{L^1(\mathbb {R})} + \int _0^T \left\| F(t) \right\| _{L^1(\mathbb {R})} \, dt, \end{aligned}$$

and similarly for \(\nu \), so we get the desired estimate.

5 Data for ill-posedness

In this section we specify the data that are used to prove Theorem 2.

Choose a cut-off \(\chi \in C_c^\infty (\mathbb {R})\) such that \(\chi =1\) on \([-1,1]\). Let \(\varepsilon > 0\). For the spinor datum and its approximations, which are \(\mathbb {C}^N\)-valued, we then take

$$\begin{aligned} \psi _0^{\mathrm {bad}}(x)= & {} \chi (x_1) \cdots \chi (x_d) \begin{pmatrix} f(x_1) \\ 0 \\ \vdots \\ 0 \end{pmatrix}, \nonumber \\ \psi _{0,\varepsilon }(x)= & {} \chi (x_1) \cdots \chi (x_d) \begin{pmatrix} f_\varepsilon (x_1) \\ 0 \\ \vdots \\ 0 \end{pmatrix}, \end{aligned}$$
(19)

where

$$\begin{aligned} f(x_1) = \frac{1}{\vert x_1 \vert ^{1/2}}, \qquad f_\varepsilon (x_1) = \frac{1}{(\varepsilon ^2+x_1^2)^{1/4}}. \end{aligned}$$
(20)

Thus, \(\chi f_\varepsilon \in C_c^\infty (\mathbb {R})\), and for \(1 \le p < 2\) we have \(\chi f \in L^p(\mathbb {R}) \setminus L^2(\mathbb {R})\) and

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \left\| \chi f_\varepsilon - \chi f \right\| _{L^p(\mathbb {R})} = 0. \end{aligned}$$

By the Hardy-Littlewood-Sobolev inequality,Footnote 2 we then conclude that \(\chi f \in H^s(\mathbb {R})\) for \(s < 0\), and that

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \left\| \chi f_\varepsilon - \chi f \right\| _{H^s(\mathbb {R})} = 0. \end{aligned}$$

It follows that \(\psi _{0,\varepsilon } \in C_c^\infty (\mathbb {R}^d)\), \(\psi _0^{\mathrm {bad}} \in X_0 \setminus L^2(\mathbb {R}^d)\), and

$$\begin{aligned} \lim _{\varepsilon \rightarrow 0} \left\| \psi _{0,\varepsilon } - \psi _0^{\mathrm {bad}} \right\| _{X_0} = 0, \end{aligned}$$

where as before \(X_0\) denotes either \(H^s(\mathbb {R}^d)\), \(s < 0\), or \(L^p(\mathbb {R}^d)\), \(1 \le p < 2\).

Next, we choose the data \(a_{\mu ,\varepsilon }, b_{\mu ,\varepsilon } \in C_c^\infty (\mathbb {R}^d)\). The first alternative is to take vanishing data

$$\begin{aligned} a_{0,\varepsilon } = \dots = a_{d,\varepsilon } = 0, \qquad b_{0,\varepsilon } = \dots = b_{d,\varepsilon } = 0, \end{aligned}$$
(21)

as in (7). The second alternative is to ensure that the constraints in (8) are satisfied. For this, we take all the data to vanish except \(b_{1,\varepsilon }\), so the constraints reduce to

$$\begin{aligned} - \partial _1 b_{1,\varepsilon } = \vert \psi _{0,\varepsilon } \vert ^2 = \frac{1}{\sqrt{\varepsilon ^2+x_1^2}} \quad \text {in }B. \end{aligned}$$

Integrating this, we obtain

$$\begin{aligned} \left\{ \begin{aligned}&a_{0,\varepsilon } = \dots = a_{d,\varepsilon } = 0, \qquad b_{0,\varepsilon } = b_{2,\varepsilon } = \dots = b_{d,\varepsilon } = 0, \\&b_{1,\varepsilon }(x) = - \chi (x_1) \cdots \chi (x_d) \log \left( x_1 + \sqrt{\varepsilon ^2 + x_1^2} \right) , \end{aligned} \right. \end{aligned}$$
(22)

which satisfies (8).

6 Proof of ill-posedness

We start by proving Theorem 2, which implies Theorem 1. Theorem 3 is proved at the end of this section.

Let \(d \le 3\), choose the Dirac matrices as in Sect. 4, and define the data \( (\psi _{0,\varepsilon },a_{\mu ,\varepsilon },b_{\mu ,\varepsilon }) \in C_c^\infty (\mathbb {R}^d) \) by (19), (20), and either (21) or (22). Since the data depend only on \(x_1\) in B, it follows from Theorem 4 that their maximal \(C^\infty \) forward evolution \( (\psi _\varepsilon ,A_{\mu ,\varepsilon }) \) exists throughout the cone K over B, and depends only on t and \(x_1\) there. Indeed, we can apply Theorem 4 with the data restricted to \(x_2=\dots =x_d=0\).

We now claim that for \(T > 0\) sufficiently small, the following holds for \(\varepsilon > 0\):

$$\begin{aligned} \vert A_{j,\varepsilon }(t,x) \vert \le 1 \quad \text {in }K_T, \text { for } 2 \le j \le d, \end{aligned}$$
(23)

and

$$\begin{aligned} \vert \psi _\varepsilon (t,x) \vert ^2 \ge \frac{1}{2} \vert f_\varepsilon (x_1-t) \vert ^2 \quad \text {in }K_T \cap \{ (t,x) :0< t < x_1 \}. \end{aligned}$$
(24)

Moreover,

$$\begin{aligned} A_{0,\varepsilon }(t,x) \ge c(Q) \vert \log \varepsilon \vert \quad \text {in any compact} Q \subset K_T \cap \{ (t,x) :\vert x_1 \vert < t \}, \end{aligned}$$
(25)

for all sufficiently small \(\varepsilon > 0\), and some constant \(c(Q) > 0\) depending only on Q.

Once we have obtained (25), then Theorem 2 is proved. The plan is now as follows: First, we prove that (24) implies (25), then we prove (23), and finally we prove (24).

Since (23)–(25) are restricted to the cone K, where the solution depends only on t and \(x_1\), it suffices to prove them for \(x_2=\dots =x_d=0\). For the remainder of this section we therefore restrict to \(x_2=\dots =x_d=0\). The solution then exists for all \(t \ge 0\) and \(x_1 \in \mathbb {R}\), by Theorem 4. To simplify the notation we also write \(x=x_1\).

6.1 Proof that (24) \(\implies \) (25)

Since \((\partial _t^2 - \partial _x^2) A_{0,\varepsilon } = \vert \psi _\varepsilon \vert ^2\) with vanishing initial data, we have by d’Alembert’s formula

$$\begin{aligned} A_{0,\varepsilon }(t,x) = \frac{1}{2} \iint _{K^{(t,x)}} \vert \psi _\varepsilon \vert ^2 \, dy \, ds = \frac{1}{2} \int _0^t \int _{x-t+s}^{x+t-s} \vert \psi _\varepsilon (s,y) \vert ^2 \, dy \, ds, \end{aligned}$$

with notation as in (18). Take \(\vert x \vert< t < T \ll 1\) and restrict the integration to the cone \(K^{(t,x)} \cap \{ (s,y) :s < y \}\). Assuming (24) holds, we thus obtain

$$\begin{aligned} A_{0,\varepsilon }(t,x)&\ge \frac{1}{2} \int _0^{\frac{x+t}{2}} \int _{s}^{x+t-s} \vert \psi _\varepsilon (s,y) \vert ^2 \, dy \, ds \\&\ge \frac{1}{4} \int _0^{\frac{x+t}{2}} \int _{s}^{x+t-s} \frac{1}{\sqrt{\varepsilon ^2+(y-s)^2}} \, dy \, ds \\&\ge \frac{1}{4} \int _0^{\frac{x+t}{2}} \int _{s}^{x+t-s} \frac{1}{\varepsilon +y-s} \, dy \, ds \\&= \frac{x+t}{8} (-\log \varepsilon ) + \frac{1}{8} (\varepsilon +x+t) \left( \log (\varepsilon +x+t) - 1 \right) - \frac{1}{2} \varepsilon (\log \varepsilon -1), \end{aligned}$$

and (25) follows.

6.2 Proof of (23)

This is only relevant in dimensions \(d = 2,3\). We show the proof for \(d=2\), and comment on \(d=3\) at the end.

Assuming now \(d=2\), then the system is as in (14):

$$\begin{aligned} \left\{ \begin{aligned} (\partial _t + \partial _x)u_\varepsilon&= i (A_{0,\varepsilon } + A_{1,\varepsilon }) u_\varepsilon + A_{2,\varepsilon } v_\varepsilon -iMv_\varepsilon , \\ (\partial _t - \partial _x)v_\varepsilon&= i (A_{0,\varepsilon } - A_{1,\varepsilon }) v_\varepsilon - A_{2,\varepsilon } u_\varepsilon -iMu_\varepsilon , \\ (\partial _t^2-\partial _x^2) A_{0,\varepsilon }&= \vert u_\varepsilon \vert ^2 + \vert v_\varepsilon \vert ^2, \\ (\partial _t^2-\partial _x^2) A_{1,\varepsilon }&= - \vert u_\varepsilon \vert ^2 + \vert v_\varepsilon \vert ^2, \\ (\partial _t^2-\partial _x^2) A_{2,\varepsilon }&= - 2 {{\,\mathrm{Im}\,}}(u_\varepsilon \overline{v_\varepsilon }), \end{aligned} \right. \end{aligned}$$
(26)

with data

$$\begin{aligned} u_\varepsilon (0,x)= & {} \chi f_\varepsilon (x) = \frac{\chi (x)}{(\varepsilon ^2+x^2)^{1/4}}, \\ \qquad v_\varepsilon (0,x)= & {} 0, \end{aligned}$$

and either (21) or (22) (with \(x=x_1\) and \(x_2=0\)). The solution exists globally and is \(C^\infty \), by Theorem 4.

We want to prove (23). This will follow if we can prove that for \(T > 0\) sufficiently small,

$$\begin{aligned} \sup _{(t,x) \in [0,T] \times \mathbb {R}} \vert A_{2,\varepsilon }(t,x) \vert \le 1 \end{aligned}$$
(27)

for all \(\varepsilon > 0\).

By d’Alembert’s formula, since \(a_{2,\varepsilon } = b_{2,\varepsilon } = 0\),

$$\begin{aligned} A_{2,\varepsilon }(t,x) = \iint _{K^{(t,x)}} {{\,\mathrm{Im}\,}}(u_\varepsilon \overline{v_\varepsilon })(s,y) \, dy \, ds, \end{aligned}$$
(28)

were \(K^{(t,x)}\) denotes the backward cone (18).

The idea is now to apply Lemma 1 to the first two equations in (26). But first we need to integrate out the terms involving \((A_{0,\varepsilon } \pm A_{1,\varepsilon })\). Define \(\phi _{+,\varepsilon }\), \(\phi _{-,\varepsilon }\) by

$$\begin{aligned} (\partial _t + \partial _x) \phi _{+,\varepsilon }= & {} A_{0,\varepsilon }+A_{1,\varepsilon }, \quad \qquad \phi _{+,\varepsilon }(0,x) = 0, \\ (\partial _t - \partial _x) \phi _{-,\varepsilon }= & {} A_{0,\varepsilon }-A_{1,\varepsilon }, \quad \qquad \phi _{-,\varepsilon }(0,x) = 0, \end{aligned}$$

that is,

$$\begin{aligned} \phi _{+,\varepsilon }(t,x)&= \int _0^t (A_{0,\varepsilon }+A_{1,\varepsilon })(s,x-t+s) \, ds, \\ \phi _{-,\varepsilon }(t,x)&= \int _0^t (A_{0,\varepsilon }-A_{1,\varepsilon })(s,x+t-s) \, ds. \end{aligned}$$

Then from the first two equations in (26) we get

$$\begin{aligned} (\partial _t + \partial _x)(e^{-i\phi _{+,\varepsilon }} u_\varepsilon )= & {} e^{-i\phi _{+,\varepsilon }}[ A_{2,\varepsilon } v_\varepsilon -iMv_\varepsilon ], \nonumber \\ (\partial _t - \partial _x)(e^{-i\phi _{-,\varepsilon }} v_\varepsilon )= & {} e^{-i\phi _{-,\varepsilon }}[- A_{2,\varepsilon } u_\varepsilon -iMu_\varepsilon ], \end{aligned}$$
(29)

so by (28) and Lemma 1,

$$\begin{aligned} \left\| A_{2,\varepsilon }(t) \right\| _{L^\infty }\le & {} \sup _{x \in \mathbb {R}} \iint _{K^{(t,x)}} \vert u_\varepsilon \vert \vert v_\varepsilon \vert \, dy \, ds \nonumber \\= & {} \sup _{x \in \mathbb {R}} \iint _{K^{(t,x)}} \vert e^{-i\phi _{+,\varepsilon }} u_\varepsilon \vert \vert e^{-i\phi _{-,\varepsilon }} v_\varepsilon \vert \, dy \, ds \nonumber \\\le & {} \left( \left\| \chi f_\varepsilon \right\| _{L^1} + \int _0^t (M + \left\| A_{2,\varepsilon }(s) \right\| _{L^\infty })\left\| v_\varepsilon (s) \right\| _{L^1} \, ds \right) \nonumber \\&\quad \times \left( \int _0^t (M + \left\| A_{2,\varepsilon }(s) \right\| _{L^\infty })\left\| u_\varepsilon (s) \right\| _{L^1} \, ds \right) . \end{aligned}$$
(30)

To control the \(L^1\) norms of \(u_\varepsilon (t)\) and \(v_\varepsilon (t)\), we use again (29), which implies

$$\begin{aligned} (e^{-i\phi _{+,\varepsilon }} u_\varepsilon )(t,x)&= \chi f_\varepsilon (x-t) + \int _0^t \left( e^{-i\phi _{+,\varepsilon }}[ A_{2,\varepsilon } v_\varepsilon -iMv_\varepsilon ]\right) (s,x-t+s) \, ds, \\ (e^{-i\phi _{-,\varepsilon }} v_\varepsilon )(t,x)&= \int _0^t \left( e^{-i\phi _{-,\varepsilon }}[- A_{2,\varepsilon } u_\varepsilon -iMu_\varepsilon ] \right) (s,x+t-s) \, ds. \end{aligned}$$

Take \(L^1\) norms in x to get

$$\begin{aligned} \left\| u_\varepsilon (t) \right\| _{L^1}&\le \left\| \chi f_\varepsilon \right\| _{L^1} + \int _0^t (M + \left\| A_{2,\varepsilon }(s) \right\| _{L^\infty })\left\| v_\varepsilon (s) \right\| _{L^1} \, ds, \\ \left\| v_\varepsilon (t) \right\| _{L^1}&\le \int _0^t (M + \left\| A_{2,\varepsilon }(s) \right\| _{L^\infty })\left\| u_\varepsilon (s) \right\| _{L^1} \, ds. \end{aligned}$$

Adding these and applying Grönwall’s inequality yields

$$\begin{aligned} \left\| u_\varepsilon (t) \right\| _{L^1} + \left\| v_\varepsilon (t) \right\| _{L^1} \le \left\| \chi f_\varepsilon \right\| _{L^1} e^{\int _0^t (M + \left\| A_{2,\varepsilon }(s) \right\| _{L^\infty }) \, ds}. \end{aligned}$$
(31)

Observing that

$$\begin{aligned} \left\| \chi f_\varepsilon \right\| _{L^1} \le C := \int _{\mathbb {R}} \frac{\vert \chi (x) \vert }{\vert x \vert ^{1/2}} \, dx < \infty , \end{aligned}$$

and defining the continuous function \(g_\varepsilon :[0,\infty ) \rightarrow [0,\infty )\) by

$$\begin{aligned} g_\varepsilon (t) = \sup _{0 \le s \le t} \left\| A_{2,\varepsilon }(s) \right\| _{L^\infty }, \end{aligned}$$

we conclude from (30) and (31) that

$$\begin{aligned} g_\varepsilon (t) \le C^2\left( 1 + t (M + g_\varepsilon (t)) e^{t (M + g_\varepsilon (t))}\right) \left( t (M + g_\varepsilon (t)) e^{t (M + g_\varepsilon (t))}\right) . \end{aligned}$$
(32)

We now use a boot-strap argument to show that there exists a \(\delta > 0\), depending only on C and M, such that for \(0 \le t \le \delta \),

$$\begin{aligned} g_\varepsilon (t) \le 1. \end{aligned}$$
(33)

Assuming this holds for some \(t > 0\), then by (32) we have

$$\begin{aligned} g_\varepsilon (t) \le C^2 \alpha (t), \end{aligned}$$
(34)

where the increasing function \(\alpha :[0,\infty ) \rightarrow [0,\infty )\) is defined by

$$\begin{aligned} \alpha (t) = \left( 1 + t (M + 1) e^{t (M + 1)}\right) \left( t (M + 1) e^{t (M + 1)}\right) . \end{aligned}$$

Since \(\alpha (0) = 0\), there exists \(\delta > 0\), depending only on M and C, such that

$$\begin{aligned} C^2 \alpha (\delta ) \le \frac{1}{2}. \end{aligned}$$
(35)

By a continuity argument it now follows that (33) holds for all \(t \in [0,\delta ]\). Indeed, since \(g_\varepsilon (0) = 0\), then (33) certainly holds for sufficiently small \(t > 0\). And if (33) holds on some interval \([0,T] \subset [0,\delta ]\), then by (34) and (35) we have in fact \(g_\varepsilon (t) \le 1/2\) on that interval, so (33) holds on a slightly larger interval.

This concludes the proof of (23) for \(d=2\). For \(d=3\) the same proof goes through with some obvious changes. Indeed, the system (16) has essentially the same structure as (14), and in particular the equations for \(A_2, A_3\) have null forms in the right hand side. Thus, we obtain

$$\begin{aligned} \sup _{(t,x) \in [0,T] \times \mathbb {R}} \left( \vert A_{2,\varepsilon }(t,x) \vert + \vert A_{3,\varepsilon }(t,x) \vert \right) \le 1 \end{aligned}$$
(36)

for \(T > 0\) sufficiently small.

6.3 Proof of (24)

Since we restrict to \(x_2=\dots =x_d=0\) and write \(x=x_1\), then (24) reduces to proving that for \(T > 0\) sufficiently small,

$$\begin{aligned} \vert u_\varepsilon (t,x) \vert ^2 \ge \frac{1}{2} \vert f_\varepsilon (x-t) \vert ^2 \quad \text {for }0< t< x< 1-t \text { and }t < T. \end{aligned}$$
(37)

We do the proof for \(d=2\), and comment on \(d=1\) and \(d=3\) at the end.

Assuming \(d=2\), we use (15). Thus,

$$\begin{aligned} (\partial _t + \partial _x) \vert u_\varepsilon \vert ^2&= F_\varepsilon := 2A_{2,\varepsilon } {{\,\mathrm{Re}\,}}(u_\varepsilon \overline{v_\varepsilon }) - 2M {{\,\mathrm{Im}\,}}(u_\varepsilon \overline{v_\varepsilon }), \\ (\partial _t - \partial _x) \vert v_\varepsilon \vert ^2&= G_\varepsilon := - 2A_{2,\varepsilon } {{\,\mathrm{Re}\,}}(u_\varepsilon \overline{v_\varepsilon }) + 2M {{\,\mathrm{Im}\,}}(u_\varepsilon \overline{v_\varepsilon }), \end{aligned}$$

and therefore

$$\begin{aligned} \vert u_\varepsilon (t,x) \vert ^2&= \vert \chi f_\varepsilon (x-t) \vert ^2 + \int _0^t F_\varepsilon (s,x-t+s) \, ds, \end{aligned}$$
(38)
$$\begin{aligned} \vert v_\varepsilon (t,x) \vert ^2&= \int _0^t G_\varepsilon (s,x+t-s) \, ds. \end{aligned}$$
(39)

By (27), for \(T > 0\) sufficiently small we have

$$\begin{aligned} \vert F_\varepsilon \vert ,\vert G_\varepsilon \vert \le (M+1)(\vert u_\varepsilon \vert ^2+\vert v_\varepsilon \vert ^2) \quad \text {in }[0,T] \times \mathbb {R}, \end{aligned}$$
(40)

hence

$$\begin{aligned} \vert u_\varepsilon (t,x) \vert ^2&\le \vert \chi f_\varepsilon (x-t) \vert ^2 + (M+1)\int _0^t (\vert u_\varepsilon \vert ^2+\vert v_\varepsilon \vert ^2)(s,x-t+s) \, ds, \end{aligned}$$
(41)
$$\begin{aligned} \vert v_\varepsilon (t,x) \vert ^2&\le (M+1) \int _0^t (\vert u_\varepsilon \vert ^2+\vert v_\varepsilon \vert ^2)(s,x+t-s) \, ds \end{aligned}$$
(42)

for \(t \in [0,T]\), \(x \in \mathbb {R}\).

The idea is now to apply a boot-strap argument. For \(\rho \in (0,1-2T)\), define

$$\begin{aligned} B_{\rho ,\varepsilon }(s) = \sup _{\rho + s \le y \le 1-s} \left( \vert u_\varepsilon (s,y \vert ^2+\vert v_\varepsilon (s,y) \vert ^2 \right) \qquad (0 \le s \le T). \end{aligned}$$

If \(\rho + t \le x \le 1-t\), the integrands in (41), (42) are bounded by \(B_{\rho ,\varepsilon }(s)\), so

$$\begin{aligned} \vert u_\varepsilon (t,x) \vert ^2 + \vert v_\varepsilon (t,x) \vert ^2 \le \frac{1}{\sqrt{\varepsilon ^2 + (x-t)^2}} + 2(M+1)\int _0^t B_{\rho ,\varepsilon }(s) \, ds \end{aligned}$$

Taking the supremum over \(x \in [\rho +t,1-t]\) gives

$$\begin{aligned} B_{\rho ,\varepsilon }(t) \le \frac{1}{\sqrt{\varepsilon ^2 + \rho ^2}} + 2(M+1)\int _0^t B_{\rho ,\varepsilon }(s) \, ds. \end{aligned}$$

By Grönwall’s inequality we conclude that

$$\begin{aligned} B_{\rho ,\varepsilon }(t) \le \frac{1}{\sqrt{\varepsilon ^2 + \rho ^2}} e^{2(M+1)t} \le \frac{3}{\sqrt{\varepsilon ^2 + \rho ^2}} \quad \text {for }t \in [0,T], \end{aligned}$$
(43)

assuming \(T > 0\) is so small that \(2(M+1)T < 1\).

Combining (43), (38) and (40), we obtain, for \(\rho > 0\), \(x \in [\rho +t,1-t]\) and \(t \le T\),

$$\begin{aligned} \vert u_\varepsilon (t,x) \vert ^2&\ge \vert \chi f_\varepsilon (x-t) \vert ^2 - (M+1)\int _0^t (\vert u_\varepsilon \vert ^2+\vert v_\varepsilon \vert ^2)(s,x-t+s) \, ds \\&\ge \frac{1}{\sqrt{\varepsilon ^2 + (x-t)^2}} - (M+1) \int _0^t B_{\rho ,\varepsilon }(s) \, ds \\&\ge \frac{1}{\sqrt{\varepsilon ^2 + (x-t)^2}} - \frac{3(M+1)t}{\sqrt{\varepsilon ^2 + \rho ^2}}, \end{aligned}$$

where we also used the fact that \(\chi =1\) on \([-1,1]\). Choosing \(\rho = x-t\) and assuming \(T > 0\) so small that \(6(M+1)T < 1\), we obtain the claimed inequality (37).

This completes the proof of (24) for \(d=2\). The proof for \(d=1,3\) works out the same way, but instead of (15) we use either (13) or (17), and in the case \(d=3\) we use (36) instead of (27).

6.4 Proof of Theorem 3

Define \(\psi _0^{\mathrm {bad}} \in X_0 \setminus L^2(\mathbb {R}^d)\) as in Sect. 5. Assume there exist (i) a neighbourhood \(\Omega _0\) of \(\psi _0^{\mathrm {bad}}\) in \(X_0\), and (ii) continuous maps

$$\begin{aligned} I_\mu , J_\mu :\Omega _0 \longrightarrow \mathcal D'(B), \end{aligned}$$

such that if \(\psi _0 \in \Omega _0 \cap C_c^\infty (\mathbb {R}^d)\), then defining

$$\begin{aligned} a_\mu = I_\mu [\psi _0], \qquad b_\mu = J_\mu [\psi _0] \qquad (\mu =0,\dots ,d), \end{aligned}$$

the constraint equations (5) are satisfied in B.

We will show that these assumptions lead to a contradiction. Define \(\psi _{0,\varepsilon } \in C_c^\infty (\mathbb {R}^d)\) as in Sect. 5. Then \(\psi _{0,\varepsilon } \rightarrow \psi _0^{\mathrm {bad}}\) in \(X_0\) as \(\varepsilon \rightarrow 0\), so \(\psi _{0,\varepsilon }\) belongs to \(\Omega _0\) for all \(\varepsilon > 0\) small enough, and we may define

$$\begin{aligned} a_{\mu ,\varepsilon } = I_\mu [\psi _{0,\varepsilon }], \qquad b_{\mu ,\varepsilon } = J_\mu [\psi _{0,\varepsilon }] \qquad (\mu =0,\dots ,d). \end{aligned}$$

By assumption, these fields satisfy the constraints (5) in B, so in particular

$$\begin{aligned} \sum _{j=1}^d \partial _j (\partial _j a_{0,\varepsilon } - b_{j,\varepsilon }) = \vert \psi _{0,\varepsilon } \vert ^2 \quad \text {in }B. \end{aligned}$$

By the assumed continuity of the maps \(I_\mu , J_\mu \), the left hand side must converge in \(\mathcal D'(B)\) as \(\varepsilon \rightarrow 0\). But the right hand side equals

$$\begin{aligned} \vert \psi _{0,\varepsilon }(x) \vert ^2 = \frac{1}{\sqrt{\varepsilon ^2 + x_1^2}} \quad \text {for }x \in B, \end{aligned}$$

and this function does not have a limit in the sense distributions in B, as \(\varepsilon \rightarrow 0\).

7 Proof of well-posedness

In this section we prove Theorem 4. To ease the notation we write \(x=x_1\) throughout. To prove local existence we use an iteration and rely only on the following elementary estimates.

7.1 Linear estimates

Firstly, for the Dirac equation

$$\begin{aligned} (- i\gamma ^0 \partial _t - i\gamma ^1 \partial _x + M) \psi = F(t,x), \qquad \psi (0,x) = \psi _0(x), \end{aligned}$$

we shall use the energy inequality, for \(t > 0\),

$$\begin{aligned} \left\| \psi (t) \right\| _{L^2(\mathbb {R})} \le \left\| \psi _0 \right\| _{L^2(\mathbb {R})} + \int _0^t \left\| F(s) \right\| _{L^2(\mathbb {R})} \, ds. \end{aligned}$$
(44)

This is proved as follows. By approximation, we may assume that \(\psi _0\) and F are smooth and compactly supported in x. Premultiplying the equation by \(i\overline{\psi }=i\psi ^*\gamma ^0\) and taking real parts yields \(\partial _t \rho + \partial _x j = {{\,\mathrm{Re}\,}}(i\psi ^*\gamma ^0 F)\), where \(\rho =\psi ^*\psi \) and \(j=\psi ^*\gamma ^0\gamma ^1\psi \). Integration in x gives

$$\begin{aligned} \frac{d}{dt} \int _{\mathbb {R}} \vert \psi \vert ^2 \, dx = 2{{\,\mathrm{Re}\,}}\int _{\mathbb {R}} i\psi ^* \gamma ^0 F \, dx \le 2 \left\| \psi (t) \right\| _{L^2} \left\| F(t) \right\| _{L^2}, \end{aligned}$$

which implies (44).

Secondly, for the wave equation

$$\begin{aligned} \square u = G(t,x), \qquad u(0,x) = f(x), \qquad \partial _t u(0,x) = g(x), \end{aligned}$$

we shall use the estimates, for \(t > 0\),

$$\begin{aligned} \left\| u(t) \right\| _{L^\infty (\mathbb {R})}&\le \left\| f \right\| _{L^\infty (\mathbb {R})} + \left\| g \right\| _{L^1(\mathbb {R})} + \int _0^t \left\| G(s) \right\| _{L^1(\mathbb {R})} \, ds, \end{aligned}$$
(45)
$$\begin{aligned} \left\| \partial _x u(t) \right\| _{L^1(\mathbb {R})}&\le \left\| f' \right\| _{L^1(\mathbb {R})} + \left\| g \right\| _{L^1(\mathbb {R})} + \int _0^t \left\| G(s) \right\| _{L^1(\mathbb {R})} \, ds, \end{aligned}$$
(46)
$$\begin{aligned} \left\| \partial _t u(t) \right\| _{L^1(\mathbb {R})}&\le \left\| f' \right\| _{L^1(\mathbb {R})} + \left\| g \right\| _{L^1(\mathbb {R})} + \int _0^t \left\| G(s) \right\| _{L^1(\mathbb {R})} \, ds, \end{aligned}$$
(47)

which are immediate from d’Alembert’s formula,

$$\begin{aligned} u(t,x) = \frac{f(x+t) + f(x-t)}{2} + \frac{1}{2} \int _{x-t}^{x+t} g(y) \, dy + \frac{1}{2} \int _0^t \int _{x-(t-s)}^{x+t-s} G(s,y) \, dy \, ds. \end{aligned}$$

Adding (45)–(47) gives

$$\begin{aligned} \left\| u(t) \right\| _{AC} + \left\| \partial _t u(t) \right\| _{L^1} \le 3 \left( \left\| f \right\| _{AC(\mathbb {R})} + \left\| g \right\| _{L^1} + \int _0^t \left\| G(s) \right\| _{L^1} \, ds \right) . \end{aligned}$$
(48)

7.2 The local result

With the above linear estimates, it is now an easy matter to prove the local well-posedness of (10) by iteration, for data with the regularity \(\psi _0 \in L^2(\mathbb {R})\), \(a_\mu \in AC(\mathbb {R})\) and \(b_\mu \in L^1(\mathbb {R})\). Indeed, applying the energy inequality (44) to the Dirac equation in (10), we use the trivial bilinear estimate

$$\begin{aligned} \int _0^T \left\| A_\mu \gamma ^\mu \psi (s) \right\| _{L^2} \, ds \le CT \left\| A \right\| _{C([0,T];L^\infty )} \left\| \psi \right\| _{C([0,T];L^2)}, \end{aligned}$$

where \(C([0,T];L^p)\) is equipped with the sup norm in time. Moreover, applying (48) to the wave equations in (10) we use the equally trivial bilinear bound

$$\begin{aligned} \int _0^T \left\| \psi ^* \gamma ^0 \gamma ^\mu \psi (s) \right\| _{L^1} \, ds \le CT \left\| \psi \right\| _{C([0,T];L^2)}^2. \end{aligned}$$
(49)

By a standard contraction argument, which we do not repeat here, one now obtains local well-posedness with a time of existence \(T > 0\) determined by

$$\begin{aligned} CT\left( \left\| \psi _0 \right\| _{L^2} + \sum _{\mu =0}^d \left\| a_\mu \right\| _{AC} + \sum _{\mu =0}^d \left\| b_\mu \right\| _{L^1} \right) \le 1, \end{aligned}$$

where C is a universal constant. This proves Theorem 4 for such T. Next, we show that the results extend globally.

7.3 The global result

To extend the local result globally in time, it suffices to obtain an a priori bound on the solution \((\psi ,A,\partial _t A)(t)\) in \(L^2(\mathbb {R}) \times AC(\mathbb {R}) \times L^1(\mathbb {R})\). For \(\psi \), this bound is directly provided by the conservation of charge, (11). The latter also provides the necessary bound for \((A,\partial _t A)\), via the linear estimate (48) and the bilinear estimate (49). This concludes the proof of Theorem 4.