1 Introduction

The theory of causal fermion systems is a recent approach to fundamental physics (see the basics in Sect. 2, the reviews [11, 12, 16], the textbook [10] or the website [1]). In this approach, spacetime and all objects therein are described by a measure \(\rho \) on a set \({\mathcal{F}}\) of linear operators of rank at most 2n on a Hilbert space \(({\mathcal{H}}, \langle .|. \rangle _{{\mathcal{H}}})\). The physical equations are formulated via the so-called causal action principle, a nonlinear variational principle where an action \({\mathcal{S}}\) is minimized under variations of the measure \(\rho \). If the Hilbert space \({\mathcal{H}}\) is finite-dimensional, the set \({\mathcal{F}}\) is a locally compact topological space. Making essential use of this fact, it was shown in [9] that the causal action principle is well defined and that minimizers exist. Moreover, as is worked out in detail in [15], the interior of \({\mathcal{F}}\) (consisting of the so-called regular points; see Definition 3.1) has a smooth manifold structure. Taking these structures as the starting point, causal variational principles were formulated and studied as a mathematical generalization of the causal action principle, where an action of the form

$$\begin{aligned} {\mathcal{S}}= \int _{{\mathcal{F}}} \mathrm{d}\rho (x) \int _{{\mathcal{F}}} \mathrm{d}\rho (y)\, {\mathcal{L}}(x,y) \end{aligned}$$

is minimized for a given lower-semicontinuous Lagrangian \({\mathcal{L}}: {\mathcal{F}}\times {\mathcal{F}}\rightarrow {\mathbb{R}}^+_0\) on an (in general non-compact) manifold \({\mathcal{F}}\) under variations of \(\rho \) within the class of regular Borel measures, keeping the total volume \(\rho ({\mathcal{F}})\) fixed. We refer the reader interested in causal variational principles to [19, Section 1 and 2] and the references therein.

This article is devoted to the case that the Hilbert space \({\mathcal{H}}\) is infinite-dimensional and separable. While the finite-dimensional setting seems suitable for describing physical spacetime on a fundamental level (where spacetime can be thought of as being discrete on a microscopic length scale usually associated to the Planck length), an infinite-dimensional Hilbert space arises in mathematical extrapolations where spacetime is continuous and has infinite volume. Most notably, infinite-dimensional Hilbert spaces come up in the examples of causal fermion systems describing Minkowski space (see [10, Section 1.2] or [26]) or a globally hyperbolic Lorentzian manifold (see for example [11]), and it is also needed for analyzing the limiting case of a classical interaction (the so-called continuum limit; see [10, Section 1.5.2 and Chapters 3-5]). A workaround to avoid infinite-dimensional analysis is to restrict attention to locally compact variations, as is done in [14, Section 2.3]. Nevertheless, in view of the importance of the examples and physical applications, it is a task of growing significance to analyze causal fermion systems systematically in the infinite-dimensional setting. It is the objective of this paper to put this analysis on a sound mathematical basis.

We now outline the main points of our constructions and explain our main results. Extending methods and results in [15] to the infinite-dimensional setting, we endow the set of all regular points of \({\mathcal{F}}\) with the structure of a Banach manifold (see Definition 3.1 and Theorem 3.4). To this end, we construct an atlas formed of so-called symmetric wave charts (see Definition 3.3). We also show that the Hilbert–Schmidt norm on finite-rank operators on \({\mathcal{H}}\) gives rise to a Fréchet-smooth Riemannian metric on this Banach manifold. More precisely, in Theorems 3.11 and 3.12, we prove that \({\mathcal{F}}^{\mathrm{reg}}\) is a smooth Banach submanifold of the Hilbert space \({\mathscr{S}}({\mathcal{H}})\) of selfadjoint Hilbert–Schmidt operators, with the Riemannian metric given by

$$\begin{aligned} g_x \, :\, T_x^{{\mathscr{S}}} {\mathcal{F}}^{\mathrm{reg}} \times T_x^{{\mathscr{S}}} {\mathcal{F}}^{\mathrm{reg}} \rightarrow {\mathbb{R}},\qquad g_x(A,B) := {{\,\mathrm{tr}\,}}(AB) . \end{aligned}$$

In order to introduce higher derivatives at a regular point \(p \in {\mathcal{F}}\), our strategy is to always work in the distinguished symmetric wave chart around this point. This has the advantage that we can avoid the analysis of differentiability properties under coordinate transformations. The remaining difficulty is that the causal Lagrangian \({\mathcal{L}}\) and other derived functions are not differentiable. Instead, directional derivatives exist only in certain directions. In general, these directions do not form a vector space. As a consequence, the derivative is not a linear mapping, and the usual product and chain rules cease to hold. On the other hand, these computation rules are needed in the applications, and it is often sensible to assume that they do hold. This motivates our strategy of looking for a vector space on which the function under consideration is differentiable. Clearly, in this way, we lose information on the differentiability in certain directions which do not lie in such a vector space. But this shortcoming is outweighted by the benefit that we can avoid the subtleties of non-smooth analysis, which, at least for most applications in mind, would be impractical and inappropriately technical. Clearly, we want the subspace to be as large as possible, and moreover, it should be defined canonically without making any arbitrary choices. These requirements lead us to the notion of expedient subspaces (see Definition 4.2). In general, the expedient subspace is neither dense nor closed. On these expedient subspaces, the function is Gâteaux differentiable, the derivative is a linear mapping, and higher derivatives are multilinear.

The differential calculus on expedient subspaces is compatible with the chain rule in the following sense: If f is locally Hölder continuous, \(\gamma \) is a smooth curve whose derivatives up to sufficiently high order lie in the expedient differentiable subspace of f, then the composition \(f \circ \gamma \) is differentiable and the chain rule holds (see Proposition 4.4), i.e.,

$$\begin{aligned} (f\circ \gamma )'(t_0) = D^{{\mathcal{E}}}f|_{x_0}\, \gamma '(t_0) , \end{aligned}$$

where the index \({\mathcal{E}}\) denotes the derivative on the expedient subspace. We also prove a chain rule for higher derivatives (see Proposition 4.5). The requirement of Hölder continuity is a crucial assumption needed in order to control the error term of the linearization. The most general statement is Theorem 5.8 where Hölder continuity is required only on a subspace which contains the curve \(\gamma \) locally.

We also work out how the differential calculus on expedient subspaces applies to the setting of causal fermion systems. In order to establish the chain rule, we prove that the causal Lagrangian is indeed locally Hölder continuous with uniform Hölder exponent (Theorem 5.1), and we analyze how the Hölder constant depends on the base point (Theorem 5.3). Moreover, we prove that for all \(x,y \in {\mathcal{F}}\), there is a neighborhood \(U\subseteq {\mathcal{F}}\) of y with (see (5.9))

$$\begin{aligned} |{\mathcal{L}}(x,y) - {\mathcal{L}}(x,\tilde{y}) | \le c(n, y) \Vert x\Vert ^2\, \Vert \tilde{y}-y\Vert ^{\frac{1}{2n-1}} \qquad \text{for all}\,\tilde{y}\in U \end{aligned}$$

(where 2n is the maximal rank of the operators in \({\mathcal{F}}\)). Relying on these results, we can generalize the jet formalism as introduced in [17] for causal variational principles to the infinite-dimensional setting (Sect. 5.2). We also work out the chain rule for the Lagrangian (Theorem 5.6) and for the function \(\ell \) obtained by integrating one of the arguments of the Lagrangian (Theorem 5.9),

$$\begin{aligned} \ell (x) = \int _M {\mathcal{L}}(x,y)\, \mathrm{d}\rho (y) - \mathfrak {s}\end{aligned}$$
(1.1)

(where is a positive constant).

The paper is organized as follows. Section 2 provides the necessary preliminaries on causal fermion systems and infinite-dimensional analysis. In Sect. 3, an atlas of symmetric wave charts is constructed, and it is shown that this atlas endows the regular points of \({\mathcal{F}}\) with the structure of a Fréchet-smooth Banach manifold. Moreover, it is shown that the Hilbert–Schmidt norm induces a Fréchet-smooth Riemannian metric. In Sect. 4, the differential calculus on expedient subspaces is developed. In Sect. 5, this differential calculus is applied to causal fermion systems. Appendix gives some more background information on the Fréchet derivative. Finally, Appendix 2 provides details on how the Riemannian metric looks like in different charts.

We finally point out that in order to address a coherent readership, concrete applications of our methods and results for example to physical spacetimes have not been included here. The example of causal fermion systems in Minkowski space will be worked out separately in [25].

2 Preliminaries

2.1 Causal fermion systems and the causal action principle

We now recall the basic definitions of a causal fermion system and the causal action principle.

Definition 2.1

(causal fermion system) Given a separable complex Hilbert space \({\mathcal{H}}\) with scalar product \(\langle .|. \rangle _{{\mathcal{H}}}\) and a parameter \(n \in {\mathbb{N}}\) (the “spin dimension”), we let \({\mathcal{F}}\subseteq \mathrm{L}({\mathcal{H}})\) be the set of all selfadjoint operators on \({\mathcal{H}}\) of finite rank, which (counting multiplicities) have at most n positive and at most n negative eigenvalues. On \({\mathcal{F}}\), we are given a positive measure \(\rho \) (defined on a \(\sigma \)-algebra of subsets of \({\mathcal{F}}\)), the so-called universal measure. We refer to \(({\mathcal{H}}, {\mathcal{F}}, \rho )\) as a causal fermion system.

A causal fermion system describes a spacetime together with all structures and objects therein. In order to single out the physically admissible causal fermion systems, one must formulate physical equations. To this end, we impose that the universal measure should be a minimizer of the causal action principle, which we now introduce.

For any \(x, y \in {\mathcal{F}}\), the product xy is an operator of rank at most 2n. However, in general, it is no longer a selfadjoint operator because \((xy)^* = yx\), and this is different from xy unless x and y commute. As a consequence, the eigenvalues of the operator xy are in general complex. We denote these eigenvalues counting algebraic multiplicities by \(\lambda ^{xy}_1, \ldots , \lambda ^{xy}_{2n} \in {\mathbb{C}}\) (more specifically, denoting the rank of xy by \(k \le 2n\), we choose \(\lambda ^{xy}_1, \ldots , \lambda ^{xy}_{k}\) as all the nonzero eigenvalues and set \(\lambda ^{xy}_{k+1}, \ldots , \lambda ^{xy}_{2n}=0\)). We introduce the Lagrangian and the causal action by

$$\begin{aligned} {\text{Lagrangian:}} \qquad {\mathcal{L}}(x,y)&= \frac{1}{4n} \sum _{i,j=1}^{2n} \left( \left|\lambda ^{xy}_i \right| - \left|\lambda ^{xy}_j \right| \right)^2 \end{aligned}$$
(2.1)
$$\begin{aligned} {\text{causal action:}} \qquad {\mathcal{S}}(\rho )&= \iint _{{\mathcal{F}}\times {\mathcal{F}}} {\mathcal{L}}(x,y)\, d\rho (x)\, d\rho (y) . \end{aligned}$$
(2.2)

The causal action principle is to minimize \({\mathcal{S}}\) by varying the measure \(\rho \) under the following constraints:

$$\begin{aligned}&{\text{volume constraint:}} \qquad \rho ({\mathcal{F}}) = \text{const} \quad \;\; \end{aligned}$$
(2.3)
$$\begin{aligned}&{\text{trace constraint:}} \qquad \int _{{\mathcal{F}}} {{\,\mathrm{tr}\,}}(x)\, \mathrm{d}\rho (x) = \text{const} \end{aligned}$$
(2.4)
$$\begin{aligned}&{\textit{boundedness constraint:}} \qquad \iint _{{\mathcal{F}}\times {\mathcal{F}}} |xy|^2 \, d\rho (x)\, d\rho (y) \le C , \end{aligned}$$
(2.5)

where C is a given parameter, \({{\,\mathrm{tr}\,}}\) denotes the trace of a linear operator on \({\mathcal{H}}\), and the absolute value of xy is the so-called spectral weight,

$$\begin{aligned} |xy| := \sum _{j=1}^{2n} \left|\lambda ^{xy}_j \right| . \end{aligned}$$

This variational principle is mathematically well posed if \({\mathcal{H}}\) is finite-dimensional. For the existence theory and the analysis of general properties of minimizing measures, we refer to [3, 8, 9]. In the existence theory, one varies in the class of regular Borel measures (with respect to the topology on \(\mathrm{L}({\mathcal{H}})\) induced by the operator norm), and the minimizing measure is again in this class. With this in mind, here, we always assume that

$$\begin{aligned} \rho \ \text{is a regular Borel measure}. \end{aligned}$$

Let \(\rho \) be a minimizing measure. Spacetime is defined as the support of this measure,

$$\begin{aligned} M := {{\,\mathrm{supp}\,}}\rho . \end{aligned}$$

Thus, the spacetime points are selfadjoint linear operators on \({\mathcal{H}}\). These operators contain a lot of additional information which, if interpreted correctly, gives rise to spacetime structures like causal and metric structures, spinors and interacting fields. We refer the interested reader to [10, Chapter 1].

The only results on the structure of minimizing measures which will be needed here concern the treatment of the trace constraint and the boundedness constraint. As a consequence of the trace constraint, for any minimizing measure \(\rho \), the local trace is constant in spacetime, i.e., there is a real constant \(c \ne 0\) such that (see [10, Proposition 1.4.1])

$$\begin{aligned} {{\,\mathrm{tr}\,}}x = c \qquad \text{for all}\,x \in M . \end{aligned}$$

Restricting attention to operators with fixed trace, the trace constraint (2.4) is equivalent to the volume constraint (2.3) and may be disregarded. The boundedness constraint, on the other hand, can be treated with a Lagrange multiplier. Indeed, as is made precise in [3, Theorem 1.3], for every minimizing measure \(\rho \), there is a Lagrange multiplier \(\kappa >0\) such that \(\rho \) is a local minimizer of the causal action with the Lagrangian replaced by

$$\begin{aligned} {\mathcal{L}}_{\kappa }(x,y) := {\mathcal{L}}(x,y) + \kappa \, |xy|^2 , \end{aligned}$$

leaving out the boundedness constraint.

2.2 Fréchet and Gâteaux derivatives

We now recall a few basic concepts from the differential calculus on normed vector spaces. In what follows, we let \((E, \Vert .\Vert _E)\) and \((F, \Vert .\Vert _F)\) be real normed vector spaces. The most common concept is that of the Fréchet derivative.

Definition 2.2

Let \(U \subseteq E\) be open and \(f : U \rightarrow F\) be an F-valued function on U. The function f is Fréchet-differentiable in \(x_0 \in U\) if there is a bounded linear mapping \(A \in \mathrm{L}(E, F)\) such that

$$\begin{aligned} f(x) = f(x_0) + A\, (x-x_0) + r(x) , \end{aligned}$$

where the error term \(r : U \rightarrow F\) goes to zero faster than linearly, i.e.,

$$\begin{aligned} \lim _{x \rightarrow x_0, x \ne x_0} \frac{\Vert r(x)\Vert _F}{\Vert x-x_0\Vert _E} = 0 . \end{aligned}$$

The linear operator A is the Fréchet derivative, also denoted by \(Df|_{x_0}\). A function is Fréchet-differentiable in U if it is Fréchet-differentiable at every point of U.

The Fréchet derivative is uniquely defined. Moreover, the concept can be iterated to define higher derivatives. Indeed, if f is differentiable in U, its derivative Df is a mapping

$$\begin{aligned} Df \, :\, U \rightarrow \mathrm{L}(E,F) . \end{aligned}$$

Since \(\mathrm{L}(E,F)\) is a normed vector space (with the operator norm), we can apply Definition 2.2 once again to define the second derivative at a point \(x_0\) by

$$\begin{aligned} D^2f|_{x_0} = D\left( Df \right) \big|_{x_0} \;\in \; \mathrm{L}\left( E, \mathrm{L}(E,F) \right) . \end{aligned}$$

The second derivative can also be viewed as a bilinear mapping from E to F,

$$\begin{aligned} D^2f|_{x_0} : E \times E \rightarrow F,\qquad D^2f|_{x_0}(u,v) := \left( D\left( Df \right) \big|_{x_0} u, v \right) . \end{aligned}$$

It is by definition bounded, meaning that there is a constant \(c>0\) such that

$$\begin{aligned} \big \Vert D^2f|_{x_0}(u,v) \big \Vert _F \le c \, \Vert u\Vert _E\, \Vert v\Vert _E \qquad \text{for all}\,u, v \in E. \end{aligned}$$

By iteration, one obtains similarly the Fréchet derivatives of order \(p \in {\mathbb{N}}\) as multilinear operators

$$\begin{aligned} D^pf|_{x_0} : \underbrace{E \times \cdots \times E}_{p\ \mathrm{factors}} \rightarrow F . \end{aligned}$$

A function is Fréchet-smooth on U if it is Fréchet-differentiable to every order.

Lemma 2.3

If the function \(f : U \subseteq E \rightarrow F\) is p times Fréchet-differentiable in \(x_0 \in U\), then its \(p^{\mathrm{th}}\) Fréchet derivative is symmetric, i.e., for any \(u_1, \ldots , u_p \in E\) and any permutation \(\sigma \in {{\mathcal{S}}}_p\),

$$\begin{aligned} D^pf|_{x_0}\left(u_1, \ldots , u_p \right) = D^pf|_{x_0} \left(u_{\sigma (1)}, \ldots , u_{\sigma (p)} \right) . \end{aligned}$$

We omit the proof, which can be found for example in [5, Section 4.4]. For the Fréchet derivative, most concepts familiar from the finite-dimensional setting carry over immediately. In particular, the composition of Fréchet-differentiable functions is again Fréchet-differentiable. Moreover, the chain and product rules hold. We refer for the details to [5, Sections 2.2 and 2.3] and [6, Chapter 8]Footnote 1 and Appendix 1.

A weaker concept of differentiability which we will use here is Gâteaux differentiability.

Definition 2.4

Let \(U \subseteq E\) be open and \(f : U \rightarrow F\) be an F-valued function on U. The function f is Gâteaux differentiable in \(x_0 \in U\) in the direction \(u \in E\) if the limit of the difference quotient exists,

$$\begin{aligned} d_u f(x_0) := \lim _{h \rightarrow 0, h \ne 0} \frac{f(x_0+h u) - f(x_0)}{h} . \end{aligned}$$

The resulting vector \(d_u f(x_0) \in F\) is the Gâteaux derivative.

By definition, the Gâteaux derivative is homogeneous of degree one, i.e.,

$$\begin{aligned} d_{\lambda u} f(x_0) = \lambda \, d_u f(x_0) \qquad \text{for all}\,\lambda \in {\mathbb{R}}. \end{aligned}$$

Moreover, if f is Fréchet-differentiable in \(x_0\), then it is also Gâteaux differentiable in any direction \(u \in E\) and

$$\begin{aligned} d_u f(x_0) = Df|_{x_0} u . \end{aligned}$$

However, the converse is not true because, even if the Gâteaux derivatives exist for any \(u \in E\), it is in general not possible to represent them by a bounded linear operator. As a consequence, the chain and product rules in general do not hold for Gâteaux derivatives. We shall come back to this issue in Sect. 5.

2.3 Banach manifolds

We recall the basic definition of a smooth Banach manifold (for more details see for example [29, Chapter 73]).

Definition 2.5

Let B be a Hausdorff topological space and \((E, \Vert .\Vert _E)\) a Banach space. A chart \((U, \phi )\) is a pair consisting of an open subset \(U \subseteq B\) and a homeomorphism \(\phi \) of U to an open subset \(V := \phi (U)\) of E, i.e.,

$$\begin{aligned} \phi \, :\, U \overset{\mathrm{open}}{\subseteq } B \rightarrow V \overset{\mathrm{open}}{\subseteq } E . \end{aligned}$$

A smooth atlas \({{\mathcal{A}}} = ( \phi _i, U_i, E)_{i \in I}\) is a collection of charts (for a general index set I) with the properties that the domains of the charts cover B,

$$\begin{aligned} B = \bigcup _{i \in I} U_i \end{aligned}$$

and that for any \(i, j \in I\), the transition map

$$\begin{aligned} \phi _j \circ \phi _i^{-1} \, :\, \phi _i \left( U_i \cap U_j \right) \subseteq E \rightarrow \phi _j \left( U_i \cap U_j \right) \end{aligned}$$

is Fréchet-smooth. Two atlases \(( \phi _i, U_i, E)_{i \in I}\) and \(( \psi _i, V_i, E)_{j \in J}\) are called equivalent if all the transition maps \(\psi _j \circ \phi _i^{-1}\) and \(\phi _i \circ \psi _j^{-1}\) are Fréchet-smooth. We denote the corresponding equivalence class by \([{\mathcal{A}}]\). The union of the charts of all atlases in \([{\mathcal{A}}]\) is called maximal atlas \({\mathcal{A}}_{\mathrm{max}}\). The triple \((B, E, {{\mathcal{A}}})\) is referred to as a smooth Banach manifold with differentiable structure provided by \({\mathcal{A}}_{\mathrm{max}}\).

Definition 2.6

Just as in the case of finite-dimensional manifolds, we call a function \(f: U\subseteq A \rightarrow B\) between two smooth Banach manifolds \((A, E, {\mathcal{A}})\) and \((B, G, {\mathcal{B}})\) (with \(U\subseteq A\) open) n-times (Fréchet) differentiable (resp. smooth) if for all combinations of charts \(\phi _a:U_a \rightarrow V_a\) and \(\phi _b: U_b \rightarrow V_b\) of some (and thus all) atlases \(\tilde{{\mathcal{A}}}\) in \([{\mathcal{A}}]\), respectively, \(\tilde{{\mathcal{B}}}\) in \([{\mathcal{B}}]\), the mapping \(\phi _b \circ f\circ \phi _a^{-1}: V_a\rightarrow V_b\) is n-times (Fréchet) differentiable (resp. smooth).

3 Smooth Banach manifold structure of \({\mathcal{F}}^{\mathrm{reg}}\)

In the definition of causal fermion systems, the number of positive or negative eigenvalues of the operators in \({\mathcal{F}}\) can be strictly smaller than n. This is important because it makes \({\mathcal{F}}\) a closed subspace of \(\mathrm{L}({\mathcal{H}})\) (with respect to the norm topology), which in turn is crucial for the general existence results for minimizers of the causal action principle (see [9] or [18]). However, in most physical examples in Minkowski space or in a Lorentzian spacetime, all the operators in M do have exactly n positive and exactly n negative eigenvalues. This motivates the following definition (see also [10, Definition 1.1.5]).

Definition 3.1

An operator \(x \in {\mathcal{F}}\) is said to be regular if it has the maximal possible rank, i.e., \(\dim x({\mathcal{H}}) = 2n\). Otherwise, the operator is called singular. A causal fermion system is regular if all its spacetime points are regular.

In what follows, we restrict attention to regular causal fermion systems. Moreover, it is convenient to also restrict attention to all those operators in \({\mathcal{F}}\) which are regular,

$$\begin{aligned} {\mathcal{F}}^{\mathrm{reg}} := \big \{ x \in {\mathcal{F}}\,|\, x\ \text{is regular} \big \} . \end{aligned}$$

\({\mathcal{F}}^{\mathrm{reg}}\) is a dense open subset of \({\mathcal{F}}\) (again with respect to the norm topology on \(\mathrm{L}({\mathcal{H}})\)).

3.1 Wave charts and symmetric wave charts

We now choose specific charts and prove that the resulting atlas endows \({\mathcal{F}}^{\mathrm{reg}}\) with the structure of a smooth Banach manifold (see Definition 2.5). In the finite-dimensional setting, these charts were introduced in [15]. We now recall their definition and generalize the constructions to the infinite-dimensional setting.

Given \(x \in {\mathcal{F}}^{\mathrm{reg}}\), we denote the image of x by \(I:=x({\mathcal{H}})\). We consider I as a 2n-dimensional Hilbert space with the scalar product induced from \(\langle .|. \rangle _{{\mathcal{H}}}\). Denoting its orthogonal complement by \(J:=I^{\perp }\), we obtain the orthogonal sum decomposition

$$\begin{aligned} {\mathcal{H}}= I \oplus J . \end{aligned}$$

This also gives rise to a corresponding decomposition of operators, like for example

$$\begin{aligned} \mathrm{L}({\mathcal{H}}, I) = \mathrm{L}(I,I) \oplus \mathrm{L}(J,I). \end{aligned}$$
(3.1)

Given an operator \(\psi \in \mathrm{L}({\mathcal{H}}, I)\), we denote its adjoint by \(\psi ^{\dagger } \in \mathrm{L}(I, {\mathcal{H}})\); it is defined by the relation

$$\begin{aligned} \langle u \,|\, \psi \,v \rangle _I = \langle \psi ^{\dagger } \,u \,|\, v \rangle _{{\mathcal{H}}} \qquad \text{for all}\,u \in I\ \text{and}\,v \in {\mathcal{H}}. \end{aligned}$$

We now form the operator

$$\begin{aligned} R_x(\psi ) := \psi ^{\dagger } \,x\, \psi \in \mathrm{L}({\mathcal{H}}) . \end{aligned}$$
(3.2)

By construction, this operator is symmetric and has at most n positive and at most n negative eigenvalues. Therefore, it is an operator in \({\mathcal{F}}\). Using (3.1), we conclude that \(R_x\) is a mapping

$$\begin{aligned} R_x \, :\, \mathrm{L}(I,I) \oplus \mathrm{L}(J,I) \rightarrow {\mathcal{F}}. \end{aligned}$$
(3.3)

Before going on, it is useful to rewrite the operator \(R_x(\psi )\) in a slightly different way. On I, one can also introduce the indefinite inner product

$$\begin{aligned} \prec .|. \succ _x \, :\, S_x \times S_x \rightarrow {\mathbb{C}}, \qquad \prec u | v \succ _x = -\langle u | x v \rangle _{{\mathcal{H}}} , \end{aligned}$$
(3.4)

referred to as the spin inner product. For conceptual clarity, we denote I endowed with the spin inner product by \((S_x, \prec .|. \succ _x)\) and refer to it as the spin space at x (for more details on the spin spaces, we refer for example to [10, Section 1.1]). It is an indefinite inner product space of signature (nn). We denote the adjoint with respect to the spin inner product by a star. More specifically, for a linear operator \(A \in \mathrm{L}(S_x)\), the adjoint is defined by

$$\begin{aligned} \prec \phi \,|\, A\, \tilde{\phi } \succ _x = \prec A^* \,\phi \,|\, \tilde{\phi } \succ _x \qquad \text{for all}\,\phi , \tilde{\phi } \in S_x. \end{aligned}$$

Using again the definition of the spin inner product (3.4), we can rewrite this equation as

$$\begin{aligned} -\langle \phi \,|\, X\,A \tilde{\phi } \rangle _{{\mathcal{H}}} = -\langle A^* \phi \,|\,X \tilde{\phi } \rangle _{{\mathcal{H}}} , \end{aligned}$$

where we introduced the short notation

$$\begin{aligned} X := x|_{S_x} \, :\, S_x \rightarrow S_x . \end{aligned}$$
(3.5)

Taking adjoints in the Hilbert space \({\mathcal{H}}\) gives

$$\begin{aligned} -\langle X^{-1}\, A^{\dagger }\,X \phi \,|\,X \tilde{\phi } \rangle _{{\mathcal{H}}} = -\langle A^* \phi \,|\,X \tilde{\phi } \rangle _{{\mathcal{H}}} \end{aligned}$$

(note that, the operator X is invertible because \(S_x\) is by definition its image). We thus obtain the relation

$$\begin{aligned} A^* = X^{-1}\, A^{\dagger }\,X . \end{aligned}$$
(3.6)

Using such transformations, one readily verifies that identifying the image of \(\psi \) with a subspace of \(S_x\), the right side of (3.2) can be written as \(-\psi ^* \psi \) (for details, see [15, Lemma 2.2]). Thus, with this identification, the operator \(R_x\) can be written instead of (3.2) and (3.3) in the equivalent form

$$\begin{aligned} R_x \, :\, \mathrm{L}(I,S_x) \oplus \mathrm{L}(J,S_x) \rightarrow {\mathcal{F}}, \qquad R_x(\psi ) = -\psi ^* \psi , \end{aligned}$$
(3.7)

where \(\psi ^*\) is the adjoint with respect to the corresponding inner products, i.e.,

$$\begin{aligned} \prec \phi \,|\, \psi \,u \succ _x = \langle \psi ^* \phi \,|\, u \rangle _{{\mathcal{H}}} \qquad \text{for}\,u \in H\ \text{and}\,\phi \in S_x. \end{aligned}$$

We want to use the operator \(R_x\) in order to construct local parametrizations of \({\mathcal{F}}^{\mathrm{reg}}\). The main difficulty is that the operator \(R_x\) is not injective. For an explanation of this point in the context of local gauge freedom, we refer to [15]. Here, we merely explain how to arrange that \(R_x\) becomes injective. We let \(\mathrm{Symm}(S_x) \subseteq \mathrm{L}(S_x)\) be the real vector space of all operators A on \(S_x\) which are symmetric with respect to the spin inner product, i.e.,

$$\begin{aligned} \prec \phi | A \tilde{\phi } \succ _x = \prec A \phi | \tilde{\phi } \succ _x \qquad \text{for all}\,\phi , \tilde{\phi } \in S_x. \end{aligned}$$

We now restrict the operator \(R_x\) in (3.3) and (3.7) to

$$\begin{aligned} R_x^{\mathrm{symm}} := R_x|_{\mathrm{Symm}(S_x) \oplus \mathrm{L}(J,S_x)} \, :\, \mathrm{Symm}(S_x) \oplus \mathrm{L}(J,S_x) \rightarrow {\mathcal{F}},\qquad R_x(\psi ) = -\psi ^* \psi . \end{aligned}$$
(3.8)

We write the direct sum decomposition as

$$\begin{aligned} \psi = \psi _I + \psi _J \qquad \text{with} \qquad \psi _I \in \text{Symm}(S_x),\; \psi _J \in \mathrm{L}(J,S_x). \end{aligned}$$

Extending the analysis in [15, Section 6.1] to the infinite-dimensional setting, one finds that this mapping is a local parametrization of \({\mathcal{F}}^{\mathrm{reg}}\):

Theorem 3.2

There is an open neighborhood \(W_x\) of \((\mathrm{id}_{S_x}, 0) \in \mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x)\) such that the restriction of \(R_x^{\mathrm{symm}}\) maps to an open subset \(\Omega _x :=R_x^{\mathrm{symm}}(W_x)\) of \({\mathcal{F}}^{\mathrm{reg}}\),

$$\begin{aligned} R_x^{\mathrm{symm}}|_{W_x} \, :\, W_x \rightarrow \Omega _x \overset{\mathrm{open}}{\subseteq } {\mathcal{F}}^{\mathrm{reg}} , \end{aligned}$$

and is a homeomorphism to its image (always with respect to the topology induced by the operator norm on \(\mathrm{L}({\mathcal{H}})\)).

Proof

The estimate

$$\begin{aligned}&\Vert R_x^{\mathrm{symm}}(\psi ) - R_x^{\mathrm{symm}}(\tilde{\psi }) \Vert _{\mathrm{L}({\mathcal{H}})} \nonumber \\&\quad = \big \Vert \psi ^* \psi - \tilde{\psi }^* \tilde{\psi } \big \Vert _{\mathrm{L}({\mathcal{H}})} \le \big \Vert \psi ^* \psi - \psi ^* \tilde{\psi } \big \Vert _{\mathrm{L}({\mathcal{H}})} +\big \Vert \psi ^* \tilde{\psi } - \tilde{\psi }^* \tilde{\psi } \big \Vert _{\mathrm{L}({\mathcal{H}})} \nonumber \\&\quad \le \Vert \psi ^*\Vert _{\mathrm{L}({\mathcal{H}})} \, \big \Vert \psi - \tilde{\psi } \big \Vert _{\mathrm{L}({\mathcal{H}})} + \big \Vert \tilde{\psi }^* - \tilde{\psi }^* \big \Vert _{\mathrm{L}({\mathcal{H}})}\, \Vert \tilde{\psi }\Vert _{\mathrm{L}({\mathcal{H}})} \end{aligned}$$
(3.9)

shows that \(R_x^{\mathrm{symm}}\) is continuous. Since the point \(R_x^{\mathrm{symm}}(\mathrm{id}_{S_x}, 0)=x \in {\mathcal{F}}^{\mathrm{reg}}\) is regular, by continuity, we may choose an open neighborhood \(W_x\) of \((\mathrm{id}_{S_x}, 0)\) such that \(R_x\) maps to \({\mathcal{F}}^{\mathrm{reg}}\).

In order to show that \(R_x^{\mathrm{symm}}\) is bijective, we begin with the formula for \(\phi _x\) as derived in [15, Proposition 6.6], which will turn out to be the inverse of \(R_x^{\mathrm{symm}}\). It has the form

$$\begin{aligned} \phi _x(y) = \left( P(x,x)^{-1}\, A_{xy}\, P(x,x)^{-1} \right)^{-\frac{1}{2}} \, P(x,x)^{-1}\, P(x,y) \, \Psi (y) \;\in \; \mathrm{L}({\mathcal{H}}, S_x) , \end{aligned}$$
(3.10)

where P(xy) (the kernel of the fermionic projector) and \(A_{xy}\) (the closed chain) are defined by

$$\begin{aligned} P(x,y) := \pi _x y|_{S_y} \, :\, S_y \rightarrow S_x ,\qquad A_{xy} := P(x,y)\, P(y,x) \, :\, S_x \rightarrow S_x . \end{aligned}$$
(3.11)

Our task is to show that for a sufficiently small open neighborhood \(\Omega _x\) of x, this formula defines a continuous mapping

$$\begin{aligned} \phi _x \, :\, \Omega _x \subseteq {\mathcal{F}}^{\mathrm{reg}} \rightarrow \mathrm{Symm}(S_x) \oplus \mathrm{L}(J, {\mathcal{H}}) , \end{aligned}$$

and that the compositions

$$\begin{aligned} \phi _x \circ R_x^{\mathrm{symm}}|_{W_x} \qquad \text{and} \qquad R_x^{\mathrm{symm}} \circ \phi _x \end{aligned}$$
(3.12)

are both the identity (showing that \(\phi _x\) is indeed the inverse of \(R_x^{\mathrm{symm}}\)).

In preparation, we rewrite the formula (3.10) as

$$\begin{aligned} \phi _x(y) = \left(X^{-1} \,\pi _x y\pi _y x X^{-1}\right)^{-\frac{1}{2}}X^{-1} \,\pi _x y \pi _y = \left(X^{-1} \, \pi _x y |_{S_x}\right)^{-\frac{1}{2}}X^{-1}\,\pi _x y , \end{aligned}$$
(3.13)

where we again used the notation (3.5). Choosing \(y=x\), the operator \(X^{-1} \,\pi _x y |_{S_x}\) is the identity on \(S_x\). We first choose an open neighborhood \(\tilde{\Omega }_x\) of x so small such that for any \(y \in \tilde{\Omega }_x\),

$$\begin{aligned} \big \Vert \mathrm{id}_{S_x}-X^{-1}\pi _x y |_{S_x} \big \Vert _{\mathrm{L}({\mathcal{H}})}< \frac{1}{2} . \end{aligned}$$
(3.14)

Then, the square root as well as the inverse square root of \(A=X^{-1}\pi _x y\) are well defined for all \(x\in \tilde{\Omega }_x\) by the respective power series,

$$\begin{aligned} A^{\frac{1}{2}}:=\sum _{n=0}^{\infty } (-1)^n \left( {\begin{array}{c}1/2\\ n\end{array}}\right) (\mathrm{id}_{S_x}-A)^n\;,\;\;\; A^{-\frac{1}{2}}:=\sum _{n=0}^{\infty } (-1)^n \left( {\begin{array}{c}-1/2\\ n\end{array}}\right) (\mathrm{id}_{S_x}-A)^n\;, \end{aligned}$$

with the generalized binomial coefficients given for \(\beta \in \mathbb{R}\) and \(n\in \mathbb{N}\) by

$$\begin{aligned} \left( {\begin{array}{c}\beta \\ n\end{array}}\right) := \left\{ \begin{array}{cl} \displaystyle \frac{1}{n!}\;\beta \cdot (\beta - 1) \cdots (\beta -n+1)\quad &{} \text{if}\,n>0 \\ 0\quad &{}\text{if}\,n=0 \end{array}\right. \end{aligned}$$

as for both power series the radius of convergence equals one. Moreover, note that, all square roots, inverse square roots, etc., appearing in the following are well defined as they are always applied to operators within their radius of convergence. We conclude that the mapping \(\phi _x\) is well defined and continuous on \(\tilde{\Omega }_x\). Now, by possibly shrinking \(W_x\), we can arrange that \(\Omega _x:=R_x^{\mathrm{symm}}(W_x)\) lies in \(\tilde{\Omega }_x\). Note that it now suffices to show that \(\phi _x|_{\Omega _x}\) is the inverse of \(R_x^{\mathrm{symm}}|_{W_x}\), because then the set \(\Omega _x=(\phi _x|_{\tilde{\Omega }_x})^{-1}(W_x)\) is open.

In order to verify that \(\phi _x\) maps into \(\mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x)\), we restrict \(\phi _x(y)\) to \(S_x\),

$$\begin{aligned} \phi _x(y) \big|_I=\left( \left(X^{-1} \,\pi _x \,y \big|_{S_x}\right)^{-\frac{1}{2}}X^{-1} \,\pi _x \,y \right)\Big|_I \nonumber \\= \left(X^{-1} \,\pi _x \,y \big|_{S_x}\right)^{-1/2} X^{-1} \,\pi _x \,y|_{S_x} =\left(X^{-1} \,\pi _x \,y \,\pi _x \big|_{S_x}\right)^{\frac{1}{2}} . \end{aligned}$$
(3.15)

A direct computation using (3.6) shows that the operator \(X^{-1}\pi _x y \pi _x|_{S_x}\) and hence also its square root are symmetric on \(S_x\).

It remains to compute the compositions in (3.12). First,

$$\begin{aligned} \phi _x \circ R_x^{\mathrm{symm}}(\psi )&= \phi _x(\psi ^{\dagger }X\psi ) =(X^{-1}\underbrace{\pi _x \psi ^{\dagger }X}_{=\psi _I^{\dagger }X} \underbrace{\psi |_{S_x}}_{\psi _I})^{-\frac{1}{2}}X^{-1} \underbrace{\pi _x \psi ^{\dagger }X}_{=\psi _I^{\dagger }X}\psi \\&= \left( \underbrace{ X^{-1}\psi _I^{\dagger }X}_{=\psi _I} \psi _I\right)^{-\frac{1}{2}}\underbrace{ X^{-1}\psi _I^{\dagger }X}_{=\psi _I} \psi = \left(\psi _I^2 \right)^{-\frac{1}{2}}\,\psi _I\,\psi =\psi , \end{aligned}$$

where in the last line, we applied (3.6) and used that \(\psi _I\) is symmetric on \(S_x\). Moreover,

$$\begin{aligned} R_x^{\mathrm{symm}} \circ \phi _x(y)&= \phi _x(y)^{\dagger } X \phi _x(y) \\&= y\,\pi _x\, X^{-1}\,\left(\pi _x y \pi _x\,X^{-1}\, \right)^{-\frac{1}{2}} \,X\, \left(X^{-1}\,\pi _x y \pi _x|_{S_x} \right)^{-\frac{1}{2}} X^{-1}\,\pi _x \,y . \end{aligned}$$

Since the spectral calculus is invariant under similarity transformations, we know that for any invertible operator B on \(S_x\),

$$\begin{aligned} X^{-1} B^{-\frac{1}{2}} X = \left( X^{-1} B X \right)^{-\frac{1}{2}} . \end{aligned}$$

Hence,

$$\begin{aligned} R_x^{\mathrm{symm}} \circ \phi _x(y)&= y\,\pi _x\, \left(X^{-1} \,\pi _x y \pi _x|_{S_x}\right)^{-\frac{1}{2}}\,\left(X^{-1} \,\pi _x y \pi _x|_{S_x} \right)^{-\frac{1}{2}} X^{-1}\,\pi _x \,y \\&= y\,\pi _x\, \left(X^{-1}\,\pi _x y \pi _x|_{S_x}\right)^{-1}\,X^{-1}\,\pi _x \,y \\&= y\,\pi _x\, \left( \pi _x \,y \pi _x |_{S_x} \right)^{-1}\, \pi _x \,y = y\,x \left( \pi _x \,y x|_{S_x} \right)^{-1}\, \pi _x \,y \\&= y\,P(y,x)\,\left( P(x,y)\, P(y,x) \right)^{-1} \, P(x,y) = y \end{aligned}$$

(note that \(P(x,y) : S_y \rightarrow S_x\) is invertible in view of (3.14)). This concludes the proof. \(\square \)

The mapping \(\phi _x\), which already appeared in the proof of the previous lemma, can also be introduced abstractly to define the chart.

Definition 3.3

Setting

$$\begin{aligned} \phi _x := R_x^{\mathrm{symm}} \big|_{W_x}^{-1} \, :\, \Omega _x \rightarrow \mathrm{Symm}(S_x) \oplus \mathrm{L}(J,S_x) , \end{aligned}$$

we obtain a chart \((\phi _x, \Omega _x)\), referred to as the symmetric wave chart about the point \(x \in {\mathcal{F}}^{\mathrm{reg}}\).

We remark that more general charts can be obtained by restricting \(R_x\) to another subspace of \(\mathrm{L}(I,S_x) \oplus \mathrm{L}(J,S_x)\), i.e., in generalization of (3.8),

$$\begin{aligned} R_x^E := R_x|_{E \oplus \mathrm{L}(J,S_x)} \, :\, E \oplus \mathrm{L}(J,S_x) \rightarrow {\mathcal{F}},\qquad R(\psi ) = -\psi ^* \psi , \end{aligned}$$

where E is a subspace of \(\mathrm{L}(S_x)\) which has the same dimension as \(\mathrm{Symm}(S_x)\). The resulting charts \(\phi ^E_x\) are obtained by composition with a unitary operator \(U_x\) on \(S_x\), i.e.,

$$\begin{aligned} \phi ^E_x = U_x \circ \phi _x \qquad \text{with} \qquad U_x \in \mathrm{U}(S_x) \end{aligned}$$

(for details and the connection to local gauge transformations, see [15, Section 6.1]). Since linear transformations are irrelevant for the question of differentiability, in what follows, we may restrict attention to symmetric wave charts.

3.2 A Fréchet smooth atlas

The goal of this section is to prove that the symmetric wave charts \((\phi _x, \Omega _x)\) form a smooth atlas of \({\mathcal{F}}^{\mathrm{reg}}\).

Theorem 3.4

(Symmetric wave atlas) The collection of all symmetric wave charts on \({\mathcal{F}}^{\mathrm{reg}}\) defines a Fréchet-smooth atlas of \({\mathcal{F}}^{\mathrm{reg}}\), endowing \({\mathcal{F}}^{\mathrm{reg}}\) with the structure of a smooth Banach manifold (see Definition 2.5).

Proof

We first verify that for any \(x\in {\mathcal{F}}^{\mathrm{reg}}\), the vector space \(\mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x)\) together with the operator norm of \(\mathrm{L}({\mathcal{H}},I)=\mathrm{L}({\mathcal{H}},S_x)\) is a Banach space. To this end, we note that, this vector space coincides with the kernel of the mapping \(\psi \mapsto (X^{-1}\psi ^{\dagger } \pi _x X - \psi |_I)\) on \(\mathrm{L}({\mathcal{H}}, I)\). Since this mapping is continuous on \(\mathrm{L}({\mathcal{H}},I)\) (as one verifies by an estimate similar to (3.9)), its kernel is closed. As a consequence, the vector space \(\mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x)\) is a closed subspace of \(\mathrm{L}({\mathcal{H}},I)\) and thus indeed a Banach space.

We saw in Theorem 3.2 that for any \(x \in {\mathcal{F}}^{\mathrm{reg}}\), \((\phi _x, \Omega _x)\) defines a chart on \({\mathcal{F}}^{\mathrm{reg}}\). Since the \(\Omega _x\) clearly cover \({\mathcal{F}}^{\mathrm{reg}}\), it remains to show that all transition mappings are Fréchet-smooth. To this end, we first note that, for any \(x,y \in {\mathcal{F}}^{\mathrm{reg}}\) and \(\psi \in \phi _x(\Omega _x \cap \Omega _y)\),

$$\begin{aligned} \phi _y \circ \phi _x^{-1}(\psi ) = \phi _y \left(\psi ^{\dagger } \,X\, \psi \right) = \left(Y^{-1} \,\pi _y\, \psi ^{\dagger } \,X\, \psi |_{S_y}\right)^{-\frac{1}{2}} \,Y^{-1} \,\pi _y\,\psi ^{\dagger }\, X\, \psi . \end{aligned}$$

Next, we define the mappings

$$\begin{aligned}&B_{xy}: \mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x) \rightarrow \mathrm{L}({\mathcal{H}},S_y), \quad \psi \mapsto Y^{-1} \,\pi _y \,\psi ^{\dagger } \,X\, \psi \;,\\&\tilde{B}_{xy}: \mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x) \rightarrow \mathrm{L}(S_y), \quad \psi \mapsto Y^{-1} \,\pi _y \,\psi ^{\dagger } \,X\, \psi |_{S_y}\;,\\&W: B_{\frac{1}{2}}(0)\subseteq \mathrm{L}(S_y) \rightarrow \mathrm{L}(S_y), \quad B\mapsto (1+B)^{-\frac{1}{2}} = \sum _{n=0}^{\infty }(-1)^n \left( {\begin{array}{c}-1/2\\ n\end{array}}\right) \,B^n \end{aligned}$$

(where the radius of the ball \(B_{1/2}(0)\) is taken with respect to the operator norm).

Recall that in the proof of Theorem 3.2 (more precisely (3.14)), we chose \(\Omega _y\) so small that the operator \(\Vert \mathrm{id}_{S_y}- Y^{-1}\pi _yz|_{S_y} \Vert <1/2\) for any \(z\in \Omega _y\). Thus, since for any \(\psi \in \phi _x(\Omega _x \cap \Omega _y)\) we have \(\psi ^{\dagger }X\psi =\phi ^{-1}_x(\psi )\in \Omega _y\), we obtain \(\tilde{B}_{xy}(\phi _x(\Omega _x \cap \Omega _y)) \subseteq B_{1/2}(\mathrm{id}_{S_y})\). Therefore, we can write the transition mapping \(\phi _y \circ \phi _x^{-1}\) as

$$\begin{aligned} \phi _y \circ \phi _x^{-1}(\psi ) = W\left(\mathrm{id}_{S_y} -\tilde{B}_{xy}(\psi )\right) \circ B_{xy}(\psi ). \end{aligned}$$

Now note that, for the Fréchet derivative, we consider all vector spaces here as a real Banach spaces, but still with the canonical operator norm induced by \(\Vert .\Vert _{{\mathcal{H}}}\). In view of the chain rule for Fréchet derivatives (for details, see Lemma 6.2 in Appendix 1) and the properties of the Fréchet derivative in Lemma 6.1 in Appendix 1, it remains to show that the mappings W, \(B_{xy}\) and \(\tilde{B}_{xy}\) are Fréchet-smooth (note that, the composition operator of \(\mathbb{R}\)-linear mappings is also always Fréchet-smooth as it defines a bounded \(\mathbb{R}\)-bilinear map and the map \(\mathrm{L}(S_y)\ni y \mapsto \mathrm{id}_{S_y}-y\in \mathrm{L}(S_y)\) is clearly Fréchet-smooth as well). For W, this is clear due to [21, pp. 40–42] (note that, \(\mathrm{L}(S_y)\) obviously defines a finite-dimensional unital Banach-algebra). Moreover, the mappings \(B_{xy}\) and \(\tilde{B}_{xy}\) are obviously \(\mathbb{R}\)-bilinear and bounded and thus Fréchet-smooth. \(\square \)

3.3 The tangent bundle

Having endowed \({\mathcal{F}}^{\mathrm{reg}}\) with a canonical smooth Banach manifold structure, the next step is to consider its tangent bundle. For finite-dimensional manifolds, the tangent space can be defined either by equivalence classes of curves or by derivations, and these two definitions coincide (see for example [24, Chapter 2]). In infinite dimensions, however, this does no longer be the case: In general, the derivation-tangent vectors (usually called operational tangent vectors) form a larger class of than the curve-tangent vectors (called kinematic tangent vectors). There might even be operational tangent vectors that depend on higher-order derivatives of the inserted function (while the kinematic tangent vectors interpreted as directional derivatives only involve the first derivatives); for details on such issues, see for example [22, Sections 28 and 29] or [2, pp. 3–6]. It turns out that for our applications in mind, it is preferable to define tangent vectors as equivalence classes of curves. Indeed, as we shall see, with this definition, the usual computation rules remain valid. More specifically, the tangent vectors of \({\mathcal{F}}^{\mathrm{reg}}\) are compatible with the Fréchet derivative, and each fiber of the corresponding tangent bundle can be identified with the underlying Banach space

$$\begin{aligned} V_x :=\mathrm{Symm}(S_x)\oplus \mathrm{L}(J,S_x) \end{aligned}$$

with respect to the chart \(\phi _x\).

Following [22, p. 284], we begin with the abstract definition of the (kinematic) tangent bundle, which makes it easier to see the topological structure. Afterward, we will show that this notion indeed agrees with equivalence classes of curves. Given \(x' \in {\mathcal{F}}^{\mathrm{reg}}\), we consider the set \(\Omega _{x'} \times V_{x'} \times \{x'\}\) (endowed with the topology inherited from the direct sum of Banach spaces). We take the disjoint union

$$\begin{aligned} \bigcup \limits _{x' \in {\mathcal{F}}^{\mathrm{reg}}} \Omega _{x'} \times V_{x'} \times \{x'\} \end{aligned}$$

and introduce the equivalence relation

$$\begin{aligned} (x,\mathbf{v },x') \sim (y,\mathbf{w },y') \qquad \Longleftrightarrow \qquad x=y \quad \mathrm{and} \quad (\phi _{x'}\circ \phi _{y'}^{-1})'|_{\phi _{y'}(x)}\mathbf{w }= \mathbf{v }. \end{aligned}$$

For clarity, we point out that the first entry represents the point of the Banach manifold \({\mathcal{F}}^{\mathrm{reg}}\), whereas the third entry labels the chart.

Definition 3.5

We define the tangent bundle \(T{\mathcal{F}}^{\mathrm{reg}}\) as the quotient space with respect to this equivalence relation,

$$\begin{aligned} T{\mathcal{F}}^{\mathrm{reg}} := \left(\bigcup \limits _{x' \in {\mathcal{F}}^{\mathrm{reg}}} \Omega _{x'} \times V_{x'} \times \{x'\} \right)\Big / \sim . \end{aligned}$$

The canonical projection is given by

$$\begin{aligned} \pi : T{\mathcal{F}}^{\mathrm{reg}} \rightarrow {\mathcal{F}}^{\mathrm{reg}}\;,\;\;\;\pi ([x,\mathbf{v },x']) = x. \end{aligned}$$

For every \(x \in {\mathcal{F}}^{\mathrm{reg}}\), the tangent space at x is defined by

$$\begin{aligned} T_x{\mathcal{F}}^{\mathrm{reg}}:=\pi ^{-1}(x). \end{aligned}$$

Note that, each \(T_x{\mathcal{F}}^{\mathrm{reg}}\) has a canonical vector space structure in the following sense: Since all equivalence classes in \(T_x{\mathcal{F}}^{\mathrm{reg}}\) have a representative of the form \([x,\mathbf{v },x]\), this representative can be identified with \(\mathbf{v }\in V_x\). In this way, we obtain an identification of \(T_x{\mathcal{F}}^{\mathrm{reg}}\) with \(V_x\).

The tangent bundle is again a Banach manifold, as we now explain. For any \(x \in {\mathcal{F}}^{\mathrm{reg}}\), the mapping

$$\begin{aligned} (\phi _x, D\phi _x): \pi ^{-1}(W_x) \rightarrow \Omega _x \times V_x\;, \;\;\;[y,\mathbf{v },z] \mapsto \left(\phi _x(y), D\left( \phi _x \circ \phi _z^{-1} \right) \big|_{\phi _z(y)}\mathbf{v }\right) \end{aligned}$$

has the inverse

$$\begin{aligned} (\phi _x, D\phi _x)^{-1}: \Omega _x \times V_x \rightarrow \pi ^{-1}(W_x)\;,\;\;\; (\psi ,\mathbf{v }) \mapsto [\phi _x^{-1}(\psi ),\mathbf{v },x] . \end{aligned}$$

On \(T{\mathcal{F}}^{\mathrm{reg}}\), we choose the coarsest topology with the property that the natural projections of these mappings to \(\Omega _x\) and \(V_x\) are both continuous (where on \(\Omega _x\) and \(V_x\), we choose the topology induced by the norm topology of \(\mathrm{L}({\mathcal{H}})\)). With this topology, the mapping \((\phi _x, D\phi _x)\) defines a chart of \(T{\mathcal{F}}^{\mathrm{reg}}\). For any \((\psi ,\mathbf{v })\in (\phi _y, D\phi _y)\left(\pi ^{-1}(\Omega _x)\cap \pi ^{-1}(\Omega _y)\right)\), the transition mappings are given by

$$\begin{aligned} (\phi _x, D\phi _x) \circ (\phi _y, D\phi _y)^{-1}(\psi , \mathbf{v })&= (\phi _x, D\phi _x)([\phi _y^{-1}(\psi ), \mathbf{v },y])\\&= \left((\phi _x \circ \phi _y^{-1})(\psi ), D \left(\phi _x \circ \phi _y^{-1} \right) \big|_{\psi }\mathbf{v }\right) . \end{aligned}$$

Proposition 3.6

\(T{\mathcal{F}}^{\mathrm{reg}}\) is again a Banach manifold.

Proof

We need to show that transition maps are Fréchet-smooth. This is clear for the first component because the transition mappings \(\phi _x \circ \phi _y^{-1}\) are Fréchet-smooth and fiberwise linear. The second component can be considered as the composition of the insertion map

$$\begin{aligned} \mathrm{L}(V_y,V_x)\times V_y \ni (A,\mathbf{v }) \mapsto A(\mathbf{v })\in V_x \end{aligned}$$

(which is obviously continuous and bilinear and thus Fréchet-smooth, for details, see Lemma 6.1 in Appendix 1) with the mapping \(W_y\times V_y \ni (\psi ,\mathbf{v }) \mapsto ((\phi _x \circ \phi _y^{-1})'|_{\psi },\mathbf{v })\in \mathrm{L}(V_x,V_y)\times V_y\), which is Fréchet-smooth due to the Fréchet-smoothness of the transition mappings. \(\square \)

In what follows, we will sometimes use the notation

$$\begin{aligned} D\phi _x([y,\mathbf{v },z]) := D\left( \phi _x \circ \phi _z^{-1} \right) \big|_{\phi _z(y)} \,\mathbf{v }\qquad \forall x\in {\mathcal{F}}^{\mathrm{reg}},\; [y,\mathbf{v },z]\in \pi ^{-1}(\Omega _x)\;, \end{aligned}$$

which also clarifies the independence of the choice of representatives.

Lemma 3.7

For any \(x \in {\mathcal{F}}^{\mathrm{reg}}\), the mapping

$$\begin{aligned} \psi _x: \Omega _x \times V_x \rightarrow \pi ^{-1}(\Omega _x)\;,\qquad (y, v) \mapsto [y,\mathbf{v },x] \end{aligned}$$

is a local trivialization.

Proof

We need to verify the properties of a local trivialization. Clearly, the operator \(\pi \circ \psi _x\) is the projection to the first component, and for fixed \(y \in \Omega _x\), the mapping \(v \mapsto \psi _x(y,\mathbf{v })=[y,\mathbf{v },x]=[y, (\phi _y \circ \phi _x^{-1})'|_{\phi _x(x)}\mathbf{v },y]\) corresponds to \(\mathbf{v }\mapsto (\phi _y \circ \phi _x^{-1})'|_{\phi _x(x)}\mathbf{v }\) (by the identification of \(T_y{\mathcal{F}}^{\mathrm{reg}}\) with \(V_y\) from before), which is obviously an isomorphism of vector spaces in view of Lemma 6.1 (vi). \(\square \)

To summarize, the Banach manifold \({\mathcal{F}}^{\mathrm{reg}}\) has similar properties as in the finite-dimensional case.

We now explain how the above definition of tangent vectors relates to the equivalence classes of curves (following [22, p. 285]):

Remark 3.8

(equivalence classes of curves) On curves \(\gamma , \tilde{\gamma } \in C^{\infty }(\mathbb{R},{\mathcal{F}}^{\mathrm{reg}})\), we consider the equivalence relation \(\gamma \sim \tilde{\gamma }\) defined by the conditions that \(\gamma (0) = \tilde{\gamma }(0)\) and that in a chart \(\phi _x\) with \(\gamma (0) \in \Omega _x\), the relation \((\phi _x\circ \gamma )'|_0= (\phi _x\circ \tilde{\gamma })'|_0\) holds. Note that, if the last relation holds in one chart, then it also holds in any other chart \(\phi _y\) with \(\gamma (0)\in \Omega _y\) because, due to the chain rule,

$$\begin{aligned} (\phi _y\circ \gamma )'|_0&= (\phi _y \circ \phi _x^{-1} \circ \phi _x \circ \gamma )'|_0 = (\phi _y \circ \phi _x^{-1})'|_{\phi _x (\gamma (0))}(\phi _x\circ \gamma )'|_0\\&= (\phi _y \circ \phi _x^{-1})'|_{\phi _x (\gamma (0))}(\phi _x\circ \tilde{\gamma })'|_0 =(\phi _y \circ \phi _x^{-1} \circ \phi _x \circ \tilde{\gamma })'|_0 =(\phi _y\circ \tilde{\gamma })'|_0. \end{aligned}$$

Now we can identify \(C^{\infty }(\mathbb{R},{\mathcal{F}}^{\mathrm{reg}}) / \sim \) with \(T{\mathcal{F}}^{\mathrm{reg}}\) via the mapping

$$\begin{aligned} C^{\infty }(\mathbb{R},{\mathcal{F}}^{\mathrm{reg}}) / \sim&\rightarrow T{\mathcal{F}}^{\mathrm{reg}} \nonumber \\ {[}\gamma ]&\mapsto \Big [\gamma (0),\, (\phi _{\gamma (0)}\circ \gamma )'|_0 ,\, \gamma (0)\Big ]\;, \end{aligned}$$
(3.16)

which bijective with inverse (for details, see [22, p. 285])

$$\begin{aligned} {[}x,\mathbf{v },x'] \mapsto \Big [t\mapsto \phi _{x'}^{-1}\left(\phi _{x'}(x) +t \,\xi _{\mathbf{v }}(t)\,\mathbf{v }\right) \Big ]\;, \end{aligned}$$

where \(\xi _{\mathbf{v }} \in C_0^{\infty }(\mathbb{R})\) is a smooth cutoff function with \(0\le \xi _v \le 1\). Moreover, \(\mathrm{supp}(\xi _{\mathbf{v }})\subseteq (-\varepsilon ,\varepsilon )\) and \(\xi _{\mathbf{v }}|_{(-\varepsilon /2,\varepsilon /2)}\equiv 1\) with \(\varepsilon >0\) chosen so small that

$$\begin{aligned} B_{\varepsilon \Vert \mathbf{v }\Vert }\left( \phi _{x'}(x) \right) \subseteq W_{x'} . \end{aligned}$$

Note that, in (3.16), the tangent vector at \(\gamma (0)\) was expressed in the specific chart \((\phi _{\gamma (0)}, \Omega _{\gamma (0)})\). However, the tangent vector can also be represented in another chart as follows. Let \(x \in {\mathcal{F}}^{\mathrm{reg}}\) and \([x,\mathbf{v },z] \in T_x{\mathcal{F}}^{\mathrm{reg}}\) be arbitrary. We say that a curve \(\gamma \in C^{\infty }(\mathbb{R},{\mathcal{F}}^{\mathrm{reg}})\) represents \([x,\mathbf{v },z]\) if in one chart \(\phi _y\) with \(x \in \Omega _y\) (and thus any chart, as one can show using the chain rule just as before) it holds that

$$\begin{aligned} {[}x,\mathbf{v },z] = [\gamma (0),(\phi _y \circ \gamma )'|_0, y]. \end{aligned}$$
(3.17)

In order to show independence of y, let \(w\in {\mathcal{F}}^{\mathrm{reg}}\) with \(x \in \Omega _w\). Then,

$$\begin{aligned} (\phi _w\circ \gamma )'|_0 = (\phi _w \circ \phi _y^{-1} \circ \phi _y \circ \gamma )'|_0 = (\phi _w \circ \phi _y^{-1})'|_{\phi _y(x)}(\phi _y\circ \gamma )'|_0\;, \end{aligned}$$

and thus,

$$\begin{aligned} {[}\gamma (0), (\phi _w\circ \gamma )'|_0, w] =[\gamma (0),(\phi _y \circ \gamma )'|_0, y] = [x,\mathbf{v },z] . \end{aligned}$$

Hence, if (3.17) holds in one chart, it also holds in any other chart around x. \(\square \)

Remark 3.9

(directional derivatives) Let \(\gamma \in C^{\infty }(\mathbb{R},{\mathcal{F}}^{\mathrm{reg}})\) be a curve that represents \([x,\mathbf{v },z]\). We define the directional derivative of a Fréchet-differentiable function \(f: {\mathcal{F}}^{\mathrm{reg}} \rightarrow \mathbb{R}\) at x in the direction \([x,\mathbf{v },z]\) as

$$\begin{aligned} D_{[x,\mathbf{v },z]}f|_x := \frac{\mathrm{d}}{\mathrm{d}t}(f\circ \gamma )|_{t=0}. \end{aligned}$$

This definition is independent of the choice of the curve \(\gamma \). Indeed, for any chart \(\phi _w\) around x, we have

$$\begin{aligned}&\frac{\mathrm{d}}{\mathrm{d}t}(f\circ \gamma )|_{t=0} = (f\circ \phi _w^{-1} \circ \phi _w \circ \phi _z \circ \phi _z^{-1} \gamma )'(0) \\&\quad = D(f\circ \phi _w^{-1})|_{\phi _w(x)}\, D(\phi _w \circ \phi _z^{-1})|_{\phi _z(x)}\, (\phi _z \circ \gamma )'(0) \\&\quad =D(f\circ \phi _w^{-1})|_{\phi _w(x)}\, D(\phi _w \circ \phi _z^{-1})|_{\phi _z(x)} \,v =D(f\circ \phi _w^{-1})|_{\phi _w(x)}\, D\phi _w([x,\mathbf{v },z]) . \end{aligned}$$

\(\square \)

We close this subsection with one last definition:

Definition 3.10

(Tangent vector fields) A tangent vector field on a Banach manifold is—similar to the finite-dimensional case—a Fréchet-smooth map \(\mathbf{v }: {\mathcal{F}}^{\mathrm{reg}} \rightarrow T{\mathcal{F}}^{\mathrm{reg}}\) such that \(\mathbf{v }(x) \in T_x{\mathcal{F}}^{\mathrm{reg}}\) (i.e. \(\pi (\mathbf{v }(x)) = x\)) for all \(x \in {\mathcal{F}}^{\mathrm{reg}}\). We denote the set of all tangent vectors fields of \({\mathcal{F}}^{\mathrm{reg}}\) by \(\Gamma ({\mathcal{F}}^{\mathrm{reg}},T{\mathcal{F}}^{\mathrm{reg}})\).

We note that, according to this definition, multiplying a vector field by Fréchet-smooth real-valued function gives again a vector field. In other words, the space of all tangent vector fields forms a module over the ring of Fréchet-smooth functions from \({\mathcal{F}}^{\mathrm{reg}}\) to \({\mathbb{R}}\).

3.4 A Riemannian metric

In this section, we show that the Hilbert–Schmidt scalar product gives rise to a canonical Riemannian metric on \({\mathcal{F}}^{\mathrm{reg}}\). For the constructions, it is most convenient to recover \({\mathcal{F}}^{\mathrm{reg}}\) as a Banach submanifold of the real Hilbert space \({\mathscr{S}}({\mathcal{H}})\) of all selfadjoint Hilbert–Schmidt operators on \({\mathcal{H}}\) endowed with the scalar product (\({\mathscr{S}}\) because of the second Schatten class; for details, see [7, Section XI.6])

$$\begin{aligned} \langle A, B\rangle _{{\mathscr{S}}({\mathcal{H}})} := {{\,\mathrm{tr}\,}}\left(A B \right) . \end{aligned}$$

Theorem 3.11

\({\mathcal{F}}^{\mathrm{reg}}\) is a smooth Fréchet submanifold of \({\mathscr{S}}({\mathcal{H}})\) in the following sense. Given \(x \in {\mathcal{F}}^{\mathrm{reg}}\), we choose \(\psi _0 \in \mathrm{Symm}(S_x) \oplus \mathrm{L}(J, I)\) with \(x = -\psi _0^* \psi _0\). Then, the mapping

$$\begin{aligned} \mathscr{R} \, :\, \left( \mathrm{Symm}(S_x) \oplus \mathrm{L}(J, I) \right) \oplus {\mathscr{S}}(J)&\rightarrow {\mathscr{S}}({\mathcal{H}}) \\ (\;\;\psi \;\;,\;\; B\;\;)\,&\mapsto -\psi ^* \psi + \begin{pmatrix} 0 &{} 0 \\ 0 &{} B \end{pmatrix} \end{aligned}$$

(where the last matrix denotes a block operator on \({\mathcal{H}}=I \oplus J\)) is a local Fréchet-diffeomorphism at \((\psi _0, 0)\). Its local inverse takes the form

$$\begin{aligned} \Phi := ({\mathscr{R}}|_{\hat{W}})^{-1} \, :\, {\mathscr{S}}({\mathcal{H}}) \cap \hat{\Omega }_x&\rightarrow \hat{W} \subseteq \left( \mathrm{Symm}(S_x) \oplus \mathrm{L}(J, I) \right) \oplus {\mathscr{S}}(J) \\ E&\mapsto \bigg ( \phi _x(\pi _x E), \pi _J \left( E + \left(\phi _x(\pi _x E) \right)^* \phi _x(\pi _x E) \right) \Big|_J \bigg ) , \end{aligned}$$

where \(\hat{W} = W_x \oplus {\mathscr{S}}(J)\), \(\hat{\Omega }_x:=\mathscr{R}(\hat{W})=\Omega _x+{\mathscr{S}}(J)\) (with \(W_x\) and \(\Omega _x\) as in Theorem 3.2), and \(\phi _x(\pi _x E)\) is defined in analogy to (3.13) by

$$\begin{aligned} \phi _x \left(\pi _x E \right) := \left(X^{-1} \,\pi _x E |_{S_x} \right)^{-\frac{1}{2}}X^{-1}\,\pi _x E \,\in \, \mathrm{Symm}(S_x) \oplus \mathrm{L}(J, I) \end{aligned}$$

(the fact that this maps to the symmetric operators on \(S_x\) is verified as in (3.15)).

Proof

A direct computation shows that \({\mathscr{R}}\) and \(\Phi \) are inverses of each other: In order to compute \(\mathscr{R}\circ \Phi \), we use the block operator notation

$$\begin{aligned} E= \begin{pmatrix} E_{II} &{} E_{IJ} \\ E_{JI} &{} E_{JJ} \end{pmatrix} \in {\mathscr{S}}({\mathcal{H}}) \cap \hat{\Omega }_x . \end{aligned}$$

Then, there exist operators \(\tilde{E}_J, \hat{E}_J \in {\mathscr{S}}(J)\) such that \(E_{JJ}=\tilde{E}_J+\hat{E}_J\), and the operator

$$\begin{aligned} \tilde{E}:= \begin{pmatrix} E_{II} &{} E_{IJ} \\ E_{JI} &{} \tilde{E}_J \end{pmatrix} \end{aligned}$$

is contained in \(\Omega _x\). Note that, \(\phi _x E = \pi _x\tilde{E}\) and therefore \(-\phi _x(\pi _x E)^*\phi _x(\pi _x E)=\tilde{E}\). We conclude that

$$\begin{aligned} \mathscr{R}\circ \Phi (E) = \begin{pmatrix} E_{II} &{} E_{IJ} \\ E_{JI} &{} \tilde{E}_J+E_{JJ}-\tilde{E}_J \end{pmatrix} =E . \end{aligned}$$

In order to compute \(\Phi \circ \mathscr{R}\), we take \((\psi , B)\in \hat{W}\) arbitrary and note that, due to the definition of \(\phi _x\) in (3.13) and Theorem 3.2, we have

$$\begin{aligned} \phi _x(\mathscr{R}(\psi ,B)) = \phi _x(-\pi _x\psi ^*\psi ) = \phi _x(-\psi ^*\psi )=\psi \end{aligned}$$

(note that, the first two mappings \(\phi _x\) are the ones defined in this theorem, whereas the third mapping is the one from (3.13)). We thus obtain

$$\begin{aligned} \phi _x(\mathscr{R}(\psi ,B)) = \left(\psi , \, \pi _J\left(-\psi ^*\psi +\begin{pmatrix} 0 &{} 0 \\ 0 &{} B \end{pmatrix} + \psi ^*\psi \right)\big|_J \right) = (\psi , B). \end{aligned}$$

Next, the mappings \({\mathscr{R}}\) and \(\Phi \) are Fréchet-smooth because for operators of finite rank (namely rank at most 2n), the operator norm is equivalent to the Hilbert–Schmidt norm. Indeed, for an operator \(A \, :\, H \rightarrow I\) mapping to a finite-dimensional Hilbert space I,

$$\begin{aligned} \Vert A\Vert ^2 \le \Vert A^{\dagger } A \Vert \le {{\,\mathrm{tr}\,}}(A^{\dagger } A) =\Vert A\Vert _{{\mathscr{S}}({\mathcal{H}}, I)}^2 \le \dim (I)\, \Vert A\Vert ^2 . \end{aligned}$$

This concludes the proof. \(\square \)

We consider a smooth curve

$$\begin{aligned}&\gamma \, :\, (-\delta , \delta ) \rightarrow {\mathcal{F}}^{\mathrm{reg}} \qquad \text{with} \qquad \gamma (0)=x .\\&\frac{\mathrm{d}}{\mathrm{d}\tau } \left(\phi _y \circ \gamma (\tau ) \right)\big|_{\tau =0} = \mathbf{v }\in V_y . \end{aligned}$$

The corresponding equivalence class defines a tangent vector \([x,\mathbf{v },y] \in T_x {\mathcal{F}}^{\mathrm{reg}}\). On the other hand, considering \(\gamma \) as a curve in \({\mathscr{S}}\), it has the tangent vector

$$\begin{aligned} \frac{\mathrm{d} \gamma (\tau )}{\mathrm{d}\tau } \Big|_{\tau =0} \in {\mathscr{S}}. \end{aligned}$$

In the chart \(\phi _x\) and setting \(\psi _0 = \phi _x(x)\), the curve is parametrized by \(\psi (\tau ) := \phi _x \circ \gamma (\tau )\) with

$$\begin{aligned} \gamma (\tau ) = \phi _x^{-1} \circ \psi (\tau )=-\psi (\tau )^* \psi (\tau ) \end{aligned}$$

and thus

$$\begin{aligned} \frac{\mathrm{d} \gamma (\tau )}{\mathrm{d}\tau } \Big|_{\tau =0} = D\phi _x^{-1}|_{\psi _0} \mathbf{v }= -\mathbf{v }^* \psi _0 - \psi _0^* \mathbf{v }\qquad \text{with} \qquad \mathbf{v }\in V_x . \end{aligned}$$

As \(\psi _0=\phi _x(x)=\pi _x\), a direct computation (for details, see the proof of Lemma 7.6 in Appendix 2) that the map \(V_x \ni \mathbf{v }\mapsto -\mathbf{v }^*\psi _0-\psi ^*_0\mathbf{v }= -\mathbf{v }^*\pi _x-\pi ^*_x \mathbf{v }\) is injective.This makes it possible to write the tangent space as

$$\begin{aligned} T_x {\mathcal{F}}^{\mathrm{reg}} \simeq T_x^{{\mathscr{S}}} {\mathcal{F}}^{\mathrm{reg}} := \big \{ -\psi ^* \psi _0 - \psi _0^* \psi \,\big|\, \psi \in \mathrm{Symm}(S_x) \oplus \mathrm{L}(J, I) \big \} \subseteq {\mathscr{S}}({\mathcal{H}}) . \end{aligned}$$
(3.18)

Theorem 3.12

Using the identification (3.18), the mapping

$$\begin{aligned} g_x \, :\, T_x^{{\mathscr{S}}} {\mathcal{F}}^{\mathrm{reg}} \times T_x^{{\mathscr{S}}} {\mathcal{F}}^{\mathrm{reg}} \rightarrow {\mathbb{R}}, \qquad g_x(A,B) := {{\,\mathrm{tr}\,}}(AB) . \end{aligned}$$

defines a Fréchet-smooth Riemannian metric on \({\mathcal{F}}^{\mathrm{reg}}\). Moreover, the topology on \({\mathcal{F}}^{\mathrm{reg}}\) induced by the operator norm coincides with the topology induced by the Riemannian metric.

Proof

Follows immediately because \(g_x\) is the restriction of the Hilbert space scalar product to the smooth Fréchet submanifold \({\mathcal{F}}^{\mathrm{reg}}\).

We finally remark that the symmetric wave charts are related to Gaussian charts (see the formulas in [15, Sections 5 and 6.2], which apply to the infinite-dimensional case as well). Detailed computations for the Riemannian metric in symmetric wave charts are given in Appendix 2.

4 Differential calculus on expedient subspaces

If all functions arising in the analysis were Fréchet-smooth, all the methods and notions from the finite-dimensional setting could be adapted in a straightforward way to the infinite-dimensional setting. However, this procedure is not sufficient for our purposes, because the Lagrangian is not Fréchet-smooth. Therefore, we need to develop a differential calculus on Banach spaces for functions which are only Hölder continuous. Clearly, in general, such functions are not even Fréchet-differentiable, but the Gâteaux derivative may exist in certain directions. The disadvantage of Gâteaux derivatives is that the differentiable directions in general do not form a vector space. As a consequence, the usual computation rules like the linearity of the derivative or the chain and product rules cease to hold. Our strategy for preserving the usual computation rules is to work on suitable linear subspaces of the star-shaped set of all Gâteaux-differentiable directions, referred to as the expedient differentiable subspace.

4.1 The expedient differentiable subspaces

In this section, E and F denote Banach spaces.

Definition 4.1

Let \(U \subseteq E\) be open and \(f : U \rightarrow F\) an F-valued function. Moreover, let V be a subspace of E. The function f is k times V-differentiable at \(x_0 \in U\) if for every finite-dimensional subspace \(H \subseteq V\), the restriction of f to the affine subspace \(H+x_0\) denoted by

$$\begin{aligned} g^H : H \rightarrow F ,\qquad g^H(h) = f(x_0+h) \end{aligned}$$

is k-times continuously differentiable at \(h=0\). If this condition holds, the subspace V is called k-admissible at \(x_0\).

Thus, a function f is once V-differentiable at \(x_0\) if for every finite-dimensional subspace \(H \subseteq V\), for every \(h_0\) in a small neighborhood of the origin,

$$\begin{aligned} g^H(h) = g^H(h_0) + Dg^H|_{h_0} (h-h_0) + o(h-h_0) \qquad \text{for all}\,h \in H , \end{aligned}$$

and if \(Dg^H|_{h_0}\) is continuous in the variable \(h_0\) at \(h_0=0\). Equivalently, choosing a basis \(e_1, \ldots , e_L\) of H, this condition can be stated that all partial derivatives

$$\begin{aligned} \frac{\partial }{\partial \alpha _i} g^H \left( \alpha _1 e_1 + \cdots + \alpha _L e_L \right) \end{aligned}$$

exist and are continuous at \(\alpha _1,\ldots , \alpha _L=0\). The higher differentiability of \(g^H\) can be defined inductively or, equivalently, by demanding that all partial derivatives up to the order k, i.e., all the functions

$$\begin{aligned} \frac{\partial ^p}{\partial \alpha _{i_1} \cdots \alpha _{i_p}} g^H(\alpha _1 e_1 + \cdots + \alpha _L e_L) \end{aligned}$$

with \(i_1,\ldots , i_p \in \{1,\ldots , L\}\) and \(p \le k\), exist and are continuous at \(\alpha _1,\ldots , \alpha _L=0\).

An admissible subspace V is maximal if there are no admissible proper extensions \(\tilde{V} \supsetneq V\). The existence of maximal admissible subspaces is guaranteed by Zorn’s lemma, but maximal subspaces are in general not unique. In order to obtain a canonical subspace, we take the intersection of all maximal admissible subspaces:

Definition 4.2

The expedient k-differentiable subspace \({\mathcal{E}}^k(f,x_0)\) of f at \(x_0\) is defined as the intersection

$$\begin{aligned} {\mathcal{E}}^k(f,x_0) := \bigcap \big \{ V \,\big|\, V \subseteq E\ k\text{-admissible at}\,x_0\ \text{and maximal} \big \} . \end{aligned}$$

Since the expedient differentiable subspace is again admissible at \(x_0\), we obtain a corresponding derivative as follows. Given \(k\in {\mathbb{N}}\) and vectors \(h_1, \ldots , h_k \in {\mathcal{E}}(f,x_0)\), we choose H as a finite-dimensional subspace which contains these vectors. We set

$$\begin{aligned} D^{k,{\mathcal{E}}} f|_{x_0}(h_1,\ldots , h_k) := D^k g^H|_0(h_1, \ldots , h_k) \end{aligned}$$
(4.1)

(where again \(g^H(h):=f(x_0+h)\)).

Lemma 4.3

This procedure defines \(D^{k,{\mathcal{E}}} f|_{x_0}\) canonically as a symmetric, multilinear mapping

$$\begin{aligned} D^{k,{\mathcal{E}}} f|_{x_0} \, :\, \underbrace{{\mathcal{E}}^k(f,x_0) \times \cdots \times {\mathcal{E}}^k(f,x_0)}_{k\ \mathrm{factors}} \rightarrow F . \end{aligned}$$

Proof

In order to show that \(D^{k,{\mathcal{E}}} f|_{x_0}\) is well defined, let H and \(\tilde{H}\) be two finite-dimensional subspaces of \({\mathcal{E}}(f,x_0)\) which contain the vectors \(h_1, \ldots , h_k\). Then, expressing the partial derivatives in terms of partial derivatives, it follows that

$$\begin{aligned} D^k g^H|_0(h_1, \ldots , h_k)&= \frac{\partial ^p}{\partial \alpha _1 \cdots \alpha _k} f(x_0 + \alpha _1 h_1 + \cdots + \alpha _k h_k) \Big|_{\alpha _1=\cdots =\alpha _k=0} \\&= D^k g^{\tilde{H}}|_0(h_1, \ldots , h_k) . \end{aligned}$$

This shows that the definition (4.1) does not depend on the choice of H.

The symmetry and homogeneity follow immediately from the corresponding properties of \(D^k g^H\) in (4.1). In order to prove additivity, we let \(h_1, \ldots , h_k \in {\mathcal{E}}^k(f,x_0)\) and \(\tilde{h}_1, \ldots , \tilde{h}_k \in {\mathcal{E}}^k(f,x_0)\). We let H be the span of all these vectors and use that the corresponding operator \(D^k g^H|_0\) in (4.1) applied to \(h_1+\tilde{h}_1, \ldots , h_k +\tilde{h}_k\) is multilinear. \(\square \)

Note that, the operator \(D^{k,{\mathcal{E}}} f|_{x_0}\) is in general not bounded. Moreover, \({\mathcal{E}}^k(f,x_0)\) will in general not be a closed subspace of E nor will it in general be dense.

4.2 Derivatives along smooth curves

We now analyze under which assumptions directional derivatives exist. To this end, we let I be an interval and \(\gamma : I \rightarrow E\) a smooth curve (here, the notions of Fréchet and Gâteaux smoothness coincide). Moreover, let \(t_0 \in I\) with \(x_0:=\gamma (t_0) \in U\) and \(U\subseteq E\) open. Given a function \(f : U \rightarrow F\), we consider the composition

$$\begin{aligned} f \circ \gamma \, :\, I \rightarrow F . \end{aligned}$$

Proposition 4.4

(chain rule) Assume that f is locally Hölder continuous at \(x_0\), meaning that there is a neighborhood \(V \subseteq U\) of \(x_0\) as well as constants \(\alpha , c>0\) such that

$$\begin{aligned} \Vert f(x) - f(x') \Vert _F \le c\, \Vert x-x'\Vert _E^{\alpha } \qquad \text{for all}\,x,x' \in V. \end{aligned}$$
(4.2)

Moreover, assume that all the derivatives of \(\gamma \) at \(x_0\) up to the order

$$\begin{aligned} p := \bigg \lceil \frac{1}{\alpha } \bigg \rceil \end{aligned}$$
(4.3)

(where \(\lceil \cdot \rceil \) is the ceiling function) lie in the expedient differentiable subspace at \(x_0\), i.e.,

$$\begin{aligned} \gamma ^{(n)}(t_0) \in {\mathcal{E}}(f,x_0) \qquad \text{for all}\,n \in \{1, \ldots , p\} . \end{aligned}$$

Then, the function \(f \circ \gamma \) is differentiable at \(t_0\) and

$$\begin{aligned} (f\circ \gamma )'(t_0) = D^{{\mathcal{E}}}f|_{x_0}\, \gamma '(t_0) . \end{aligned}$$

Proof

We consider the polynomial approximation of \(\gamma \)

$$\begin{aligned} \gamma _p(t) := \sum _{n=0}^p \frac{\gamma ^{(n)}(t_0)}{n!}\, (t-t_0)^n . \end{aligned}$$
(4.4)

By assumption, this curve lies in the affine subspace \({\mathcal{E}}(f,x_0)+x_0\). Using that the restriction of f to this subspace is continuously differentiable, it follows that

$$\begin{aligned} (f\circ \gamma _p)'(t_0) = D^{{\mathcal{E}}} f|_{x_0}\, \gamma '(t_0) . \end{aligned}$$

It remains to control the error term of the polynomial approximation. Using that f is locally Hölder continuous, we know that

$$\begin{aligned} \big \Vert (f\circ \gamma )(t) - (f\circ \gamma _p)(t) \big \Vert _F \le c\, \Vert \gamma (t) - \gamma _p(t)\Vert _E^{\alpha } . \end{aligned}$$

Using that \(\gamma \) is smooth, it follows that

$$\begin{aligned} \big \Vert (f\circ \gamma )(t) - (f\circ \gamma _p)(t) \big \Vert _F \le \big \Vert o \left( (t-t_0)^p \right) \big \Vert _E^{\alpha } = o \left( (t-t_0)^{\alpha p} \right) . \end{aligned}$$
(4.5)

According to (4.3), we know that \(\alpha p \ge 1\). Therefore, the error term is of the order \(o(t-t_0)\), which shows that also the function \(t\mapsto (f\circ \gamma )(t) - (f\circ \gamma _p)(t)\) is differentiable with vanishing derivative. This proves the desired result. \(\square \)

This result immediately generalizes to higher derivatives:

Proposition 4.5

(higher order chain rule) Assume that f is locally Hölder continuous at \(x_0\) (see (4.2)). Moreover, assume that all the derivatives of \(\gamma \) at \(x_0\) up to the order

$$\begin{aligned} p := \bigg \lceil \frac{q}{\alpha } \bigg \rceil \end{aligned}$$
(4.6)

lie in the expedient differentiable subspace at \(x_0\), i.e.,

$$\begin{aligned} \gamma ^{(n)}(t_0) \in {\mathcal{E}}^q(f,x_0) \qquad \text{for all}\,n \in \{1, \ldots , p\} . \end{aligned}$$

Then, the function \(f \circ \gamma \) is q-times differentiable at \(t_0\), and the derivative can be computed with the usual product and chain rules (formula of Faà di Bruno).

Proof

We again consider f along the polynomial approximation \(\gamma _p\) (4.4) of the curve \(\gamma \). By assumption, this curve lies in a finite-dimensional subspace of the affine space

$$\begin{aligned} {\mathcal{E}}^q(f,x_0)+x_0 \;\subset \; F . \end{aligned}$$

Using that the restriction of f to this subspace is continuously differentiable, we know that \(f\circ \gamma _p\) is q times continuously differentiable at \(t=t_0\), and the derivatives can be computed with the formula of Faà di Bruno,

$$\begin{aligned} (f\circ \gamma _p)^{(q)}(t_0)&= D^{{\mathcal{E}},q} f|_{x_0} \left(\gamma '(t_0), \ldots , \gamma '(t_0) \right) \\&\quad + \frac{q(q-1)}{2}\, D^{{\mathcal{E}},q-1} f|_{x_0} \left(\gamma ''(t_0), \gamma '(t_0), \ldots , \gamma '(t_0) \right) + \cdots . \end{aligned}$$

Using (4.5) and (4.6), we conclude that

$$\begin{aligned} (f\circ \gamma )(t) - (f\circ \gamma _p)(t) = o \left( (t-t_0)^q \right) . \end{aligned}$$

It follows that also this function is q-times differentiable and that all its derivatives vanish. This concludes the proof. \(\square \)

5 Application to causal fermion systems in infinite dimensions

5.1 Local Hölder continuity of the causal Lagrangian

The goal of this section is to prove the following result.

Theorem 5.1

The Lagrangian is locally Hölder continuous in the sense that for all \(x,y_0 \in {\mathcal{F}}\) there is a neighborhood \(U \subseteq {\mathcal{F}}\) of \(y_0\) and a constant \(c>0\) such that

$$\begin{aligned} \left| {\mathcal{L}}(x,y) - {\mathcal{L}}(x,\tilde{y}) \right| \le c\, \Vert y-\tilde{y}\Vert ^{\frac{1}{2n-1}} \qquad \text{for all}\,y,\tilde{y} \in U, \end{aligned}$$
(5.1)

where n is the spin dimension. Moreover, the integrand of the boundedness constraint is locally Lipschitz continuous in the sense that

$$\begin{aligned} \left| |x y|^2 - |x\tilde{y}|^2 \right| \le c\, \Vert y-\tilde{y}\Vert ^{\frac{1}{2n}} \qquad \text{for all}\,y,\tilde{y} \in U. \end{aligned}$$
(5.2)

We begin with a preparatory lemma.

Lemma 5.2

(Hölder continuity of roots) Let

$$\begin{aligned} {{\mathcal{P}}}(\lambda ) := \lambda ^g + c_{g-1}\, \lambda ^{g-1} + \cdots + c_0 = \prod _{i=1}^g (\lambda - \lambda _i) \end{aligned}$$

be a complex monic polynomial of degree g with roots \(\lambda _1, \ldots , \lambda _g\). Then, there are constants \(C, \varepsilon >0\) such that any complex monic polynomial \(\tilde{{\mathcal{P}}}(\lambda ) =\lambda ^g + \tilde{c}_{g-1}\, \lambda ^{g-1} + \cdots + \tilde{c}_0\) of degree g which is close to \({{\mathcal{P}}}\) in the sense that

$$\begin{aligned} \Vert \tilde{{\mathcal{P}}} - {{\mathcal{P}}}\Vert := \max _{\ell \in \{0,\ldots ,g-1\} } \left| \tilde{c}_{\ell } - c_{\ell } \right| < \varepsilon \end{aligned}$$

can be written as \(\tilde{{\mathcal{P}}}(\lambda ) = \prod _{i=1}^g (\lambda - \tilde{\lambda }_i)\) with

$$\begin{aligned} |\lambda _i - \tilde{\lambda }_i| \le C\, \Vert \tilde{{\mathcal{P}}} - {{\mathcal{P}}} \Vert ^{\frac{1}{p_i}} \qquad \text{for all}\,i=1,\ldots , g, \end{aligned}$$

where \(p_i\) is the multiplicity of the root \(\lambda _i\).

This lemma is proven in a more general context in [4, Theorem 2]. For self-consistency, we here give a simple proof based on Rouché’s theorem:

Proof of Lemma 5.2

After the rescaling \(\lambda \rightarrow \nu \lambda \) and \(\lambda _i \rightarrow \nu \lambda _i\) with \(\nu >0\), we can assume that all the roots \(\lambda _i\) are in the unit ball. Then, the polynomial \(\Delta {{\mathcal{P}}} := \tilde{{\mathcal{P}}} - {{\mathcal{P}}}\) is bounded in the ball of radius two by

$$\begin{aligned} |\Delta {{\mathcal{P}}}(\lambda )| \le g\,2^g\, \Vert \Delta {{\mathcal{P}}}\Vert \qquad \text{for all}\,\lambda \ \text{with}\,|\lambda | \le 2. \end{aligned}$$
(5.3)

We denote the minimal distance of distinct eigenvalues by

$$\begin{aligned} D := \min _{\lambda _i \ne \lambda _j} |\lambda _i - \lambda _j| . \end{aligned}$$

Since there is a finite number of roots, it clearly suffices to prove the lemma for one of them. Given \(i \in \{1, \ldots , g\}\), we choose

$$\begin{aligned} \delta = \bigg ( \frac{g\,2^{2g-p_i+1}}{D^{g-p_i}} \, \Vert \Delta {{\mathcal{P}}} \Vert \bigg )^{\frac{1}{p_i}} . \end{aligned}$$
(5.4)

Next, we choose \(\varepsilon \) so small that \(\delta <D/2\). We consider the ball \(\Omega = B_{\delta }(\lambda _i)\). Then, for any \(\lambda \in \partial \Omega \), the polynomial \({{\mathcal{P}}}\) satisfies the bound

$$\begin{aligned} |{{\mathcal{P}}}(\lambda ) | \ge (D/2)^{g-p_i} \,\delta ^{p_i} \ge g\,2^{g+1}\, \Vert \Delta {{\mathcal{P}}} \Vert > |\Delta {{\mathcal{P}}}(\lambda ) | , \end{aligned}$$

where we used (5.4) and (5.3). Therefore, Rouché’s theorem (see for example [27, Theorem 10.36]) implies that the polynomials \({{\mathcal{P}}}\) and \(\tilde{{\mathcal{P}}}\) have the same number of roots in the ball \(\Omega \). Thus, after a suitable ordering of the roots,

$$\begin{aligned} |\lambda _i - \tilde{\lambda }_i| \le \delta . \end{aligned}$$

Using (5.4) gives the result. \(\square \)

Proof of Theorem 5.1

Let \(x, y \in {\mathcal{F}}\). Since both operators x and y vanish on the orthogonal complement of the span their images combined, \(J :=\text{span}(S_x, S_y)\), it suffices to compute the eigenvalues on the finite-dimensional subspace J. Choosing an orthonormal basis of \(S_x=x({\mathcal{H}})\) and extending it to an orthonormal basis of J, the matrix \(x y|_J- {\mathbb {1}}_J\) has the block matrix form

$$\begin{aligned} \begin{pmatrix} x y \pi _x - \lambda {\mathbb {1}} &{} * \\ 0 &{} -\lambda {\mathbb {1}} \end{pmatrix} . \end{aligned}$$

Therefore, its characteristic polynomial is given by

$$\begin{aligned} \det \nolimits _J (x y- {\mathbb {1}}_J) = (-\lambda )^{\dim J -\dim x({\mathcal{H}})} \det \nolimits _{x({\mathcal{H}})} \left(x y \pi _x - \lambda {\mathbb {1}}_{x({\mathcal{H}})} \right) . \end{aligned}$$

This consideration shows that it suffices to analyze the operators \(x y \pi _x\) and similarly \(x \tilde{y} \pi _x\) on the finite-dimensional Hilbert space \(x({\mathcal{H}})\). We denote the corresponding characteristic polynomials by \({{\mathcal{P}}}\) and \(\tilde{{\mathcal{P}}}\), respectively. They are monic polynomials of degree \(g:= \dim x({\mathcal{H}})\). The difference of these polynomials can be estimated in terms of operator norms on \(\mathrm{L}({\mathcal{H}})\) as follows,

$$\begin{aligned} \Vert \tilde{{\mathcal{P}}} - {{\mathcal{P}}}\Vert \le c\left(g, \Vert x\Vert , \Vert y\Vert \right) \, \big \Vert x \tilde{y} \pi _x - x y \pi _x \big \Vert \le c'\left(g, \Vert x\Vert , \Vert y\Vert \right) \, \big \Vert \tilde{y}-y \big \Vert , \end{aligned}$$

valid for all \(\tilde{y}\) with \(\Vert \tilde{y}\Vert \le 2 \,\Vert y\Vert \). According to Lemma 5.2, for sufficiently small \(\Vert y-\tilde{y}\Vert \), the eigenvalues of these matrices can be arranged to satisfy the inequalities

$$\begin{aligned} |\lambda _i - \tilde{\lambda }_i| \le C\, \Vert \tilde{{\mathcal{P}}} -{{\mathcal{P}}} \Vert ^{\frac{1}{p_i}} \le C'\left(x, y \right) \, \big \Vert \tilde{y}-y \big \Vert ^{\frac{1}{p_i}} . \end{aligned}$$

In order to prove (5.2), we consider the estimate

$$\begin{aligned} \left| |x y|^2 - |x \tilde{y}|^2 \right|&\le \sum _{i=1}^g \left| |\lambda _i|^2 -|\tilde{\lambda }_i|^2 \right| \nonumber \\&\le \sum _{i=1}^g |\lambda _i - \tilde{\lambda }_i| \, \left( |\lambda _i| +|\tilde{\lambda }_i| \right) \le \tilde{C}(x, y) \, \big \Vert \tilde{y}-y \big \Vert ^{\frac{1}{g}} \end{aligned}$$
(5.5)

and use that \(g \le 2n\).

It remains to prove (5.1). In the case \(g<2n\), a simple estimate similar to (5.5) gives the result. In the remaining case \(g=2n\), using the abbreviation \(\Delta \lambda _i := \tilde{\lambda }_i - \lambda _i\), we obtain

$$\begin{aligned} \left| {\mathcal{L}}(x, \tilde{y}) - {\mathcal{L}}(x,y) \right|&\le \frac{1}{g} \sum _{i,j=1}^g \left| |\tilde{\lambda }_i - \tilde{\lambda _j} |^2 -|\lambda _i - \lambda _j |^2 \right| \\&\le \frac{1}{g} \sum _{i,j=1}^g \left( 2\, |\Delta \lambda _i - \Delta \lambda _j |\,|\lambda _i - \lambda _j | + |\Delta \lambda _i - \Delta \lambda _j |^2 \right) \\&\le c_2(x, y) \sum _{i,j=1}^g \left( \big \Vert \tilde{y}-y \big \Vert ^{\max \left(\frac{1}{p_i}, \frac{1}{p_j} \right) }\,|\lambda _i -\lambda _j | + \big \Vert \tilde{y}-y \big \Vert ^{\frac{2}{g}} \right) \\&\le c_3(x, y) \sum _{i,j=1}^g \left( \big \Vert \tilde{y}-y \big \Vert ^{\frac{1}{g-1}} + \big \Vert \tilde{y}-y \big \Vert ^{\frac{2}{g}} \right) , \end{aligned}$$

where in the last step we used that whenever \(\lambda _i \ne \lambda _j\), the multiplicities of both roots are at most \(g-1\). The inequality

$$\begin{aligned} \frac{2}{g} = \frac{1}{n} \ge \frac{1}{2n-1} = \frac{1}{g-1} , \end{aligned}$$

yields the desired Hölder inequality with exponent \(1/(2n-1)\). Finally, it is clear from the construction that the constant depends continuously on y. This concludes the proof. \(\square \)

In the case of spin dimension one, the Lagrangian is Lipschitz continuous, in agreement with the findings in [20]. If the spin dimension is larger, one still has Hölder continuity, but the Hölder exponent becomes smaller if the spin dimension is increased. This can be understood from the fact that the higher the spin dimension is, the higher the degeneracies of the eigenvalues of xy can be.

We next prove a global Hölder continuity result.

Theorem 5.3

(Global Hölder continuity) There is a constant c(n) which depends only on the spin dimension such that for all \(x,y \in {\mathcal{F}}\) with \(y \ne 0\), there is a neighborhood \(U\subseteq {\mathcal{F}}\) of y with

$$\begin{aligned} | {\mathcal{L}}(x,y) - {\mathcal{L}}(x,\tilde{y}) | \le c(n) \,\Vert y\Vert ^{2-\frac{1}{2n-1}} \, \Vert x\Vert ^2\, \Vert \tilde{y}-y\Vert ^{\frac{1}{2n-1}} \qquad \text{for all}\,\tilde{y}\in U. \end{aligned}$$
(5.6)

Proof

Without loss of generality, we can assume that \(x \ne 0\). Moreover, using that both sides of the inequality (5.6) have the same scaling behavior under the rescaling

$$\begin{aligned} x \rightarrow \frac{x}{\Vert x\Vert }\;, \quad y \rightarrow \frac{y}{\Vert y\Vert } \;, \quad \tilde{y} \rightarrow \frac{\tilde{y}}{\Vert y\Vert }\;, \end{aligned}$$

it suffices to consider the case that \(\Vert x\Vert =\Vert y\Vert =1\).

Next, choosing a fixed 4n-dimensional subspace of \(I \subseteq {\mathcal{H}}\), we can always find a unitary transformation \(U: {\mathcal{H}}\rightarrow {\mathcal{H}}\) such that \(UxU^{-1}({\mathcal{H}}), UyU^{-1}({\mathcal{H}}) \subseteq I\). Since the Lagrangian and the operator norms are invariant under such joint unitary transformations (as they leave the eigenvalues of xy invariant), we can assume that both x and y map into the fixed finite dimensional subspace I.

After these transformations, the operators x and y can be considered as operators in \(\mathrm{L}(I)\). Therefore, they lie in the compact set \(\overline{B_1(0)} \subseteq \mathrm{L}(I)\). Since the Hölder constant for the local Hölder continuity depends continuously on x and y, a compactness argument shows that we can choose the Hölder constant uniformly in x and y: As the previous arguments show, the local Hölder constant can be written as a continuous function \(c: \mathrm{L}(I)\times \mathrm{L}(I) \rightarrow {\mathbb{R}}^+,\, (x,y) \mapsto c(x,y)\). Since \(\overline{B_1(0)} \times \overline{B_1(0)} \subseteq \mathrm{L}(I)\times \mathrm{L}(I)\) is compact, the local Hölder constant function c is bounded on this set by a constant \(c_{\mathrm{max}}>0\), which can then be taken as the desired global Hölder constant. \(\square \)

Remark 5.4

  1. (1)

    Since the Lagrangian is symmetric, Theorem 5.3 also gives rise to global Hölder continuity with respect to the other argument. Thus, for all \(x,y \in {\mathcal{F}}\) with \(x \ne 0\), there is a neighborhood \(U \subseteq {\mathcal{F}}\) of x such that

    $$\begin{aligned} |{\mathcal{L}}(x,y) - {\mathcal{L}}(\tilde{x},y)| \le c(n) \Vert x\Vert ^{2-\frac{1}{2n-1}} \Vert y\Vert ^2 \Vert \tilde{x}-x\Vert ^{\frac{1}{2n-1}} \qquad \text{for all}\,\tilde{x}\in U. \end{aligned}$$
    (5.7)
  2. (2)

    As explained in the proof of Theorem 5.3, the Lagrangian \({\mathcal{L}}(x,y)\) depends only on the nonzero eigenvalues of xy and these coincide with the eigenvalues of \(xy\pi _x\). Thus, denoting

    $$\begin{aligned} J:=\mathrm{span}(S_x,S_{\tilde{x}}) , \end{aligned}$$

    we immediately obtain the following strengthened version of (5.7): Every \(x\ne 0\) has a neighborhood \(U \subset {\mathcal{F}}\) such that the inequality

    $$\begin{aligned} |{\mathcal{L}}(x,y) - {\mathcal{L}}(\tilde{x},y)|&= |{\mathcal{L}}(x,\pi _J \,y\,\pi _J) -{\mathcal{L}}(\tilde{x},\pi _J \,y\, \pi _J)| \nonumber \\&\le c(n)\, \Vert x\Vert ^{2-\frac{1}{2n-1}} \,\Vert \pi _J \,y\, \pi _J\Vert ^2 \,\Vert \tilde{x} -x\Vert ^{\frac{1}{2n-1}} , \end{aligned}$$
    (5.8)

    holds for all \(\tilde{x} \in U\) and all \(y \in {\mathcal{F}}\). This estimate will be needed for the proof of the chain rule for the integrated Lagrangian \(\ell \) in Theorem 5.9.

  3. (3)

    In the case \(y=0\), a direct estimate of the eigenvalues shows that one has Hölder continuity with the improved exponent two,

    $$\begin{aligned} \left| {\mathcal{L}}(x,\tilde{y}) \right| \le c(n) \, \Vert x\Vert ^2\, \Vert \tilde{y}\Vert ^2 . \end{aligned}$$

    This inequality can be combined with the result of Theorem 5.3 to the statement that for all xy there is a neighborhood \(U\subseteq {\mathcal{F}}\) of y with

    $$\begin{aligned} | {\mathcal{L}}(x,y) - {\mathcal{L}}(x,\tilde{y}) | \le c(n, y) \Vert x\Vert ^2\, \Vert \tilde{y}-y\Vert ^{\frac{1}{2n-1}} \qquad \text{for all}\,\tilde{y}\in U. \end{aligned}$$
    (5.9)

    Likewise, (5.8) generalizes to

    $$\begin{aligned} |{\mathcal{L}}(x,y) - {\mathcal{L}}(\tilde{x},y)| \le c(n,x) \,\Vert \pi _J \,y\, \pi _J\Vert ^2 \,\Vert \tilde{x}-x\Vert ^{\frac{1}{2n-1}} . \end{aligned}$$
    (5.10)

    This inequality will be used in the proof of Theorem 5.9.

\(\square \)

5.2 Definition of Jet Spaces

For the analysis of causal variational principles, the jet formalism was developed in [17]; see also [13, Section 2]. We now generalize the definition of the jet spaces to causal fermion systems in the infinite-dimensional setting. Our method is to work with the expedient subspaces, where for convenience derivatives at x are always computed in the corresponding chart \(\phi _x\). For example, for analyzing the differentiability of a real-valued function f at a point \(x \in {\mathcal{F}}^{\mathrm{reg}}\), we consider the composition

$$\begin{aligned} f \circ \phi _x^{-1} \, :\, \Omega _x \subseteq \mathrm{Symm}(S_x) \oplus \mathrm{L}(J,I) \rightarrow {\mathbb{R}}. \end{aligned}$$

We introduce \(\Gamma ^{\mathrm{diff}}_{\rho }\) as the linear space of all vector fields for which the directional derivative of the function \(\ell \) exists in the sense of expedient subspaces (see Definition 4.2),

$$\begin{aligned} \Gamma ^{\mathrm{diff}}_{\rho } =\Big \{ \mathbf{u }\in C^{\infty }(M, T{\mathcal{F}}^{\mathrm{reg}}) \;\big|\; \mathbf{u }(x) \in {\mathcal{E}}\left( \ell \circ \phi _x^{-1}, \phi _x(x) \right)\ \text{for all}\,x \in M \Big \} . \end{aligned}$$

This gives rise to the jet space

$$\begin{aligned} \mathfrak {J}^{\mathrm{diff}}_{\rho } := C^{\infty }(M, {\mathbb{R}}) \oplus \Gamma ^{\mathrm{diff}}_{\rho } \;\subseteq \; \mathfrak {J}_{\rho } . \end{aligned}$$
(5.11)

We choose a linear subspace \(\mathfrak {J}^{\mathrm{test}}_{\rho } \subseteq \mathfrak {J}^{\mathrm{diff}}_{\rho }\) with the property that its scalar and vector components are both vector spaces,

$$\begin{aligned} \mathfrak {J}^{\mathrm{test}}_{\rho } = C^{\mathrm{test}}(M, {\mathbb{R}}) \oplus \Gamma ^{\mathrm{test}}_{\rho } \;\subseteq \; \mathfrak {J}^{\mathrm{diff}}_{\rho } , \end{aligned}$$
(5.12)

and the scalar component is nowhere trivial in the sense that

$$\begin{aligned} \text{for all}\,x \in M\ \text{there is}\,a \in C^{\mathrm{test}}(M, {\mathbb{R}})\ \text{with}\,a(x) \ne 0. \end{aligned}$$

It is convenient to consider a pair \(\mathfrak {u}:= (a, \mathbf{u })\) consisting of a real-valued function a on M and a vector field \(\mathbf{u }\) on \(T{\mathcal{F}}^{\mathrm{reg}}\) along M and to denote the combination of multiplication and directional derivative by

$$\begin{aligned} \nabla _{\mathfrak {u}} \ell (x) := a(x)\, \ell (x) + \left(D_{\mathbf{u }} \ell \right)(x) . \end{aligned}$$
(5.13)

For the Lagrangian, being a function of two variables \(x,y \in {\mathcal{F}}^{\mathrm{reg}}\), we always work in charts \(\phi _x\) and \(\phi _y\), giving rise to the mapping

$$\begin{aligned} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right) ={\mathcal{L}}\left(\phi _x^{-1}(.), \phi _y^{-1}(.) \right) \, :\, \Omega _x \times \Omega _y \subseteq E \rightarrow {\mathbb{R}}, \end{aligned}$$
(5.14)

where E is the Cartesian product of Banach spaces

$$\begin{aligned} E := \left( \mathrm{Symm}(S_x) \oplus \mathrm{L}(J_x,I_x) \right) \times \left( \mathrm{Symm}(S_y) \oplus \mathrm{L}(J_y,I_y) \right) \end{aligned}$$

with the norm

$$\begin{aligned} \Vert (\psi _x, \psi _y)\Vert _E := \max \left( \Vert \psi _x \Vert _{\mathrm{L}({\mathcal{H}})}, \Vert \psi _y\Vert _{\mathrm{L}({\mathcal{H}})} \right) \end{aligned}$$

(where the subscripts x and y clarify the dependence on the base points, i.e., \(I_x = x(H)\), \(J_x = I_x^{\perp } \subseteq {\mathcal{H}}\) and similarly at y). We denote partial derivatives acting on the first and second arguments by subscripts 1 and 2, respectively. Throughout this paper, we use the following conventions for partial derivatives and jet derivatives:

\({\blacktriangleright}\):

Partial and jet derivatives with an index \(i \in \{ 1,2 \}\), as for example in (5.15), only act on the respective variable of the function \({\mathcal{L}}\). This implies, for example, that the derivatives commute,

$$\begin{aligned} \nabla _{1,\mathfrak {v}} \nabla _{1,\mathfrak {u}} {\mathcal{L}}(x,y) = \nabla _{1,\mathfrak {u}} \nabla _{1,\mathfrak {v}} {\mathcal{L}}(x,y) . \end{aligned}$$
\({\blacktriangleright}\):

The partial or jet derivatives which do not carry an index act as partial derivatives on the corresponding argument of the Lagrangian. This implies, for example, that

$$\begin{aligned} \nabla _{\mathfrak {u}} \int _{{\mathcal{F}}} \nabla _{1,\mathfrak {v}} \, {\mathcal{L}}(x,y) \, \mathrm{d}\rho (y) =\int _{{\mathcal{F}}} \nabla _{1,\mathfrak {u}} \nabla _{1,\mathfrak {v}}\, {\mathcal{L}}(x,y) \, \mathrm{d}\rho (y) . \end{aligned}$$

Definition 5.5

For any \(\ell \in {\mathbb{N}}_0 \cup \{\infty \}\), the jet space \(\mathfrak {J}_{\rho }^{\ell } \subseteq \mathfrak {J}_{\rho }\) is defined as the vector space of test jets with the following properties:

  1. (i)

    The directional derivatives up to order \(\ell \) exist in the sense that

    $$\begin{aligned} \mathfrak {J}^{\ell }_{\rho }&\subseteq \Big \{ (b,\mathbf{v }) \in \mathfrak {J}_{\rho } \,\Big|\, \left(\mathbf{v }(x), \mathbf{v }(y) \right) \in \Gamma ^{\ell }_{\rho }(x,y) \\&\qquad \text{for all}\,y \in M\ \text{and}\,x\ \text{in an open neighborhood of}\,M \subseteq {\mathcal{F}}^{\mathrm{reg}}\Big \} , \end{aligned}$$

    where

    $$\begin{aligned} \Gamma ^{\ell }_{\rho }(x,y) := {\mathcal{E}}^{\ell } \left({\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right), \left(\phi _x(x), \phi _y(y) \right) \right) . \end{aligned}$$

    The higher jet derivatives are defined by using (5.13) and multiplying out, keeping in mind that the partial derivatives act only on the Lagrangian, i.e.,

    $$\begin{aligned}&\nabla ^{p, {\mathcal{E}}} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right)\big|_{(\phi _x(x), \phi _y(y))} \left( \left(\mathfrak {v}_1(x), \mathfrak {v}_1(y) \right), \ldots , \left(\mathfrak {v}_p(x), \mathfrak {v}_p(y) \right) \right) \\&\quad := D^{p, {\mathcal{E}}} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right)\big|_{(\phi _x(x), \phi _y(y))} \left( \left(\mathbf{v }_1(x), \mathbf{v }_1(y) \right), \ldots , \left(\mathbf{v }_p(x), \mathbf{v }_p(y) \right) \right) \\&\qquad \; + \left(b_1(x)+b_1(y) \right) \, D^{p-1, {\mathcal{E}}} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right)\big|_{(\phi _x(x), \phi _y(y))}\\&\qquad \qquad \times \left( \left(\mathbf{v }_2(x), \mathbf{v }_2(y) \right), \ldots , \left(\mathbf{v }_p(x), \mathbf{v }_p(y) \right) \right) \\&\qquad \; + \left(b_2(x)+b_2(y) \right)\, D^{p-1, {\mathcal{E}}} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right)\big|_{(\phi _x(x), \phi _y(y))} \\&\qquad \qquad \times \left( \left(\mathbf{v }_1(x), \mathbf{v }_1(y) \right), \left(\mathbf{v }_3(x), \mathbf{v }_3(y) \right), \ldots , \left(\mathbf{v }_p(x), \mathbf{v }_p(y) \right) \right) \\&\qquad \; + \cdots + \left(b_1(x)+b_1(y) \right) \cdots \left(b_p(x)+b_p(y) \right)\, {\mathcal{L}}(x,y). \end{aligned}$$
  2. (ii)

    The functions

    $$\begin{aligned}&\left( \nabla _{1, \mathfrak {v}_1} + \nabla _{2, \mathfrak {v}_1} \right) \cdots \left( \nabla _{1, \mathfrak {v}_p} + \nabla _{2, \mathfrak {v}_p} \right) {\mathcal{L}}(x,y) \nonumber \\&\qquad := \nabla ^{p, {\mathcal{E}}} {\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right) \big|_{(\phi _x(x), \phi _y(y))} \left( \left(\mathfrak {v}_1(x), \mathfrak {v}_1(y) \right), \ldots , \left(\mathfrak {v}_p(x), \mathfrak {v}_p(y) \right) \right) \end{aligned}$$
    (5.15)

    are \(\rho \)-integrable in the variable y, giving rise to locally bounded functions in x. More precisely, these functions are in the space

    $$\begin{aligned} L^{\infty }_{\mathrm{loc}}\left( M, L^1\left(M, d\rho (y) \right); d\rho (x) \right) . \end{aligned}$$
  3. (iii)

    Integrating the expression (5.15) in y over M with respect to the measure \(\rho \), the resulting function g (defined for all x in an open neighborhood of M) is continuously differentiable in the direction of every jet \(\mathfrak {u}\in \mathfrak {J}^{\mathrm{test}}_{\rho }\), i.e.,

    $$\begin{aligned} \Gamma ^{\mathrm{test}}_x \subseteq {\mathcal{E}}(g, x) \qquad \text{for all}\,x \in M. \end{aligned}$$

5.3 Derivatives of \({\mathcal{L}}\) and \(\ell \) along smooth curves

In this section, we use the chain rule in Proposition 4.4 in order to differentiate the Lagrangian \({\mathcal{L}}\) and the function \(\ell \) along smooth curves.

Theorem 5.6

Let \(\gamma _1\) and \(\gamma _2\) be two smooth curves in \({\mathcal{F}}^{\mathrm{reg}}\),

$$\begin{aligned} \gamma _1, \gamma _2 \in C^{\infty }((-\delta , \delta ), {\mathcal{F}}^{\mathrm{reg}}) . \end{aligned}$$

Setting \(x=\gamma _1(0)\) and \(y=\gamma _2(0)\), we assume that the tangent vectors up to the order \(p=2n-1\) denoted by

$$\begin{aligned} \mathbf{v }_1^{(1)}&:= (\phi _x \circ \gamma _a)'(0) , \ldots , \, \mathbf{v }_1^{(p)} := (\phi _x \circ \gamma _a)^{(p)}(0) \\ \mathbf{v }_2^{(1)}&:= (\phi _y \circ \gamma _a)'(0) , \ldots , \, \mathbf{v }_2^{(p)} := (\phi _y \circ \gamma _a)^{(p)}(0) \end{aligned}$$

are in the expedient differentiable subspace of the Lagrangian, i.e.,

$$\begin{aligned} \left( \mathbf{v }^{(1)}_1, \mathbf{v }^{(1)}_2 \right), \ldots , \left( \mathbf{v }^{(p)}_1, \mathbf{v }^{(p)}_2 \right) \in \Gamma _{\rho }(x,y) . \end{aligned}$$

Then, the function \({\mathcal{L}}(\gamma _1(\tau ), \gamma _2(\tau ))\) is \(\tau \)-differentiable at \(\tau =0\) and the chain rule holds, i.e.,

$$\begin{aligned} \frac{\mathrm{d}}{\mathrm{d}\tau } {\mathcal{L}}\left(\gamma _1(\tau ), \gamma _2(\tau ) \right)\big|_{\tau =0}&= D^{{\mathcal{E}}} \left({\mathcal{L}}\circ \left(\phi _x^{-1} \times \phi _y^{-1}\right)\right) \big|_{(\phi _x(x), \phi _y(y))} \left(\mathbf{v }_1, \mathbf{v }_2 \right) \\&\equiv \left( D_{1, \gamma _1'(0)} + D_{2, \gamma _2'(0)} \right) {\mathcal{L}}(x,y) . \end{aligned}$$

Proof

We again consider the Lagrangian in the charts \(\phi _x\) and \(\phi _y\), (5.14). In order to show that this function is locally Hölder continuous on E, we begin with the estimate

$$\begin{aligned}&\left| {\mathcal{L}}\left(\phi ^{-1}_x(\tilde{\psi }_x), \phi ^{-1}_y(\tilde{\psi }_y) \right) -{\mathcal{L}}(x,y) \right| \\&\quad \le \left| {\mathcal{L}}\left( \phi ^{-1}_x(\tilde{\psi }_x), \phi ^{-1}_y(\tilde{\psi }_y) \right) - {\mathcal{L}}\left(\phi ^{-1}_x(\tilde{\psi }_x), y \right) \big \Vert + \big \Vert {\mathcal{L}}\left(\phi ^{-1}_x(\tilde{\psi }_x), y \right) - {\mathcal{L}}(x,y) \right| \\&\quad \le c \left( \Vert \phi ^{-1}_x(\tilde{\psi }_x) - x\Vert ^{\alpha }_{\mathrm{L}({\mathcal{H}})} + \Vert \phi ^{-1}_y(\tilde{\psi }_y) - y\Vert ^{\alpha }_{\mathrm{L}({\mathcal{H}})} \right) . \end{aligned}$$

Noting that the function

$$\begin{aligned} \phi _x^{-1}(\tilde{\psi }_x) = -\tilde{\psi }_x^* \tilde{\psi }_x \end{aligned}$$

is bilinear and therefore Fréchet-smooth, it follows that

$$\begin{aligned}&\left| {\mathcal{L}}\left(\phi ^{-1}_x(\tilde{\psi }_x), \phi ^{-1}_y(\tilde{\psi }_y) \right) -{\mathcal{L}}(x,y) \right| \\&\quad \le c C \left( \Vert \tilde{\psi }_x - \psi _x \Vert ^{\alpha }_{\mathrm{L}({\mathcal{H}}, I)} +\Vert \tilde{\psi }_y - \psi _y\Vert ^{\alpha }_{\mathrm{L}({\mathcal{H}}, I)} \right) \le 2 c C \,\big \Vert (\tilde{\psi }_x, - \tilde{\psi }_y) -(\psi _x, \psi _y) \big \Vert ^{\alpha }_E , \end{aligned}$$

where \(\psi _x := \phi ^{-1}_x(x)\) and \(\psi _y := \phi ^{-1}_y(y)\). This proves local Hölder continuity on E. Applying Proposition 4.4 gives the result. \(\square \)

We remark that using Proposition 4.5, the above method could be generalized in a straightforward manner to higher derivatives.

Definition 5.7

We call \(\ell \) Hölder continuous with Hölder exponent \(\alpha \) along a smooth curve \(\gamma : I \rightarrow {\mathcal{F}}\) (with I an open interval) if for any \(t_0 \in I\) with \(x_0 = \gamma (t_0)\) there exists a subspace \(E_0 \subseteq \mathrm{Symm}S_{x_0} \oplus {\mathcal{L}}(J_{x_0},I_{x_0})\) and \(\delta >0\) such that the mapping

$$\begin{aligned} \gamma _{x_0}: (t_0-\delta , t_0+\delta ) \rightarrow E_0, \quad t \mapsto \phi _{x_0}\circ \gamma (t) - ({\mathbb {1}},0)\;, \end{aligned}$$

is well defined and locally Hölder continuous with Hölder exponent \(\alpha \).

Theorem 5.8

Let \(\gamma : I \rightarrow {\mathcal{F}}\) be a smooth curve and \(\ell \) Hölder continuous along \(\gamma \) with Hölder exponent \(\alpha \). For \(t_0 \in I\) with \(x_0 = \gamma (t_0)\), we set

$$\begin{aligned} \ell _{x_0}: E_0 \rightarrow {\mathbb{R}},\quad \ell _{x_0}(x) =\ell \circ \phi _{x_0}^{-1} \left(x+({\mathbb {1}},0) \right) . \end{aligned}$$

If for any \(x_0\in I\), the derivatives of \(\gamma _{x_0}\) up to the order \(p:=\lceil q/\alpha \rceil \) lie in the expedient differentiable subspace at \(x_0\), i.e.,

$$\begin{aligned} (\gamma _{x_0})^{(n)}(t_0) \in {\mathcal{E}}^q\left(\ell _{x_0}, 0\right) \quad \mathrm{for\;all\;}n\in \{1, \dots , p\}\;, \end{aligned}$$

then the function \(\ell \circ \gamma = \ell _{x_0} \circ \gamma _{x_0}\) is q-times differentiable at \(t_0\). Moreover, the usual product and chain rules hold for \(\ell _{x_0} \circ \gamma _{x_0}\).

Proof

Applying proposition 4.5 to \(\ell _{x_0}\) and \(\gamma _{x_0}\) yields the claim as the assumptions for this theorem are clearly fulfilled. \(\square \)

We now give a sufficient condition which ensures that \(\ell \) is Hölder continuous along \(\gamma \). This condition needs to be verified in the applications; see for example [25].

Theorem 5.9

Let \(\gamma \) be a smooth curve in \({\mathcal{F}}\) with

$$\begin{aligned} \int _M \big \Vert P(\gamma (\tau ), y) \big \Vert ^4 \, \big \Vert Y^{-1} \big \Vert ^2 \,\mathrm{d}\rho (y) < C \qquad \text{for all}\,\tau \in (-\delta ,\delta ) , \end{aligned}$$

where P(xy) is again the kernel of the fermionic projector (3.11) and Y is (similar to (3.5)) the invertible operator

$$\begin{aligned} Y := y|_{S_y} \, :\, S_y \rightarrow S_y . \end{aligned}$$

Then the integrated Lagrangian \(\ell \) defined by (1.1) is Hölder continuous along \(\gamma \) with Hölder exponent \(\frac{1}{2n-1}\).

Proof

The idea of the proof is to integrate the estimate (5.10) over M. To this end, it is crucial to estimate the factor \(\Vert \pi _J y \pi _J \Vert \). We let \((\tilde{\phi }_i)_{i\in 1, \dots m}\) be an orthonormal basis of J and denote the orthogonal projection on \(\mathrm{span}(\tilde{\phi }_i)\) by \(\pi _i\). Since on the finite-dimensional vector space L(J) all norms are equivalent, we can work with the Hilbert–Schmidt norm of \(\pi _J y\pi _J\), i.e., for a suitable constant \(C=C(n)\),

$$\begin{aligned} \Vert \pi _J \,y\, \pi _J \Vert ^2 = \Vert \pi _J \,y\,Y^{-1}\,y \pi _J \Vert ^2 \le \Vert \pi _J \,y\Vert ^2 \,\Vert Y^{-1}\Vert ^2 \, \Vert y \,\pi _J \Vert ^2 = \Vert \pi _J \,y\Vert ^4 \,\Vert Y^{-1}\Vert ^2 , \end{aligned}$$

where in the last step, we used that the norm of an operator is the same as the norm of its adjoint. Combining this inequality with the estimate

$$\begin{aligned} \big \Vert \pi _J \,y \,\psi \big \Vert ^2&\le \left( \big \Vert \pi _x \,y \,\psi \big \Vert +\big \Vert \pi _{\tilde{x}} \,y \,\psi \big \Vert \right)^2 \le 2 \,\big \Vert \pi _x \,y \,\psi \big \Vert ^2 + 2\,\big \Vert \pi _{\tilde{x}} \,y\, \psi \big \Vert ^2\;, \end{aligned}$$

we obtain

$$\begin{aligned} \Vert \pi _J \,y\, \pi _J \Vert ^2&\le 2\,C(n) \,\left( \big \Vert \pi _x \,y\, \psi \big \Vert ^2 + \big \Vert \pi _{\tilde{x}} \,y\, \psi \big \Vert ^2 \right)^2 \,\big \Vert Y^{-1} \big \Vert ^2 \\&\le 4\, C(n) \,\big \Vert Y^{-1} \big \Vert ^2 \,\left( \big \Vert \pi _x \,y\, \psi \big \Vert ^4 + \big \Vert \pi _{\tilde{x}} \,y\, \psi \big \Vert ^4 \right) \\&= 4\, C(n) \,\big \Vert Y^{-1} \big \Vert ^2 \, \left( \big \Vert P(x,y) \, \psi \big \Vert ^4 + \big \Vert P(\tilde{x},y) \,\psi \big \Vert ^4 \right) . \end{aligned}$$

Using this estimate when integrating (5.10) over M and noting that \(\phi _{x}^{-1}\) is locally Lipschitz (since it is Fréchet-smooth) yields the claim. \(\square \)