1 Introduction

The Benjamin–Ono (BO) equation on the line reads as

$$\begin{aligned} \partial _t u = \mathrm {H} \partial _x^2 u - \partial _x (u^2), \qquad (t,x) \in {\mathbb {R}}\times {\mathbb {R}}, \end{aligned}$$
(1.1)

where u is real-valued and \( \mathrm {H} =-i\mathrm {sign}(\mathrm {D}): L^2({\mathbb {R}}) \rightarrow L^2({\mathbb {R}})\) denotes the Hilbert transform, \(\mathrm {D}=-i \partial _x\),

$$\begin{aligned} \widehat{\mathrm {H}f}(\xi ) = -i \mathrm {sign}(\xi ){\hat{f}}(\xi ) , \qquad \forall f \in L^2({\mathbb {R}}). \end{aligned}$$
(1.2)

\(\mathrm {sign}(\pm \xi )=\pm 1\), for all \(\xi > 0\) and \(\mathrm {sign}(0)=0\), \({\hat{f}} \in L^2({\mathbb {R}})\) denotes the Fourier–Plancherel transform of \(f \in L^2({\mathbb {R}})\). We adopt the convention \(L^p({\mathbb {R}})=L^p({\mathbb {R}}, {\mathbb {C}})\). Its \({\mathbb {R}}\)-subspace consisting of all real-valued \(L^p\)-functions is specially emphasized as \(L^p({\mathbb {R}}, {\mathbb {R}})\) throughout this paper. Equipped with the inner product \((f, g) \in L^2({\mathbb {R}}) \times L^2({\mathbb {R}}) \mapsto \langle f, g \rangle _{L^2} = \int _{{\mathbb {R}}}f(x) \overline{g(x)}\mathrm {d}x \in {\mathbb {C}}\), \(L^2({\mathbb {R}})\) is a \({\mathbb {C}}\)-Hilbert space. Derived by Benjamin [3] and Ono [17], the BO equation (1.1) describes the evolution of weakly nonlinear internal long waves in a two-layer fluid. Equation (1.1) is globally well-posed in every Sobolev space \(H^s({\mathbb {R}}, {\mathbb {R}})\), see Tao [23] for \(s\ge 1\), see Ionescu–Kenig [10] for \(s\ge 0\), etc. On appropriate Sobolev spaces, equation (1.1) can be written in Hamiltonian form

$$\begin{aligned} \partial _t u = \partial _x \nabla _u E(u), \qquad E(u)= \frac{1}{2} \langle |\mathrm {D}|u, u \rangle _{H^{-\frac{1}{2}}, H^{\frac{1}{2}}} - \frac{1}{3}\int _{{\mathbb {R}}} u^3, \end{aligned}$$
(1.3)

where \(\nabla _u E(u)\) denotes the \(L^2({\mathbb {R}})\)-gradient of E, \(\partial _x\) is the Gardner–Faddeev–Zakharov Poisson structure and \(X_E(u)= \partial _x\nabla _u E(u)\) is the Hamiltonian vector field of E with respect to the Poisson structure \(\partial _x\). Since \(\partial _x = - \partial _x^*\) is an unbounded operator on \(L^2({\mathbb {R}}, {\mathbb {R}})\) with domain \(H^1({\mathbb {R}}, {\mathbb {R}})\) and range given by

$$\begin{aligned} \mathcal {W}:=\partial _x \left( H^1({\mathbb {R}}, {\mathbb {R}}) \right) = \{u \in L^2({\mathbb {R}}, {\mathbb {R}}): \int _{{\mathbb {R}}}\tfrac{|{\hat{u}}(\xi )|^2}{|\xi |^2}\mathrm {d}\xi <+\infty \}, \end{aligned}$$
(1.4)

its inverse \(\partial _x^{-1}: \mathcal {W} \rightarrow H^1({\mathbb {R}}, {\mathbb {R}})\) is a symplectic structure on \(\mathcal {W}\). A 2-covector \(\varvec{\omega } \in \varvec{\Lambda }^2(\mathcal {W}^*)\) is defined by \(\varvec{\omega }(h_1, h_2)= \langle h_1, \partial _x^{-1} h_2\rangle _{L^2}\), \(\forall h_1, h_2 \in \mathcal {W}\). Under appropriate conditions on the functionals F and G, \(\varvec{\omega }\) is the symplectic form corresponding to the Gardner bracket, which is defined by

$$\begin{aligned} \{F, G\}(u):=\langle \partial _x \nabla _u F(u), \nabla _u G(u) \rangle _{L^2}. \end{aligned}$$
(1.5)

The goal of this paper is to show the complete integrability of equation (1.1) when restricted to every multi-soliton manifold. Recall the scaling and translation invariances of equation (1.1): if \(u=u(t,x)\) is a solution, so is the function \(u_{c, y} : (t,x) \mapsto c u(c^2 t, c(x-y))\). A smooth solution \(u=u(t,x)\) is called a solitary wave of (1.1) if there exists \(\mathcal {R} \in C^{\infty }({\mathbb {R}})\) solving the following non local elliptic equation

$$\begin{aligned} \mathrm {H}\mathcal {R}' + \mathcal {R} - \mathcal {R}^2=0, \qquad \mathcal {R}(x) >0 \end{aligned}$$
(1.6)

such that \(u(t,x)=\mathcal {R}_c(x-y-ct)\), where \(\mathcal {R}_c(x)=c \mathcal {R}(cx)\), for some \(c>0\) and \(y \in {\mathbb {R}}\). In [2], Amick and Toland have shown that the unique (up to translation) solution of (1.6) is given by

$$\begin{aligned} \mathcal {R}(x)= \tfrac{2}{1+x^2}, \qquad \forall x \in {\mathbb {R}}. \end{aligned}$$
(1.7)

Definition 1.1

For any positive integer \(N \in {\mathbb {N}}_+ :={\mathbb {Z}}\bigcap (0,+\infty )\), the set \(\mathcal {U}_N\) is defined as follows,

$$\begin{aligned}&\mathcal {U}_N:=\{u \in L^2({\mathbb {R}}, {\mathbb {R}}) : u(x)=\sum _{j=1}^N c_j\mathcal {R}(c_j(x-x_j)), c_j >0, \quad x_j \in {\mathbb {R}},\nonumber \\&\quad \forall 1\le j\le N\}. \end{aligned}$$
(1.8)

A function \(u \in \mathcal {U}_N\) is called an N-soliton of the BO equation (1.1). The set of translation–scaling parameters of u is given by \(\mathbf {P}(u):=\{x_1-c_1^{-1}i, x_2-c_2^{-1}i, \ldots , x_N-c_N^{-1}i\}\) and \(\mathbf {m}(z)\) denotes the multiplicity of \(z \in \mathbf {P}(u)\) in the expression of u in (1.8). As a consequence, \(u(x)=\sum _{z \in \mathbf {P}(u)} \frac{-2 \mathbf {m}(z)\mathrm {Im}z}{(x-\mathrm {Re}z)^2 +(\mathrm {Im}z)^2}\) and a polynomial characterization of each N-soliton is given as follows,

$$\begin{aligned} u(x)= \sum _{z \in \mathbf {P}(u)}\mathrm {Im} \tfrac{2 \mathbf {m}(z)}{z-x} = -2 \mathrm {Im} \tfrac{Q_u'(x)}{Q_u(x)}, \quad Q_u (X) := \prod _{z \in \mathbf {P}(u)} (X- z)^{\mathbf {m}(z)}, \end{aligned}$$
(1.9)

where \(Q_u \in {\mathbb {C}}[X]\) is called the characteristic polynomial of \(u, \forall u \in \mathcal {U}_N\).

The set \(\mathcal {U}_N\) is in one to one correspondance with the set \(\mathcal {V}_N\) that consists of all polynomials of degree N with leading coefficient 1, whose roots are contained in the lower half plane \({\mathbb {C}}_- = \{z \in {\mathbb {C}} : \mathrm {Im}z <0\}\). Moreover, the bijection \(u \in \mathcal {U}_N \mapsto Q_u \in \mathcal {V}_N\) provides the real analytic structure on \(\mathcal {U}_N\).

Proposition 1.2

Equipped with the subspace topology of the \({\mathbb {R}}\)-Hilbert space \(L^2({\mathbb {R}}, {\mathbb {R}})\), the subset \(\mathcal {U}_N\) is a simply connected, real analytic, embedded submanifold of \(L^2({\mathbb {R}}, {\mathbb {R}})\) and \(\dim _{{\mathbb {R}}}\mathcal {U}_N =2N\). For every \(u\in \mathcal {U}_N\), the tangent space to \(\mathcal {U}_N\) at u is given by \(\mathcal {T}_u (\mathcal {U}_N) = \bigoplus _{z \in \mathbf {P}(u)} ({\mathbb {R}}^{\mathbf {m}(z)}( \mathrm {Re}\varvec{\phi }_{z})\bigoplus {\mathbb {R}}^{\mathbf {m}(z)}( \mathrm {Im}\varvec{\phi }_{z}))\), where \(\varvec{\phi }_{z}(x):=( x-z)^{-2}\), \(\forall z \in \mathbf {P}(u)\subset {\mathbb {C}}_-\).

Given \(u\in \mathcal {U}_N\), we have \(\int _{{\mathbb {R}}}u=2\pi N\), so the tangent space \(\mathcal {T}_u(\mathcal {U}_N)\) is included in an auxiliary space

$$\begin{aligned} \mathcal {T}:=\{h \in L^2({\mathbb {R}}, (1+x^2)\mathrm {d}x): h({\mathbb {R}})\subset {\mathbb {R}}, \quad {\hat{h}}(0)=0\}. \end{aligned}$$
(1.10)

The Hardy’s inequality yields that \(\mathcal {T}\) is contained in the auxiliary space \(\mathcal {W}\) given by (1.4). So \(\mathcal {W}\bigcap L^2(x^2 \mathrm {d}x) = \mathcal {T}\). We define a real analytic 2-form \(\omega : u \in \mathcal {U}_N \mapsto \varvec{\omega }\in \varvec{\Lambda }^2(\mathcal {W}^*)\), i.e.

$$\begin{aligned}&\omega _u (h_1, h_2) =\frac{i}{2\pi } \int _{{\mathbb {R}}} \frac{{\hat{h}}_1(\xi ) \overline{{\hat{h}}_2(\xi )}}{\xi }\mathrm {d}\xi = - \mathrm {Im} \int _0^{+\infty } \frac{{\hat{h}}_1(\xi )\overline{{\hat{h}}_2(\xi )}}{\pi \xi }\mathrm {d}\xi , \nonumber \\&\qquad \quad \forall h_1, h_2 \in \mathcal {T}_u(\mathcal {U}_N). \end{aligned}$$
(1.11)

Then we show that \(\omega \) establishes the symplectic structure, which corresponds to the Gardner bracket (1.5), on the N-soliton manifold \(\mathcal {U}_N\) defined by (1.8).

Proposition 1.3

Endowed with \(\omega \) in (1.11), the real analytic manifold \((\mathcal {U}_N, \omega )\) is a symplectic manifold. For any smooth function \(f : \mathcal {U}_N \rightarrow {\mathbb {R}}\), let \(X_f \in \mathfrak {X}(\mathcal {U}_N)\) denote its Hamiltonian vector field, then

$$\begin{aligned} X_{f}(u) = \partial _x \nabla _u f(u) \in \mathcal {T}_u (\mathcal {U}_N), \qquad \forall u \in \mathcal {U}_N. \end{aligned}$$
(1.12)

The Gardner bracket in (1.5) coincides with the Poisson bracket associated to the symplectic form \(\omega \), i.e. for another smooth function \(g: \mathcal {U}_N \rightarrow {\mathbb {R}}\), we have \( \omega _u(X_f(u), X_g(u))=\{f,g\}(u)\), \(\forall u \in \mathcal {U}_N\).

The following result indicates the global well-posedness of the BO equation (1.1) on the manifold \(\mathcal {U}_N\).

Proposition 1.4

For every \(N \in {\mathbb {N}}_+\), the manifold \(\mathcal {U}_N\) is invariant under the BO flow.

Remark 1.5

Since \(\mathcal {U}_N \subset H^{\infty }({\mathbb {R}}, {\mathbb {R}}) \bigcap L^2({\mathbb {R}}, x^2 \mathrm {d}x)\) with \(H^{\infty }({\mathbb {R}}, {\mathbb {R}}) := \bigcap _{s\ge 0} H^s({\mathbb {R}}, {\mathbb {R}})\), the energy functional E in (1.3) is well defined on \(\mathcal {U}_N\). So equations (1.1) and (1.3) are equivalent on \(\mathcal {U}_N\).

Inspired from the construction of Birkhoff coordinates of the space-periodic BO equation in Gérard–Kappeler [8], we want to establish the generalized action–angle coordinates of (1.1) on \(\mathcal {U}_N\). Let

$$\begin{aligned} \Omega _N:=\{(r_1, r_2, \ldots , r_N) \in {\mathbb {R}}^N : r_1< r_2< \cdots< r_N <0 \} \end{aligned}$$
(1.13)

denote the subset of actions. For any \(j,k =1,2, \ldots , N\), the Kronecker symbol is denoted by \(\delta _{kj}\), i.e. \(\delta _{kj}=1\) if \(j=k\); \(\delta _{kj}=0\), if \(j\ne k\). The main result of this paper is stated as follows.

Theorem 1

There exists a real analytic diffeomorphism

$$\begin{aligned} \Phi _N : u \in \mathcal {U}_N \mapsto \left( I_1(u), I_2(u),\ldots , I_N(u); \gamma _1(u), \gamma _2(u),\ldots , \gamma _N(u)\right) \in \Omega _N \times {\mathbb {R}}^N\nonumber \\ \end{aligned}$$
(1.14)

such that the following statements hold:

   (i) The Poisson brackets (1.5) between the coordinate functions are well defined and

$$\begin{aligned} \{I_j, I_k\} = 0 , \quad \{I_j, \gamma _k\} = \delta _{kj}, \quad \{\gamma _j, \gamma _k\} =0 \quad \mathrm {on} \quad \mathcal {U}_N, \quad \forall j, k =1,2, \ldots , N.\nonumber \\ \end{aligned}$$
(1.15)

   (ii) The energy functional E defined in (1.3), when expressed in the coordinate functions, is given by

$$\begin{aligned} E(u)= -\frac{1}{2\pi } \sum _{j=1}^N I_j(u)^2, \qquad \forall u \in \mathcal {U}_N. \end{aligned}$$

The coordinates \(\{I_j\}_{1\le j \le N}\) are referred to as actions and \(\{\gamma _j\}_{1\le j \le N}\) as (generalized) angles.

Corollary 1.6

When expressed in the generalized action–angle coordinates \(I_j\), \(\gamma _j\), \(1\le j\le N\), the restriction of the BO equation (1.1) to \(\mathcal {U}_N\) reads as

$$\begin{aligned} \partial _t \left( I_j\circ u\right) (t)=\{E, I_j\}(u(t)) =0, \quad \partial _t \left( \gamma _j\circ u\right) (t) = \{E, \gamma _j\}(u(t))= \mathbf {k}_j(u(t)), \quad \forall t\in {\mathbb {R}}, \end{aligned}$$
(1.16)

where \(\mathbf {k}_j := - \tfrac{I_j}{\pi }\) is referred to as the j th frequency and \(u : t \in {\mathbb {R}} \mapsto u(t) \in \mathcal {U}_N\) solves equation (1.1). As a consequence, \(I_j \circ u(t)= I_j \circ u(0)\) and \(\gamma _j \circ u(t)= \gamma _j \circ u(0) + (\mathbf {k}_j\circ u (0))t\). For any \(\mathbf {r} \in \Omega _N\), \(\Phi _N^{-1}(\{\mathbf {r}\}\times {\mathbb {R}}^N)\) is a Lagrangian submanifold that is invariant under the flow of (1.1).

Remark 1.7

For any \(j =1,2, \ldots , N\), the frequency \(\mathbf {k}_j : \mathcal {U}_N \rightarrow (0,+\infty )\) is a linear function of the action \(I_j\). Hence the motions of the angles are completely decoupled.

Remark 1.8

The image of actions \(\Omega _N\) is a noncompact convex polytope. As a consequence, the manifold \(\mathcal {U}_N\) can be interpreted as the universal covering of the manifold of N-gap potentials \(U_N^{\mathbb {T}}\) for the Benjamin–Ono equation on the torus \(\mathbb {T}:={\mathbb {R}} /2\pi {\mathbb {Z}}\), which is introduced in theorem 7.1 of Gérard–Kappeler [8],

$$\begin{aligned} U_N^{\mathbb {T}} := \{v = 2\mathrm {Re}h \in L^2(\mathbb {T}, {\mathbb {R}}) : \quad h(y)= -e^{iy} \tfrac{{\mathfrak {Q}}'(e^{iy})}{{\mathfrak {Q}} (e^{iy})}, \quad {\mathfrak {Q}} \in {\mathbb {C}}_N^+[X]\}, \end{aligned}$$
(1.17)

where \({\mathbb {C}}_N^+[X]\) consists of all polynomials \({\mathfrak {Q}}\in {\mathbb {C}}[X]\) of degree N with leading coefficient 1, whose roots are contained in the annulus \({\mathscr {A}}:= \{z \in {\mathbb {C}} : |z|>1\}\). Since the fundamental group of \(U_N^{\mathbb {T}}\) is \(({\mathbb {Z}},+)\), the manifold \(U_N^{\mathbb {T}}\) is mapped real bi-analytically onto \( \mathcal {U}_N /{\mathbb {Z}}\). We refer to remark 1.13 to see the comparison between the main theorem 1 and theorem 7.1 of [8].

A precise description of \(\Phi _N\) is given in Definition 5.1 and Theorem 5.2. In order to establish the link between the action–angle coordinates and the translation–scaling parameters of an N-soliton, we introduce the inverse spectral matrix associated to \(\Phi _N\), denoted by \(M : u \in \mathcal {U}_N \mapsto (M_{kj}(u) )_{1\le k, j\le N}\in {\mathbb {C}}^{N\times N}\), where

$$\begin{aligned}&M_{jj}(u):=\gamma _j(u) + \tfrac{\pi i}{I_j(u)}, \quad \forall 1\le j \le N; \quad M_{kj}(u):= \tfrac{2 \pi i}{I_k(u)- I_j(u)} \sqrt{\tfrac{ I_k(u) }{I_j(u)}}, \nonumber \\&\quad \forall 1\le j\ne k\le N. \end{aligned}$$
(1.18)

Proposition 1.9

Given \(u \in \mathcal {U}_N\), the polynomial \(Q_u\) in (1.9) is the characteristic polynomial of the inverse spectral matrix \(M(u) \in {\mathbb {C}}^{N\times N}\) defined by (1.18). As a consequence, an N-soliton is expressed by \(u(x)=\sum _{j=1}^N c_j \mathcal {R} (c_j(x- x_j))\) if and only if its translation–scaling parameters \(\{x_j-c_j^{-1}i\}_{1\le j\le N} \subset {\mathbb {C}}_-\) are eigenvalues with corresponding multiplicities of the matrix M(u), whose coefficients are expressed in terms of the action–angle coordinates \((I_j(u), \gamma _j(u))_{1\le j\le N} \in \Omega _N \times {\mathbb {R}}^N\).

Proposition 1.9 is restated with more details in Theorem 4.8, Proposition 5.4 and Corollary 5.5, which both give a spectral characterization of the N-soliton manifold \(\mathcal {U}_N\) and establish a spectral connection between the inverse spectral matrix \(M(u) \in {\mathbb {C}}^{N\times N}\) and the Lax operator \(L_u\), which is given in Definition 2.2, of the BO equation (1.1), for any \(u \in \mathcal {U}_N\). Then an explicit expression of solutions of equation (1.1) on the multi-soliton manifolds can be deduced by using Corollary 1.6 and Proposition 1.9.

Corollary 1.10

If \(u : t \in {\mathbb {R}} \mapsto u(t) \in \mathcal {U}_N \) solves equation (1.1) such that \(u(0)=u_0\), then for any \((t,x)\in {\mathbb {R}} \times {\mathbb {R}}\), we have

$$\begin{aligned} u(t,x)= u (t,x; u_0) = 2\mathrm {Im} \langle \left( M(u_0) - (x+ \tfrac{t}{\pi }\mathfrak {V}(u_0))\right) ^{-1} X(u_0), Y(u_0)\rangle _{{\mathbb {C}}^N},\qquad \end{aligned}$$
(1.19)

where the inner product of \({\mathbb {C}}^N\) is \(\langle X, Y\rangle _{{\mathbb {C}}^N} = X^T \overline{Y}\); and \(\forall u \in \mathcal {U}_N\), the matrix M(u) is given by (1.18), the matrix \(\mathfrak {V}(u) \in {\mathbb {C}}^{N\times N}\) and the vectors \(X(u), Y(u) \in {\mathbb {C}}^N\) are defined by

$$\begin{aligned} \begin{aligned} \sqrt{2\pi } X(u)^T&= (\sqrt{|I_1(u)|}, \sqrt{|I_2(u)|}, \ldots , \sqrt{|I_N(u)|}), \\ \qquad \sqrt{2\pi }^{-1} Y(u)^T&= (\sqrt{|I_1(u)|^{-1}}, \sqrt{|I_2(u)|^{-1}}, \ldots , \sqrt{|I_N(u)|^{-1}}), \end{aligned}\qquad \mathfrak {V}(u ) = \left( {\begin{matrix} I_1(u ) \\ &{} I_2(u ) \\ &{} &{} \ddots \\ &{} &{} &{} I_N(u ) \end{matrix}}\right) . \end{aligned}$$

One application of the explicit formula (1.19) is to describe the asymptotic behavior of the multi-soliton solutions of the BO equation (1.1).

Corollary 1.11

Given \(u_0 \in \mathcal {U}_N\), we set \(u_{\infty }(t,x)=u_{\infty }(t,x; u_0):=\sum _{j=1}^N \mathcal {R}_{\mathbf {k}_j(u_0)}(x- \gamma _j(u_0) - \mathbf {k}_j(u_0)t)\), where \(\mathcal {R}_c(x)=\frac{2c}{1+c^2 x^2}\) and \(\mathbf {k}_j = - \frac{I_j}{\pi }\). If \(u : t \in {\mathbb {R}} \mapsto u(t) \in \mathcal {U}_N \) solves (1.1) with \(u(0)=u_0\), then

   (i) for any \(R>0\), we have \(\lim _{t \rightarrow \pm \infty }\Vert u(t)-u_{\infty }(t)\Vert _{L^2(-R,R)}=0\);

   (ii) for any \(x \in {\mathbb {R}}\), we have \(\lim _{t \rightarrow \pm \infty }\frac{u(t,x)}{u_{\infty }(t,x)}=1\).

When \(t \rightarrow \pm \infty \), the N-soliton solutions of equation (1.1) can be approximated asymptotically by the superposition of N solitons such that the j th soliton which starts from the point \(\gamma _j(u_0)\), moves with constant velocity \(\mathbf {k}_j(u_0)\) and constant scaling parameter \(\mathbf {k}_j(u_0)\). We refer to Matsuno [14] and the references therein to see another expression of multi-soliton solutions, the soliton interactions, the non linear superposition principle and other asymptotic behaviors of solutions of equation (1.1), which are studied by using Hirota’s bilinear transformation, the pole expansion and the Bäcklund transformation. However, it still remains to solve an algebraic equation (see for instance Proposition 1.9 or formula (3.266) in section \(\mathbf {3.3}\) of Matsuno [14]) by radicals in order to express the velocity & scaling parameter \(\mathbf {k}_j(u_0)\) and the starting point \(\gamma _j(u_0)\) of the asymptotic approximation \(u_{\infty }(u_0)\) in terms of the translation–scaling parameters with corresponding multiplicities of the initial datum \(u_0 \in \mathcal {U}_N\). Compared with Matsuno [14], we give a precise and explicit expression of the velocity & scaling parameter \(\mathbf {k}_j(u_0) =-\frac{I_j(u_0)}{\pi }\) of \(u_{\infty }(u_0)\), thanks to the min-max formula (4.8) and definition 5.1.

Remark 1.12

When \(N=1\), formula (1.19) has been established in Benjamin [3], Ono [17] and Amick–Toland [2]. Moreover, let \(u : t\in {\mathbb {R}} \mapsto u(t) \in \mathcal {U}_1\) solve the BO equation (1.1), if \(u(0,x)=\frac{2c_1}{c_1^2(x-x_1)^2 +1}\) for some \(x_1 \in {\mathbb {R}}\) and \(c_1>0\), then \(u_{\infty }(t,x)= u(t,x) =\tfrac{2c_1}{c_1^2(x-(x_1+c_1 t))^2 +1}\), \(\forall (t, x) \in {\mathbb {R}}^2\).

1.1 Notation

Before outlining the construction of action–angle coordinates, we introduce some notations used in this paper. The indicator function of a subset \(A \subset X\) is denoted by \(\mathbf {1}_A\), i.e. \(\mathbf {1}_A(x) =1\) if \(x \in A\) and \(\mathbf {1}_A(x) =0\) if \(x \in X\backslash A\). Recall that \(\mathrm {H} : L^2({\mathbb {R}}) \rightarrow L^2({\mathbb {R}})\) denotes the Hilbert transform given by (1.2). Set \(\mathrm {Id}_{L^2( {\mathbb {R}})} (f)=f\), for every \(f \in L^2({\mathbb {R}})\). Let \(\Pi : L^2({\mathbb {R}}) \rightarrow L^2({\mathbb {R}})\) denote the Szegő projector, defined by

$$\begin{aligned} \Pi := \tfrac{\mathrm {Id}_{L^2( {\mathbb {R}})}+i\mathrm {H}}{2} \Leftrightarrow \widehat{\Pi f}(\xi )= \mathbf {1}_{[0,+\infty )}(\xi ) {\hat{f}}(\xi ), \quad \forall \xi \in {\mathbb {R}}, \quad \forall f \in L^2({\mathbb {R}}). \end{aligned}$$
(1.20)

If \(\mathfrak {O}\) is an open subset of \({\mathbb {C}}\), we denote by \(\mathrm {Hol}(\mathfrak {O})\) all holomorphic functions on \(\mathfrak {O}\). Let the upper half-plane and the lower half-plane be denoted by \({\mathbb {C}}_+ = \{z \in {\mathbb {C}} : \mathrm {Im}z >0\}\) and \({\mathbb {C}}_- = \{z \in {\mathbb {C}} : \mathrm {Im}z <0\}\) respectively. For every \(p \in (0, +\infty ]\), we denote by \(L^p_+\) to be the Hardy space on \({\mathbb {C}}_+\), which is defined by \(L^p_+ =L^p_+ ({\mathbb {R}}): = \{g \in \mathrm {Hol}({\mathbb {C}}_+) : \Vert g\Vert _{L^p_+}<+\infty \}\), where

$$\begin{aligned} \Vert g\Vert _{L^p_+} = \sup _{y >0} \left( \int _{{\mathbb {R}}}|g(x+iy)|^p\mathrm {d}x\right) ^{\frac{1}{p}}, \qquad \mathrm {if} \quad p \in (0, +\infty ), \end{aligned}$$
(1.21)

and \(\Vert g\Vert _{L^{\infty }_+}=\sup _{z \in {\mathbb {C}}_+}|g(z)|\). A function \(g \in L^{\infty }_+\) is called an inner function if \(|g |=1\) on \({\mathbb {R}}\). When \(p=2\), the Paley–Wiener theorem yields the identification between \(L^2_+\) and \(\Pi [L^2({\mathbb {R}})]\):

$$\begin{aligned} L^2_+ = \mathcal {F}^{-1}[ L^2(0, +\infty )] = \{f \in L^2({\mathbb {R}}) : \mathrm {supp} {\hat{f}} \subset [0, +\infty )\} = \Pi (L^2({\mathbb {R}})), \end{aligned}$$

where \(\mathcal {F} : f\in L^2({\mathbb {R}}) \mapsto {\hat{f}}\in L^2({\mathbb {R}})\) denotes the Fourier–Plancherel transform. Similarly, we set \(L^2_-= (\mathrm {Id}_{L^2( {\mathbb {R}})} - \Pi )( L^2({\mathbb {R}}))\). Let the filtered Sobolev spaces be denoted as \(H^s_+:=L^2_+ \bigcap H^s({\mathbb {R}})\) and \(H^s_-:=L^2_- \bigcap H^s({\mathbb {R}})\), for every \(s\ge 0\). We set \(H^{\infty }({\mathbb {R}}, {\mathbb {R}}):= \bigcap _{s \ge 0} H^s ({\mathbb {R}}, {\mathbb {R}})\).

The domain of definition of an unbounded operator \(\mathcal {A}\) on some Hilbert space \(\mathcal {E}\) is denoted by \(\mathbf {D}(\mathcal {A}) \subset \mathcal {E}\). Given another operator \(\mathcal {B}\) on \(\mathbf {D}(\mathcal {B}) \subset \mathcal {E}\) such that \(\mathcal {A}(\mathbf {D}(\mathcal {A}))\subset \mathbf {D}(\mathcal {B})\) and \(\mathcal {B}(\mathbf {D}(\mathcal {B})) \subset \mathbf {D}(\mathcal {A})\), their Lie bracket is an operator defined on \(\mathbf {D}(\mathcal {A}) \bigcap \mathbf {D}(\mathcal {B}) \subset \mathcal {E}\), which is given by \([\mathcal {A}, \mathcal {B}]:=\mathcal {A} \mathcal {B} - \mathcal {B}\mathcal {A}\). If the operator \(\mathcal {A}\) is self-adjoint, let \(\sigma (\mathcal {A})\) denote its spectrum, \(\sigma _{\mathrm {pp}}(\mathcal {A})\) denotes the set of its eigenvalues and \(\sigma _{\mathrm {cont}}(\mathcal {A})\) denotes its continuous spectrum. Then \(\sigma _{\mathrm {cont}}(\mathcal {A}) \bigcup \overline{\sigma _{\mathrm {pp}}(\mathcal {A})}= \sigma (A) \subset {\mathbb {R}}\). Given two \({\mathbb {C}}\)-Hilbert spaces \(\mathcal {E}_1\) and \(\mathcal {E}_2\), let \(\mathfrak {B}(\mathcal {E}_1, \mathcal {E}_2)\) denote the \({\mathbb {C}}\)-Banach space that consists of all bounded \({\mathbb {C}}\)-linear transformations \(\mathcal {E}_1 \rightarrow \mathcal {E}_2\), equipped with the uniform norm. We set \(\mathfrak {B}(\mathcal {E}_1):=\mathfrak {B}(\mathcal {E}_1, \mathcal {E}_1)\).

All manifolds introduced in this paper are smooth manifolds without boundary. Given a smooth manifold \(\mathbf {M}\) of real dimension N, let \(C^{\infty }(\mathbf {M})\) denote all smooth functions \(f : \mathbf {M} \rightarrow {\mathbb {R}}\) and the set of all smooth vector fields is denoted by \(\mathfrak {X}(\mathbf {M})\). The tangent (resp. cotangent) space to \(\mathbf {M}\) at \(p \in \mathbf {M}\) is denoted by \(\mathcal {T}_p(\mathbf {M})\) (resp. \(\mathcal {T}_p^*(\mathbf {M})\)). Given \(k\in {\mathbb {N}}_+\), the \({\mathbb {R}}\)-vector space of smooth k-forms on \(\mathbf {M}\) is denoted by \(\varvec{\Omega }^k(\mathbf {M})\). Given a \({\mathbb {R}}\)-vector space \(\mathbb {V}\), we denote by \(\varvec{\Lambda }^k(\mathbb {V}^*)\) the vector space of all k-covectors on \(\mathbb {V}\). Given a smooth covariant tensor field \(\mathbf {A}\) on \(\mathbf {M}\) and \(X\in \mathfrak {X}(\mathbf {M})\), the Lie derivative of \(\mathbf {A}\) with respect to X is denoted by \(\mathscr {L}_X (\mathbf {A})\), which is also a smooth tensor field on \(\mathbf {M}\). If \(\mathbf {N}\) is another smooth manifold, \(\mathbf {F} : \mathbf {N} \rightarrow \mathbf {M}\) is a smooth map and \(\mathbf {A}\) is a smooth covariant k-tensor field on \(\mathbf {M}\), the pullback of \(\mathbf {A}\) by \(\mathbf {F}\), denoted by \(\mathbf {F}^*\mathbf {A}\), is a smooth k-tensor field on \(\mathbf {N}\) that is defined by \(\forall p \in \mathbf {N}\), \(\forall j =1,2, \ldots , k\),

$$\begin{aligned}&(\mathbf {F}^* \mathbf {A})_p (v_1,v_2, \ldots , v_k)= \mathbf {A}_{\mathbf {F}(p)} \left( \mathrm {d}\mathbf {F}(p)(v_1), \mathrm {d}\mathbf {F}(p)(v_2), \ldots , \mathrm {d}\mathbf {F}(p)(v_k) \right) , \nonumber \\&\quad \forall v_j \in \mathcal {T}_p(\mathbf {N}). \end{aligned}$$
(1.22)

Given a positive integer N, let \({\mathbb {C}}_{\le N-1}[X]\) denote the \({\mathbb {C}}\)-vector space of all polynomials with complex coefficients whose degree is no greater than \(N-1\) and \({\mathbb {C}}_N[X]= {\mathbb {C}}_{\le N}[X] \backslash {\mathbb {C}}_{\le N-1}[X]\) consists of all polynomials of degree exactly N. Given \(Q \in {\mathbb {C}}_N[X]\), we set \(\overline{Q}(X):=\sum _{j=0}^{N}\overline{a}_j X^j\), if \(Q(X)=\sum _{j=0}^{N}a_j X^j\). We set \({\mathbb {R}}_+=[0,+\infty )\), \( {\mathbb {R}}_+^*=(0,+\infty )\) and \({\mathbb {C}}^*= {\mathbb {C}}\backslash \{0\}\). Let \(D(z,r) \subset {\mathbb {C}}\) denote the open disc of radius \(r>0\), whose center is \(z \in {\mathbb {C}}\) and its boundary is denoted by \(\mathscr {C}(z, r)= \partial D(z, r)=\{\eta \in {\mathbb {C}} : |\eta - z| =r\}\).

1.2 Organization of the paper

The construction of action–angle coordinates for the BO equation (1.3) on \(\mathcal {U}_N\) mainly relies on the Lax pair formulation \(\partial _t L_u = [B_u, L_u]\), discovered by Nakamura [15] and Bock–Kruskal [4]. Section 2 is dedicated to the spectral analysis of the Lax operator \(L_u : h \in H^1_+ \mapsto -i \partial _x h - \Pi (uh) \in L^2_+\) given by definition 2.2 for general symbol \(u \in L^2({\mathbb {R}}, {\mathbb {R}})\), where \(\Pi \) denotes the Szegő projector given in (1.20) and the Hardy space \(L^2_+\) is given in (1.21). Then \(L_u\) is an unbounded self-adjoint operator on \(L^2_+\) that is bounded from below, it has essential spectrum \(\sigma _{\mathrm {ess}}(L_u)=[0, +\infty )\). In addition, if \(u \in L^2({\mathbb {R}}, x^2\mathrm {d}x)\bigcap L^2({\mathbb {R}}, {\mathbb {R}})\), every eigenvalue of \(L_u\) is negative and simple, thanks to the following identity,

$$\begin{aligned} \lambda |{\hat{\varphi }}(0)|^2 = -2\pi \Vert \varphi \Vert _{L^2}^2, \quad \mathrm {if} \quad \lambda \in {\mathbb {R}} \quad \mathrm {and} \quad \varphi \in \mathrm {Ker}(\lambda -L_u), \end{aligned}$$
(1.23)

which is firstly found by Wu [24] in the case \(\lambda <0\). Then we introduce a generating functional which encodes the entire BO hierarchy,

$$\begin{aligned} \mathcal {H}_{\lambda }(u)= \langle (L_u+ \lambda )^{-1} \Pi u , \Pi u\rangle _{L^2}, \qquad \mathrm {if} \quad \lambda \in {\mathbb {C}}\backslash \sigma (-L_u), \end{aligned}$$
(1.24)

in Definition 2.14. It provides a sequence of conservation laws controlling every Sobolev norm.

In Sect. 3, we study the shift semigroup \((S(\eta )^*)_{\eta \ge 0}\) acting on the Hardy space \(L^2_+\), where \(S(\eta )f = e_{\eta } f\) and \(e_{\eta }(x)=e^{i\eta x}\). Then a weak version of the Lax Theorem 3.2, which is stated as Lemma 3.3, can be obtained by solving a linear differential equation with constant coefficients. Every N-dimensional subspace of \(L^2_+\) that is invariant under the infinitesimal generator \(G = i \frac{\mathrm {d}}{\mathrm {d}\eta }\big |_{\eta =0^+} S(\eta )^*\) is of the form \(\frac{{\mathbb {C}}_{\le N-1}[X]}{Q}\), for some monic polynomial \(Q \in {\mathbb {C}}_N[X]\) whose roots are contained in the lower half-plane \({\mathbb {C}{\_}}\).

In Sect. 4, the real analytic structure and symplectic structure of the N-soliton subset \(\mathcal {U}_N\) are established at first. Then we continue the spectral analysis of \(L_u\), \(\forall u \in \mathcal {U}_N\). The Lax operator \(L_u\) has N simple eigenvalues \(\lambda _1^u<\lambda _2^u< \cdots<\lambda _N^u <0\) and the Hardy space \(L^2_+\) splits as

$$\begin{aligned}&L^2_+ = \mathscr {H}_{\mathrm {cont}}(L_u) \bigoplus \mathscr {H}_{\mathrm {pp}}(L_u), \quad \mathscr {H}_{\mathrm {cont}}(L_u) = \mathscr {H}_{\mathrm {ac}}(L_u) = \Theta _u L^2_+, \nonumber \\&\quad \mathscr {H}_{\mathrm {pp}}(L_u) = \tfrac{{\mathbb {C}}_{\le N-1}[X]}{Q_u}, \end{aligned}$$
(1.25)

where \(Q_u\) denotes the characteristic polynomial of u given by (1.9) and \(\Theta _u = \tfrac{\overline{Q}_u}{Q_u}\) is an inner function on the upper half-plane \({\mathbb {C}}_+\). Proposition 1.9 is proved by identifying M(u) in (1.18) as the matrix of the restriction \(G|_{\mathscr {H}_{\mathrm {pp}}(L_u)}\) associated to the spectral basis \(\{\varphi _1^u, \varphi _2^u, \ldots , \varphi _N^u\}\), where \(\varphi _j^u \in \mathrm {Ker}(\lambda _j^u-L_u)\) such that \(\Vert \varphi _j^{u}\Vert _{L^2} =1\) and \(\int _{{\mathbb {R}}}u \varphi _j^u >0\). The generating function \(\mathcal {H}_{\lambda }\) in (1.24) can be identified as the Borel–Cauchy transform of the spectral measure of \(L_u\) associated to \(\Pi u\), which yields the invariance of \(\mathcal {U}_N\) under the BO flow in \(H^{\infty }({\mathbb {R}}, {\mathbb {R}})\). Hence (1.3) is a globally well-posed Hamiltonian system on \(\mathcal {U}_N\).

Section 5 is dedicated to completing the proof of theorem 1. The generalized angle variables are the real parts of the diagonal elements of the matrix M(u), i.e. \(\gamma _j : u \in \mathcal {U}_N \mapsto \mathrm {Re} \langle G \varphi _j^u, \varphi _j^u \rangle _{L^2} \in {\mathbb {R}}\) and the action variables are \(I_j : u \in \mathcal {U}_N \mapsto 2\pi \lambda _j^u \in {\mathbb {R}}\). Thanks to the Lax pair formulation \(\mathrm {d}L(u) (X_{\mathcal {H}_{\lambda }}(u))=[B_u^{\lambda }, L_u]\), where \(L : u \in \mathcal {U}_N \mapsto L_u \in \mathfrak {B}(H^1_+, L^2_+)\) is \({\mathbb {R}}\)-affine and \(B^{\lambda }_u\) is some skew-adjoint operator on \(L^2_+\), we have \(2\pi \{\lambda _j, \gamma _k\}= \delta _{kj}\) and \(\{\lambda _j, \lambda _k\}= 0\) on \(\mathcal {U}_N\), \(1\le j , k \le N\). Then \(\Phi _N : \mathcal {U}_N \rightarrow \Omega _N \times {\mathbb {R}}^N\) is a real analytic immersion. The diffeomorphism property of \(\Phi _N\) is given by Hadamard’s global inverse function theorem. Finally, we show that \(\Phi _N : (\mathcal {U}_N, \omega ) \rightarrow (\Omega _N \times {\mathbb {R}}^N, \nu )\) is a symplectomorphism by restricting \( \omega - \Phi _N^*\nu \) to a special Lagrangian submanifold \(\Lambda _N:=\bigcap _{j=1}^N \gamma _j^{-1}(0) \subset \mathcal {U}_N\). Corollary 1.11 is proved in Sect. 6.

Remark 1.13

(Comparison with Gérard–Kappeler [8]) The BO equation on \(\mathbb {T}={\mathbb {R}} /2\pi {\mathbb {Z}}\) reads as

$$\begin{aligned} \partial _t v= \mathrm {H}^{\mathbb {T}} \partial _x^2 v - \partial _x (v^2), \quad (t,x)\in {\mathbb {R}} \times \mathbb {T}, \end{aligned}$$
(1.26)

where \(\mathrm {H}^{\mathbb {T}}\) denotes the Hilbert transform on \(L^2(\mathbb {T}, {\mathbb {C}})\) that is defined by \(\mathrm {H}^{\mathbb {T}}f(x)= -i \sum _{|n| \ge 1} \frac{|n|}{n} {\hat{f}}(n) e^{inx}\), \(\forall f= \sum _{n\in {\mathbb {Z}}}{\hat{f}}(n) e^{inx} \in L^2(\mathbb {T}, {\mathbb {C}})\). It can be written in Hamiltonian form on appropriate Sobolev spaces

$$\begin{aligned} \partial _t v = \partial _x \nabla _v E^{\mathbb {T}}(v), \quad E^{\mathbb {T}}(v)= \frac{1}{2\pi }\int _0^{2\pi } \left( \frac{1}{2} (|\partial _x |^{\frac{1}{2}}v(x))^2 -\frac{1}{3}v(x)^3 \right) \mathrm {d}x, \end{aligned}$$
(1.27)

where \(X_{E^{\mathbb {T}}}(v) =\partial _x \nabla _v E^{\mathbb {T}}(v)\) is the Hamiltonian vector field of \(E^{\mathbb {T}}\) with respect to the symplectic form

$$\begin{aligned}&\omega ^{\mathbb {T}}(f_1,f_2) = \sum _{|n|\ge 1} \frac{i}{2\pi n}{\hat{f}}_1(n) \overline{{\hat{f}}_2(n)}, f_j = \sum _{|n|\ge 1}{\hat{f}}_j(n)e^{inx} \in L^2_{r,0}(\mathbb {T}):=\{v \in L^2(\mathbb {T}, {\mathbb {R}}) : {\hat{v}}(0)=0\}.\nonumber \\ \end{aligned}$$
(1.28)

The global Birkhoff coordinates for (1.26) on \(L^2_{r,0}(\mathbb {T})\) described in theorem 1.1 of [8] is denoted by

$$\begin{aligned} \varvec{\zeta }^{\mathbb {T}} : v \in L^2_{r,0}(\mathbb {T}) \mapsto (\zeta _n(v))_{n\ge 1} \in \mathfrak {h}_+^{\frac{1}{2}}, \end{aligned}$$
(1.29)

where \(\mathfrak {h}_+^{\frac{1}{2}} := \{(z_n)_{n \in {\mathbb {N}}_+} \subset {\mathbb {C}} : \Vert (z_n)_{n \in {\mathbb {N}}_+}\Vert _{\frac{1}{2}}^2 := \sum _{n\ge 1}|n||z_n|^2 <+\infty \}\) is a weighted \(\ell ^2\)-sequence space. Thanks to theorem 7.1 of [8], the N-gap potential manifold \(U_N^{\mathbb {T}}\) defined by (1.17) is a connected, real analytic, symplectic submanifold of \((L^2_{r,0}(\mathbb {T}), \omega ^{\mathbb {T}})\) given by (1.28) and \(U_N^{\mathbb {T}}\) is characterized by

$$\begin{aligned} U_N^{\mathbb {T}} = \{v \in L^2_{r,0}(\mathbb {T})\quad : \quad \zeta _N(v) \ne 0, \quad \zeta _j(v)=0, \quad \forall j > N\}. \end{aligned}$$
(1.30)

So it is invariant under the flow of equation (1.26) and \(\dim _{{\mathbb {R}}} U_N^{\mathbb {T}} = 2N\). Let \({\tilde{\nu }} := i \sum _{j=1}^N \mathrm {d}z_j \wedge \mathrm {d}\overline{z}_j\) denote the canonical symplectic form on \({\mathbb {C}}^{N-1} \times {\mathbb {C}}^*\). The restriction of complex Birkhoff coordinates \(\varvec{\zeta }^{\mathbb {T}}\) given by (1.29), to the manifold \(U_N^{\mathbb {T}}\) establishes a real analytic diffeomorphism

$$\begin{aligned} \varvec{\zeta }_N^{\mathbb {T}} := \left( \varvec{\zeta }^{\mathbb {T}}\right) \big |_{U_N^{\mathbb {T}}} : v \in U_N^{\mathbb {T}} \mapsto (\zeta _1(v), \zeta _2(v), \ldots , \zeta _N(v)) \in {\mathbb {C}}^{N-1} \times {\mathbb {C}}^*, \end{aligned}$$
(1.31)

such that \(\varvec{\zeta }_N^{\mathbb {T}}\) preserves the symplectic structure, i.e. \(\left( \varvec{\zeta }_N^{\mathbb {T}} \right) ^* {\tilde{\nu }} = \omega ^{\mathbb {T}}\), and the energy functional \(E^{\mathbb {T}}\) in (1.27), when expressed in the coordinate functions, is given by

$$\begin{aligned} E^{\mathbb {T}}(v) = \sum _{n=1}^N n^2|\zeta _n(v)|^2 - \sum _{n=1}^N \left( \sum _{k=n}^N|\zeta _k(v)|^2\right) ^2, \qquad \forall v \in U_N^{\mathbb {T}}. \end{aligned}$$

The generating functional defined in (1.24) plays a key role in proving the local diffeomorphism property and the symplectomorphism property of action–angle\(/\)Birkhoff map in both theorem 1 of this paper and theorem 7.1 of [8]. The real analytic structure of \(\mathcal {U}_N\) (resp. \(U_N^{\mathbb {T}}\)) is constructed by establishing a real analytic embedding from an open subset of \({\mathbb {C}}^N\) to \(L^2({\mathbb {R}}, {\mathbb {R}})\) (resp. \(L^2(\mathbb {T}, {\mathbb {R}})\)) with range given by \(\mathcal {U}_N\) (resp. \(U_N^{\mathbb {T}}\)). A real analytic covering map from the N-soliton manifold \(\mathcal {U}_N\) to the N-gap potential manifold \(U_N^{\mathbb {T}}\) is established in remark 1.8. However, the construction of the action–angle map \(\Phi _N\) in (1.14) is quite different from the construction of the Birkhoff map \(\varvec{\zeta }^{\mathbb {T}}\) in [8].

\(\mathbf {1}.\) The symplectic form \(\omega ^{\mathbb {T}}\) given by (1.28) is well defined on \(L^2_{r,0}(\mathbb {T})\), which is a \({\mathbb {C}}\)-Hilbert space that contains every manifold \(U_N^{\mathbb {T}}\). So \(U_N^{\mathbb {T}}\) is a symplectic submanifold of \((L^2_{r,0}, \omega ^{\mathbb {T}})\). The BO equation on the torus (1.26), when restricted to \(U_N^{\mathbb {T}}\), is interpreted as an integrable subsystem of equation (1.26) on \((L^2_{r,0}, \omega ^{\mathbb {T}})\). On the other hand, in the space non-periodic regime, we do not know whether there exists a large submanifold of \(L^2({\mathbb {R}}, {\mathbb {R}})\), denoted by \(\mathfrak {L}\), such that \(\mathfrak {L}\) contains every multi-soliton manifold \(\mathcal {U}_N\), \(\mathfrak {L}\) is invariant under the flow of (1.1), and there exist action–angle coordinates for the BO equation (1.1) on \(\mathfrak {L}\), whose restriction to \(\mathcal {U}_N\) is \(\Phi _N\) given in (1.14). Evidently, \(\mathfrak {L}\) can not be chosen as \( \mathcal {W}= \partial _x (H^1({\mathbb {R}}, {\mathbb {R}}))\) given by (1.4), because \(\mathcal {U}_N \bigcap \mathcal {W}=\emptyset \). However, the 2-covector \(\varvec{\omega } : (h_1, h_2) \in \mathcal {W}^2 \mapsto \langle h_1, \partial _x^{-1} h_2\rangle _{L^2({\mathbb {R}})}\) is defined on \( \mathcal {W}\). The extension of the symplectic form \(\omega \in \varvec{\Omega }^2(\mathcal {U}_N)\), which is defined by (1.11), to the manifold \(\mathfrak {L}\) would be the major difficulty for constructing action–angle coordinates of the BO equation (1.1) on \(\mathfrak {L}\). Since \(\mathcal {U}_N \bigcap \mathcal {W}=\emptyset \), we have to use Cartan’s formula (4.2) in order to prove the closedness of the 2-form \(\omega : u\in \mathcal {U}_N \mapsto \omega _u= \varvec{\omega } \in \varvec{\Lambda }^2(\mathcal {W}^*)\), which may not be interpreted as a pullback of \(\varvec{\omega }\). Moreover, the simple connectedness of \(\mathcal {U}_N\) is established by a special property of the Viète map (4.1).

\(\mathbf {2}.\) In any case, the Lax operator for the BO equation is self-adjoint and bounded from below. The spectrum of the Lax operator \(L_v^{\mathbb {T}}\) in the space-periodic regime consists of a sequence of simple eigenvalues \(\sigma (L_v^{\mathbb {T}}) = \{\lambda _0^{\mathbb {T}}(v)< \lambda _1^{\mathbb {T}}(v) < \cdots \} \subset {\mathbb {R}}\) and the gap between each two of them is at least 1. Then the n th action variable is defined by \(|\zeta _n(v)|^2 := \lambda _n^{\mathbb {T}}(v) - \lambda _{n-1}^{\mathbb {T}}(v)-1\) in [8], \(\forall n \ge 1\). However, in order to prove the simplicity and negativeness of eigenvalues of the Lax operator \(L_u\) in Definition 2.2 for the BO equation on the line (1.1), we have to introduce the auxiliary identity (1.23). The action variables for equation (1.1) on \(\mathcal {U}_N\) are actually the eigenvalues of \(2 \pi L_u\), \(\forall u \in \mathcal {U}_N\).

\(\mathbf {3}.\) The shift operator \(S^{\mathbb {T}} : f\in L^2_+(\mathbb {T}) \mapsto e^{ix} f(x)\in L^2_+(\mathbb {T})\) and its adjoint are bounded operators on the Hardy space \(L^2_+(\mathbb {T}):= \Pi ^{\mathbb {T}}(L^2 (\mathbb {T}, {\mathbb {C}}))\), where \(\Pi ^{\mathbb {T}} : \sum _{n \in {\mathbb {Z}}}g_n e^{inx} \in L^2 (\mathbb {T}, {\mathbb {C}}) \mapsto \sum _{n \ge 0}g_n e^{inx} \in L^2 (\mathbb {T}, {\mathbb {C}})\) denotes the Szegő projector on \(L^2 (\mathbb {T}, {\mathbb {C}})\). So both the inverse formula for \(v \in L^2(\mathbb {T}, {\mathbb {R}})\), which is denoted by formula (4.5) in [8], and the spectral characterization of \(U_N^{\mathbb {T}}\), which is given by formula (1.30) of this paper and (7.2) in [8], can be directly obtained by computing the 0 th Fourier mode of each eigenfunction of the space-periodic Lax operator \(L_v^{\mathbb {T}}\) without using Beurling’s theorem that characterizes all the shift-invariant subspaces of \(L^2_+(\mathbb {T}) = \mathscr {H}_{\mathrm {pp}}(L_v^{\mathbb {T}})\). On the other hand, in the case of the BO equation on the line (1.1), the Lax operator \(L_u\) in definition 2.2 has not only eigenvalues but also continuous spectrum. In order to determine the characteristic polynomial \(Q_u\) in (1.9) and prove the spectral characterization Theorem 4.8 for \(\mathcal {U}_N\), we have to do the spectral decomposition (1.25) and identify each spectral subspace as the corresponding closed shift-invariant (also called translation-invariant) subspace by both introducing the two shift semigroups \((S(\eta ))_{\eta \ge 0}\), \((S(\eta )^*)_{\eta \ge 0}\), and using Lax’s scalar representation Theorem 3.2 or its special case stated as Lemma 3.3. In fact, \(\forall u \in \mathcal {U}_N\), the spectral subspace \(\mathscr {H}_{\mathrm {ac}}(L_u)\) is invariant under \((S(\eta ))_{\eta \ge 0}\); the spectral subspace \(\mathscr {H}_{\mathrm {pp}}(L_u)\) is invariant under \((S(\eta )^*)_{\eta \ge 0}\) and \(\dim _{{\mathbb {C}}}\mathscr {H}_{\mathrm {pp}}(L_u) =N\). Since the infinitesimal generator \(G = i \frac{\mathrm {d}}{\mathrm {d}\eta }\big |_{\eta =0^+} S(\eta )^*\) is an unbounded, densely defined operator on \(L^2_+ ({\mathbb {R}})\), given by (3.2), we study its restriction to the N-dimensional spectral subspace \(\mathscr {H}_{\mathrm {pp}}(L_u)\). Then Lemma 3.3 yields that \(Q_u(X)= \det (X- G|_{\mathscr {H}_{\mathrm {pp}}(L_u)})\).

1.3 Related work

Besides the global well-posedness problem of the BO equation (1.1), various properties of its multi-soliton solutions have been investigated in detail. Both the solitary waves for (1.1) and the internal periodic waves for (1.26) are completely classified in Amick–Toland [2]. The \(H^1\)-orbital stability of double solitons of (1.1) is obtained in Neves–Lopes [16]. In Dobrokhotov–Krichever [6], the multi-phase solutions (periodic multi-solitons) for (1.26) are constructed by finite zone integration and they have also established an inversion formula for multi-phase solutions. Compared with their work, we give a geometric description of the inverse spectral transform by proving the real bi-analyticity and the symplectomorphism property of the action–angle map \(\Phi _N\) given by (1.14). Furthermore, the inverse spectral formula \(u = - 2 \mathrm {Im} \frac{Q_u'}{Q_u}\) with \(Q_u(x) =\det (x - G|_{\mathscr {H}_{\mathrm {pp}}(L_u)})=\det (x - M(u))\) provides a spectral connection between the Lax operator \(L_u\) and the operator \(G|_{\mathscr {H}_{\mathrm {pp}}(L_u)}\), \(\forall u \in \mathcal {U}_N\).

Concerning the investigation of the integrability of the BO equations (1.1) and (1.26), besides the discovery of their Lax pair structures, we mention the pioneering work of Ablowitz–Fokas [1], Coifman–Wickerhauser [5], Kaup–Matsuno [12] and Wu [24, 25] about the direct and inverse scattering transform of (1.1). Equations (1.1) and (1.26) both admit an infinite hierarchy of conservation laws that control every \(H^s\)-norm of the solutions, see [1] and [5] for the case \(2s\in {\mathbb {N}}\), see Talbut [22] for the case \(-\frac{1}{2}< s < 0\) and for conservation laws controlling Besov norms, etc. In the space-periodic regime, Gérard and Kappeler have shown in [8] that (1.26) admits global Birkhoff coordinates on \(L^2_{r,0}(\mathbb {T})\), see also remark 1.13 for the comparison between [8] and theorem 1 of this paper. We point out that both Korteweg–de Vries (KdV) equation on \(\mathbb {T}\) (see Kappeler–Pöschel [11]) and the cubic defocusing Schrödinger (dNLS) equation on \(\mathbb {T}\) (see Grébert–Kappeler [9]) admit global Birkhoff coordinates. The theory of finite-dimensional Hamiltonian system is transferred to BO, KdV and dNLS equations on \(\mathbb {T}\) through the submanifolds of finite-gap potentials, which are introduced in order to solve the periodic KdV initial problem. Moreover, the cubic Szegő equations both on \(\mathbb {T}\) (see Gérard–Grellier [7]) and on \({\mathbb {R}}\) (see Pocovnicu [18]) admit global (generalized) action–angle coordinates on all finite-rank generic rational function manifolds, denoted respectively by \(\mathcal {M}(N)_{\mathrm {gen}}^{\mathbb {T}}\) and \(\mathcal {M}(N)_{\mathrm {gen}}^{{\mathbb {R}}}\). A real analytic covering map can be established from \(\mathcal {M}(N)_{\mathrm {gen}}^{{\mathbb {R}}}\) to \(\mathcal {M}(N)_{\mathrm {gen}}^{\mathbb {T}}\). Moreover, the cubic Szegő equations both on \(\mathbb {T}\) and on \({\mathbb {R}}\) have inverse spectral formulas which permit the Szegő flows to be expressed explicitly in terms of time-variables and initial data without using action–angle coordinates. The shift semigroup \((S(\eta )^*)_{\eta \ge 0}\) and its infinitesimal generator G are also used in [18].

Remark 1.14

The BO equation (1.1) can be interpreted as a Schrödinger-type equation, which is filtered by the Szegő projector \(\Pi : L^2({\mathbb {R}}) \rightarrow L^2_+\). If \(u : t\in {\mathbb {R}} \mapsto u(t) \in H^2({\mathbb {R}}, {\mathbb {R}})\) solves (1.1) and \(w : t \in {\mathbb {R}} \mapsto w(t):=\Pi (u(t)) \in H^2_+\), then equation (1.1) reads as an NLS–Szegő equation

$$\begin{aligned} i \partial _t w - \partial _x^2 w + i \partial _x (w^2 + 2 \Pi (|w|^2)) =0, \qquad (t,x)\in {\mathbb {R}}\times {\mathbb {R}}. \end{aligned}$$
(1.32)

We refer to Sun [20, 21] to see the long time and asymptotic behavior of other NLS–Szegő equations.

2 The Lax Operator

This section is dedicated to studying the Lax operator \(L_u\) in the Lax pair formulation of the BO equation (1.1). Then we describe the location of its spectrum and revisit the simplicity of its eigenvalues. At last, we introduce a generating functional \(\mathcal {H}_{\lambda }\) which encodes the entire BO hierarchy. The equation \(\partial _t u = \partial _x \nabla _u \mathcal {H}_{\lambda }(u)\) also enjoys a Lax pair structure. Now, we recall a basic fact concerning unitarily equivalent self-adjoint operators.

Proposition 2.1

If \(\mathcal {E}_1\) and \(\mathcal {E}_2\) are two Hilbert spaces, let \(\mathcal {A}\) be a self-adjoint operator defined on \(\mathbf {D}(\mathcal {A}) \subset \mathcal {E}_1\) and \(\mathcal {B}\) be a self-adjoint operator defined on \(\mathbf {D}(\mathcal {B}) \subset \mathcal {E}_2\). Both \(\mathcal {A}\) and \(\mathcal {B}\) have spectral decompositions

$$\begin{aligned} \mathcal {E}_1 = \mathscr {H}_{\mathrm {ac}}(\mathcal {A}) \bigoplus \mathscr {H}_{\mathrm {sc}}(\mathcal {A}) \bigoplus \mathscr {H}_{\mathrm {pp}}(\mathcal {A}), \qquad \mathcal {E}_2 = \mathscr {H}_{\mathrm {ac}}(\mathcal {B}) \bigoplus \mathscr {H}_{\mathrm {sc}}(\mathcal {B}) \bigoplus \mathscr {H}_{\mathrm {pp}}(\mathcal {B}).\nonumber \\ \end{aligned}$$
(2.1)

If \(\mathcal {A}\) and \(\mathcal {B}\) are unitarily equivalent i.e. there exists a unitary operator \(\mathcal {U}: \mathcal {E}_1 \rightarrow \mathcal {E}_2\) such that

$$\begin{aligned} \mathcal {B} \mathcal {U} = \mathcal {U} \mathcal {A} , \qquad \mathbf {D}(\mathcal {B})=\mathcal {U}\mathbf {D}(\mathcal {A}), \end{aligned}$$
(2.2)

then \(\sigma _{\mathrm {xx}}(A)=\sigma _{\mathrm {xx}}(B)\) and \(\mathcal {U}\mathscr {H}_{\mathrm {xx}}(\mathcal {A})=\mathscr {H}_{\mathrm {xx}}(\mathcal {B})\), for every \(\mathrm {xx} \in \{\mathrm {ac}, \mathrm {sc}, \mathrm {pp}\}\). Moreover, for every bounded borel function \(f : {\mathbb {R}} \rightarrow {\mathbb {C}}\), \(f(\mathcal {A})\) is a bounded operator on \(\mathcal {E}_1\), \(f(\mathcal {B})\) is a bounded operator on \(\mathcal {E}_2\), we have \(f(\mathcal {B}) = \mathcal {U} f(\mathcal {A}) \mathcal {U}^*\).

2.1 Spectral analysis I

In this subsection, we study the essential spectrum and discrete spectrum of the Lax operator \(L_u\). The spectral analysis of \(L_u\) such that u is a multi-soliton in definition 1.1, will be continued in Sect. 4.2.

Definition 2.2

Given \(u \in L^2({\mathbb {R}}, {\mathbb {R}})\), its associated Lax operator \(L_u\) is an unbounded operator on \(L^2_+\), given by \(L_u:= \mathrm {D}- T_u\), where \(\mathrm {D}: h \in H^1_+ \mapsto -i \partial _x h \in L^2_+\) and \(T_u\) denotes the Toeplitz operator of symbol u, defined by \(T_u : h \in H^1_+ \mapsto \Pi (uh) \in L^2_+\), where the Szegő projector \(\Pi : L^2 ({\mathbb {R}})\rightarrow L^2_+\) is given by (1.20). If \(u\in H^1({\mathbb {R}}, {\mathbb {R}})\) in addition, we define \(B_u:=i(T_{|\mathrm {D}|u} - T_u^2) \in \mathfrak {B}(H^1_+,L^2_+)\).

Both \(\mathrm {D}\) and \(T_u\) are densely defined symmetric operators on \(L^2_+\) and \(\Vert T_u(h) \Vert _{L^2} \le \Vert u\Vert _{L^2} \Vert h\Vert _{L^{\infty }}\), for every \(h \in H^1_+\) and \(u\in L^2({\mathbb {R}}, {\mathbb {R}})\). Moreover, the Fourier–Plancherel transform implies that \(\mathrm {D}\) is a self-adjoint operator on \(L^2_+\), whose domain of definition is \(H^1_+\).

Proposition 2.3

If \(u\in L^2({\mathbb {R}}, {\mathbb {R}})\), then \(L_u\) is an unbounded self-adjoint operator on \(L^2_+\), whose domain of definition is \(\mathbf {D}(L_u)=H^1_+\). Moreover, \(L_u\) is bounded from below. The essential spectrum of \(L_u\) is \(\sigma _{\mathrm {ess}}(L_u)=\sigma _{\mathrm {ess}}(\mathrm {D})=[0,+\infty )\) and its pure point spectrum satisfies \(\sigma _{\mathrm {pp}}(L_u) \subset [- \tfrac{ \Vert u\Vert _{L^2}^2 }{4C^4}, +\infty )\), where \(C=\inf _{f \in H^1_+ \backslash \{0\}}\frac{\Vert |\mathrm {D}|^{\frac{1}{4}}f\Vert _{L^2}}{\Vert f\Vert _{L^4}}\) denotes the Sobolev constant.

Proof

For every \(h\in L^2_+\), let \(\mu ^{\mathrm {D}}_h\) denote the spectral measure of \(\mathrm {D}\) associated to h, then we have \(\langle f(\mathrm {D})h, h \rangle _{L^2} = \int _0^{+\infty } f(\xi ) \frac{|{\hat{h}}(\xi )|^2}{2\pi } \mathrm {d}\xi \), so \(\mathrm {d}\mu ^{\mathrm {D}}_h(\xi ) = \frac{ \mathbf {1}_{[0,+\infty )}(\xi )|{\hat{h}}(\xi )|^2}{2\pi } \mathrm {d}\xi \). Thus \(\sigma (\mathrm {D})=\sigma _{\mathrm {ess}}(\mathrm {D})=\sigma _{\mathrm {ac}}(\mathrm {D}) =[0, +\infty )\). If \(u\in L^2({\mathbb {R}}, {\mathbb {R}})\), we claim that \(\mathcal {P}_u:=T_u \circ (\mathrm {D}+i)^{-1}\) is a Hilbert–Schmidt operator on \(L^2_+\).

In fact, let \(\mathscr {F} : h\in L^2_+ \mapsto \frac{{\hat{h}}}{\sqrt{2\pi }}\in L^2 ({\mathbb {R}}_+^*)\) denotes the renormalized Fourier–Plancherel transform, then \(\mathcal {A}_u := \mathscr {F} \circ \mathcal {P}_u \circ \mathscr {F}^{-1}\) is an operator on \(L^2( {\mathbb {R}}_+^*)\). Then we have \(\mathcal {A}_u g(\xi ) = \int _{0}^{+\infty } K_u(\xi , \eta ) g(\eta )\mathrm {d}\eta \), where \(K_u(\xi , \eta ):=\frac{{\hat{u}}(\xi -\eta )}{2\pi (\eta +i)}\), \(\forall \xi , \eta \in {\mathbb {R}}_+^*\). So \(\Vert \mathcal {A}_u\Vert _{\mathcal {HS}(L^2({\mathbb {R}}_+^*))} \le \Vert K\Vert _{L^2({\mathbb {R}}_+^* \times {\mathbb {R}}_+^*)} \le \frac{\Vert u\Vert _{L^2}}{2}\). Since \(\mathcal {P}_u\) is unitarily equivalent to \(\mathcal {A}_u\), we have \(\Vert \mathcal {P}_u\Vert _{\mathcal {HS}( L^2_+)}^2=\sum _{\lambda \in \sigma (\mathcal {P}_u)} \lambda ^2 = \sum _{\lambda \in \sigma (\mathcal {A}_u)} \lambda ^2=\Vert \mathcal {A}_u\Vert ^2_{\mathcal {HS}(L^2({\mathbb {R}}_+^*))} \le \frac{\Vert u\Vert _{L^2}^2}{4}\).

Then the symmetric operator \(T_u\) is relatively compact with respect to \(\mathrm {D}\) and Weyl’s essential spectrum theorem (Theorem XIII.14 of Reed–Simon [19]) yields that \(\sigma _{\mathrm {ess}}(L_u)=\sigma _{\mathrm {ess}}(\mathrm {D})\) and \(L_u\) is self-adjoint with \(\mathbf {D}(L_u)=\mathbf {D}(\mathrm {D})=H^1_+\). Moreover, \(|\langle T_u f, f \rangle _{L^2}| = |\int _{{\mathbb {R}}}u|f|^2|\le \Vert u\Vert _{L^2} \Vert f\Vert _{L^4}^2 \le C^{-2} \Vert u\Vert _{L^2} \Vert f\Vert _{L^2} \Vert |\mathrm {D}|^{\frac{1}{2}}f\Vert _{L^2}\) holds by Sobolev embedding \(\Vert f\Vert _{L^4} \le C^{-1} \Vert |\mathrm {D}|^{\frac{1}{4}}f\Vert _{L^2}\), for every \(f \in H^1_+\). Then \(L_u\) is bounded from below, precisely \(\langle L_u f, f \rangle _{L^2} = \Vert |\mathrm {D}|^{\frac{1}{2}}f\Vert _{L^2}^2 -\langle T_u f, f \rangle _{L^2} \ge - \tfrac{\Vert u\Vert _{L^2}^2 \Vert f\Vert _{L^2}^2}{4 C^4}\). When \(\lambda < - \tfrac{ \Vert u\Vert _{L^2}^2 }{4C^4}\), the map \(L_u - \lambda : H^1_+ \rightarrow L^2_+\) is injective. Hence \(\sigma _{\mathrm {pp}}(L_u) \subset [- \tfrac{ \Vert u\Vert _{L^2}^2 }{4C^4}, +\infty )\).

\(\square \)

Proposition 2.4

Assume that \(u\in L^2({\mathbb {R}}, (1+x^2)\mathrm {d}x)\) and u is real-valued. For every \(\lambda \in {\mathbb {R}}\) and \(\varphi \in \mathrm {Ker}(\lambda -L_u)\), we have \(\widehat{u \varphi } \in C^1({\mathbb {R}}) \bigcap H^1({\mathbb {R}})\) and the following identity holds,

$$\begin{aligned} |\langle u, \varphi \rangle _{L^2}|^2 = -2\pi \lambda \Vert \varphi \Vert _{L^2}^2. \end{aligned}$$
(2.3)

Thus \(\sigma _{\mathrm {pp}}(L_u) \subset (-\infty , 0)\) and for every \(\lambda \in \sigma _{\mathrm {pp}}(L_u)\), we have

$$\begin{aligned}&\mathrm {Ker}(\lambda -L_u) \subset \{\varphi \in H^1_+ : {\hat{\varphi }}_{|{\mathbb {R}}_+} \in C^1({\mathbb {R}}_+) \bigcap H^{1}({\mathbb {R}}_+) \quad \mathrm {and} \nonumber \\&\quad \xi \mapsto \xi [ {\hat{\varphi }}(\xi )+ \partial _{\xi } {\hat{\varphi }}(\xi )] \in L^2({\mathbb {R}}_+)\}. \end{aligned}$$
(2.4)

Before the proof of proposition 2.4, we recall a lemma concerning the regularity of convolutions.

Lemma 2.5

For any \(p \in (1,+\infty )\), we have \(W^{m, p}({\mathbb {R}}) * W^{n, \frac{p}{p-1}}({\mathbb {R}}) \subset C^{m+n}({\mathbb {R}})\bigcap W^{m+n, +\infty }({\mathbb {R}})\), \(\forall m,n \in {\mathbb {N}}\). For every \(f \in W^{m, p}({\mathbb {R}}) * W^{n, \frac{p}{p-1}}({\mathbb {R}})\), we have \(\lim _{|x|\rightarrow +\infty }\partial _x^{\alpha }f(x)=0\), \(\forall \alpha =0,1, \ldots , m+n\).

Remark 2.6

Identity (2.3) was firstly found by Wu [24] in the case \(\lambda <0\). We show that (2.3) still holds in the case \(\lambda \ge 0\). Hence the operator \(L_u\) has no eigenvalues in \([0,+\infty )\).

Proof of proposition 2.4

We choose \(u\in L^2({\mathbb {R}}, (1+x^2)\mathrm {d}x)\) such that \(u({\mathbb {R}})\subset {\mathbb {R}}\), \(\lambda \in {\mathbb {R}}\) and \(\varphi \in L^2_+\) such that \(L_u (\varphi )= \lambda \varphi \). Applying the Fourier–Plancherel transform, we obtain

$$\begin{aligned} \widehat{u\varphi }(\xi ) \mathbf {1}_{[0,+\infty )} (\xi ) =(\xi -\lambda ){\hat{\varphi }}(\xi )=:g_{\lambda }(\xi ). \end{aligned}$$
(2.5)

Since \({\hat{u}}\in H^1({\mathbb {R}})\) and \({\hat{\varphi }} \in L^2({\mathbb {R}})\), their convolution \(\widehat{u\varphi } =\frac{1}{2\pi }{\hat{u}} * {\hat{\varphi }} \in C^1({\mathbb {R}})\bigcap C_0({\mathbb {R}})\), where \(C_0({\mathbb {R}})\) denotes the uniform closure of \(C_c({\mathbb {R}})\) with respect to the \(L^{\infty }({\mathbb {R}})\)-norm, by Lemma 2.5. We claim that if \(\lambda <0\), then \({\hat{\varphi }} \in C^1({\mathbb {R}}_+)\); if \(\lambda \ge 0\), then \({\hat{\varphi }} \in C({\mathbb {R}}_+)\bigcap C^1({\mathbb {R}}_+ \backslash \{\lambda \})\).

In fact, if \(\lambda \ge 0\), we have \(g_{\lambda }(\lambda )=0\). Otherwise, \(\lambda \) would be a singular point of \({\hat{\varphi }}\) that prevents \({\hat{\varphi }}\) from being a \(L^2\) function on \({\mathbb {R}}_+\), because \(\xi \rightarrow \frac{1}{\xi -\lambda } \notin L^2({\mathbb {R}}_+)\). By using the fact \(g \in C^1({\mathbb {R}}_+)\) (g is right differentiable at \(\xi =0\) and the derivative \(g'\) is right continuous at \(\xi =0\)), we have

$$\begin{aligned} {\hat{\varphi }}(\xi )= \frac{g_{\lambda }(\xi )-g_{\lambda }(\lambda )}{\xi -\lambda }\rightarrow {\left\{ \begin{array}{ll} g'_{\lambda }(\lambda ), \quad \quad \;\;\mathrm {if} \quad \lambda >0;\\ g'_{\lambda }(0^+), \qquad \mathrm {if} \quad \lambda =0;\\ \end{array}\right. } \end{aligned}$$

when \(\xi \rightarrow \lambda \). So \({\hat{\varphi }} \in C({\mathbb {R}}_+)\) and \(\lim _{\xi \rightarrow +\infty } {\hat{\varphi }}(\xi )=0\). Then we derive (2.5) with respect to \(\xi \) to get

$$\begin{aligned} -i \widehat{xu}* {\hat{\varphi }}(\xi )=g'_{\lambda }(\xi )= (\widehat{u\varphi })'(\xi ) = {\hat{\varphi }}(\xi ) + (\xi -\lambda )({\hat{\varphi }})'(\xi ), \qquad \forall \xi \in [0,+\infty )\backslash \{\lambda \}.\nonumber \\ \end{aligned}$$
(2.6)

Thus we have

$$\begin{aligned} \tfrac{\mathrm {d}}{\mathrm {d}\xi }[(\xi -\lambda )|{\hat{\varphi }}(\xi )|^2]= & {} |{\hat{\varphi }}(\xi )|^2 + 2\mathrm {Re}[((\xi -\lambda )({\hat{\varphi }})'(\xi ))\overline{{\hat{\varphi }}}(\xi ) ]\nonumber \\= & {} 2\mathrm {Re}[(\widehat{u\varphi })'(\xi )\overline{{\hat{\varphi }}}(\xi )] - |{\hat{\varphi }}(\xi )|^2. \end{aligned}$$
(2.7)

\(\bullet \) When \(\lambda <0\), it suffices to use the Plancherel formula \(\int _0^{+\infty } (\widehat{u\varphi })'(\xi )\overline{{\hat{\varphi }}}(\xi ) \mathrm {d}\xi = -2\pi i\int _{{\mathbb {R}}}x u(x)| \varphi (x) |^2 \mathrm {d}x\) and to integrate equation (2.7) on \([0, +\infty )\). Since \((\xi -\lambda )|{\hat{\varphi }}(\xi )|^2 = \widehat{u\varphi }(\xi ) \overline{{\hat{\varphi }}}(\xi )\rightarrow 0\), as \(\xi \rightarrow +\infty \), we have \(\lambda |{\hat{\varphi }}(0)|^2 = \int _0^{+\infty } \frac{\mathrm {d}}{\mathrm {d}\xi }[(\xi -\lambda )|{\hat{\varphi }}(\xi )|^2] \mathrm {d}\xi = 4\pi \mathrm {Im} \int _{{\mathbb {R}}}x u(x)| \varphi (x) |^2 \mathrm {d}x - \int _0^{+\infty }|{\hat{\varphi }}(\xi )|^2 \mathrm {d}\xi = -2\pi \Vert \varphi \Vert _{L^2({\mathbb {R}})}^2\).

\(\bullet \) When \(\lambda > 0\), there may be some problem of derivability of \({\hat{\varphi }}\) at \(\xi =\lambda \). We replace the integral \(\int _0^{+\infty }\) by two integrals \(\int _0^{\lambda -\epsilon }\) and \(\int _{\lambda + \epsilon }^{+\infty }\), for some \(\epsilon \in (0,\lambda )\). We set \(\mathcal {I}(\epsilon ):= \lambda |{\hat{\varphi }}(0)|^2 - \epsilon |{\hat{\varphi }}(\lambda -\epsilon )|^2- \epsilon |{\hat{\varphi }}(\lambda +\epsilon )|^2\), then \(\mathcal {I}(\epsilon ) = 2\mathrm {Re} \left( \int _0^{+\infty } (\widehat{u\varphi })'(\xi )\overline{{\hat{\varphi }}}(\xi ) \mathrm {d}\xi - \int _{\lambda -\epsilon }^{\lambda + \epsilon } (\widehat{u\varphi })'(\xi )\overline{{\hat{\varphi }}}(\xi ) \mathrm {d}\xi \right) - \int _0^{+\infty }|{\hat{\varphi }}(\xi )|^2 \mathrm {d}\xi + \int _{\lambda -\epsilon }^{\lambda + \epsilon }|{\hat{\varphi }}(\xi )|^2 \mathrm {d}\xi \). Thanks to the continuity of \({\hat{\varphi }}\) on \({\mathbb {R}}_+\), we have \(\lambda |{\hat{\varphi }}(0)|^2=\lim _{\epsilon \rightarrow 0^{+}}\mathcal {I}(\epsilon ) = -2\pi \Vert \varphi \Vert _{L^2({\mathbb {R}})}^2\).

\(\bullet \) When \(\lambda =0\), we use the same idea and integrate (2.7) over interval \([\epsilon , +\infty )\), for some \(\epsilon >0\). Then \(\mathcal {J}(\epsilon ):= - \epsilon |{\hat{\varphi }}( \epsilon )|^2 = 2\mathrm {Re} \int _{\epsilon }^{+\infty } (\widehat{u\varphi })'(\xi )\overline{{\hat{\varphi }}}(\xi ) \mathrm {d}\xi - \int _{\epsilon }^{+\infty }|{\hat{\varphi }}(\xi )|^2 \mathrm {d}\xi \rightarrow 0\), as \(\epsilon \rightarrow 0\).

So we always have \(-2\pi \Vert \varphi \Vert _{L^2({\mathbb {R}})}^2=\lambda |{\hat{\varphi }}(0)|^2\), if \(\varphi \in \mathrm {Ker}(\lambda -L_u)\). As a consequence \(L_u\) has only negative eigenvalues, if the real-valued function \(u\in L^2({\mathbb {R}}, (1+x^2) \mathrm {d}x)\). Finally we use \(\widehat{u\varphi }(0)=-\lambda {\hat{\varphi }}(0)\) to get identity (2.3). If \(\lambda \in \sigma _{\mathrm {pp}}(L_u)\) and \(\varphi \in \mathrm {Ker}(\lambda - L_u) \backslash \{0\}\), we want to prove that

$$\begin{aligned} \xi \mapsto (1+|\xi |)\partial _{\xi }{\hat{\varphi }}(\xi ) \in L^2(0, +\infty ). \end{aligned}$$
(2.8)

In fact, since \(\varphi \in H^1_+ \hookrightarrow L^{\infty }({\mathbb {R}})\) and \(u \in L^2({\mathbb {R}}, (1+x^2)\mathrm {d}x)\), we have \(\widehat{u \varphi }=\frac{{\hat{u}}*{\hat{\varphi }}}{2\pi } \in H^1({\mathbb {R}})\). Formula (2.5) yields that \(\xi \mapsto (|\lambda | + \xi ){\hat{\varphi }}(\xi ) \in L^2({\mathbb {R}})\) and we have \({\hat{\varphi }} \in L^1({\mathbb {R}})\). The hypothesis \(u \in L^2({\mathbb {R}}, x^2 \mathrm {d}x)\) implies that the convolution term \( \widehat{xu}*{\hat{\varphi }} \in L^2({\mathbb {R}})\). Since \(\lambda <0\), we obtain (2.8) by using formula (2.6). \(\quad \square \)

Corollary 2.7

Assume that \(u\in L^2({\mathbb {R}}, (1+x^2)\mathrm {d}x)\) and u is real-valued. Then every eigenvalue of \(L_u\) is simple. If \(u \in L^{\infty }({\mathbb {R}})\) in addition, then \(\sigma _{\mathrm {pp}}(L_u)\) is a finite subset of \([- \tfrac{ \Vert u\Vert _{L^2}^2 }{4C^4}, 0)\).

Proof

Fix \(\lambda \in \sigma _{\mathrm {pp}}(L_u)\) and set \(V_{\lambda }=\mathrm {Ker}(\lambda -L_u)\), then \(\mathrm {dim}_{{\mathbb {C}}}(V_{\lambda }) \ge 1\). We define a linear form \( A:V_{\lambda } \rightarrow {\mathbb {C}}\) such that \(A(\varphi ):= \int _{{\mathbb {R}}}u\varphi \). Then identity (2.3) yields that \(\mathrm {Ker}(A)=\{0\}\). Thus we have \(V_{\lambda } \cong V_{\lambda }/\mathrm {Ker}(A)\cong \mathrm {Im}(A) \hookrightarrow {\mathbb {C}}\). So \(\mathrm {dim}_{{\mathbb {C}}}(V_{\lambda }) = 1\). When \(u \in L^{\infty }({\mathbb {R}})\) in addition, the finiteness of \(\sigma _{\mathrm {pp}}(L_u)\bigcap (-\infty , 0)\) is given by Theorem 1.2 of Wu [24]. \(\quad \square \)

2.2 Lax pair formulation

We recall some known results of global well-posedness of the BO equation on the line.

Proposition 2.8

(Tao [23], Ionescu–Kenig [10], etc.). Given \(s\ge 0\), the Fréchet space \(C({\mathbb {R}}, H^s({\mathbb {R}}))\) is endowed with the topology of uniform convergence on every compact subset of \({\mathbb {R}}\). There exists a unique continuous mapping \(u_0 \in H^s({\mathbb {R}}) \mapsto u\in C({\mathbb {R}}, H^s({\mathbb {R}}))\) such that u solves the BO equation (1.1) with initial datum \(u(0)=u_0\).

Proposition 2.9

(Ablowitz–Fokas [1], Coifman–Wickerhauser [5], etc.). For every \(n \in {\mathbb {N}}:={\mathbb {Z}}\bigcap [0,+\infty )\), if \(u_0 \in H^{\frac{n}{2}}({\mathbb {R}}, {\mathbb {R}})\), let \(u : t\in {\mathbb {R}} \mapsto u(t) \in H^{\frac{n}{2}}({\mathbb {R}}, {\mathbb {R}})\) solves equation (1.1) with initial datum \(u(0)=u_0\), then we have \(\mathcal {C}(\Vert u_0\Vert _{H^{\frac{n}{2}}}):=\sup _{t\in {\mathbb {R}}}\Vert u(t)\Vert _{H^{\frac{n}{2}}} <+\infty \).

When \(u \in H^2({\mathbb {R}}, {\mathbb {R}})\), the Toeplitz operators \(T_{|\mathrm {D}|u}\) and \(T_{ u}\) are bounded both on \(L^2_+\) and on \(H^1_+\). So \(B_u\) is a bounded skew-adjoint operator both on \(L^2_+\) and on \(H^1_+\).

Proposition 2.10

Let \(u : t\in {\mathbb {R}} \mapsto u(t)\in H^2({\mathbb {R}}, {\mathbb {R}})\) denote the unique solution of equation (1.1), then

$$\begin{aligned} \partial _t L_{u(t)}= [B_{u(t)}, L_{u(t)}] \in \mathfrak {B}(H^1_+, L^2_+), \qquad \forall t \in {\mathbb {R}}. \end{aligned}$$
(2.9)

The proof of proposition 2.10 can be found in Gérard–Kappeler [8], Wu [24] etc. In order to make this paper self contained, we recall it here.

Proof

Since \(\frac{\mathrm {d}}{\mathrm {d}t}(L\circ u)(t) = -T_{\partial _t u(t)} = -T_{ \mathrm {H}\partial _x^2 u(t)- \partial _x \left( u(t)^2 \right) }\), it suffices to prove \([B_u, L_u]+T_{ \mathrm {H}\partial _x^2 u - \partial _x \left( u^2 \right) }=0\) for every \(u \in H^2({\mathbb {R}}, {\mathbb {R}})\). In fact, we have \({\hat{u}}(-\xi ) = \overline{{\hat{u}}(\xi )}\), \(u = \Pi u + \overline{\Pi u}\) and \(|\mathrm {D}| u = \mathrm {D} \Pi u - \mathrm {D} \overline{\Pi u}\). Since both \(T_u\) and \(B_u\) are bounded both \(L^2_+ \rightarrow L^2_+\) and \(H^1_+ \rightarrow H^1_+\), we have

$$\begin{aligned} \begin{aligned} {[}B_u, L_u]f =&- \Pi (f \partial _x |\mathrm {D}| u) + i \Pi [u\Pi (f |\mathrm {D}|u) - |\mathrm {D}|u \Pi (uf)] + \Pi [\partial _x u \Pi (uf) + u \Pi (f \partial _x u)] \\ =&-\Pi (f \mathrm {H} \partial _x^2 u) + \mathcal {I}_1 + \mathcal {I}_2 \in L^2_+, \end{aligned} \end{aligned}$$
(2.10)

for every \(f \in H^1_+\), where the terms \(\mathcal {I}_1\) and \(\mathcal {I}_2\) are given by

$$\begin{aligned} \begin{aligned} \mathcal {I}_1 :=&i \Pi [u\Pi (f |\mathrm {D}|u) - |\mathrm {D}|u \Pi (uf)] \\ =&\Pi [f \overline{\Pi u} \partial _x \Pi u + f \Pi u \partial _x \overline{\Pi u}] - \Pi u \Pi (f \partial _x \overline{\Pi u}) -\Pi (f \overline{\Pi u}) \partial _x \Pi u + \Pi [\Pi (f \overline{\Pi u})\partial _x \overline{\Pi u} - \overline{\Pi u}\Pi (f \partial _x \overline{\Pi u})],\\ \mathcal {I}_2 :=&\Pi [\partial _x u \Pi (uf) + u \Pi (f \partial _x u)] = \Pi (f \overline{\Pi u}) \partial _x \Pi u+ \Pi u \Pi (f \partial _x \overline{\Pi u}) + \Pi (\overline{\Pi u} \Pi (f \partial _x \overline{\Pi u})) \\&\qquad + 2 f \Pi u \partial _x \Pi u + \Pi [f \Pi u \partial _x \overline{\Pi u} + f \overline{\Pi u} \partial _x \Pi u + \Pi (f \overline{\Pi u}) \partial _x \overline{\Pi u}]. \end{aligned} \end{aligned}$$

Since \(\partial _x \overline{\Pi u} \in L^2_-\), we have \(\Pi [ \Pi (f \overline{\Pi u}) \partial _x \overline{\Pi u}] = \Pi [ f \overline{\Pi u} \partial _x \overline{\Pi u}]\). Thus,

$$\begin{aligned} \mathcal {I}_1 + \mathcal {I}_2 = 2 f \Pi u \partial _x \Pi u + 2\Pi [f \Pi u \partial _x \overline{\Pi u} + f \overline{\Pi u} \partial _x \Pi u + \Pi (f \overline{\Pi u})\partial _x \overline{\Pi u} ] = \Pi [f\partial _x (u^2)] \in H^1_+.\nonumber \\ \end{aligned}$$
(2.11)

Formulas (2.10) and (2.11) yield that \([B_u, L_u]f = \Pi [f(\partial _x( u^2) - \mathrm {H} \partial _x^2 u)]\). Thus equation (2.9) holds along the evolution of equation (1.1). \(\quad \square \)

Remark 2.11

As indicated in Gérard–Kappeler [8], there are many choices of the operator \(B_u\). We can replace \(B_u\) by any operator of the form \(B_u + P_u\) such that \(P_u\) is a skew-adjoint operator commuting with \(L_u\). For instance, we set \(C_u:=B_u+ i L_u^2\) and we obtain \(C_u = i\mathrm {D}^2 - 2i \mathrm {D} T_u +2i T_{\mathrm {D} \Pi u}\). So \((L_u, C_u)\) is also a Lax pair of the BO equation (1.1). The advantage of the operator \(B_u = i(T_{|\mathrm {D} |u}- T_u^2)\) is that \(B_u : L^2_+ \rightarrow L^2_+\) is bounded if u is sufficiently regular. For instance, \(u \in H^2({\mathbb {R}}, {\mathbb {R}})\).

Let \(U : t \mapsto U(t) \in \mathfrak {B}(L^2_+):= \mathfrak {B}(L^2_+, L^2_+)\) denote the unique solution of the following equation

$$\begin{aligned} U'(t)=B_{u(t)} U(t), \qquad U(0)=\mathrm {Id}_{L^2_+}, \end{aligned}$$
(2.12)

if \(u : t\in {\mathbb {R}} \mapsto u(t)\in H^2({\mathbb {R}}, {\mathbb {R}})\) denote the unique solution of equation (1.1). The system (2.12) is globally well-posed in \(\mathfrak {B}(L^2_+)\), thanks to Proposition 2.9 and the following estimate

$$\begin{aligned} \Vert B_u (h)\Vert _{L^2} \lesssim (\Vert u\Vert _{H^2}+\Vert u\Vert _{H^1}^2)\Vert h\Vert _{L^2}, \qquad \forall h \in L^2_+, \quad \forall u\in H^2({\mathbb {R}}, {\mathbb {R}}). \end{aligned}$$

Since \(B_u^*=-B_u\), the operator U(t) is unitary for every \(t \in {\mathbb {R}}\). Thus, the Lax pair formulation (2.9) of the BO equation (1.1) is equivalent to \(L_{u(t)} = U(t) L_{u(0)} U(t)^* \in \mathfrak {B}(H^1_+, L^2_+)\). On the one hand, the spectrum of \(L_u\) is invariant under the BO flow. On the other hand, there exists a sequence of conservation laws controlling every Sobolev norms \(H^{\frac{n}{2}}({\mathbb {R}})\), \(n\ge 0\). Furthermore, the Lax operator in the Lax pair formulation is not unique. If \(f \in L^{\infty }({\mathbb {R}})\) and p is a polynomial with complex coefficients, then we have \( f(L_{u(t)}) = U(t) f(L_{u(0)}) U(t)^* \in \mathfrak {B}(L^2_+)\) and \(p(L_{u(t)}) = U(t) p(L_{u(0)}) U(t)^* \in \mathfrak {B}(H^{N}_+, L^2_+)\), where N is the degree of the polynomial p.

Proposition 2.12

Given \(n \in {\mathbb {N}}\), let \(u: t \in {\mathbb {R}} \mapsto u(t) \in H^{\frac{n}{2}}({\mathbb {R}}, {\mathbb {R}})\) solve equation (1.1), we set

$$\begin{aligned} E_n (u) := \langle L_u^n \Pi u, \Pi u \rangle _{H^{-\frac{n}{2}}, H^{\frac{n}{2}}}. \end{aligned}$$
(2.13)

Then \(E_n (u(t)) = E_n (u(0))\), for every \(t \in {\mathbb {R}}\). In particular, \(E_1 = E\) on \(H^{\frac{1}{2}}({\mathbb {R}},{\mathbb {R}})\), where the energy functional E is given by (1.3).

In order to prove Proposition 2.12, we need the following result.

Proposition 2.13

If \(u : t\in {\mathbb {R}} \mapsto u(t) \in H^2({\mathbb {R}}, {\mathbb {R}})\) solve the BO equation (1.1), then we have

$$\begin{aligned} \partial _t \Pi u(t) = B_{u(t)}(\Pi u(t)) + i L_{u(t)}^2(\Pi u(t)) \in L^2_+. \end{aligned}$$
(2.14)

Proof

For every \(u\in H^2({\mathbb {R}}, {\mathbb {R}})\), \(B_u\) is a bounded operator on both \(L^2_+\) and \(H^1_+\), \(\Pi u \in \mathbf {D}(L_u)=H^1_+\). We have \({\hat{u}}(-\xi ) = \overline{{\hat{u}}(\xi )}\), \(u = \Pi u + \overline{\Pi u}\) and \(|\mathrm {D}| u = \mathrm {D} \Pi u -\mathrm {D}\overline{\Pi u}\). Since \(\mathrm {D} \overline{\Pi u} \in L^2_-\), we have \(\Pi (\Pi u \mathrm {D} \overline{\Pi u}) = \Pi (u \mathrm {D} \overline{\Pi u})\). Thus the following two formulas hold,

$$\begin{aligned} \begin{aligned} B_u(\Pi u)&= i(T_{|\mathrm {D}|u} - T_u^2)(\Pi u) = i (\Pi u )(\mathrm {D} \Pi u) - i \Pi (u \mathrm {D}\overline{\Pi u}) - i T_u^2(\Pi u)\\&= \Pi u \partial _x \Pi u - \Pi (u \partial _x \overline{\Pi u}) - i T_u^2(\Pi u),\\ i L_u^2(\Pi u)&= i \mathrm {D}^2 \Pi u - i T_u(\mathrm {D} \Pi u) -i \mathrm {D} \circ T_u(\Pi u) + i T_u^2(\Pi u) \\&= - i \partial _x^2 \Pi u - T_u(\partial _x \Pi u) - \partial _x [T_u(\Pi u)] + i T_u^2(\Pi u). \end{aligned} \end{aligned}$$

Then \(B_u(\Pi u)+i L_u^2(\Pi u) = - i \partial _x^2 \Pi u-2 \Pi [\Pi u \partial _x \Pi u + \Pi u\partial _x \overline{\Pi u} + \overline{\Pi u} \partial _x \Pi u]\). Finally we replace u by u(t), where \(u:t \in {\mathbb {R}} \mapsto u(t) \in H^2({\mathbb {R}}, {\mathbb {R}})\) solves equation (1.1) to obtain (2.14). \(\quad \square \)

Proof of proposition 2.12

It suffices to prove it in the case \(u_0 \in H^{\infty }({\mathbb {R}}, {\mathbb {R}})\). Then we use the density argument and the continuity of the flow map \(u_0 \in H^s({\mathbb {R}}) \mapsto u \in C([-T,T]; H^s({\mathbb {R}}))\) in proposition 2.8, where \(\forall T>0\), \(s \ge 0\). We choose \(u=u(t) \in H^{\infty }({\mathbb {R}},{\mathbb {R}}) = \bigcap _{s \ge 0} H^{s}({\mathbb {R}},{\mathbb {R}})\), so \(L_u^n \Pi u\), \(\partial _t \Pi u\) and \( \partial _t(L_u^n )\Pi u =[B_u, L_u^n ]\Pi u\) are in \(H^{\infty }({\mathbb {R}}, {\mathbb {C}})\). Thus \(\partial _t E_n(u) = 2 \mathrm {Re} \langle L_u^n \Pi u, \partial _t \Pi u \rangle _{L^2} + \langle \partial _t (L_u^n) \Pi u, \Pi u \rangle _{L^2}\). Since \(B_u + i L_u^2\) is skew-adjoint, we have \(2 \mathrm {Re} \langle L_u^n \Pi u, \partial _t \Pi u \rangle _{L^2} = \langle [ L_u^n , B_u + i L_u^2 ] \Pi u, \Pi u \rangle _{L^2} = \langle [ L_u^n , B_u] \Pi u, \Pi u \rangle _{L^2}\) by (2.14). Since \(( L_u^n , B_u)\) is also a Lax pair of (1.1), we have \(\partial _t E_n(u) = \langle ( [ L_u^n , B_u ]+\partial _t (L_u^n) )\Pi u, \Pi u \rangle _{L^2} = 0\). In the case \(n=1\), we assume that \(u\in H^1({\mathbb {R}}, {\mathbb {R}})\). Since \(u=\Pi u + \overline{\Pi u}\), \(|\mathrm {D}|u=\mathrm {D}\Pi u - \mathrm {D} \overline{\Pi u}\) and \(\int _{{\mathbb {R}}}(\Pi u)^3 = 0\), we have \(\langle |\mathrm {D}| u, u\rangle _{L^2} = 2 \langle \mathrm {D}\Pi u, \Pi u \rangle _{L^2}\) and \(\int _{{\mathbb {R}}}u^3 = 3 \int _{{\mathbb {R}}}( \Pi u + \overline{\Pi u} )|\Pi u|^2= 3 \int _{{\mathbb {R}}}u |\Pi u|^2\). \(\quad \square \)

2.3 The generating functional

We introduce a new conservation law of the BO equation (1.1) that encodes the entire BO hierarchy.

Definition 2.14

Given \(u \in L^2({\mathbb {R}}, {\mathbb {R}}), \lambda \in {\mathbb {C}} \backslash \sigma (-L_u)\), the generating functional of equation (1.1) is defined by \(\mathcal {H}_{\lambda }(u)= \langle (L_u +\lambda )^{-1} \Pi u , \Pi u \rangle _{L^2}\). The subset \(\mathcal {X}:=\{(\lambda , u)\in {\mathbb {R}}\times L^2({\mathbb {R}}, {\mathbb {R}}) : 4C^4\lambda > \Vert u\Vert _{L^2}^2\} \) is open in \({\mathbb {R}}\times L^2({\mathbb {R}}, {\mathbb {R}})\), where the Sobolev constant is given by \(C=\inf _{f \in H^1_+ \backslash \{0\}}\tfrac{\Vert |\mathrm {D}|^{\frac{1}{4}}f\Vert _{L^2}}{\Vert f\Vert _{L^4}}\).

Since \(\sigma (L_u)\subset [-\tfrac{\Vert u\Vert _{L^2}^2}{4 C^4},+\infty )\), the map \((\lambda ,u)\in \mathcal {X} \mapsto \mathcal {H}_{\lambda }(u)= \langle (L_u +\lambda )^{-1} \Pi u , \Pi u \rangle _{L^2} \in {\mathbb {R}}\) is real analytic.

Proposition 2.15

Let \(u : t \in {\mathbb {R}} \mapsto u(t) \in H^{\infty }({\mathbb {R}}, {\mathbb {R}})\) denote the solution of the BO equation (1.1) and we choose \(\lambda \in {\mathbb {C}} \backslash \sigma (-L_{u(0)})\), then \(\mathcal {H}_{\lambda }(u(t))= \mathcal {H}_{\lambda }(u(0))\), for every \(t \in {\mathbb {R}}\).

Proof

Let \(u : t \in {\mathbb {R}} \mapsto u(t) \in H^{\infty }({\mathbb {R}}, {\mathbb {R}})\) solve equation (1.1). Since \(\sigma (-L_{u(t)})=\sigma (-L_{u(0)})\) by Proposition 2.1, the operator \(\lambda + L_{u(t)} \in \mathfrak {B}(H^1_+, L^2_+)\) is invertible and we have

$$\begin{aligned} \partial _t \mathcal {H}_{\lambda } (u) = 2 \mathrm {Re} \langle (L_u +\lambda )^{-1} \Pi u, \partial _t \Pi u \rangle _{L^2} - \langle (L_u +\lambda )^{-1} \partial _t L_u (L_u +\lambda )^{-1} \Pi u, \Pi u \rangle _{L^2}.\nonumber \\ \end{aligned}$$
(2.15)

Formula (2.14) yields that

$$\begin{aligned}&2 \mathrm {Re} \langle (L_u +\lambda )^{-1} \Pi u, \partial _t \Pi u \rangle _{L^2} \\&\quad = \langle [(L_u +\lambda )^{-1}, B_u + i L_u^2 ] \Pi u, \Pi u \rangle _{L^2} = \langle [(L_u +\lambda )^{-1}, B_u] \Pi u, \Pi u \rangle _{L^2}, \\&\langle [(L_u +\lambda )^{-1}, B_u] \Pi u, \Pi u \rangle _{L^2} \\&\quad = \langle (L_u +\lambda )^{-1} [B_u, L_u +\lambda ] (L_u +\lambda )^{-1} \Pi u, \Pi u \rangle _{L^2}. \end{aligned}$$

Then \(\partial _t L_u = [B_u, L_u]\) yields that \(\partial _t H_{\lambda }(u(t))=0\). \(\quad \square \)

Given \((\lambda , u)\in \mathcal {X}\), there exists a neighbourhood of u in \(L^2({\mathbb {R}},{\mathbb {R}})\), denoted by \(\mathcal {V}_u\) such that the restriction \(\mathcal {H}_{\lambda } : v \in \mathcal {V}_u \mapsto \mathcal {H}_{\lambda }(v) \in {\mathbb {R}}\) can be expressed by power series. Then the Fréchet derivative of \(\mathcal {H}_{\lambda }\) at u is given by \(\mathrm {d} \mathcal {H}_{\lambda }(u)(h) = \langle w_{\lambda } , \Pi h \rangle _{L^2} + \overline{\langle w_{\lambda } , \Pi h \rangle }_{L^2} + \langle T_h w_{\lambda } , w_{\lambda } \rangle _{L^2} = \langle h, w_{\lambda } + \overline{w}_{\lambda } + |w_{\lambda } |^2 \rangle _{L^2}\), \(\forall h \in L^2({\mathbb {R}}, {\mathbb {R}})\), where \(w_{\lambda } \in H^1_+\) is defined by \(w_{\lambda }\equiv w_{\lambda }(u) \equiv w_{\lambda }(x, u)= [(L_u +\lambda )^{-1} \circ \Pi ] u(x)\), \(\forall x \in {\mathbb {R}}\). So

$$\begin{aligned} \nabla _u \mathcal {H}_{\lambda }(u)=|w_{\lambda }(u) |^2 +w_{\lambda }(u) + \overline{w}_{\lambda }(u). \end{aligned}$$
(2.16)

Given \((\lambda , u_0) \in \mathcal {X}\) fixed, we consider the following equation

$$\begin{aligned} \partial _t u = \partial _x \nabla _u \mathcal {H}_{\lambda }(u) = \partial _x \left( |w_{\lambda }(u) |^2 +w_{\lambda }(u) + \overline{w}_{\lambda }(u)\right) , \qquad u(0)=u_0. \end{aligned}$$
(2.17)

There exists an open subset \(\mathcal {V}_{u_0}\) of \(L^2({\mathbb {R}}, {\mathbb {R}})\) such that \(v \in \mathcal {V}_{u_0} \mapsto \partial _x \left( |w_{\lambda }(v) |^2 +w_{\lambda }(v) + \overline{w}_{\lambda }(v)\right) \in L^2({\mathbb {R}}, {\mathbb {R}})\) is real analytic and \(u_0 \in \mathcal {V}_{u_0}\). Hence equation (2.17) admits a unique local \(L^2({\mathbb {R}}, {\mathbb {R}})\)-solution by Cauchy–Lipschitz theorem.

Remark 2.16

In Sect. 4, we show that \(u \in \mathcal {U}_N \mapsto \partial _x \nabla _u f(u) \in \mathcal {T}_u(\mathcal {U}_N)\) is exactly the Hamiltonian vector field of the smooth function \(f : \mathcal {U}_N \rightarrow {\mathbb {R}}\) with respect to the symplectic form \(\omega \) on the N-soliton manifold \(\mathcal {U}_N\) defined in (1.11).

Proposition 2.17

Given \((\lambda , u_0) \in \mathcal {X}\) fixed, there exists \(\varepsilon >0\) such that \((\lambda , u(t)) \in \mathcal {X}\), for every \(t \in (-\varepsilon , \varepsilon )\), where \(u : t \in (-\varepsilon , +\varepsilon ) \mapsto u(t) \in L^2({\mathbb {R}}, {\mathbb {R}})\) solves (2.17) with initial datum \(u(0)=u_0\). Then

$$\begin{aligned} \partial _t L_{u(t)}= [B_{u(t) }^{\lambda }, L_{u(t)}] , \quad \mathrm {where} \quad B_{v }^{\lambda }:=i(T_{w_{\lambda }(v)}T_{\overline{w}_{\lambda }(v)} +T_{w_{\lambda }(v)}+ T_{\overline{w}_{\lambda }(v)} ), \quad \mathrm {if} \quad (\lambda , v)\in \mathcal {X}.\nonumber \\ \end{aligned}$$
(2.18)

Remark 2.18

For every \(u \in H^{\infty }({\mathbb {R}}, {\mathbb {R}})\) and \(\epsilon \in (0, \frac{4 C^4}{\Vert u\Vert _{L^2}^2})\), we set \(\tilde{\mathcal {H}}_{\epsilon }(u):=\frac{1}{\epsilon }\mathcal {H}_{\frac{1}{\epsilon }}(u)\) and \({\tilde{B}}_{\epsilon ,u} :=\frac{1}{\epsilon }B_u^{\frac{1}{\epsilon }}\). Recall that \(E_n(u)=\langle L_u^n \Pi u, \Pi u\rangle _{L^2}\), we have the following Taylor expansion

$$\begin{aligned} \tilde{\mathcal {H}}_{\epsilon }(u) =\sum _{k=0}^{K}(- \epsilon )^n E_n(u) - (-\epsilon )^K \langle (L_u + \tfrac{1}{\epsilon })^{-1} \Pi u, L_u^K \Pi u\rangle _{L^2}, \quad \forall K \in {\mathbb {N}}.\qquad \end{aligned}$$
(2.19)

Then Proposition 2.17 leads to a Lax pair formulation for the equations corresponding to the conservation laws in the BO hierarchy, \(\partial _t L_u = [\frac{\mathrm {d}^n}{\mathrm {d}\epsilon ^n}\Big |_{\epsilon =0}{\tilde{B}}_{\epsilon ,u}, L_u]\), where now u evolves according to the Hamiltonian flow of \(E_n = (-1)^n\frac{\mathrm {d}^n}{\mathrm {d}\epsilon ^n}\big |_{\epsilon =0}\tilde{\mathcal {H}}_{\epsilon }\) with respect to the Gardner–Faddeev–Zakharov Poisson structure. In the case \(n=1\), we have \(E_1=E\) and \(B_u = \frac{\mathrm {d} }{\mathrm {d}\epsilon }\big |_{\epsilon =0}{\tilde{B}}_{\epsilon ,u}\).

Before proving Proposition 2.17, we introduce the Hankel operators of symbols in \(L^2({\mathbb {R}}) \bigcup L^{\infty }({\mathbb {R}})\). They are used to calculate the commutators of Toeplitz operators. We notice that the Hankel operators are \({\mathbb {C}}\)-anti-linear and the Toeplitz operators are \({\mathbb {C}}\)-linear. For every symbol \(v\in L^2({\mathbb {R}}) \bigcup L^{\infty }({\mathbb {R}})\), its associated Hankel operator is defined by \(H_v(h)=T_{\overline{h}}v = \Pi (v \overline{h})\), \(\forall h\in H^1_+\). If \(v \in L^{\infty }({\mathbb {R}})\), then \(H_v : L^2_+ \rightarrow L^2_+\) is a bounded operator. If \(v \in L^2({\mathbb {R}})\), then \(H_v\) may be an unbounded operator on \(L^2_+\) whose domain of definition contains \(H^1_+\). For any \(b\in H^1({\mathbb {R}}), h\in H^1_+\), we have \(\Vert T_b (h)\Vert _{H^1} + \Vert H_b (h)\Vert _{H^1} \lesssim \Vert b\Vert _{H^1} \Vert h\Vert _{H^1}\), so both \(T_b\) and \(H_b\) are bounded on \(L^2_+\) and on \(H^1_+\).

Lemma 2.19

For every \(v, w \in L^2_+\bigcap L^{\infty }({\mathbb {R}})\) and \(u \in L^2({\mathbb {R}})\), we have

$$\begin{aligned}{}[T_v, T_{\overline{w}}] =-H_v \circ H_w \in \mathfrak {B}(L^2_+). \end{aligned}$$
(2.20)

If \(w \in H^1_+\) in addition, then we have \(T_u(w) \in L^2_+\) and

$$\begin{aligned} H_{T_{u}w}= T_{w} \circ H_{\Pi u} + H_w \circ T_{\overline{u}} = T_u \circ H_w + H_{\Pi u} \circ T_{\overline{w}}\in \mathfrak {B}(H^1_+, L^2_+). \end{aligned}$$
(2.21)

Proof

For every \(v, w \in L^2_+\bigcap L^{\infty }({\mathbb {R}})\) and \(h\in L^2_+\), we have \(\overline{w}h = \Pi (\overline{w}h) + \overline{\Pi (w\overline{h} )} \in L^2_+\). Thus, we have \([T_v, T_{\overline{w}}]h=\Pi (v\Pi (\overline{w}h) - \overline{w}\Pi (vh) ) = \Pi ( v\overline{w}h - v\overline{\Pi (w\overline{h})}- v\overline{w} h ) =-\Pi ( v\overline{\Pi (w\overline{h})}) = - H_v \circ H_w (h) \in L^2_+\). Given \(u \in L^2({\mathbb {R}})\) and \(w \in H^1_+\), for every \(h\in H^1_+\), we have \(w \overline{h} =\Pi (w\overline{h} ) + \overline{\Pi ( \overline{w}h )} \in H^1({\mathbb {R}})\) and \(H_{w}(h), T_{\overline{w}} (h) \in H^1_+\). So \(\Pi (u\overline{\Pi ( \overline{w} h)}) =\Pi (\overline{\Pi ( \overline{w} h)} \Pi u)=H_{\Pi u} \circ T_{\overline{w}} (h) \in L^2_+\) and we have

$$\begin{aligned} H_{T_{u}w}(h)= & {} \Pi (\Pi (uw)\overline{h}) = \Pi ( uw \overline{h}) = \Pi ( u \Pi (w \overline{h}) + u\overline{\Pi ( \overline{w} h)}) \\= & {} (T_u \circ H_w + H_{\Pi u} \circ T_{\overline{w}} ) (h)\in L^2_+. \end{aligned}$$

Similarly, we have \(u \overline{h} =\Pi (u\overline{h} ) + \overline{\Pi ( \overline{u}h )} \in L^2({\mathbb {R}})\) and \(\Pi (u \overline{h}) = \Pi (\overline{h}\Pi u) = H_{\Pi u}(h) \in L^2_+\). Thus, we have \(H_{T_{u}w}(h) = \Pi ( w u\overline{h}) = \Pi ( w \Pi (u \overline{h}) + w\overline{\Pi ( \overline{u} h)}) = (T_w \circ H_{\Pi u} + H_{w} \circ T_{\overline{u}} ) (h)\in L^2_+\). \(\quad \square \)

Lemma 2.20

Given \((\lambda , u) \in \mathcal {X}\) given in Definition 2.14, set \(w_{\lambda }(u)= \left( L_u+\lambda \right) ^{-1} \circ \Pi (u) \in H^1_+\), then

$$\begin{aligned}{}[\mathrm {D} - T_u, T_{w_\lambda (u)} T_{\overline{w}_\lambda (u)} + T_{w_\lambda (u)} + T_{\overline{w}_\lambda (u)}] = T_{\mathrm {D} [|w_{\lambda }(u)|^2 + w_{\lambda }(u) + \overline{w}_{\lambda }(u)]}\in \mathfrak {B}(H^1_+ , L^2_+).\nonumber \\ \end{aligned}$$
(2.22)

Proof

We use abbreviation \(w_{\lambda }:=w_{\lambda }(u) \in H^1_+\), then \(\overline{w}_{\lambda } \in H^1_-\). If \(f^+, g^+ \in H^1_+\) and \(f^-, g^- \in H^1_-\), then \([T_{f^+}, T_{g^+}]=[T_{f^-}, T_{g^-}]=0\), because for every \(h \in L^2_+\), we have \(T_{f^+}[T_{g^+}(h)] = f^+ g^+ h = T_{g^+}[T_{f^+}(h)]\) and \(T_{f^-}[T_{g^-}(h)] =\Pi (f^- \Pi (g^- h )) = \Pi (f^- g^- h ) =\Pi (g^- \Pi (f^- h )) = T_{g^-}[T_{f^-}(h)]\). Since \(\Pi u \in L^2_+\) and \(\overline{\Pi u} \in L^2_-\), we use Leibnitz’s rule and formula (2.20) to obtain that

$$\begin{aligned}{}[\mathrm {D} - T_u, T_{w_\lambda } + T_{\overline{w}_\lambda }]= & {} T_{\mathrm {D} w_\lambda } + T_{\mathrm {D} \overline{w}_\lambda } - [T_u, T_{w_\lambda }] - [T_u, T_{\overline{w}_\lambda }]\\= & {} T_{\mathrm {D} w_\lambda } + T_{\mathrm {D} \overline{w}_\lambda } - [T_{\overline{\Pi u}}, T_{w_\lambda }] - [T_{\Pi u}, T_{\overline{w}_\lambda }]\nonumber \\= & {} T_{\mathrm {D} w_\lambda } + T_{\mathrm {D} \overline{w}_\lambda } - H_{w_{\lambda }}H_{\Pi u} + H_{\Pi u}H_{w_{\lambda }} .\nonumber \end{aligned}$$
(2.23)

Similarly, formula (2.20) implies that

$$\begin{aligned}{}[T_u, T_{w_\lambda } T_{\overline{w}_\lambda }]= & {} [T_u, T_{w_\lambda }] T_{\overline{w}_\lambda } + T_{w_\lambda } [T_u, T_{\overline{w}_\lambda }]\nonumber \\= & {} [T_{\overline{\Pi u}}, T_{w_\lambda }] T_{\overline{w}_\lambda } + T_{w_\lambda } [T_{\Pi u}, T_{\overline{w}_\lambda }] = H_{w_\lambda } H_{\Pi u} T_{\overline{w}_\lambda } - T_{w_\lambda } H_{\Pi u} H_{w_\lambda }.\nonumber \\ \end{aligned}$$
(2.24)

For every \(h \in H^1_+\), since \(\overline{w}_\lambda , \mathrm {D} \overline{w}_\lambda \in L^2_-\), we have

$$\begin{aligned}{}[\mathrm {D}, T_{\overline{w}_\lambda } T_{w_\lambda }]h= & {} [\mathrm {D}, T_{\overline{w}_\lambda }] T_{w_\lambda }h +T_{\overline{w}_\lambda }[\mathrm {D}, T_{w_\lambda }] h \\= & {} T_{\mathrm {D} \overline{w}_\lambda } (T_{w_\lambda }h ) +T_{\overline{w}_\lambda } (T_{\mathrm {D} w_\lambda }h) \\= & {} \Pi [\mathrm {D} \overline{w}_\lambda \Pi (w_\lambda h) + \overline{w}_\lambda \Pi (\mathrm {D} w_\lambda h) ] = \Pi [ (w_\lambda \mathrm {D} \overline{w}_\lambda +\overline{w}_\lambda \mathrm {D} w_\lambda )h ] \in L^2_+. \end{aligned}$$

So \([\mathrm {D}, T_{\overline{w}_\lambda } T_{w_\lambda }] = T_{\mathrm {D} |w _\lambda |^2 } \in \mathfrak {B}(H^1_+, L^2_+)\). We use formula (2.20) and Leibnitz’s Rule to obtain that

$$\begin{aligned}{}[\mathrm {D}, T_{w_\lambda }T_{\overline{w}_\lambda }] = [\mathrm {D}, T_{\overline{w}_\lambda } T_{w_\lambda }] - [\mathrm {D}, H_{w_{\lambda }}^2] = T_{\mathrm {D} |w_\lambda |^2 } - H_{\mathrm {D} {w_\lambda }} H_{w_\lambda } +H_{w_\lambda } H_{\mathrm {D} {w_\lambda }}\qquad \end{aligned}$$
(2.25)

Recall that \(w_\lambda = (\lambda + L_u)^{-1} \Pi u\), then we have

$$\begin{aligned} \mathrm {D}w_{\lambda } = T_u (w_{\lambda }) -\lambda w_{\lambda } + \Pi u. \end{aligned}$$
(2.26)

The formulas (2.21) and (2.26) imply the following two identities,

$$\begin{aligned} \begin{aligned} H_{\mathrm {D} w_{\lambda }} - T_{w_{\lambda }} H_{\Pi u}&= H_{T_u w_{\lambda } } -\lambda H_{w_{\lambda }} + H_{\Pi u} - T_{w_{\lambda }} H_{\Pi u} = H_{w_{\lambda }} T_u -\lambda H_{w_{\lambda }} + H_{\Pi u},\\ H_{\mathrm {D} w_{\lambda }} -H_{\Pi u} T_{\overline{w}_{\lambda }}&=H_{T_u w_{\lambda } } -\lambda H_{w_{\lambda }} + H_{\Pi u} - H_{\Pi u} T_{\overline{w}_{\lambda }} =T_u H_{w_{\lambda }}-\lambda H_{w_{\lambda }} + H_{\Pi u}.\\ \end{aligned}\nonumber \\ \end{aligned}$$
(2.27)

We use formulas (2.24), (2.25) and (2.27) to get the following formula

$$\begin{aligned} \begin{aligned}&[\mathrm {D} - T_u , T_{w_\lambda }T_{\overline{w}_\lambda }] = T_{\mathrm {D} |w_\lambda |^2 } -( H_{\mathrm {D} {w_\lambda }}-T_{w_\lambda } H_{\Pi u}) H_{w_\lambda } +H_{w_{\lambda }} (H_{\mathrm {D} {w_\lambda }} - H_{\Pi u} T_{\overline{w}_\lambda })\\ =&T_{\mathrm {D} |w_\lambda |^2 } -(H_{w_{\lambda }} T_u H_{w_\lambda } -\lambda H_{w_{\lambda }}^2 + H_{\Pi u}H_{w_\lambda } ) + (H_{w_{\lambda }} T_u H_{w_{\lambda }}-\lambda H_{w_{\lambda }}^2 + H_{w_{\lambda }} H_{\Pi u})\\ =&T_{\mathrm {D} |w_\lambda |^2 } - H_{\Pi u}H_{w_\lambda } + H_{w_{\lambda }} H_{\Pi u}. \end{aligned}\nonumber \\ \end{aligned}$$
(2.28)

At last, we combine formulas (2.23) and (2.28) to obtain formula (2.22). \(\quad \square \)

End of the proof of proposition 2.17

For every \(u : t \mapsto u(t)\in L^2({\mathbb {R}}, {\mathbb {R}})\) solving equation (2.17), we have \(\frac{\mathrm {d}}{\mathrm {d}t} (L\circ u)(t) = - T_{\partial _t u(t)} = -i T_{\mathrm {D}\left( w_{\lambda }(u(t)) \overline{w}_{\lambda }(u(t)) + w_{\lambda }(u(t)) + \overline{w}_{\lambda }(u(t)) \right) }\). Consequently, the Lax equation (2.18) is obtained by identity (2.22) in Lemma 2.20. \(\quad \square \)

3 The Action of the Shift Semigroup

In this section, we introduce the semigroup of shift operators \((S(\eta )^*)_{\eta \ge 0}\) acting on the Hardy space \(L^2_+\) and classify all finite-dimensional translation-invariant subspaces of \(L^2_+\). For every \(\eta \ge 0\), we define the operator \(S(\eta ): L^2_+ \rightarrow L^2_+\) such that \(S(\eta )f = e_{\eta }f\), where \(e_{\eta }(x)=e^{i\eta x}\). Then, its adjoint is given by \(S(\eta )^* = T_{e_{-\eta }}\). We have \(S(\eta )^* \circ L_u \circ S(\eta ) = L_u + \eta \mathrm {Id}_{L^2_+}\), \(\forall \eta \ge 0\). Since \(\Vert S(\eta )^*\Vert _{\mathfrak {B}(L^2_+)}=\Vert S(\eta )\Vert _{\mathfrak {B}(L^2_+)}=1\), \((S(\eta )^*)_{\eta \ge 0}\) is a contraction semi-group. Let \(-i G\) denote its infinitesimal generator, i.e. \(Gf = i \frac{\mathrm {d}}{\mathrm {d}\eta } \big |_{\eta =0^+} S(\eta )^* f \in L^2_+\), \(\forall f \in \mathbf {D}(G)\), where

$$\begin{aligned} \begin{aligned} \mathbf {D}(G):=&\{f \in L^2_+ : {\hat{f}}_{|{\mathbb {R}}_+} \in H^1(0, +\infty )\}, \end{aligned} \end{aligned}$$
(3.1)

because \(\lim _{\epsilon \rightarrow 0} \Vert \frac{\psi -\tau _{\epsilon }\psi }{\epsilon } - \partial _x \psi \Vert _{L^2(0, +\infty )} =0\), where \(\tau _{\epsilon }\psi (x)= \psi (x-\epsilon )\) and \( \psi \in H^1(0, +\infty )\). Every function \(f \in \mathbf {D}(G)\) has bounded Hölder continuous Fourier transform by Morrey’s inequality and Sobolev extension operator yields the existence of \({\hat{f}}(0^+):=\lim _{\xi \rightarrow 0^+}{\hat{f}}(\xi )\). The operator G is densely defined and closed. The Fourier transform of Gf is given by

$$\begin{aligned} \widehat{Gf}(\xi ) = i \partial _{\xi }{\hat{f}}(\xi ) , \qquad \forall f \in \mathbf {D}(G), \qquad \forall \xi >0. \end{aligned}$$
(3.2)

The Hille–Yosida theorem implies that \((-\infty , 0) \subset \rho (iG)\) and \(\Vert (G-\lambda i)^{-1}\Vert _{\mathfrak {B}(L^2_+)} \le \lambda ^{-1}\), \(\forall \lambda >0\).

Lemma 3.1

For every \(b \in L^2({\mathbb {R}})\bigcap L^{\infty }({\mathbb {R}})\), we have \(T_b (\mathbf {D}(G)) \subset \mathbf {D}(G)\) and the following identity

$$\begin{aligned}{}[G, T_{b}] \varphi = \tfrac{i{\hat{\varphi }}(0^+)}{2\pi }\Pi b \end{aligned}$$
(3.3)

holds for every \(\varphi \in \mathbf {D}(G)\).

Proof

For every \(\eta >0\) and \(\varphi \in \mathbf {D}(G)\), both \(S(\eta )^*\) and \(T_b\) are bounded operators on \(L^2_+\), so we have \(\tfrac{1}{\eta }\left( [ S(\eta )^* , T_b]\varphi \right) ^{\wedge }(\xi ) = \frac{1}{2\pi \eta }\left( {\hat{b}}*{\hat{\varphi }}(\xi +\eta ) - {\hat{b}}*(\mathbf {1}_{{\mathbb {R}}_+}( \tau _{-\eta }{\hat{\varphi }}))(\xi ) \right) = \frac{1}{2\pi \eta } \int _{\xi }^{\xi +\eta } {\hat{b}}(\zeta ) {\hat{\varphi }}(\xi + \eta - \zeta ) \mathrm {d}\zeta \), \(\forall \xi >0\), where \(\tau _{-\eta }{\hat{\varphi }} (x)={\hat{\varphi }} (x+\eta )\), \(\forall x \in {\mathbb {R}}\). Then we change the variable \(\zeta = \xi + t \eta \), for \(0\le t \le 1\),

$$\begin{aligned} \tfrac{1}{\eta }\left( [ S(\eta )^* - \mathrm {Id}_{L^2_+} , T_b]\varphi \right) ^{\wedge }(\xi )= \frac{1}{2\pi } \int _{0}^{1} {\hat{b}}(\xi + t\eta ) {\hat{\varphi }}((1-t)\eta ) \mathrm {d}t= a_{\eta }\widehat{ b}(\xi ) + \widehat{\phi _{\eta }}(\xi ), \quad \forall \xi >0, \end{aligned}$$
(3.4)

where \(a_{\eta } := \frac{1}{2\pi } \int _{0}^{1} {\hat{\varphi }}((1-t)\eta ) \mathrm {d}t \in {\mathbb {C}}\) and \(\phi _{\eta } \in L^2_+\) such that \(\widehat{\phi _{\eta }}(\xi ) := \frac{1}{2\pi } \int _{0}^{1} [{\hat{b}}(\xi + t\eta ) - {\hat{b}}(\xi )]{\hat{\varphi }}((1-t)\eta ) \mathrm {d}t\), \(\forall \xi > 0\). Since \({\hat{\varphi }}|_{{\mathbb {R}}_+} \in H^1(0, +\infty )\), \({\hat{\varphi }}\) is bounded and \(\lim _{\eta \rightarrow 0^+}{\hat{\varphi }}(\eta ) = {\hat{\varphi }}(0^+)\), Lebesgue’s dominated convergence theorem yields that \(\lim _{\eta \rightarrow 0^+} a_{\eta } = \frac{{\hat{\varphi }}(0^+)}{2\pi }\). Since \(b \in L^2({\mathbb {R}})\), we have \(\lim _{\epsilon \rightarrow 0} \Vert \tau _{\epsilon }{\hat{b}} - {\hat{b}} \Vert _{L^2} = 0\). So \(\Vert \phi _{\eta }\Vert _{L^2}^2 \lesssim \Vert {\hat{\varphi }}\Vert _{L^{\infty }}^2\int _{0}^{1} \int _{0}^{+\infty }|{\hat{b}}(\xi + t\eta ) - {\hat{b}}(\xi )|^2\mathrm {d}\xi \mathrm {d}t = \Vert {\hat{\varphi }}\Vert _{L^{\infty }}^2\int _{0}^{1} \Vert \tau _{-t\eta }{\hat{b}} - {\hat{b}} \Vert _{L^2}^2\mathrm {d}t \rightarrow 0\), if \(\eta \rightarrow 0^+\). Thus (3.4) implies that \(\tfrac{1}{\eta }[ S(\eta )^* - \mathrm {Id}_{L^2_+} , T_b]\varphi = a_{\eta } \Pi b + \phi _{\eta } \rightarrow \frac{{\hat{\varphi }}(0^+)}{2\pi } \Pi b\) in \(L^2_+\), when \(\eta \rightarrow 0^+\). Since \(\varphi \in \mathbf {D}(G)\) and \(T_b\) is bounded, we have \(\tfrac{i}{\eta } T_b [ ( S(\eta )^* - \mathrm {Id}_{L^2_+} )\varphi ] \rightarrow ( T_b G )\varphi \) in \(L^2_+\), when \(\eta \rightarrow 0^+\). Consequently, \(\tfrac{i}{\eta } ( S(\eta )^* - \mathrm {Id}_{L^2_+} )(T_b \varphi ) \rightarrow ( T_b G )\varphi + \frac{i{\hat{\varphi }}(0^+)}{2\pi }\Pi b\) in \(L^2_+\) when \(\eta \rightarrow 0^+\). So \(T_b \varphi \in \mathbf {D}(G)\) and (3.3) holds. \(\quad \square \)

The following scalar representation theorem discovered by Lax in [13] allows to classify all translation-invariant subspaces of the Hardy space \(L^2_+\), which plays the same role as the Beurling’s theorem in the case of Hardy space on the circle.

Theorem 3.2

(Lax) Every nonempty closed subspace of \(L^2_+\) that is invariant under the semigroup of shift operators \((S(\eta ))_{\eta \ge 0}\) is of the form \(\Theta L^2_+\), where \(\Theta \) is a holomorphic function on the upper-half plane \({\mathbb {C}}_+ = \{z \in {\mathbb {C}} : \mathrm {Im}z >0\}\). We have \(|\Theta (z)|\le 1\), for all \(z \in {\mathbb {C}}_+\) and \(|\Theta (x)|=1\), \(\forall x \in {\mathbb {R}}\). Moreover, \(\Theta \) is uniquely determined up to multiplication by a complex constant of absolute value 1.

The following lemma classifies all finite-dimensional subspaces that are invariant under the semi-group \((S(\eta )^*)_{\eta \ge 0}\), which is a weak version of Theorem 3.2.

Lemma 3.3

Let M be a subspace of \(D(G)\subset L^2_+\) of finite dimension \(N =\dim _{{\mathbb {C}}} M \ge 1\) and \(G(M) \subset M\). Then there exists a unique monic polynomial \(Q \in {\mathbb {C}}_{N}[X]\) such that \(Q^{-1}(0) \subset {\mathbb {C}}_-\) and \(M=\frac{{\mathbb {C}}_{\le N-1}[X]}{Q}\). Moreover, Q is the characteristic polynomial of the operator \(G|_{M}\).

Proof

We set \({\hat{M}}=\{{\hat{f}} \in L^2(0,+\infty ) : f \in M\}\), then \(\dim _{{\mathbb {C}}}{\hat{M}}=N\). Since \(\widehat{Gf}=i\partial _{\xi }{\hat{f}}\) on \({\mathbb {R}}\backslash \{0\}\), the restriction \(G|_M\) is unitarily equivalent to \(i\partial _{\xi }|_{{\hat{M}}}\) by the renormalized Fourier–Plancherel transformation. So the characteristic polynomial \(Q \in {\mathbb {C}}_N [X]\) of \(i\partial _{\xi }|_{{\hat{M}}}\) is well defined, let \(\{\beta _1, \beta _2, \ldots , \beta _n\} \subset {\mathbb {C}}\) denote the distinct roots of Q and \(m_j\) denotes the multiplicity of \(\beta _j\), we have \(\sum _{j=1}^n m_j=N\) and there exist \(c_0, c_1, \ldots , c_{N-1} \in {\mathbb {C}}\) such that \(Q(z) = \det (z - i \partial _{\xi }|_{{\hat{M}}})= \prod _{j=1}^n (z - \beta _j)^{m_j} = z^N + \sum _{k=0}^{N-1}c_k z^k\). The Cayley–Hamilton theorem implies that \(Q(i\partial _{\xi } ) = 0\) on the subspace \({\hat{M}}\). If \(\psi \in {\hat{M}} \subset L^2(0,+\infty )\), then \(\psi \) is a weak-solution of the following differential equation

$$\begin{aligned} i^{-N}Q(-\mathrm {D})\psi =\partial _{\xi }^N \psi + \sum _{k=0}^{N-1} i^{k-N}c_k \partial _{\xi }^k \psi =0 \quad \mathrm {on} \quad (0,+\infty ), \qquad \psi \equiv 0 \quad \mathrm {on} \quad (-\infty , 0), \end{aligned}$$
(3.5)

where \(\mathrm {D} = - i\partial _{\xi }\). The differential operator \(Q(-\mathrm {D})\) is elliptic on the open interval \((0,+\infty )\) i.e. the symbol of the principal part of \(Q(-\mathrm {D})\), denoted by \( a _{Q} : (x,\xi ) \in (0,+\infty )\times {\mathbb {R}} \mapsto (-\xi )^N\), does not vanish except for \(\xi =0\). So \(\psi \) is a smooth function on \((0,+\infty )\). The solution space

$$\begin{aligned} \mathrm {Sol}(3.5) =\mathrm {Span}_{{\mathbb {C}}}\{{\hat{f}}_{j,l}\}_{0\le l \le m_j-1, 1\le j \le n}, \qquad {\hat{f}}_{j,l}(\xi )= \xi ^l e^{-i \beta _j \xi } \mathbf {1}_{{\mathbb {R}}_+}, \end{aligned}$$
(3.6)

has complex dimension \(\sum _{j=1}^n m_j=N\), so we have \(\mathrm {Sol}(3.5)={\hat{M}} \subset L^2_+\) and \(\mathrm {Im}\beta _j <0\), \(\forall j=1,2, \ldots , N\). At last, we have \(M=\mathrm {Span}_{{\mathbb {C}}}\{f_{j,l}\}_{0\le l \le m_j-1, 1\le j \le n} =\frac{{\mathbb {C}}_{\le N-1}[X]}{Q} \), where \(f_{j,l}(x)= \frac{l !}{2\pi [(-i)(x- \beta _j)]^{l+1}}\), \(\forall x \in {\mathbb {R}}\). The uniqueness is obtained by identifying all the roots. \(\quad \square \)

Lemma 3.4

For every monic polynomial \(Q \in {\mathbb {C}}_N[X]\) such that \(Q^{-1}(0)\subset {\mathbb {C}}_-\), the associated inner function is defined by \(\Theta =\Theta _Q =\frac{\overline{Q}}{Q}\). The following identity holds for every \(\varphi \in \frac{{\mathbb {C}}_{\le N-1}[X]}{Q}\),

$$\begin{aligned} {\hat{\varphi }}(\xi )=\langle S(\xi )^*\varphi , 1- \Theta \rangle _{L^2}, \quad \forall \xi >0. \end{aligned}$$
(3.7)

In particular, \({\hat{\varphi }}(0^+)=\langle \varphi , 1- \Theta \rangle _{L^2}\).

Proof

Formula (3.6) yields that \(\frac{{\mathbb {C}}_{\le N-1}[X]}{Q} \subset \mathbf {D}(G)\), \(G(\frac{{\mathbb {C}}_{\le N-1}[X]}{Q}) \subset \frac{{\mathbb {C}}_{\le N-1}[X]}{Q}\) and \({\hat{\varphi }} \in C^1({\mathbb {R}}_+^*)\), for any \(\varphi \in \frac{{\mathbb {C}}_{\le N-1}[X]}{Q}\). Set \(\varphi =\frac{P}{Q}\), for some \(P \in {\mathbb {C}}_{\le N-1}[X]\), then we have \(\overline{\Theta } \varphi = \frac{Q}{\overline{Q}} \frac{P}{Q} = \frac{P}{\overline{Q}} \in L^2_-\). Since \(Q(X)= \prod _{j=1}^N (X-\beta _j)\), \(\mathrm {Im} \beta _j<0\), we have \(\Theta (x)= 1 + 2i \sum _{j=1}^N \frac{\mathrm {Im} \beta _j}{x-\beta _j}+ \mathcal {O}(\frac{1}{x^2})\), when \(x\rightarrow +\infty \), so \(1-\Theta \in L^2_+\). As a consequence, we have \({\hat{\varphi }}(\xi ) = \int _{{\mathbb {R}}}\varphi (y)(1-\overline{\Theta (y)})e^{-iy \xi } \mathrm {d}y = \langle S(\xi )^* \varphi , 1-\Theta \rangle _{L^2}\), \(\forall \xi >0\). \(\quad \square \)

4 The Manifold of Multi-solitons

This section is dedicated to a geometric description of every multi-soliton subset given in definition 1.1. Then we give a spectral characterization for the real analytic symplectic manifold \(\mathcal {U}_N\) in order to prove the global well-posedness of the BO equation (1.1) on \(\mathcal {U}_N\).

4.1 Differential structure and symplectic structure

The real analytic structure of \(\mathcal {U}_N\) is constructed at first.

Proof of proposition 1.2

The set \(\mathcal {V}_N:=\{Q\in {\mathbb {C}}_N[X] : Q^{-1}(0) \subset {\mathbb {C}}_-, \lim _{x\rightarrow +\infty } \frac{Q(x)}{x^N}=1\}\) is identified as \( \mathbf {V}({\mathbb {C}}_-^N)\), where \(\mathbf {V} : (\beta _1, \beta _2, \ldots , \beta _N) \in {\mathbb {C}}^N \mapsto (a_0, a_1, \ldots , a_{N-1}) \in {\mathbb {C}}^N\) denotes the Viète map, defined by

$$\begin{aligned} \prod _{j=1}^N (X- \beta _j) = \sum _{k=0}^{N-1} a_k X^k + X^N. \end{aligned}$$
(4.1)

Recall that \(\mathbf {V} : {\mathbb {C}}^N \rightarrow {\mathbb {C}}^N\) is a both open and closed quotient map. For any open simply connected subset \(A \subset {\mathbb {C}}^N\), if A is saturated with respect to \(\mathbf {V}\) and \(A \bigcap \Delta \ne \emptyset \) with \(\Delta :=\{(\beta , \beta , \ldots , \beta ) \in {\mathbb {C}}^N : \forall \beta \in {\mathbb {C}}\}\), then \(\mathbf {V}(A)\) is an open simply connected subset of \({\mathbb {C}}^N\). With the subspace topology of \({\mathbb {C}}^N\) and the Hermitian form \( \mathfrak {H}_{{\mathbb {C}}^N}(X,Y) = X^T \overline{Y}\), the subset \((\mathbf {V}({\mathbb {C}}_-^N), \mathfrak {H}_{{\mathbb {C}}^N})\) is a simply connected Kähler manifold of complex dimension N. The map \(\Gamma _N : (a_0, a_1, \ldots , a_{N-1}) \in \mathbf {V}({\mathbb {C}}_-^N) \mapsto \Pi u = i\frac{Q'}{Q} \in L^2_+\), where Q is given by \(Q(X)=\sum _{k=0}^{N-1} a_k X^k + X^N\), is both a holomorphic immersion and a topological embedding. So \(\Pi (\mathcal {U}_N)= \Gamma _N \circ \mathbf {V}({\mathbb {C}}_-^N)\) is an embedded complex analytic submanifold of \(L^2_+\) and \(\dim _{{\mathbb {C}}}(\Pi (\mathcal {U}_N))=N\). The map \(\Gamma _N : \mathbf {V}({\mathbb {C}}_-^N) \rightarrow \Pi (\mathcal {U}_N)\) is a biholomorphism and \(\mathcal {T}_{\Pi u} (\Pi (\mathcal {U}_N)) = \bigoplus _{z \in \mathbf {P}(u)} {\mathbb {C}}^{\mathbf {m}(z)} \varvec{\phi }_{z}\), where \(\varvec{\phi }_{z}(x)=( x-z)^{-2}\), \(\forall z \in \mathbf {P}(u)\), \(\forall u \in \mathcal {U}_N\). The proof is completed by using the isometry property of the \({\mathbb {R}}\)-linear isomorphism \(\sqrt{2} \Pi : u \in L^2({\mathbb {R}}, {\mathbb {R}}) \mapsto \sqrt{2} \Pi u \in L^2_+\). In fact, we have \(2\mathrm {Re}|_{L^2_+} = (\Pi _{|L^2({\mathbb {R}}, {\mathbb {R}})})^{-1}\) and \(\Vert u\Vert _{L^2} = \sqrt{2} \Vert \Pi u\Vert _{L^2}\). \(\quad \square \)

We set \(\mathcal {E}:= L^2({\mathbb {R}}, {\mathbb {R}}) \bigcap L^2({\mathbb {R}}, x^2\mathrm {d}x)\), \(\mathcal {E}_{c} : = \{u \in \mathcal {E} : \int _{{\mathbb {R}}}u=c\}\), \(\forall c\in {\mathbb {R}}\). Then we have \(\mathcal {U}_N \subset \mathcal {E}_{2\pi N}\), \(\mathcal {T}_u(\mathcal {U}_N) \subset \mathcal {E}_0 = \mathcal {T}\) defined in (1.10), \(\forall u \in \mathcal {U}_N\). Moreover, \(\mathcal {T}\) is included in \(\mathcal {W} = \partial _x (H^1({\mathbb {R}}, {\mathbb {R}}))\), which is defined in (1.4), thanks to the following lemma.

Lemma 4.1

(Hardy). For every \(f \in H^1({\mathbb {R}})\) such that \(f(0)=0\), we have \(\int _{{\mathbb {R}}} \frac{|f(x)|^2}{|x|^2} \mathrm {d} x \le 4 \Vert \partial _x f\Vert _{L^2}^2\).

So the 2-form \(\omega \) in (1.11) is well defined. Then we show that \(\omega \) is a real analytic symplectic form on \(\mathcal {U}_N\).

Proof of proposition 1.3

Given any smooth vector field \(X \in \mathfrak {X}(\mathcal {U}_N)\), let \(X \lrcorner \omega \in \varvec{\Omega }^1(\mathcal {U}_N)\) denote the interior multiplication by X, i.e. \((X\lrcorner \omega ) (Y) = \omega (X,Y)\), for every \(Y \in \mathfrak {X}(\mathcal {U}_N)\). The first step is prove that \(\mathrm {d}\omega =0\) on \(\mathcal {U}_N\) by using the following Cartan’s formula:

$$\begin{aligned} \mathscr {L}_X \omega = X\lrcorner (\mathrm {d}\omega ) + \mathrm {d}( X\lrcorner \omega ). \end{aligned}$$
(4.2)

Let \(\phi \) denote the smooth maximal flow of X. If t is sufficiently close to 0, then \(\phi _t : u\in \mathcal {U}_N \mapsto \phi (t,u) \in \mathcal {U}_N\) is a local diffeomorphism by the fundamental theorem on flows. For any \(u \in \mathcal {U}_N\), \(h_1, h_2 \in \mathcal {T}_u(\mathcal {U}_N)\), we compute the Lie derivative of \(\omega \) with respect to X,

$$\begin{aligned} \begin{aligned} (\mathscr {L}_X \omega )_u(h_1,h_2) =&\lim _{t \rightarrow 0} \tfrac{\omega _{\phi _t(u)} (\mathrm {d} \phi _t(u) h_1, \mathrm {d} \phi _t(u) h_2)-\omega _{u} ( h_1, h_2)}{t} \\ =&\lim _{t \rightarrow 0}\varvec{\omega } \left( \tfrac{ \mathrm {d} \phi _t(u) h_1 - h_1}{t}, \mathrm {d} \phi _t(u) h_2 \right) + \lim _{t \rightarrow 0}\varvec{\omega } \left( h_1, \tfrac{ \mathrm {d} \phi _t(u) h_2 - h_2}{t} \right) . \end{aligned} \end{aligned}$$

So \((\mathscr {L}_X \omega )_u(h_1,h_2) = \left( h_1 \omega (X , h_2)\right) (u) - \left( h_2 \omega (X , h_1)\right) (u)\). We choose \((V, x^i)\) a smooth local chart for \(\mathcal {U}_N\) such that \(u \in V\) and the tangent vector \(h_k\) has the coordinate expression \(h_k= \sum _{j=1}^{2N} h_k^{(j)}\frac{\partial }{\partial x^j}\big |_u\), for some \(h_k^{(j)} \in {\mathbb {R}}\), \(j=1,2 \ldots , 2N\) and \(k=1,2\). The tangent vector \(h_k\) can be identified as some locally constant vector field \(Y_k \in \mathfrak {X}(\mathcal {U}_N)\), which is defined by \(Y_k : v \in V \mapsto \sum _{j=1}^{2N} h_k^{(j)}\frac{\partial }{\partial x^j}\big |_v \in \mathcal {T}_v(\mathcal {U}_N)\), \(Y_k :u \mapsto (Y_k)_u= h_k\), \(\forall k=1,2\). Then the vector field \([Y_1, Y_2]\) vanishes in the open subset V. The exterior derivative of the 1-form \(\beta = X \lrcorner \omega \) is computed as \(\mathrm {d}\beta (Y_1,Y_2) = Y_1 \left( \beta (Y_2)\right) - Y_2 \left( \beta (Y_1)\right) + \beta ([Y_1,Y_2])\). Thus \(\left( \mathrm {d}( X \lrcorner \omega )\right) _u (h_1,h_2) = (\mathscr {L}_X \omega )_u(h_1,h_2)\). Then Cartan’s formula (4.2) yields that \( X \lrcorner (\mathrm {d}\omega )=0\). Since \(X \in \mathfrak {X}(\mathcal {U}_N)\) is arbitrary, we have \(\mathrm {d}\omega =0\).

Given \(u\in \mathcal {U}_N\), we claim that the linear map \(\Upsilon ^{\omega }_u : h \in \mathcal {T}_u(\mathcal {U}_N) \mapsto h\lrcorner \omega _u \in \mathcal {T}_u^*(\mathcal {U}_N)\) is injective.

In fact, for any \(h \in \mathrm {Ker}\Upsilon ^{\omega }_u\), we define \(h^{\sharp }:= 2 \mathrm {Re}(i \Pi h) \in \mathcal {T}_u(\mathcal {U}_N)\). Then the second expression of (1.11) yields that \(0=(h\lrcorner \omega _u) (h^{\sharp })= \int _0^{+\infty } \frac{|{\hat{h}}(\xi )|^2}{\pi \xi }\mathrm {d}\xi \) and hence \(h=2\mathrm {Re} \circ \Pi (h)=0\). So \(\omega \) is nondegenerate and it is a real analytic symplectic form on \(\mathcal {U}_N\). For any smooth function \(f : \mathcal {U}_N \rightarrow {\mathbb {R}}\), its Hamiltonian vector field \(X_f \in \mathfrak {X}(\mathcal {U}_N)\) is given by \(X_f(u):=-(\Upsilon ^{\omega }_u)^{-1} (\mathrm {d}f(u))\). Since \(\mathrm {d}f(u)(h)= \langle h, \nabla _u f(u) \rangle _{L^2} = \frac{i}{2\pi } \int _{{\mathbb {R}}}\frac{{\hat{h}}(\xi )}{\xi } \overline{i \xi (\nabla _u f(u))^{\wedge }(\xi )}\mathrm {d}\xi \), \(\forall h \in \mathcal {T}_u(\mathcal {U}_N)\), formula (1.12) is obtained. \(\quad \square \)

Corollary 4.2

Endowed with Hermitian form \(\mathfrak {H}\), which is defined by \(\mathfrak {H}_{\Pi u}(h_1, h_2) := \int _0^{+\infty }\frac{{\hat{h}}_1(\xi ) \overline{{\hat{h}}_2(\xi )}}{\pi \xi } \mathrm {d}\xi \), \(\forall h_1, h_2 \in \mathcal {T}_{\Pi u}(\Pi (\mathcal {U}_N))\), \(\forall u \in \mathcal {U}_N\), \((\Pi (\mathcal {U}_N), \mathfrak {H})\) is a Kähler manifold and \(\omega = - \Pi ^* (\mathrm {Im} \mathfrak {H})\).

4.2 Spectral analysis II

We continue to study the spectrum of the Lax operator \(L_u\) introduced in Definition 2.2. The general case \(u\in \mathcal {E} = L^2({\mathbb {R}}, {\mathbb {R}}) \bigcap L^2({\mathbb {R}}, x^2 \mathrm {d}x)\) has been studied in Sect. 2.1. We restrict our study to the case \(u\in \mathcal {U}_N\) in this subsection. The operator \(L_u\) has the following spectral decomposition

$$\begin{aligned} L^2_+ = \mathscr {H}_{\mathrm {ac}}(L_u) \bigoplus \mathscr {H}_{\mathrm {sc}}(L_u) \bigoplus \mathscr {H}_{\mathrm {pp}}(L_u). \end{aligned}$$
(4.3)

Let \(Q_u\) denote the characteristic polynomial of u given by (1.9) and \(\Theta _u := \Theta _{Q_u}=\frac{\overline{Q}_u}{Q_u}\) denotes the inner function on \({\mathbb {C}}_+\) associated to \(Q_u\). We have \(S(\eta )[ \Theta _u h]= \Theta _u [S(\eta ) h]\), \(\forall h \in L^2_+\), so \(\Theta _u L^2_+\) is a closed subspace of \(L^2_+\) that is invariant under the semigroup \((S(\eta ))_{\eta \ge 0}\) in section 3. Set \(K_{\Theta _u}:=( \Theta _u L^2_+)^{\perp }\). Thus,

$$\begin{aligned} L^2_+ = \Theta L^2_+ \bigoplus K_{\Theta _u}, \quad S(\eta )^* ( K_{\Theta _u}) \subset K_{\Theta _u} \quad \mathrm {and} \quad G(\mathbf {D}(G) \bigcap K_{\Theta _u}) \subset K_{\Theta _u}.\nonumber \\ \end{aligned}$$
(4.4)

where G is defined in (3.2). The following proposition identifies the subspaces in (4.3) and (4.4).

Proposition 4.3

If \(u \in \mathcal {U}_N\), then \(L_u\) has exactly N simple negative eigenvalues and we have

$$\begin{aligned} \mathscr {H}_{\mathrm {ac}}(L_u) = \Theta _u L^2_+, \qquad \mathscr {H}_{\mathrm {sc}}(L_u)=\{0\}, \qquad \mathscr {H}_{\mathrm {pp}}(L_u) = K_{\Theta _u} = \tfrac{{\mathbb {C}}_{\le N-1}[X]}{Q_u}. \end{aligned}$$
(4.5)

Proof

Fix \(u\in \mathcal {U}_N\), we use abbreviation \(Q:=Q_u\) and \(\Theta :=\Theta _u\). The first step is to prove \(K_{\Theta } = \frac{{\mathbb {C}}_{\le N-1}[X]}{Q}\). In fact, \(\forall h \in L^2_+\) and \(f = \frac{P}{Q} \in \frac{{\mathbb {C}}_{\le N-1}[X]}{Q}\), for some \(P \in {\mathbb {C}}_{\le N-1}[X]\), we have \(\langle f, \Theta h\rangle _{L^2} = \langle \frac{P}{\overline{Q}}, h\rangle _{L^2}\). Since \(\overline{Q}(x)=\prod _{j=1}^N (x- \overline{\beta }_j)\) with \(\mathrm {Im}(\beta _j)<0\), the meromorphic function \(\frac{P}{\overline{Q}}\) has poles in \({\mathbb {C}}_+\), so \(\frac{P}{\overline{Q}} \in L^2_-\). Thus \(\langle f, \Theta h\rangle _{L^2} = \langle \frac{P}{\overline{Q}}, h\rangle _{L^2} =0\). Thus \(\frac{{\mathbb {C}}_{\le N-1}[X]}{Q} \subset (\Theta L^2_+)^{\perp } =K_{\Theta }\). Conversely, if \(f \in K_{\Theta }\), then \(\langle \Theta ^{-1}f, h \rangle _{L^2} =\langle f, \Theta h \rangle _{L^2} = 0\), for every \(h \in L^2_+\). Thus \(g:= \frac{Q}{\overline{Q}}f \in L^2_-\). It suffices to prove that \(P:=Q f = \overline{Q}g \in {\mathbb {C}}[X]\). In fact, \(\widehat{Qf}= Q (i \partial _{\xi }) {\hat{f}}\) and \(\mathrm {supp}({\hat{f}}) \subset [0,+\infty ) \Rightarrow \mathrm {supp}(\widehat{Qf}) \subset [0,+\infty )\). Similarly, we have \(\mathrm {supp}((\overline{Q}g)^{\wedge }) \subset (-\infty , 0]\). So \(\mathrm {supp}({\hat{P}})\subset \{0\}\) and P is a polynomial. Since \(f = \frac{P}{Q} \in L^2({\mathbb {R}})\), we have \(\deg P \le N-1\). So \(K_{\Theta } \subset \frac{{\mathbb {C}}_{\le N-1}[X]}{Q}\subset K_{\Theta }\). The second step is to show that

$$\begin{aligned} L_u(\Theta h)= \Theta \mathrm {D} h, \qquad \forall h \in L^2_+. \end{aligned}$$
(4.6)

In fact, we have \(\frac{{\mathbb {C}}_{\le N-1}[X]}{Q} \subset L^2_+\), \(\Theta = \frac{\overline{Q}}{Q}\) and \(\frac{\mathrm {D} \Theta }{\Theta }=\frac{\mathrm {D} \overline{Q}}{\overline{Q}} - \frac{\mathrm {D} Q}{Q} = i \frac{Q'}{Q} - i \frac{\overline{Q}'}{\overline{Q}} = \Pi u + \overline{\Pi u} = u\) on \({\mathbb {R}}\), then \(L_u (\Theta h) =(\mathrm {D}-T_u)(\Theta h)= \Theta \mathrm {D} h + h \left( \mathrm {D} \Theta - i \tfrac{Q'}{Q} \Theta + i \tfrac{\overline{Q}'}{Q} \right) = \Theta \mathrm {D}h + h \Theta \left( \tfrac{\mathrm {D} \Theta }{\Theta } - i \tfrac{Q'}{Q} + i \tfrac{\overline{Q}'}{\overline{Q}} \right) = \Theta \mathrm {D} h\). Recall that \(L_u = L_u^*\), so we have \(L_u(K_{\Theta }) \subset K_{\Theta }\). Since \(\dim _{{\mathbb {C}}}K_{\Theta } = N\), Corollary 2.7 yields that the Hermitian matrix \(L_{u|{K_{\Theta }}}\) has exactly N distinct eigenvalues. Hence \(K_{\Theta } \subset \mathscr {H}_{\mathrm {pp}}(L_u)\).

We set \(U_{\Theta } : L^2_+ \rightarrow \Theta L^2_+\) such that \(U_{\Theta } h = \Theta h\). Then \(U_{\Theta }^{-1}=U_{\Theta }^* : g \in \Theta L^2_+ \mapsto \Theta ^{-1} g \in L^2_+\), i.e \(U_{\Theta } : L^2_+ \rightarrow \Theta L^2_+\) is unitary. Moreover, we have \(U_{\Theta }(H^1_+)=\Theta H ^1_+ = H^1_+ \bigcap \Theta L^2_+\). Formula (4.6) yields that \(U_{\Theta }[\mathbf {D}(\mathrm {D})] =\Theta H ^1_+ = H^1_+ \bigcap \Theta L^2_+ = \mathbf {D}( L_{u|\Theta L^2_+} )\) and \(U_{\Theta }^* L_{u|\Theta L^2_+} U_{\Theta } = \mathrm {D}\). For every bounded Borel function \(f : {\mathbb {R}} \rightarrow {\mathbb {C}}\), we have \(f(L_{u |\Theta L^2_+} )U_{\Theta } = U_{\Theta } f( \mathrm {D})\) by proposition 2.1. Let \(\mu _{\psi }=\mu _{\psi }^{L_u}\) denote the spectral measure of \(L_u\) associated to \(\psi \in L^2_+\). Then \(\int _{{\mathbb {R}}}f(\xi ) \mathrm {d}\mu _{\Theta h}(\xi )=\langle f(L_u)U_{\Theta }h, U_{\Theta }h \rangle _{L^2} = \langle f( \mathrm {D}) h, h \rangle _{L^2} =\frac{1}{2\pi }\int _0^{+\infty } f(\xi ) |{\hat{h}}(\xi )|^2 \mathrm {d}\xi \), \(\forall h \in L^2_+\). So \(2 \pi \mathrm {d}\mu _{\Theta h}(\xi ) = \mathbf {1}_{{\mathbb {R}}_+}|{\hat{h}}(\xi )|^2 \mathrm {d}\xi \). The measure \(\mu _{\Theta h}\) is absolutely continuous with respect to the Lebesgue measure on \({\mathbb {R}}\). Thus \(\Theta L^2_+ \subset \mathscr {H}_{\mathrm {ac}}(L_u) \subset \mathscr {H}_{\mathrm {cont}}(L_u) = ( \mathscr {H}_{\mathrm {pp}}(L_u))^{\perp } \subset \Theta L^2_+\) and (4.5) is obtained. We have \(\mathrm {supp}(\mu _{\Theta h}) \subset [0,+\infty )\). For any \(\xi > 0\), there exists \(h\in L^2_+\bigcap L^1(\mathbb {R})\) such that \({\hat{h}}(\xi ) \ne 0\). So we have \(\sigma _{\mathrm {ess}}(L_u)=\sigma _{\mathrm {cont}}(L_u)=\sigma _{\mathrm {ac}}(L_u)= [0,+\infty )\). \(\quad \square \)

Definition 4.4

For every \(u \in \mathcal {U}_N\), we have the following spectral decomposition of \(L_u\):

$$\begin{aligned} \sigma (L_u)=\sigma _{\mathrm {ac}}(L_u) \bigcup \sigma _{\mathrm {sc}}(L_u) \bigcup \sigma _{\mathrm {pp}}(L_u), \qquad \mathrm {where} \quad \sigma _{\mathrm {ac}}(L_u) = [0, +\infty ), \quad \sigma _{\mathrm {sc}}(L_u) =\emptyset \end{aligned}$$
(4.7)

and \(\sigma _{\mathrm {pp}}(L_u) = \{\lambda _1^u,\lambda _2^u, \ldots , \lambda _N^u\}\) consists of all eigenvalues of \(L_u\). Proposition 2.3 yields that \(L_u\) is bounded from below and \(-\frac{\Vert u\Vert _{L^2}^2}{4 C^4} \le \lambda _1^u< \cdots< \lambda _N^u < 0\) with \(C=\inf _{f \in H^1_+ \backslash \{0\}}\frac{\Vert |\mathrm {D}|^{\frac{1}{4}}f\Vert _{L^2}}{\Vert f\Vert _{L^4}}\).

Hence the min-max principle (Theorem XIII.1 of Reed–Simon [19]) yields that

$$\begin{aligned} \lambda _n^u = \sup _{\dim _{{\mathbb {C}}}F=n-1 }\mathfrak {I}(F, L_u), \qquad \mathfrak {I}(F, L_u)=\inf \{\langle L_u h, h\rangle _{L^2} : h \in H^1_+ \bigcap F^{\perp }, \Vert h\Vert _{L^2}=1\} \end{aligned}$$
(4.8)

where, the above supremum, F describes all subspaces of \(L^2_+\) of complex dimension \(n - 1\), \(1 \le n\le N\). When \(n\ge N+1\), \(\sup _{\dim _{{\mathbb {C}}}F=n - 1 }\mathfrak {I}(F, L_u) =\inf \sigma _{\mathrm {ess}}(L_u)=0\). Given \(j=1,2, \ldots , N\), Proposition 2.4 and Corollary 2.7 yield that there exists a unique function \(\varphi _j : u\in \mathcal {U}_N \mapsto \varphi _j^{u} \in \mathscr {H}_{\mathrm {pp}}(L_u)\) such that

$$\begin{aligned} \mathrm {Ker}(\lambda _j^u - L_u)= {\mathbb {C}} \varphi _j^u , \qquad \Vert \varphi _j^{u}\Vert _{L^2} =1, \qquad \langle \varphi _j^{u}, u \rangle _{L^2} = \sqrt{2\pi |\lambda _j^u |}, \end{aligned}$$
(4.9)

for every \(j=1, 2, \ldots , N\). Then \(\{\varphi _1^u, \varphi _2^u, \ldots , \varphi _N^u\}\) is an orthonormal basis of the subspace \(\mathscr {H}_{\mathrm {pp}}(L_u)\). Before proving the real analyticity of each eigenvalue, we show its continuity at first.

Lemma 4.5

For every \(j=1,2,\ldots , N\), the j th eigenvalue \(\lambda _j: u \in \mathcal {U}_N\mapsto \lambda _j^u \in {\mathbb {R}}\) is Lipschitz continuous on every compact subset of \(\mathcal {U}_N\).

Proof

For every \(f\in H^1({\mathbb {R}})\), the Sobolev embedding \(\Vert f\Vert _{L^4} \le C^{-1} \Vert |\mathrm {D}|^{\frac{1}{4}} f\Vert _{L^2}\) yields that \(\forall u, v \in \mathcal {U}_N\),

$$\begin{aligned}&\big | \langle L_u h, h\rangle _{L^2} - \langle L_v h, h\rangle _{L^2}\big | \le \Vert u-v\Vert _{L^2}\Vert h\Vert _{L^4}^2 \le C^{-2} \Vert u-v\Vert _{L^2} \Vert |\mathrm {D}|^{\frac{1}{2}} h\Vert _{L^2} \Vert h\Vert _{L^2},\nonumber \\&\quad \forall h \in H^1_+. \end{aligned}$$
(4.10)

Given \(j=1,2, \ldots , N\) and a subspace \(F \subset L^2_+\) whose complex dimension is \(j-1\), we choose a function \(h \in F^{\perp } \bigcap \bigoplus _{k=1}^{j} \mathrm {Ker}(\lambda _k^u - L_u) \subset H^1_+\) such that \(\Vert h\Vert _{L^2}=1\). We have \(h = \sum _{k=1}^j h_k \varphi _k^u\) for some \(h_k \in {\mathbb {C}}\). Then \(\langle L_u h, h\rangle _{L^2} = \sum _{k=1}^j |h_k|^2 \lambda _k^u \le \lambda _j^u<0\), because \(\lambda _k^u < \lambda _{k+1}^u\). We have the following estimate

$$\begin{aligned} \Vert \mathrm {D}|^{\frac{1}{2}} h\Vert _{L^2}^2= & {} \langle \mathrm {D}h, h \rangle _{L^2} = \langle L_u h, h\rangle _{L^2} + \langle u h, h\rangle _{L^2} \le \lambda _j^u + \Vert u\Vert _{L^2}\Vert h\Vert _{L^4}^2 \nonumber \\\le & {} C^{-2} \Vert u\Vert _{L^2} \Vert |\mathrm {D}|^{\frac{1}{2}} h\Vert _{L^2} \Vert h\Vert _{L^2}. \end{aligned}$$
(4.11)

So estimates (4.10) and (4.11) yield that \(\langle L_v h, h\rangle _{L^2} \le \lambda _j^u +C^{-4} \Vert u\Vert _{L^2} \Vert u-v\Vert _{L^2}\). Since F is arbitrary, the max–min formula (4.8) implies that \(|\lambda _j^u - \lambda _j^v|\le C^{-4} (\Vert u\Vert _{L^2}+ \Vert v\Vert _{L^2})\Vert u-v\Vert _{L^2}\). Every compact subset \(K \subset \mathcal {U}_N\) is bounded in \(L^2({\mathbb {R}},{\mathbb {R}})\). Hence \(u \in K \mapsto \lambda _j^u \in {\mathbb {R}}\) is Lipschitz continuous. \(\quad \square \)

Proposition 4.6

For every \(j=1,2,\ldots , N\), the j th eigenvalue \(\lambda _j: u \in \mathcal {U}_N\mapsto \lambda _j^u \in {\mathbb {R}}\) is real analytic.

Its proof is based on Kato’s perturbation theory for linear operators.

Proof

For every \(u \in \mathcal {U}_N\), let \(\mathbb {P}^{j}_{u}\) denotes the Riesz projector of the eigenvalue \(\lambda _j^u\). Then there exists \(\epsilon _0>0\) such that the family of closed discs \(\{\overline{D}(\lambda _j^u, \epsilon _0)\}_{1\le j\le N}\bigcup \{\overline{D}(0, \epsilon _0)\}\) is mutually disjoint and for every \(j, k =1,2 \ldots , N\) and any closed path \(\varvec{\Gamma }_j^u\) (piecewise \(C^1\) closed curve) in \( D(\lambda _j^u, \epsilon _0)\) with respect to which the eigenvalue \(\lambda _j^u\) has winding number 1, we have

$$\begin{aligned} \mathbb {P}^{j}_{u} = \frac{1}{2\pi i} \oint _{\varvec{\Gamma }_j^u} (\zeta - L_u)^{-1} \mathrm {d}\zeta , \qquad \mathbb {P}^{j}_{u} \circ \mathbb {P}^{j}_{u}=\mathbb {P}^{j}_{u},\qquad \mathbb {P}^{j}_{u} \varphi _k^u = \delta _{kj}\varphi _k^u. \end{aligned}$$
(4.12)

by Theorem XII.5 of Reed–Simon [19]. We choose \(\varvec{\Gamma }_j^u\) to be the counterclockwise-oriented circle \(\mathscr {C}(\lambda _j^u, \epsilon )\) in (4.12) for some \( \epsilon \in (0, \epsilon _0)\). We claim that \(\mathrm {Im}\mathbb {P}^{j}_{u} = \mathrm {Ker}(\lambda _j^u - L_u)={\mathbb {C}}\varphi _j^u \).

It suffices to show that \(\mathbb {P}^{j}_{u}|_{\mathscr {H}_{\mathrm {ac}}(L_u)}=0\). In fact the operator \(\mathbb {P}^{j}_{u} = g_{\lambda _j^u}(L_u)\) is self-adjoint and bounded, where the bounded Borel function \(g_{\lambda } : {\mathbb {R}} \rightarrow {\mathbb {R}}\) is given by

$$\begin{aligned} g_{\lambda }(x):= \frac{1}{2\pi i} \oint _{\mathscr {C}(\lambda , \epsilon )} (\zeta - x)^{-1} \mathrm {d}\zeta = \mathbf {1}_{(\lambda -\epsilon , \lambda +\epsilon )}(x), \qquad \mathrm {a}.\mathrm {e}. \quad \mathrm {on} \quad {\mathbb {R}}, \end{aligned}$$

for every \(\lambda \in {\mathbb {R}}\). Since \(\mathbb {P}^{j}_{u}(\mathscr {H}_{\mathrm {pp}}(L_u)) \subset {\mathbb {C}}\varphi _j^u \subset \mathscr {H}_{\mathrm {pp}}(L_u)\), we have \(\mathbb {P}^{j}_{u}(\mathscr {H}_{\mathrm {ac}}(L_u)) \subset \mathscr {H}_{\mathrm {ac}}(L_u) \). Let \(\mu _{\psi }=\mu _{\psi }^{L_u}\) denote the spectral measure of \(L_u\) associated to \(\psi \in \mathscr {H}_{\mathrm {ac}}(L_u) \), whose support is included in \([0, +\infty )\) by (4.7), so \(\langle \mathbb {P}^{j}_{u} \psi , \psi \rangle _{L^2} = \frac{1}{2\pi i} \oint _{\mathscr {C}(\lambda _j^u, \epsilon )} \langle (\zeta - L_u)^{-1}\psi , \psi \rangle _{L^2} \mathrm {d}\zeta = \frac{1}{2\pi i}\int _0^{+\infty }\left( \oint _{\mathscr {C}(\lambda _j^u, \epsilon )} (\zeta - \xi )^{-1} \mathrm {d}\zeta \right) \mathrm {d}\mu _{\psi }(\xi )=0\). Set \({\tilde{\psi }}=\mathbb {P}^{j}_{u} \psi \in \mathscr {H}_{\mathrm {ac}}(L_u)\), then \(\Vert {\tilde{\psi }}\Vert _{L^2}^2 = \langle \mathbb {P}^{j}_{u} {\tilde{\psi }}, {\tilde{\psi }}\rangle _{L^2} =0\). So the claim is obtained.

For every fixed \(j=1,2, \ldots N\), we have \(\lambda ^u_j= \mathrm {Tr}(L_u \circ \mathbb {P}^{j}_{u})\). Since every eigenvalue \(\lambda _k : v \in \mathcal {U}_N \mapsto \lambda _k^v \in {\mathbb {R}}\) is continuous, there exists an open subset \(\mathcal {V}\subset \mathcal {U}_N\) containing u such that \(\sup _{v \in \mathcal {V}}\sup _{1\le k \le N}|\lambda ^v_k-\lambda ^u_k| < \frac{\epsilon _0}{3}\). We set \(\epsilon =\frac{2\epsilon _0}{3}\), then \(\lambda _j^v \in D(\lambda _j^u, \epsilon ) \backslash \overline{D}(\lambda _k^u, \epsilon _0) \), for every \(v \in \mathcal {V}\) and \(k \ne j\). For example, in the next picture, the dashed circles denote respectively \(\mathscr {C}(\lambda _j^u, \epsilon _0)\) and \(\mathscr {C}(\lambda _k^u, \epsilon _0)\); the smaller circles denote respectively \(\mathscr {C}(\lambda _j^u, \epsilon )\) and \(\mathscr {C}(\lambda _k^u, \epsilon )\) with \(j<k\). The segments inside small circles denote the possible positions of \(\lambda _j^v\) and \(\lambda _k^v\).

figure a

Then \(\sigma (L_v) \bigcap D(\lambda _j^u, \epsilon _0)=\{\lambda _j^v\}\) and \(\mathscr {C}(\lambda _j^u, \epsilon )\) is a closed path in \(D(\lambda _j^u, \epsilon _0)\) with respect to which \(\lambda _j^v\) has winding number 1. Thus,

$$\begin{aligned} \mathbb {P}^{j}_{v} = \frac{1}{2\pi i} \oint _{\mathscr {C}(\lambda _j^u, \epsilon )} (\zeta - L_v)^{-1} \mathrm {d}\zeta , \qquad \lambda _j^v = \mathrm {Tr}(L_v\circ \mathbb {P}^{j}_{v} ), \qquad \forall v \in \mathcal {V}. \end{aligned}$$
(4.13)

Since \(v \in \mathcal {V} \mapsto L_v \in \mathfrak {B}(H^1_+, L^2_+)\) is \({\mathbb {R}}\)-affine and \(\mathbf {i} : \mathcal {A} \in \mathfrak {B}_{\mathfrak {I}}(H^1_+, L^2_+) \mapsto \mathcal {A}^{-1} \in \mathfrak {B}(L^2_+, H^1_+)\) is complex analytic, where \(\mathfrak {B}_{\mathfrak {I}}(H^1_+, L^2_+)\subset \mathfrak {B}(H^1_+, L^2_+)\) denotes the open subset of all bijective bounded \({\mathbb {C}}\)-linear transformations \(H^1_+ \rightarrow L^2_+\), we have the real analyticity of the following map

$$\begin{aligned} (\zeta , v) \in \left( D(\lambda _j^u, \frac{3}{4}\epsilon _0) \backslash \overline{D}(\lambda _j^u, \frac{1}{2}\epsilon _0) \right) \times \mathcal {V} \mapsto (\zeta - L_v)^{-1} \in \mathfrak {B}(L^2_+, H^1_+). \end{aligned}$$
(4.14)

Hence the maps \(\mathbb {P}^{j} : v \in \mathcal {V} \mapsto \mathbb {P}^{j}_{v} \in \mathfrak {B}(L^2_+, H^1_+)\) and \(\lambda _j : v \in \mathcal {V} \mapsto \mathrm {Tr}(L_v\circ \mathbb {P}^{j}_{v} ) \in {\mathbb {R}}\) are both real analytic by composing (4.13) and (4.14). \(\quad \square \)

Recall that \(\mathscr {H}_{\mathrm {pp}}(L_u)= \frac{{\mathbb {C}}_{\le N-1}[X]}{Q_u}\subset \mathbf {D}(G)\) is given by (3.6), \(\forall u \in \mathcal {U}_N\). We have the following consequence.

Corollary 4.7

For every \(j=1,2, \ldots , N\), both the map \(\varphi _j : u \in \mathcal {U}_N \mapsto \varphi _j^u \in H^1_+\) and the map \(\mho _j : u \in \mathcal {U}_N \mapsto \langle G \varphi _j^u, \varphi _j^u \rangle _{L^2} \in {\mathbb {C}}\) are real analytic.

Proof

Given \(u,v \in \mathcal {U}_N\), we have \(\mathbb {P}_v^j \varphi _j^u = \langle \varphi _j^u, \varphi _j^v \rangle _{L^2}\varphi _j^v\). Since the Riesz projector \(\mathbb {P}^{j} : v \in \mathcal {U}_N \mapsto \mathbb {P}^{j}_{v} \in \mathfrak {B}(L^2_+, H^1_+)\) is real analytic in the proof of proposition 4.6 and \(\Vert \mathbb {P}_u^j \varphi _j^u\Vert _{L^2}=1\), there exists a neighbourhood of u, denoted by \(\mathcal {V}\), such that \(\Vert \mathbb {P}_v^j \varphi _j^u\Vert _{L^2} > \frac{1}{2}\) for every \(v \in \mathcal {V}\) and \(\mathbb {P}^{j} : v \in \mathcal {V} \mapsto \mathbb {P}^{j}_{v} \in \mathfrak {B}(L^2_+, H^1_+)\) can be expressed by power series. Then we have \(\varphi _j^v = \frac{\mathbb {P}_v^j \varphi _j^u}{ \langle \varphi _j^u, \varphi _j^v \rangle _{L^2}}\) and \(\mho _j(v)= \frac{ \langle G \circ \mathbb {P}_v^j( \varphi _j^u), \mathbb {P}_v^j( \varphi _j^u) \rangle _{L^2}}{\Vert \mathbb {P}_v^j( \varphi _j^u)\Vert _{L^2}^2}\). Hence the restriction \(\mho _j : v \in \mathcal {V} \mapsto \Vert \mathbb {P}_v^j( \varphi _j^u)\Vert _{L^2}^{-2} \langle G \circ \mathbb {P}_v^j( \varphi _j^u), \mathbb {P}_v^j( \varphi _j^u) \rangle _{L^2} \in {\mathbb {C}}\) is real analytic. Since (4.9) yields that \(\langle \mathbb {P}_v^j \varphi _j^u, v\rangle _{L^2} = \sqrt{-2\pi \lambda _j^v} \langle \varphi _j^u, \varphi _j^v \rangle _{L^2}\), the restriction \(\varphi _j : v \in \mathcal {V} \mapsto \frac{\sqrt{-2\pi \lambda _j^v}}{\langle \mathbb {P}_v^j \varphi _j^u, v\rangle _{L^2}} \mathbb {P}_v^j \varphi _j^u \in H^1_+\) is real analytic. \(\quad \square \)

4.3 Characterization theorem

This subsection is dedicated to proving the following spectral characterization theorem for multi-solitons.

Theorem 4.8

Given \(N \in {\mathbb {N}}_+\), a function \(u \in \mathcal {U}_N\) if and only if \(u\in L^2({\mathbb {R}}, (1+x^2)\mathrm {d}x)\) is real-valued, \(\dim _{{\mathbb {C}}}\mathscr {H}_{\mathrm {pp}}(L_u) = N\) and \(\Pi u \in \mathscr {H}_{\mathrm {pp}}(L_u)\). Moreover, we have the following inverse formula

$$\begin{aligned} \Pi u(x) = i \det (x - G|_{\mathscr {H}_{\mathrm {pp}}(L_u)})^{-1} \tfrac{\mathrm {d}}{\mathrm {d}x} \left( \det (x - G|_{\mathscr {H}_{\mathrm {pp}}(L_u)})\right) , \qquad \forall x \in {\mathbb {R}}. \end{aligned}$$
(4.15)

The direct sense is given by Proposition 4.3. Before proving the converse sense of Theorem 4.8, we need to prove the invariance of \(\mathscr {H}_{\mathrm {pp}}(L_u)\) under G, if \(u \in L^2({\mathbb {R}}, (1+x^2)\mathrm {d}x)\) is real-valued, \(\Pi u \in \mathscr {H}_{\mathrm {pp}}(L_u)\) and \(\dim _{{\mathbb {C}}}\mathscr {H}_{\mathrm {pp}}(L_u)=N \ge 1\). We give another version of formula of commutators (see also Lemma 3.1).

Lemma 4.9

For \(u \in L^2({\mathbb {R}}, (1+x^2)\mathrm {d}x)\), u is real-valued, \(\forall \varphi \in \mathrm {Ker}(\lambda - L_u)\) for some \(\lambda \in \sigma _{\mathrm {pp}}(L_u)\), then we have \(\varphi , T_u\varphi , L_u\varphi \in \mathbf {D}(G)\) and

$$\begin{aligned} \begin{aligned}&[G, T_{ u}] \varphi = \tfrac{i{\hat{\varphi }}(0^+)}{2\pi }\Pi u , \qquad [G, L_{ u}] \varphi =i\varphi -\tfrac{i{\hat{\varphi }}(0^+)}{2\pi }\Pi u .\\ \end{aligned} \end{aligned}$$
(4.16)

Proof

In Proposition 2.4, we have shown that \(\widehat{u \varphi } \in H^1({\mathbb {R}})\), so \((T_u \varphi )^{\wedge } =\widehat{u \varphi } \mathbf {1}_{{\mathbb {R}}_+} \in H^1(0,+\infty )\) and \(T_u \varphi \in \mathbf {D}(G)\). So \(G\varphi \in H^1_+ = \mathbf {D}(L_u)=\mathbf {D}(T_u)\). Moreover, we have \({\hat{\varphi }}\) is right-continuous at \(\xi =0^+\) and \({\hat{\varphi }} \in C^1(0, +\infty )\). The weak-derivative of \({\hat{\varphi }} \) is denoted by \(\partial ^{w}_{\xi } {\hat{\varphi }} \), \(\delta _0\) denotes the Dirac measure with support \(\{0\}\), then \(\partial ^{w}_{\xi } {\hat{\varphi }} = \mathbf {1}_{{\mathbb {R}}_+^* } \frac{\mathrm {d}}{\mathrm {d}\xi }{\hat{\varphi }} + {\hat{\varphi }}(0^+) \delta _0\) and \(\partial _{\xi }({\hat{u}}*{\hat{\varphi }})= \partial ^{w}_{\xi }({\hat{u}}*{\hat{\varphi }}) ={\hat{u}}*\partial ^{w}_{\xi } {\hat{\varphi }}\) by Lemma 2.5. Since \({\hat{\varphi }} = \mathbf {1}_{{\mathbb {R}}_+^* } {\hat{\varphi }}\) a.e. in \({\mathbb {R}}\) and \({\hat{u}} \in H^1({\mathbb {R}})\), we have \({\hat{u}}* \widehat{G \varphi }(\xi )= {\hat{u}}*[\mathbf {1}_{{\mathbb {R}}_+^* } \widehat{G\varphi }](\xi )\), for every \(\xi >0\) and \(([G, T_u]\varphi )^{\wedge }(\xi ) = \frac{i}{2\pi } \partial _{\xi } ({\hat{u}}* {\hat{\varphi }})( \xi ) - \frac{i}{2\pi } {\hat{u}} * [\mathbf {1}_{{\mathbb {R}}_+^* } \frac{\mathrm {d}}{\mathrm {d}\xi }{\hat{\varphi }} ](\xi ) = \frac{i}{2\pi } {\hat{\varphi }}(0^+) \widehat{u}(\xi )\). The first formula of (4.16) is obtained. Since \(L_u= \mathrm {D}-T_u\), we claim that \(\mathrm {D} \varphi \in \mathbf {D}(G)\). In fact, \(\partial _{\xi } (\mathrm {D} \varphi )^{\wedge } (\xi ) = {\hat{\varphi }}(\xi ) + \xi \partial _{\xi } {\hat{\varphi }}(\xi )\), \(\forall \xi >0\). Thus (2.4) implies that \(\widehat{\mathrm {D}\varphi }\in H^1(0, +\infty )\). Then \(\left( [G, \mathrm {D}]\varphi \right) ^{\wedge }(\xi ) = i \partial _\xi (\xi {\hat{\varphi }})(\xi ) - \xi \cdot i \partial _{\xi } {\hat{\varphi }}(\xi ) = i {\hat{\varphi }}(\xi )\), \(\forall \xi >0\). So we have \([\partial _x, G]= \mathrm {Id}_{L^2_+}\). The second formula of (4.16) holds. \(\quad \square \)

Proposition 4.10

If \(u \in L^2({\mathbb {R}}, (1+x^2)\mathrm {d}x)\) is real-valued, \(\dim _{{\mathbb {C}}} \mathscr {H}_{\mathrm {pp}}(L_u) =N \ge 1\) and \(\Pi u \in \mathscr {H}_{\mathrm {pp}}(L_u)\), then we have \(\mathscr {H}_{\mathrm {pp}}(L_u) \subset \mathbf {D}(G)\) and \(G (\mathscr {H}_{\mathrm {pp}}(L_u)) \subset \mathscr {H}_{\mathrm {pp}}(L_u)\).

Proof

There exists an orthonormal basis of the vector space \( \mathscr {H}_{\mathrm {pp}}(L_u)\), denoted by \(\{\psi _1, \psi _2, \ldots , \psi _N\}\), such that \(L_u \psi _j = \lambda _j \psi _j\), where \(\sigma _{\mathrm {pp}}(L_u)=\{\lambda _1, \lambda _2, \ldots , \lambda _N\} \subset (-\infty , 0)\) and \(\lambda _j < \lambda _{j+1}\). Since (2.4) implies that \(\mathscr {H}_{\mathrm {pp}}(L_u) \subset G^{-1}(H^1_+) \bigcap \mathbf {D}(G)\), formula (4.16) gives that \(f_j:=[L_u, G]\psi _j = -i \psi _j + \frac{i {\hat{\psi }}_j(0^+)}{2\pi } \Pi u \in \mathscr {H}_{\mathrm {pp}}(L_u)\). So we have \(\langle f_j, \psi _j\rangle _{L^2} = \langle G \psi _j, L_u\psi _j\rangle _{L^2} - \langle G L_u \psi _j, \psi _j\rangle _{L^2} = \lambda (\langle G \psi _j, \psi _j \rangle _{L^2} -\langle G \psi _j, \psi _j \rangle _{L^2})=0\). For every \(j=1,2,\ldots , N\), we set \(g_j:= \sum _{1\le k \le N, k\ne j} \frac{\langle f_j, \psi _k\rangle _{L^2}}{\lambda _k -\lambda _j} \psi _k\). Since \(f_j = \sum _{1\le k \le N, k\ne j} \langle f_j, \psi _k\rangle _{L^2} \psi _k\), we have \((L_u - \lambda _j)g_j =f_j = (L_u - \lambda _j)G \psi _j\). Then \(G \psi _j - g_j \in \mathrm {Ker}(L_u- \lambda _j) = {\mathbb {C}} \psi _j\) and \(G \psi _j \in g_j + {\mathbb {C}} \psi _j \subset \mathscr {H}_{\mathrm {pp}}(L_u)\). We conclude by \(\mathscr {H}_{\mathrm {pp}}(L_u) = \mathrm {Span}_{{\mathbb {C}}}\{\psi _1\), \(\psi _2, \ldots , \psi _N\}\). \(\quad \square \)

Now, we perform the proof of converse sense of Theorem 4.8 and give the explicit formula of \(Q_u\).

End of the proof of theorem 4.8

\(\Leftarrow \): Proposition 4.10 yields that \(G(\mathscr {H}_{\mathrm {pp}}(L_u)) \subset \mathscr {H}_{\mathrm {pp}}(L_u)\). Let Q denote the characteristic polynomial of the operator \(G|_{\mathscr {H}_{\mathrm {pp}}(L_u)}\), then we have \(\mathscr {H}_{\mathrm {pp}}(L_u) = \frac{{\mathbb {C}}_{\le N-1}[X]}{Q}\) by Lemma 3.3. So \(\Pi u = \frac{\mathrm {P}_0}{Q}\), for some \(\mathrm {P}_0 \in {\mathbb {C}}[X]\) such that \(\deg \mathrm {P}_0 \le N-1\). It remains to show that \(\mathrm {P}_0=iQ'\).

In fact, we have \(L_u (\frac{P}{Q})= (\mathrm {D}-T_{\frac{\mathrm {P}_0}{Q}} -T_{\frac{\overline{\mathrm {P}_0}}{\overline{Q}}})(\frac{P}{Q})= \frac{\mathrm {D}P}{Q} - \Pi (\frac{\overline{\mathrm {P}}_0 P}{\overline{Q}Q}) + \frac{(i Q'-\mathrm {P}_0)P }{Q^2} \in \frac{{\mathbb {C}}_{\le N-1}[X]}{Q}\), for every \(P \in {\mathbb {C}}_{\le N-1}[X]\), thanks to the invariance of \(\mathscr {H}_{\mathrm {pp}}(L_u)\) under \(L_u\). Partial-fraction decomposition implies that \(\Pi (\frac{\overline{\mathrm {P}}_0 P}{\overline{Q}Q}) \in \frac{{\mathbb {C}}_{\le N-1}[X]}{Q}\). So \(\frac{(i Q'-\mathrm {P}_0) P }{Q } \in {\mathbb {C}}_{\le N-1}[X]\) for every \(P \in {\mathbb {C}}_{\le N-1}[X]\). Choose \(P=\mathbf {1}\), since \(\deg (iQ' - \mathrm {P}_0) \le N-1\), we have \( \mathrm {P}_0 = iQ'\), so \(u \in \mathcal {U}_N\). Since \(Q \in {\mathbb {C}}_N[X]\) is monic and \(Q^{-1}(0) \subset {\mathbb {C}}_-\), we have \(Q_u(x)=Q(x)=\det (x-G|_{\mathscr {H}_{\mathrm {pp}}(L_u)})\). \(\quad \square \)

We refer to Proposition 4.10 and formula (4.4) to see the invariance of \(\mathscr {H}_{\mathrm {pp}}(L_u) \subset \mathbf {D}(G)\) under G, \(\forall u \in \mathcal {U}_N\). The translation–scaling parameters of u can be identified as the spectrum of \(G|_{\mathscr {H}_{\mathrm {pp}}(L_u)}\). The matrix representation of \(G|_{\mathscr {H}_{\mathrm {pp}}(L_u)}\) with respect to the orthonormal basis \(\{\varphi _1^u, \varphi _2^u, \ldots , \varphi _N^u\}\) is given in Proposition 5.4.

4.4 The invariance under the Benjamin–Ono flow

Proposition 1.4 is proved in this subsection. At first, we show the invariance of the property \(x \mapsto xu(x) \in L^2({\mathbb {R}})\) under the BO flow. Then the spectral characterization Theorem 4.8 is used to establish the global well-posedness of the Hamiltonian system (1.3) on \(\mathcal {U}_N\).

Lemma 4.11

If \(u_0 \in H^{2}({\mathbb {R}}, {\mathbb {R}}) \bigcap L^2({\mathbb {R}}, x^2 \mathrm {d}x)\), let \(u=u(t,x)\) solves the BO equation (1.1) with initial datum \(u(0)=u_0\), then \(u(t) \in L^2({\mathbb {R}}, x^2 \mathrm {d}x)\), for every \(t \in {\mathbb {R}}\).

Remark 4.12

This result can be strengthened by replacing the assumption \(u_0 \in H^{2}({\mathbb {R}}, {\mathbb {R}})\) by a weaker assumption \(u_0 \in H^{\frac{3}{2} +}({\mathbb {R}}, {\mathbb {R}}) = \bigcup _{s > \frac{3}{2}}H^s({\mathbb {R}}, {\mathbb {R}})\), because one can construct a conservation law of (1.1), which controls the \(H^s\)-norm of solution, \(\forall s>-\frac{1}{2}\), by using the method of perturbation of determinants. We refer to Talbut [22] to see details. It suffices to use Lemma 4.11 to prove Proposition 1.4.

Before proving Lemma 4.11, we need some commutator estimates.

Lemma 4.13

For a general locally Lipschitz function \(\chi : {\mathbb {R}} \rightarrow {\mathbb {R}}\) such that \(\partial _x \chi , \partial _x^3 \chi , \partial _x^5 \chi \in L^1({\mathbb {R}})\), we have the following commutator estimates

$$\begin{aligned} \begin{aligned}&\Vert [|\mathrm {D}|, \chi ]g\Vert _{L^2} + \Vert [\partial _x, \chi ]g\Vert _{L^2} \lesssim (\Vert \partial _x \chi \Vert _{L^1}\Vert \partial _x^3 \chi \Vert _{L^1})^{\frac{1}{2}} \Vert g\Vert _{L^2}, \qquad \qquad \forall g \in L^2({\mathbb {R}}),\\&\Vert |\mathrm {D}|[\partial _x , \chi ]g\Vert _{L^2} \lesssim (\Vert \partial _x \chi \Vert _{L^1}\Vert \partial _x^3 \chi \Vert _{L^1})^{\frac{1}{2}} \Vert \partial _x g\Vert _{L^2} + (\Vert \partial _x \chi \Vert _{L^1}\Vert \partial _x^5 \chi \Vert _{L^1})^{\frac{1}{2}} \Vert g\Vert _{L^2}, \qquad \qquad \forall g \in H^1({\mathbb {R}}). \end{aligned} \end{aligned}$$
(4.17)

Proof

Since \(2\pi \big |\left( [|\mathrm {D}|, \chi ]g \right) ^{\wedge } (\xi ) \big | \le \int _{\eta \in {\mathbb {R}}}\big ||\xi | - |\eta | \big | |{\hat{\chi }}(\xi - \eta )| |{\hat{g}}(\eta )| \mathrm {d}\eta \le |\widehat{\partial _x \chi }| * |{\hat{g}}|(\xi )\), Young’s convolution inequality yields that \(\Vert [|\mathrm {D}|, \chi ]g\Vert _{L^2} \lesssim \Vert \widehat{\partial _x \chi }\Vert _{L^1} \Vert g \Vert _{L^2}\). We set \(\mathcal {R}_1=\Vert \partial _x \chi \Vert _{L^1}^{-\frac{1}{2}}\Vert \partial _x^3 \chi \Vert _{L^1}^{\frac{1}{2}}\), then \(\Vert \widehat{\partial _x \chi }\Vert _{L^1} \le \Vert \widehat{\partial _x \chi }\Vert _{L^{\infty }}\int _{|\xi |\le \mathcal {R}_1}\mathrm {d}\xi + \int _{|\xi |> \mathcal {R}_1} \frac{\Vert \widehat{\partial _x^3 \chi }\Vert _{L^{\infty }}}{|\xi |^2}\mathrm {d}\xi \lesssim \Vert \partial _x \chi \Vert _{L^{1}}\mathcal {R}_1 + \frac{\Vert \partial _x^3 \chi \Vert _{L^1}}{\mathcal {R}_1} = 2(\Vert \partial _x \chi \Vert _{L^1}\Vert \partial _x^3 \chi \Vert _{L^1})^{\frac{1}{2}}\). Similarly, we have \(\Vert [\partial _x, \chi ]g\Vert _{L^2} \lesssim \Vert \widehat{\partial _x \chi }\Vert _{L^1} \Vert g \Vert _{L^2} \lesssim (\Vert \partial _x \chi \Vert _{L^1}\Vert \partial _x^3 \chi \Vert _{L^1})^{\frac{1}{2}} \Vert g \Vert _{L^2}\), so the first inequality of (4.17) is obtained. Since \(2\pi \big |\left( |\mathrm {D}|[\partial _x , \chi ]g \right) ^{\wedge } (\xi ) \big | \le |\xi |\int _{\eta \in {\mathbb {R}}} |\xi - \eta | |{\hat{\chi }}(\xi - \eta )| |{\hat{g}}(\eta )| \mathrm {d}\eta \le |\widehat{\partial _x^2 \chi }| * |{\hat{g}}|(\xi ) + |\widehat{\partial _x \chi }| * |\widehat{\partial _x g}|(\xi )\), then \(\Vert |\mathrm {D}|[\partial _x , \chi ]g\Vert _{L^2} \lesssim \Vert \widehat{\partial _x^2 \chi }\Vert _{L^1} \Vert g \Vert _{L^2} + \Vert \widehat{\partial _x \chi }\Vert _{L^1} \Vert \partial _x g \Vert _{L^2}\). We set \(\mathcal {R}_2 :=\Vert \partial _x \chi \Vert _{L^1}^{-\frac{1}{4}}\Vert \partial _x^5 \chi \Vert _{L^1}^{\frac{1}{4}}\), then \(\Vert \widehat{\partial _x^2 \chi }\Vert _{L^1} \le \Vert \widehat{\partial _x \chi }\Vert _{L^{\infty }}\int _{|\xi |\le \mathcal {R}_1}|\xi |\mathrm {d}\xi + \int _{|\xi |> \mathcal {R}_1} \frac{\Vert \widehat{\partial _x^5 \chi }\Vert _{L^{\infty }}}{|\xi |^3}\mathrm {d}\xi \lesssim \Vert \partial _x \chi \Vert _{L^{1}}\mathcal {R}_2^2 + \frac{\Vert \partial _x^5 \chi \Vert _{L^1}}{\mathcal {R}_2^2} = 2 (\Vert \partial _x \chi \Vert _{L^1}\Vert \partial _x^5 \chi \Vert _{L^1})^{\frac{1}{2}}\). Finally, we add them together to get the second estimate of (4.17). \(\quad \square \)

Now we prove the invariance of the property \(x \mapsto xu(x) \in L^2({\mathbb {R}})\) is invariant under the BO flow.

Proof of lemma 4.11

We choose a cut-off function \(\chi \in C^{\infty }_c({\mathbb {R}})\) such that \(\chi \) decreases in \([0, +\infty )\), \(\chi \) is even, \(0 \le \chi \le 1\), \(\chi \equiv 1\) on \([-1, 1]\) and \(\mathrm {supp}(\chi ) \subset [-2, 2]\). If \(u_0 \in H^{2}({\mathbb {R}}, {\mathbb {R}}) \bigcap L^2({\mathbb {R}}, x^2 \mathrm {d}x)\), let \(u : t \in {\mathbb {R}} \mapsto u(t) \in H^2({\mathbb {R}}, {\mathbb {R}})\) solves the BO equation (1.1) with initial datum \(u(0)=u_0\), we claim that there exists a constant \(\mathcal {C}=\mathcal {C}(\Vert u(0)\Vert _{H^1})\) such that

$$\begin{aligned}&I(R,t):= \int _{{\mathbb {R}}}\chi ^2(\tfrac{x}{R})|x|^2 |u(t,x)|^2 \mathrm {d}x\nonumber \\&\le \mathcal {C} e^{|t|}( \int _{{\mathbb {R}}} |x|^2 |u(0,x)|^2 \mathrm {d}x +1), \qquad \forall t \in {\mathbb {R}}, \quad \forall R>1, \end{aligned}$$
(4.18)

In fact, we define \(\rho (x):=x \chi (x)\). For every \(R>0\), we set \(\rho _R(x):=R \rho (\tfrac{x}{R})=x\chi (\tfrac{x}{R})\). Thus

$$\begin{aligned}&\partial _t I(R,t)= 2 \mathrm {Re}\langle \rho ^2_R |\mathrm {D}|\partial _x u(t) - 2\rho ^2_R u(t) \partial _x u(t), \\&\quad u(t) \rangle _{L^2} = \mathcal {J}_1(u(t))+\mathcal {J}_2(u(t)), \end{aligned}$$

where for every \(u \in H^2({\mathbb {R}})\), we define

$$\begin{aligned}&\mathcal {J}_1(u):= -4 \mathrm {Re}\langle \rho ^2_R u \partial _x u, u \rangle _{L^2} \Longrightarrow \nonumber \\&|\mathcal {J}_1(u)| \le 4 \Vert \partial _x u\Vert _{L^{\infty }} \Vert \rho _R u\Vert _{L^2}^2 \lesssim \Vert u\Vert _{H^2}\Vert \rho _R u\Vert _{L^2}^2 \end{aligned}$$
(4.19)

and \(\mathcal {J}_2(u):=2 \mathrm {Re}\langle \rho ^2_R |\mathrm {D}|\partial _x u , u \rangle _{L^2} = \langle [\rho ^2_R, |\mathrm {D}|\partial _x] u , u \rangle _{L^2}\). Since \([\rho _R^2, |\mathrm {D}|\partial _x] =\rho _R [\rho _R , |\mathrm {D}|\partial _x] + [\rho _R , |\mathrm {D}|\partial _x] \rho _R\) and \([\rho _R , |\mathrm {D}|\partial _x] =[\rho _R , |\mathrm {D}|\partial _x]^* = [\rho _R , |\mathrm {D}| ]\partial _x + |\mathrm {D}| [\rho _R , \partial _x ]\), we have

$$\begin{aligned} \begin{aligned} \mathcal {J}_2(u) = 2 \mathrm {Re}\langle [\rho _R , |\mathrm {D}| ]\partial _x u ,\rho _R u \rangle _{L^2} + 2 \mathrm {Re}\langle |\mathrm {D}| [\rho _R , \partial _x ] u ,\rho _R u \rangle _{L^2}. \end{aligned} \end{aligned}$$

Since \(\Vert \partial _x \rho _R\Vert _{L^1}=R \Vert \partial _x \rho \Vert _{L^1}\), \(\Vert \partial _x^3 \rho _R\Vert _{L^1}=R^{-1} \Vert \partial _x \rho \Vert _{L^1}\) and \(\Vert \partial _x^5 \rho _R\Vert _{L^1}=R^{-3} \Vert \partial _x \rho \Vert _{L^1}\), the commutator estimates (4.17) yield that if \(u \in H^2({\mathbb {R}})\), then

$$\begin{aligned} \begin{aligned} |\mathcal {J}_2(u) |&\le 2 \Vert \rho _R u\Vert _{L^2}^2 + \Vert [\rho _R , |\mathrm {D}| ]\partial _x u \Vert _{L^2}^2+ \Vert |\mathrm {D}| [\rho _R , \partial _x ] u\Vert _{L^2}^2 \\&\lesssim \Vert \rho _R u\Vert _{L^2}^2 + \Vert \partial _x \rho _R\Vert _{L^1}\Vert \partial _x^3 \rho _R\Vert _{L^1} \Vert \partial _x u \Vert _{L^2}^2+ \Vert \partial _x \rho _R\Vert _{L^1}\Vert \partial _x^5 \rho _R\Vert _{L^1} \Vert u\Vert _{L^2}^2\\&\lesssim \Vert \rho _R u\Vert _{L^2}^2 + \Vert \partial _x \rho \Vert _{L^1}\Vert \partial _x^3 \rho \Vert _{L^1} \Vert \partial _x u \Vert _{L^2}^2+ R^{-2} \Vert \partial _x \rho \Vert _{L^1}\Vert \partial _x^5 \rho \Vert _{L^1} \Vert u\Vert _{L^2}^2 \\&\lesssim \Vert \rho _R u\Vert _{L^2}^2 +\Vert u\Vert _{H^1}^2 \end{aligned}\nonumber \\ \end{aligned}$$
(4.20)

for every \(R \ge 1\). Proposition 2.9 and 2.12 yield that there exists a conservation law of (1.1) controlling \(H^2\)-norm of the solution. Let \(u : t \in {\mathbb {R}} \mapsto u(t) \in H^2({\mathbb {R}})\) denote the solution of the BO equation (1.1). Then \(\sup _{t \in {\mathbb {R}}} \Vert u(t)\Vert _{H^2} \lesssim _{\Vert u_0\Vert _{H^2}} 1 \). Since \(I(R,t)=\Vert \rho _R u(t)\Vert _{L^2}^2\), estimates (4.19) and (4.20) imply that \(|\partial _t I(R,t)| \le \mathcal {C} (I(R,t) +1)\), \(\forall t \in {\mathbb {R}}\), for some constant \(\mathcal {C}=\mathcal {C}(\Vert u_0\Vert _{H^2}) \). Thus (4.18) is obtained by Gronwall’s inequality. Let \(R \rightarrow +\infty \), we conclude by Lebesgue’s monotone convergence theorem. \(\quad \square \)

Since the generating function \(\lambda \in {\mathbb {C}}\backslash \sigma (-L_u) \mapsto \mathcal {H}_{\lambda }(u) \in {\mathbb {C}}\) is the Borel–Cauchy transform of the spectral measure of \(L_u\), the invariance of \(\mathcal {U}_N\) under the BO flow is obtained by the inverse spectral transform.

End of the proof of proposition 1.4

If \(u_0 \in \mathcal {U}_N \subset H^{\infty }({\mathbb {R}}, {\mathbb {R}}) \bigcap L^2({\mathbb {R}}, x^2 \mathrm {d}x)\), let \(u=u(t,x)\) denote the solution of the BO equation (1.1) with initial datum \(u(0)=u_0\), then \(u(t)\in H^{\infty }({\mathbb {R}}, {\mathbb {R}}) \bigcap L^2({\mathbb {R}}, x^2 \mathrm {d}x)\) by Proposition 2.8 and Lemma 4.11. Given \(\lambda \in {\mathbb {C}}\backslash {\mathbb {R}}\), the generating function \(\mathcal {H}_{\lambda } : u \in L^2({\mathbb {R}}, {\mathbb {R}}) \rightarrow {\mathbb {R}}\) reads as \(\mathcal {H}_{\lambda }(u)=\langle (\lambda +L_u)^{-1} \Pi u, \Pi u\rangle _{L^2} = \int _{{\mathbb {R}}} \frac{\mathrm {d}\mathbf {m}_u(\xi )}{\xi +\lambda }\) with \(\mathbf {m}_u:=\mu _{\Pi u}^{L_u}\), where \(\mu _{\psi }^{L_u}\) denotes the spectral measure of \(L_u\) associated to the function \(\psi \in L^2_+\). So the holomorphic function \(\lambda \in {\mathbb {C}}\backslash {\mathbb {R}} \mapsto \mathcal {H}_{\lambda }u \) is the Borel–Cauchy transform of the positive Borel measure \(\mathbf {m}_u\). The total variation \(\mathbf {m}_u({\mathbb {R}})=\Vert \Pi u\Vert _{L^2}^2\) is a conservation law of the BO equation (1.1) by Proposition 2.12 and formula (2.14). Thanks to the Stieltjes inversion formula, every finite Borel measure is uniquely determined by its Borel–Cauchy transform. For every \(t \in {\mathbb {R}}\), we have \(\mathcal {H}_{\lambda }[u(t)]=\mathcal {H}_{\lambda }[u(0)]\) by proposition 2.15. Since \(u(0) \in \mathcal {U}_N\), we have \(\Pi [u(0)] \in \mathscr {H}_{\mathrm {pp}}(L_{u(0)})\) by Proposition 4.3. Consequently, there exist \(c_1, c_2, \ldots , c_N \in {\mathbb {R}}\backslash \{0\}\) such that \(\mu _{\Pi [u(t)]}^{L_{u(t)}}=\mathbf {m}_{u(t)}=\mathbf {m}_{u(0)}= \mu _{\Pi [u(0)]}^{L_{u(0)}} = \sum _{j=1}^N c_j \delta _{\lambda _j^{u(0)}}\). Then \(\Pi [u(t)] \in \mathscr {H}_{\mathrm {pp}}(L_{u(t)})\), \(\forall t \in {\mathbb {R}}\). The Lax pair structure yields the unitary equivalence between \(L_{u(t)}\) and \(L_{u(0)}\). So \(\dim _{{\mathbb {C}}}\mathscr {H}_{\mathrm {pp}}(L_{u(t)}) = \dim _{{\mathbb {C}}}\mathscr {H}_{\mathrm {pp}}(L_{u(0)})=N\) by Proposition 2.1. We conclude by Theorem 4.8. \(\quad \square \)

5 The Generalized Action–Angle Coordinates

In this section, we construct the global (generalized) action–angle coordinates \(\Phi _N\) in Theorem 1 of the Hamiltonian system (1.3) with solutions in the real analytic symplectic manifold \((\mathcal {U}_N, \omega )\) of real dimension 2N given in Proposition 1.2. The goal of this section is to establish the diffeomorphism property and the symplectomorphism property of \(\Phi _N\).

Proposition 1.3 yields that the Poisson bracket of two smooth functions \(f, g :\mathcal {U}_N \rightarrow {\mathbb {R}}\) is given by

$$\begin{aligned} \{f,g\} : u \in \mathcal {U}_N \mapsto \omega _u(X_f(u), X_g(u))=\langle \partial _x \nabla _u f(u), \nabla _u g(u) \rangle _{L^2}\in {\mathbb {R}}. \end{aligned}$$
(5.1)

Given \(u\in \mathcal {U}_N\), Proposition 4.3 yields that there exist \(\lambda _1^u<\lambda _2^u< \cdots<\lambda _N^u<0\) and \(\varphi _j^u \in \mathrm {Ker}(\lambda _j^u - L_u) \subset \mathbf {D}(G)\) such that \(\Vert \varphi _j^u\Vert _{L^2}=1\) and \(\langle u, \varphi _j^u \rangle _{L^2} = \sqrt{2\pi |\lambda _j^u|}\), thanks to the spectral analysis in Sect. 4.2.

Definition 5.1

For every \(j=1,2, \ldots , N\), the map \(I_j :u \in \mathcal {U}_N \mapsto 2 \pi \lambda _j^u \in {\mathbb {R}}\) is called the j th action. The map \(\gamma _j : u \in \mathcal {U}_N \mapsto \mathrm {Re}\langle G \varphi _j^u, \varphi _j^u\rangle _{L^2} \in {\mathbb {R}}\) is called the j th (generalized) angle.

The set \(\Omega _N\) is defined by (1.13) and we adopt the superscript instead of the subscript in this section: \(\Omega _N=\{(r^1,r^2, \ldots , r^N) \in {\mathbb {R}}^N : r^1< r^2< \cdots< r^N<0\}\). Then the real analytic manifold \((\Omega _N \times {\mathbb {R}}^N, \nu )\) is a symplectic manifold of real dimension 2N, where \(\nu =\sum _{j=1}^N \mathrm {d}r^j \wedge \mathrm {d}\alpha ^j\). The action–angle map is given by \(\Phi _N : u \in \mathcal {U}_N \mapsto (I_1(u), I_2(u), \ldots , I_N(u); \gamma _1(u), \gamma _2(u), \ldots , \gamma _N(u)) \in \Omega _N \times {\mathbb {R}}^N\). Theorem 1 is restated here.

Theorem 5.2

The map \(\Phi _N\) has following properties:

\((\mathrm {a})\). The map \(\Phi _N: \mathcal {U}_N \rightarrow \Omega _N \times {\mathbb {R}}^N\) is a real analytic diffeomorphism.

\((\mathrm {b})\). The pullback of \(\nu \) by \(\Phi _N\) is \(\omega \), i.e. \(\Phi _N^* \nu = \omega \).

\((\mathrm {c})\). We have \(E\circ \Phi _N^{-1} : (r^1, r^2, \ldots , r^N; \alpha ^1, \alpha ^2, \ldots , \alpha ^N) \in \Omega _N \times {\mathbb {R}}^N \mapsto - \frac{1}{2\pi }\sum _{j=1}^N| r^j|^2 \in (-\infty , 0)\).

Remark 5.3

The real analyticity of \(\Phi _N: \mathcal {U}_N \rightarrow \Omega _N \times {\mathbb {R}}^N\) is given by Proposition 4.6 and Corollary 4.7. The symplectomorphism property \((\mathrm {b})\) is equivalent to the Poisson bracket characterization (1.15). The family \((X_{I_1}, X_{I_2} , \ldots , X_{I_N} ; X_{\gamma _1} , X_{\gamma _2} , \ldots , X_{\gamma _N} )\) is linearly independent in \(\mathfrak {X}(\mathcal {U}_N)\) and we have

$$\begin{aligned} \mathrm {d}\Phi _N(u) : X_{I_k}(u) \mapsto \tfrac{\partial }{\partial \alpha ^k} \big |_{\Phi _N(u)}, \qquad \mathrm {d}\Phi _N(u) : X_{\gamma _k}(u) \mapsto -\tfrac{\partial }{\partial r^k}\big |_{\Phi _N(u)}. \end{aligned}$$

The assertion \((\mathrm {c})\) is obtained by a direct calculus: \(\Pi u = \sum _{j=1}^N \langle \Pi u, \varphi _j^u \rangle _{L^2}\varphi _j^u \), formula (4.9) yields that \(E(u) = \langle L_u (\Pi u), \Pi u\rangle _{L^2} = \sum _{j=1}^N |\langle \Pi u, \varphi _j^u \rangle _{L^2}|^2 \lambda _j^u =-\sum _{j=1}^N \frac{I_j(u)^2}{2\pi }\).

This section is organized as follows. The matrix associated to \(G|_{\mathscr {H}_{\mathrm {pp}}(L_u)}\) is expressed in terms of actions and angles in Sect. 5.1. In Sect. 5.2, the Poisson brackets of actions and angles are used to prove the local diffeomorphism property of \(\Phi _N\). The bijectivity of \(\Phi _N\) is obtained by Hadamard’s global inverse function theorem in Sect. 5.3. Finally, we use Sects. 5.4 and 5.5 to prove that \(\Phi _N : (\mathcal {U}_N, \omega ) \rightarrow (\Omega _N \times {\mathbb {R}}^N, \nu )\) preserves the symplectic structure.

5.1 The inverse spectral matrix

We continue to study the infinitesimal generator G defined in (3.2) when restricted to the invariant subspace \(\mathscr {H}_{\mathrm {pp}}(L_u)\) with complex dimension N. Then we state a general linear algebra lemma that describes the location of eigenvalues of the operator \(G|_{\mathscr {H}_{\mathrm {pp}}(L_u)}\).

Proposition 5.4

For every \(u \in \mathcal {U}_N\), let \(M(u)=(M_{kj}(u))_{1\le k, j \le N} \in {\mathbb {C}}^{N\times N}\) denote the inverse spectral matrix defined by (1.18) and Definition 5.1. Then M(u) is the matrix associated to the operator \(G|_{\mathscr {H}_{\mathrm {pp}}(L_u)}\) with respect to the basis \(\{\varphi _1^u, \varphi _2^u, \ldots , \varphi _N^u\}\), i.e. \(M_{k j}(u)= \langle G \varphi _j^u, \varphi _k^u \rangle _{L^2}\), \(1\le k, j \le N\).

Proof

Since \(L_u=L_u^*\) and \(\mathscr {H}_{\mathrm {pp}}(L_u) \subset \mathbf {D}(G)\), we have \((\lambda _j^u - \lambda _k^u )\langle G \varphi _j^u, \varphi _k^u \rangle _{L^2} = \langle [G, L_u ]\varphi _j^u, \varphi _k^u \rangle _{L^2}\). Since formulas (2.5) and (4.9) imply that \(-\lambda _j^u \widehat{\varphi _j^u}(0) = \widehat{u \varphi _j^u}(0) =\sqrt{2\pi |\lambda _j^u|}\), we use formula (4.16) to obtain that if k and j are different, then \((\lambda _j^u - \lambda _k^u )\langle G \varphi _j^u, \varphi _k^u \rangle _{L^2} = \langle i \varphi _j^u - \tfrac{i}{2 \pi } \widehat{\varphi _j^u}(0^+)\Pi u, \varphi _k^u \rangle _{L^2} = - \tfrac{i}{2 \pi } \widehat{\varphi _j^u}(0^+) \overline{\widehat{u \varphi _k^u}}(0) = -i \sqrt{\tfrac{\lambda _k^u}{\lambda _j^u}}\). In the case \(k=j\), we have \(\langle G^* f, g \rangle _{L^2} =-\tfrac{i}{2\pi }\int _0^{+\infty }{\hat{f}}(\xi ) \partial _{\xi }\overline{{\hat{g}}}(\xi ) \mathrm {d} \xi = \tfrac{i}{2\pi }\left[ {\hat{f}}(0^+)\overline{{\hat{g}}}(0^+) + \int _0^{+\infty }\partial _{\xi } {\hat{f}}(\xi ) \overline{{\hat{g}}}(\xi ) \mathrm {d} \xi \right] \) and \(\langle G^* f, g \rangle _{L^2} = \langle G f, g \rangle _{L^2} + \frac{i}{2\pi }{\hat{f}}(0^+)\overline{{\hat{g}}}(0^+)\), for any \(f, g \in \mathscr {H}_{\mathrm {pp}}(L_u)\) by using formula (3.2). Consequently, we have \(\mathrm {Im} \langle G \varphi _j^u, \varphi _j^u\rangle _{L^2}= -\frac{1}{4\pi } |\widehat{\varphi _j^u }(0)|^2 = -\frac{1}{2 |\lambda _j^u|} = \mathrm {Im} M_{jj}(u)\). \(\quad \square \)

Corollary 5.5

For every \(u \in \mathcal {U}_N\), we define two vectors \(X(u), Y(u) \in {\mathbb {R}}^N\) as

$$\begin{aligned} X(u)^T = (\sqrt{|\lambda _1^u|}, \sqrt{|\lambda _2^u|}, \ldots , \sqrt{|\lambda _N^u|}), \qquad Y(u)^T = (\sqrt{|\lambda _1^u|^{-1}}, \sqrt{|\lambda _2^u|^{-1}}, \ldots , \sqrt{|\lambda _N^u|^{-1}}), \end{aligned}$$
(5.2)

Then we have the following inverse spectral formula

$$\begin{aligned} \Pi u(x) = -i \langle (M(u)-x)^{-1} X(u), Y(u)\rangle _{{\mathbb {C}}^N}, \qquad \forall x \in {\mathbb {R}}. \end{aligned}$$
(5.3)

Hence, the map \(\Phi _N : \mathcal {U}_N \rightarrow \Omega _N \times {\mathbb {R}}^N\) is injective.

Proof

For any \(k, j =1,2, \ldots , N\), let \(K^u_{kj}(x)\) denote the \({(N-1)\times (N-1)}\) submatrix obtained by deleting the k th row and j th column of the matrix \(M(u)-x \), for all \( x \in {\mathbb {R}}\). The Cramer’s rule yields that \(\langle (M(u)-x)^{-1} X(u), Y(u)\rangle _{{\mathbb {C}}^N}= \sum _{1\le k,j \le N} \tfrac{(-1)^{k+j} \det (K^u_{kj}(x))}{\det (M(u)-x)} \sqrt{ \tfrac{\lambda _k^u}{\lambda _j^u}} = \tfrac{\sum _{j=1}^N\det (K^u_{j j}(x))+ R}{\det (M(u)-x)}\), where \(R := \sum _{1\le k \ne j \le N} (-1)^{k+j} \det (K^u_{kj}(x))\sqrt{\frac{\lambda _k^u}{\lambda _j^u}}= i (\sum _{j=1}^N \lambda _j^u - \sum _{k=1}^N \lambda _k^u )\det (M(u)-x) = 0\) by (1.18) and Definition 5.1. If \(Q_u(x)= \det (x - M(u))\), then \(Q_u'(x) = (-1)^{N-1} \sum _{j=1}^N\det (K^u_{j j}(x))\). Then formula (5.3) is obtained by formula (4.15). \(\quad \square \)

The next lemma describes the location of spectrum of all matrices of the form defined as (1.18).

Lemma 5.6

For every \(N\in {\mathbb {N}}_+\), we choose N negative numbers \(\lambda _1<\lambda _2< \cdots<\lambda _N < 0\) and \(\gamma _1, \gamma _2, \ldots , \gamma _N \in {\mathbb {R}}\). The matrix \(\mathcal {M}=(\mathcal {M}_{kj})_{1\le k, j \le N} \in {\mathbb {C}}^{N\times N}\) is defined as \(\mathcal {M}_{jj}=\gamma _j + \frac{i}{2 \lambda _j }\) and \(\mathcal {M}_{kj}=\frac{i}{\lambda _k- \lambda _j} \sqrt{\frac{\lambda _k}{\lambda _j}}\), if \(k \ne j\). Then \(\mathfrak {I}(\mathcal {M}) := \frac{\mathcal {M}-\mathcal {M}^*}{2i}\) is negative semi-definite and \(\sigma _{\mathrm {pp}} (\mathcal {M})\subset {\mathbb {C}}_-\).

Proof

The vector \(V_{\lambda }\in {\mathbb {R}}^N\) is defined as \(V_{\lambda }^T := ((2| \lambda _1|)^{-\frac{1}{2}}, (2 |\lambda _2|)^{-\frac{1}{2}}, \ldots , (2 |\lambda _N|)^{-\frac{1}{2}})\). So we have \(\mathfrak {I}(\mathcal {M}) = \left( -\tfrac{1}{2 \sqrt{|\lambda _j||\lambda _k|}} \right) _{1\le k, j \le N} = - V_{\lambda } \cdot V_{\lambda }^T\). Thus \(\langle (\mathfrak {I}(\mathcal {M}))X, X \rangle _{{\mathbb {C}}^N } = -| \langle X, V_{\lambda }\rangle _{{\mathbb {C}}^N }|^2 \le 0\). So \(\mathfrak {I}(\mathcal {M})\) is negative semi-definite. If \(\mu \in \sigma _{\mathrm {pp}}(\mathcal {M})\) and \(V \in \mathrm {Ker}(\mu - \mathcal {M})\backslash \{0\}\), it suffices to show \(\mathrm {Im}\mu <0\). Since

$$\begin{aligned}&-|\langle V, V_{\lambda } \rangle _{{\mathbb {C}}^N}|^2 = \langle \mathfrak {I}(\mathcal {M})V, V \rangle _{{\mathbb {C}}^N} = \mathrm {Im} \mu \Vert V\Vert _{{\mathbb {C}}^N}^2,\nonumber \\&\qquad \mathrm {where} \quad \Vert V\Vert _{{\mathbb {C}}^N}^2 = \langle V, V \rangle _{{\mathbb {C}}^N} >0, \end{aligned}$$
(5.4)

we have \(\mathrm {Im} \mu \le 0\). Assume that \(\mu \in {\mathbb {R}}\), then formula (5.4) yields that \(V \perp V_{\lambda }\). Moreover, we have \((\mathcal {M}-\mathcal {M}^*) V = -2 i \langle V, V_{\lambda } \rangle _{{\mathbb {C}}^N} V_{\lambda }=0\). We set \(D^{\lambda } \in {\mathbb {C}}^{N\times N}\) to be the diagonal matrix whose diagonal elements are \(\lambda _1, \lambda _2, \ldots ,\lambda _N\), i.e. \(D^{\lambda }=(\lambda _k \delta _{jk})_{1\le k,j\le N}\). Then we have the following formula

$$\begin{aligned}{}[\mathcal {M}, D^{\lambda }] = i(\mathrm {I}_N + 2 D^{\lambda } V_{\lambda } V_{\lambda }^T). \end{aligned}$$
(5.5)

So \([\mathcal {M}, D^{\lambda }]V = iV\) by (5.5). Finally, \(i \Vert V\Vert _{{\mathbb {C}}^N}^2 = \langle (\mathcal {M} - \mu ) D^{\lambda } V, V \rangle _{{\mathbb {C}}^N} = \langle D^{\lambda } V, (\mathcal {M}^* - \mu ) V \rangle _{{\mathbb {C}}^N}=0 \) contradicts the fact that \(V \ne 0\). Consequently, we have \(\mu \in {\mathbb {C}}_-\). \(\quad \square \)

5.2 Poisson brackets

In this subsection, the Poisson bracket defined in (5.1) is generalized in order to obtain the first two formulas of (1.15). It can be defined between a smooth function from \(\mathcal {U}_N\) to an arbitrary Banach space and another smooth function from \(\mathcal {U}_N\) to \({\mathbb {R}}\). For every smooth function \(f : \mathcal {U}_N \rightarrow {\mathbb {R}}\), its Hamiltonian vector field \(X_f \in \mathfrak {X}(\mathcal {U}_N)\) is given by (1.12). For any Banach space \(\mathcal {E}\) and any smooth map \(F: u \in \mathcal {U}_N \mapsto F(u) \in \mathcal {E}\), we define the Poisson bracket of f and F as follows

$$\begin{aligned} \{f, F \} : u \in \mathcal {U}_N \mapsto \{f, F \} ( u) : = \mathrm {d} F(u)(X_{f}(u)) \in \mathcal {T}_{F(u)}(\mathcal {E})=\mathcal {E}. \end{aligned}$$
(5.6)

If \(\mathcal {E}={\mathbb {R}}\), then the definition in formula (5.6) coincides with (5.1). For every \(u \in \mathcal {U}_N\) and \(\lambda \in {\mathbb {C}}\backslash \sigma (-L_u)\), since \(\Pi u = \sum _{j=1}^N \langle \Pi u , \varphi _j^{u}\rangle _{L^2} \varphi _j^{u}\), the generating functional

$$\begin{aligned} \mathcal {H}_{\lambda }(u)= \langle (L_u + \lambda )^{-1} \Pi u, \Pi u\rangle _{L^2} = -\sum _{j=1}^N \frac{2\pi \lambda _j^u}{\lambda + \lambda _j^u} \end{aligned}$$
(5.7)

is well defined. The analytical continuation allows to extend the map \(\lambda \mapsto \mathcal {H}_{\lambda }(u)\) to the domain \( {\mathbb {C}}\backslash \sigma _{\mathrm {pp}}(-L_u)\), and it has simple poles at every \(\lambda = -\lambda _j^u\). Proposition 2.3 yields that \(-\frac{ \Vert u\Vert _{L^2}^2}{4C^4} \le \lambda _1^u< \cdots< \lambda _N^u < 0\), where \(C=\inf _{f \in H^1_+ \backslash \{0\}}\frac{\Vert |\mathrm {D}|^{\frac{1}{4}}f\Vert _{L^2}}{\Vert f\Vert _{L^4}}\) denotes the Sobolev constant. So we introduce

$$\begin{aligned} \mathcal {Y}=\{(\lambda , u)\in {\mathbb {R}}\times \mathcal {U}_N : 4 C^4 \lambda > \Vert u\Vert _{L^2}^2 \} = \mathcal {X}\bigcap \left( {\mathbb {R}}\times \mathcal {U}_N \right) , \end{aligned}$$
(5.8)

where \(\mathcal {X}\) is given by Definition 2.14. Then \(\mathcal {Y}\) is open in \({\mathbb {R}}\times \mathcal {U}_N\) and \(\mathcal {H}: (\lambda , u)\in \mathcal {Y} \mapsto -\sum _{j=1}^N \frac{2\pi \lambda _j^u}{\lambda + \lambda _j^u} \in {\mathbb {R}}\) is real analytic by Proposition 4.6. The Fréchet derivative of \(\mathcal {H}_{\lambda }\) is given by (2.16), so

$$\begin{aligned} X_{\mathcal {H}_{\lambda }}(u)= \partial _x \nabla _u \mathcal {H}_{\lambda }(u)=\partial _x (|w_{\lambda }(u) |^2 +w_{\lambda } (u) + \overline{w}_{\lambda }(u)), \qquad \forall (\lambda , u)\in \mathcal {Y}, \end{aligned}$$
(5.9)

by formula (1.12), where \(w_{\lambda }(u)= (L_u+ \lambda )^{-1} (\Pi u)\). The following proposition restates the Lax pair structure of the Hamiltonian equation associated to \(\mathcal {H}_{\lambda }\). Even though the stability of \(\mathcal {U}_N\) under the Hamiltonian flow of \(\mathcal {H}_{\lambda }\) remains as an open problem, the Poisson bracket defined in (5.6) provides an algebraic method to obtain the first two formulas of (1.15).

Proposition 5.7

Given \((\lambda , u)\in \mathcal {Y}\) defined by (5.8), we have \(\{\mathcal {H}_{\lambda },L\} (u) = [B^{\lambda }_u, L_u]\) and

$$\begin{aligned} \{\mathcal {H}_{\lambda }, \lambda _j\} (u) =0, \qquad \{\mathcal {H}_{\lambda }, \gamma _j\} (u) = \mathrm {Re}\langle [G, B^{\lambda }_{u }] \varphi _j^{u }, \varphi _j^{u } \rangle _{L^2}= - \tfrac{\lambda }{(\lambda + \lambda _j^{u} )^2} , \end{aligned}$$
(5.10)

for every \(j =1,2,\ldots , N\), where \(B^{\lambda }_u=i(T_{w_{\lambda }(u)}T_{\overline{w}_{\lambda }(u)} + T_{w_{\lambda }(u)}+ T_{\overline{w}_{\lambda }(u)})\).

Proof

Since \(L: u \in L^2({\mathbb {R}}, {\mathbb {R}}) \mapsto L_u = \mathrm {D}- T_u \in \mathfrak {B}(H^1_+ , L^2_+)\), \(\forall u \in L^2_+\), we have \(\mathrm {d}L(u)(h) = -T_h\), \(\forall h \in L^2_+\). If \((\lambda , u)\in \mathcal {Y}\), then the \({\mathbb {C}}\)-linear transformation \(L_u+\lambda \in \mathfrak {B}(H^1_+, L^2_+)\) is bijective. So formula (5.9) yields that \(\{\mathcal {H}_{\lambda }, L\}(u)= {\mathrm {d}}L(u) (X_{\mathcal {H}_{\lambda }}(u)) = - T_{{\partial _{x}} (|w_{\lambda }(u)|^2 + w_{\lambda }(u) + \overline{w}_{\lambda }(u))}\). Then identity (2.22) yields the Lax equation for the Hamiltonian flow of the generating function \(\mathcal {H}_{\lambda }\), i.e.

$$\begin{aligned} \{\mathcal {H}_{\lambda },L\} (u) = [B^{\lambda }_u, L_u] \in \mathfrak {B}(H^1_+, L^2_+). \end{aligned}$$
(5.11)

Consider the map \(L \varphi _j : u \in \mathcal {U}_N \mapsto L_u \varphi _j^u = \lambda _j^u \varphi _j^u \in H^1_+\), for every \((\lambda , u)\in \mathcal {Y}\), we have

$$\begin{aligned} \{\mathcal {H}_{\lambda },L\} (u) \varphi _j^u+L_u \left( \{\mathcal {H}_{\lambda },\varphi _j\} (u)\right) = \lambda _j^u\{\mathcal {H}_{\lambda },\varphi _j\} (u) + \{\mathcal {H}_{\lambda },\lambda _j\} (u) \varphi _j^u \in H^1_+. \end{aligned}$$

Then (5.11) yields \((\lambda _j^u-L_u) \left( B_u^{\lambda } \varphi _j^u - \{\mathcal {H}_{\lambda },\varphi _j\} (u) \right) = \{\mathcal {H}_{\lambda },\lambda _j\} (u) \varphi _j^u\). Since \(\varphi _j^u \in \mathrm {Ker}( \lambda _j^u-L_u)\) and \(\Vert \varphi _j^u\Vert _{L^2}=1\) by (4.9), we have \(\{\mathcal {H}_{\lambda },\lambda _j\} (u) = \langle (\lambda _j^u-L_u) \left( B_u^{\lambda } \varphi _j^u - \{\mathcal {H}_{\lambda },\varphi _j\} (u) \right) , \varphi _j^u \rangle _{L^2}=0\). We define \(\mathcal {N}_2 : \varphi \in L^2 \mapsto \Vert \varphi \Vert _{L^2}^2\), then \(\mathcal {N}_2 \circ \varphi _j \equiv 1\) on \(\mathcal {U}_N\). Then we have

$$\begin{aligned} 0=\mathrm {d}(\mathcal {N}_2 \circ \varphi _j)(u)\left( X_{\mathcal {H}_{\lambda }}(u)\right) =2 \mathrm {Re}\langle \varphi _j^u,\{\mathcal {H}_{\lambda },\varphi _j\} (u) \rangle _{L^2}. \end{aligned}$$
(5.12)

So there exists \(r \in {\mathbb {R}}\) such that \(B_u^{\lambda } \varphi _j^u - \{\mathcal {H}_{\lambda },\varphi _j\} (u) = ir \varphi _j^u\) because \(\mathrm {Ker}( \lambda _j^u-L_u) = {\mathbb {C}}\varphi _j^u\) by corollary 2.7 and formula (5.12). Since \(B_u^{\lambda }\) is a skew-adjoint operator on \(L^2_+\) and \(\gamma _j = \mathrm {Re}\langle G\varphi _j^u, \varphi _j^u \rangle _{L^2}\), we have \(\{\mathcal {H}_{\lambda }, \gamma _j\} (u) = \mathrm {Re}\left( \langle G \{\mathcal {H}_{\lambda },\varphi _j\} (u), \varphi _j^{u } \rangle _{L^2}+ \langle G \varphi _j^{u } , \{\mathcal {H}_{\lambda },\varphi _j\} (u)\rangle _{L^2}\right) = \mathrm {Re}\langle [G, B^{\lambda }_{u }] \varphi _j^{u }, \varphi _j^{u } \rangle _{L^2}\). Furthermore, for every \((\lambda ,u) \in \mathcal {Y}\), formula (3.3) implies that \([G, T_{\overline{w}_{\lambda }(u)}] =0\) and

$$\begin{aligned}{}[G, B^{\lambda }_{u}]f = i[G, T_{w_{\lambda }(u)}](T_{\overline{w}_{\lambda }(u)}(f) + f) = -\tfrac{1}{2 \pi } [(\overline{w}_{\lambda }(u) f)^{\wedge } (0^+) + {\hat{f}}(0^+) ]w_{\lambda }(u), \quad \forall f \in \mathbf {D}(G).\nonumber \\ \end{aligned}$$
(5.13)

Since \((\overline{w}_{\lambda }(u) \varphi _j^u)^{\wedge } (0^+) = \langle \varphi _j^u, w_{\lambda }(u) \rangle _{L^2} = (\lambda + \lambda _j^u)^{-1} \overline{\langle u, \varphi _j^u \rangle }_{L^2}\) and \(\overline{\langle u, \varphi _j^u \rangle }_{L^2}=-\lambda _j^u \widehat{\varphi _j^u} (0^+)\), we replace f by \(\varphi _j^u\) in formula (5.13) to obtain \(\langle [G, B^{\lambda }_{u}]\varphi _j^{u} , \varphi _j^{u } \rangle _{L^2} = - \frac{\lambda }{(\lambda +\lambda _j^u)^2}\), \(\forall (\lambda ,u) \in \mathcal {Y}\). \(\quad \square \)

Remark 5.8

Recall that \(\tilde{\mathcal {H}}_{\epsilon }=\frac{1}{\epsilon }\mathcal {H}_{\frac{1}{\epsilon }} \) and \({\tilde{B}}_{\epsilon ,u} :=\frac{1}{\epsilon }B_u^{\frac{1}{\epsilon }}\) in Remark 2.18, \(\forall (\epsilon ^{-1}, u)\in \mathcal {Y}\). In general, the identity \((-1)^n\) \(\{E_n, \gamma _j\}(u)= \mathrm {Re}\langle [G, \tfrac{\mathrm {d}^n}{\mathrm {d}\epsilon ^n}\big |_{\epsilon =0}{\tilde{B}}_{\epsilon , u}] \varphi _j^u,\varphi _j^u\rangle _{L^2}\) holds for every conservation law \(E_n = (-1)^n\frac{\mathrm {d}^n}{\mathrm {d}\epsilon ^n}\big |_{\epsilon =0}\tilde{\mathcal {H}}_{\epsilon }\) in the BO hierarchy, \(\forall 1\le j\le N\).

Corollary 5.9

For any \(j, k = 1,2, \ldots , N\), we have \(2\pi \{\lambda _j, \gamma _k\}(u) =\delta _{kj}\), \(\{\lambda _k, \lambda _j\}(u)=0\), \(\forall u \in \mathcal {U}_N\).

Proof

Given \(u \in \mathcal {U}_N\), \(\forall \lambda > \frac{ \Vert u\Vert _{L^2}^2}{4 C^4}\), we have \((\lambda ,u) \in \mathcal {Y}\), then formula (5.7) and formula (5.10) imply that \(-\frac{\lambda }{(\lambda +\lambda _j^u)^2}= \{\mathcal {H}_{\lambda }, \gamma _j\}(u)= 2\pi \sum _{k=1}^N \{\frac{\lambda }{\lambda +\lambda _k}, \gamma _j\}(u) = -2\pi \lambda \sum _{k=1}^N \frac{\{\lambda _k, \gamma _j\}(u)}{(\lambda +\lambda _k^u)^2}\) and \(0 = \{\mathcal {H}_{\lambda }, \lambda _j\}(u) = 2\pi \lambda \sum _{k=1}^N \frac{\{\lambda _k, \lambda _j\}(u)}{(\lambda + \lambda _k^u)^2}\), \(\forall j=1,2,\ldots ,N\). The uniqueness of analytic continuation yields that \(-\frac{z}{(z+\lambda _j^u)^2} = -2\pi z \sum _{k=1}^N \frac{\{\lambda _k, \gamma _j\}(u)}{(z +\lambda _k^u)^2}\) and \(\sum _{k=1}^N \frac{\{\lambda _k, \lambda _j\}(u)}{(z + \lambda _k^u)^2}=0\), \(\forall z \in {\mathbb {C}}\backslash {\mathbb {R}}\). \(\quad \square \)

Recall that the actions \(I_j : u\in \mathcal {U}_N \mapsto 2\pi \lambda _j^u\) and the generalized angles \(\gamma _j: u\in \mathcal {U}_N \mapsto \mathrm {Re}\langle G\varphi _j^u, \varphi _j^u\rangle _{L^2}\) are both real analytic functions by Proposition 4.6 and Corollary 4.7.

Proposition 5.10

Given \(u \in \mathcal {U}_N\), the family \(\{\mathrm {d}I_1(u),\mathrm {d}I_2(u), \ldots \mathrm {d}I_N(u);\mathrm {d}\gamma _1(u),\mathrm {d}\gamma _2(u), \ldots \mathrm {d}\gamma _N(u) \}\) is linearly independent in the cotangent space \(\mathcal {T}^*_u(\mathcal {U}_N)\). As a consequence, \(\Phi _N : \mathcal {U}_N \rightarrow \Omega _N \times {\mathbb {R}}^N\) is a local diffeomorphism.

Proof

Given \(a_1, a_2, \ldots , a_N, b_1, b_2, \ldots , b_N \in {\mathbb {R}}\) such that \((\sum _{j=1}^N a_j\mathrm {d}I_j(u)+ b_j\mathrm {d}\gamma _j(u))(h)=0\), \(\forall h \in \mathcal {T}_u (\mathcal {U}_N)\). Corollary 5.9 yields that \(\forall j,k=1,2,\ldots , N\), we have \(\mathrm {d}I_j(u)(X_{I_k}(u))=\{I_k, I_j\}(u)=0\) and \(\mathrm {d}\gamma _j(u)(X_{I_k}(u))=\{I_k, \gamma _j\}(u)=\delta _{jk}\). We replace h by \(X_{I_k}(u)\) to obtain that \(b_k=0\). Then set \(h=X_{\gamma _k}(u)\), we have \(a_k=0\). \(\quad \square \)

Since all the actions \((I_j)_{1\le j\le N}\) are in evolution by Corollary 5.9 and the differentials \((\mathrm {d}I_j(u))_{1\le j \le N}\) are linearly independent for any \(u \in \mathcal {U}_N\), the level set \(\mathcal {L}_{\mathbf {r}} := \bigcap _{j=1}^N I_j^{-1}(r^j)\) is a real analytic Lagrangian submanifold of \(\mathcal {U}_N\), \(\forall \mathbf {r} =(r^1, r^2, \ldots ,r^N) \in \Omega _N\). Moreover, \(\mathcal {L}_{\mathbf {r}}\) is invariant under the Hamiltonian flow of \(I_j\), \(\forall j=1,2, \ldots , N\), by the Arnold–Liouville theorem.

5.3 The diffeomorphism property

This subsection is dedicated to proving the real bi-analyticity of \(\Phi _N : \mathcal {U}_N \rightarrow \Omega _N \times {\mathbb {R}}^N\). It remains to show its surjectivity. The proof is based on Hadamard’s global inverse function theorem.

Theorem 5.11

(Hadamard) Suppose X and Y are connected smooth manifolds, then every proper local diffeomorphism \(F : X \rightarrow Y\) is surjective. If Y is simply connected in addition, then every proper local diffeomorphism \(F : X \rightarrow Y\) is a diffeomorphism.

Lemma 5.12

The map \(\Phi _N : \mathcal {U}_N \rightarrow \Omega _N \times {\mathbb {R}}^N\) is proper.

Proof

If K is compact in \(\Omega _N \times {\mathbb {R}}^N\), we choose \(u_n \in \Phi _N^{-1}(K)\), so

$$\begin{aligned} \Phi _N(u_n)=(2\pi \lambda _1^{u_n}, 2\pi \lambda _2^{u_n}, \ldots , 2\pi \lambda _N^{u_n}; \gamma _1(u_n), \gamma _2(u_n), \ldots , \gamma _N(u_n)) \in K, \qquad \forall n \in {\mathbb {N}}. \end{aligned}$$

We assume that there exists \((2\pi \lambda _1, 2\pi \lambda _2, \ldots ,2\pi \lambda _N; \gamma _1, \gamma _2, \ldots , \gamma _N) \in K\) such that \(\lambda _j^{u_n} \rightarrow \lambda _j\) and \(\gamma _j(u_n)\rightarrow \gamma _j\) up to a subsequence. So \((M(u_n))_{n\in {\mathbb {N}}}\) converges to some matrix \(M = (M_{kj})_{1\le k , j \le N} \in {\mathbb {C}}^{N\times N}\) whose coefficients are defined by \(M_{kj}=\frac{i}{\lambda _k- \lambda _j} \sqrt{\frac{|\lambda _k|}{|\lambda _j|}}\), if \(k\ne j\); \(M_{jj}=\gamma _j - \frac{i}{2|\lambda _j|}\), \(\forall 1\le j, k \le N\). Lemma 5.6 yields that \(\sigma _{\mathrm {pp}}(M) \subset {\mathbb {C}}_-\). We set \(Q(x):=\det (x-M)\) and \(u=i\frac{Q'}{Q} - i \frac{\overline{Q}'}{\overline{Q}} \in \mathcal {U}_N\). The Viète map \(\mathbf {V}\) is defined in (4.1) and \(\mathbf {V}({\mathbb {C}}_-^N)\) is open in \({\mathbb {C}}^N\). Then there exists \(\mathbf {a}^{(n)}=(a^{(n)}_0, a^{(n)}_1, \ldots , a^{(n)}_{N-1})\), \(\mathbf {a}=(a_0, a_1, \ldots , a_{N-1}) \in \mathbf {V}({\mathbb {C}}_-^N)\) such that \(Q_n(x)=\det (x-M(u_n)) = \sum _{j=0}^{N-1} a^{(n)}_j x^j+ x^N\) and \(Q(x)=\sum _{j=0}^{N-1} a_j x^j+ x^N\). We have \(\lim _{n\rightarrow +\infty } Q_n(x) = Q(x)\), \(\forall x \in {\mathbb {R}}\). So \(\lim _{n \rightarrow +\infty }\mathbf {a}^{(n)} = \mathbf {a}\). The continuity of \(\Gamma _N : \mathbf {a}= (a_0, a_1, \ldots , a_{N-1}) \in \mathbf {V}({\mathbb {C}}_-^N) \mapsto \Pi u = i\frac{Q'}{Q} \in L^2_+\) yields that \(\Pi u_n \rightarrow \Pi u\) in \(L^2_+\), as \(n \rightarrow +\infty \). Since \(\mathcal {U}_N\) inherits the subspace topology of \(L^2({\mathbb {R}},{\mathbb {R}})\), we have \((u_n)_{n\in {\mathbb {N}}}\) converges to u in \(\mathcal {U}_N\). The continuity of the map \(\Phi _N\) shows that \(\Phi _N(u) =(2\pi \lambda _1, 2\pi \lambda _2, \ldots ,2\pi \lambda _N; \gamma _1, \gamma _2, \ldots , \gamma _N) \in K\). \(\quad \square \)

Proposition 5.13

The map \(\Phi _N : \mathcal {U}_N \rightarrow \Omega _N \times {\mathbb {R}}^N\) is a real analytic diffeomorphism.

5.4 A Lagrangian submanifold

In general, the symplectomorphism property of \(\Phi _N\) is equivalent to its Poisson bracket characterization (1.15). The first two formulas of (1.15), which are given in Corollary 5.9, lead us to focusing on the study of a special Lagrangian submanifold of \(\mathcal {U}_N\), denoted by

$$\begin{aligned} \Lambda _N := \{u \in \mathcal {U}_N : \gamma _j(u)=0, \quad \forall j =1,2, \ldots , N\}. \end{aligned}$$
(5.14)

Lemma 5.14

For every \(u \in \mathcal {U}_N\), then each of the following four properties implies the others:

  1. (a)

    The N-soliton \( u \in \Lambda _N\).

  2. (b)

    For every \(x \in {\mathbb {R}}\), we have \(\overline{\Pi u}(x) = \Pi u(-x)\).

  3. (b)

    The N-soliton u is an even function \({\mathbb {R}} \rightarrow {\mathbb {R}}\).

  4. (b)

    The Fourier transform \({\hat{u}}\) is real-valued.

Proof

\(\mathrm {(a)} \Rightarrow \mathrm {(b)}\) is obtained by (5.3) and (5.2). \(\mathrm {(b)} \Rightarrow \mathrm {(c)}\) is given by the formula \(u = \Pi u + \overline{\Pi u}\). \(\mathrm {(c)} \Rightarrow \mathrm {(d)}\) is given by \(\overline{u}(x)=u(x)=u(-x)\). Finally, \(\mathrm {(d)} \Rightarrow \mathrm {(a)}\): fix \(\lambda \in \sigma _{\mathrm {pp}}(L_u) = \{\lambda _1^u,\lambda _2^u,\ldots , \lambda _N^u \}\) and \(\varphi \in \mathrm {Ker}(\lambda -L_u)\). Since both u and its Fourier transform \({\hat{u}}\) are real-valued, we have \([(\overline{\varphi })^{\vee }]^{\wedge }(\xi )= \overline{{\hat{\varphi }}(\xi )}\), where \((\overline{\varphi })^{\vee }(x) := \overline{\varphi (-x)}\), \(\forall x,\xi \in {\mathbb {R}}\). Since \(T_u((\overline{\varphi })^{\vee }) =( \overline{T_u \varphi })^{\vee }\), we have \((\overline{\varphi })^{\vee }\in \mathrm {Ker}(\lambda -L_u)\). We choose the orthonormal basis \(\{\varphi _1^u, \varphi _2^u, \ldots , \varphi _N^u\}\) in \(\mathscr {H}_{\mathrm {pp}}(L_u)\) as in formula (4.9). Corollary 2.7 yields that \(\dim _{{\mathbb {C}}}\mathrm {Ker}(\lambda -L_u)=1\). There exists \({\tilde{\theta }}_j \in {\mathbb {R}}\) such that \((\overline{\varphi _j^u})^{\vee } = e^{i{\tilde{\theta }}_j}\varphi _j^u \Leftrightarrow \overline{(\varphi _j^u)^{\wedge }}(\xi ) = e^{i{\tilde{\theta }}_j}(\varphi _j^u)^{\wedge }(\xi )\), \(\forall \xi \in {\mathbb {R}}\), \(\forall j =1,2, \ldots , N\). So we set \(\phi _j^u:=\exp ({\tfrac{i{\tilde{\theta }}_j}{2}})\varphi _j^u\), then its Fourier transform \((\phi _j^u)^{\wedge }\) is a real-valued function. Hence \(\gamma _j(u) = \mathrm {Re}\langle G\phi _j^u, \phi _j^u\rangle _{L^2({\mathbb {R}})} = -\frac{1}{2\pi } \mathrm {Im}\langle \partial _{\xi }[(\phi _j^u)^{\wedge }] , (\phi _j^u)^{\wedge }\rangle _{L^2(0, +\infty )}=0\). \(\quad \square \)

Lemma 5.15

The level set \(\Lambda _N\) is a real analytic Lagrangian submanifold of \((\mathcal {U}_N, \omega )\).

Proof

The map \(\mathbf {\gamma } : u \in \mathcal {U}_N \mapsto (\gamma _1(u), \gamma _2(u), \ldots , \gamma _N(u)) \in {\mathbb {R}}^N\) is a real analytic submersion by Proposition 5.10. So the level set \(\Lambda _N\) is a properly embedded real analytic submanifold of \(\mathcal {U}_N\) and \(\dim _{{\mathbb {R}}} \Lambda _N =N\). The classification of the tangent space \(\mathcal {T}_u(\mathcal {U}_N)\) is given by Proposition 1.2. If \(u(x)= \sum _{j=1}^N \frac{2\eta _j}{ x^2 + \eta _j^2 }\), for some \(\eta _j >0\), then we have \(\mathcal {T}_u (\Lambda _N) = \bigoplus _{j=1}^N {\mathbb {R}}f_j^u\), where \(f_{j}^u(x)= \tfrac{2[x^2 - \eta _j^2]}{[x^2 + \eta _j^2]^2}\). We have \((f^u_j)^{\wedge }( \xi )=-2\pi |\xi | e^{-\eta _j |\xi |}\). Then by definition of \(\omega \), we have \(\omega _u (h_1, h_2) = \frac{i}{2\pi } \int _{{\mathbb {R}}} \frac{{\hat{h}}_1(\xi ) \overline{{\hat{h}}_2(\xi )}}{\xi } \mathrm {d}\xi = \frac{i}{2\pi } \int _{{\mathbb {R}}} \frac{{\hat{h}}_1(\xi ) {\hat{h}}_2(\xi )}{\xi } \mathrm {d}\xi \in i{\mathbb {R}}\), \(\forall h_1, h_2 \in \mathcal {T}_u (\Lambda _N)\). Since the symplectic form \(\omega \) is real-valued, we have \(\omega _u (h_1, h_2)= 0\), for every \(h_1, h_2 \in \mathcal {T}_u (\Lambda _N)\). Since \(\dim _{{\mathbb {R}}}(\Lambda _N)=N = \frac{1}{2} \dim _{{\mathbb {R}}}\mathcal {U}_N\), \(\Lambda _N\) is a Lagrangian submanifold of \(\mathcal {U}_N\). \(\quad \square \)

5.5 The symplectomorphism property

Finally, we prove the assertion \((\mathrm {b})\) in Theorem 5.2, i.e. the map \(\Phi _N : (\mathcal {U}_N, \omega ) \rightarrow (\Omega _N \times {\mathbb {R}}^N, \nu )\) is symplectic. We set \(\Psi _N = \Phi _N^{-1} : \Omega _N \times {\mathbb {R}}^N \rightarrow \mathcal {U}_N\), let \(\Psi _N^* \omega \) denote the pullback of the symplectic form \(\omega \) by \(\Psi _N\) which is defined by (1.22). The goal of this subsection is to prove that

$$\begin{aligned} {\tilde{\nu }} := \Psi _N^* \omega - \nu =0. \end{aligned}$$
(5.15)

Lemma 5.16

For every \(u \in \mathcal {U}_N\), set \(p=\Phi _N(u) \in \Omega _N \times {\mathbb {R}}^N\). Then we have

$$\begin{aligned} \mathrm {d}\Phi _N (u) (X_{I_k}(u)) = \tfrac{\partial }{\partial \alpha ^k}\big |_p, \qquad \forall k=1,2, \ldots , N. \end{aligned}$$
(5.16)

Proof

Fix \(u \in \mathcal {U}_N\) and \(p=\Phi _N(u)\), \(\forall h \in \mathcal {T}_u (\mathcal {U}_N)\), we have \(\mathrm {d}\Phi _N (u) (h) \in \mathcal {T}_{p} (\Omega _N \times {\mathbb {R}}^N)\). For every smooth function \(f : \mathbf {p}=(r^1, r^2, \ldots , r^N; \alpha ^1, \alpha ^2, \ldots , \alpha ^N) \in \Omega _N \times {\mathbb {R}}^N \mapsto f( \mathbf {p}) \in {\mathbb {R}}\), we have \(\left( \mathrm {d}\Phi _N (u) (h) \right) f =\mathrm {d}(f \circ \Phi _N)(u) (h) = \sum _{j=1}^N ( \mathrm {d} I_j(u)(h) \tfrac{\partial f}{\partial r^j}\big |_p + \mathrm {d} \gamma _j(u)(h)\tfrac{\partial f}{\partial \alpha ^j}\big |_p )\). For every \(k=1,2, \ldots , N\), we replace h by \(X_{I_k}(u)\in \mathcal {T}_u (\mathcal {U}_N)\), thus Corollary 5.9 yields that \(\frac{\partial f}{ \partial \alpha ^k} \big |_p = \left( \mathrm {d}\Phi _N (u) (X_{I_k}(u)) \right) f\). \(\quad \square \)

Lemma 5.17

For every \(1 \le j<k \le N\), there exists a smooth function \(c_{jk} \in C^{\infty }(\Omega _N \times {\mathbb {R}}^N)\) such that

$$\begin{aligned} {\tilde{\nu }} = \sum _{1 \le j <k \le N} c_{jk} \mathrm {d}r^j \wedge \mathrm {d}r^k, \qquad \frac{\partial c_{jk}}{\partial \alpha ^l}\Big |_p = 0, \quad \forall j,k, l=1,2,\ldots , N, \end{aligned}$$
(5.17)

for every \(p =(r^1, r^2, \ldots , r^N; \alpha ^1, \alpha ^2, \ldots , \alpha ^N)\in \Omega _N \times {\mathbb {R}}^N\).

Proof

The proof is divided into three steps. The first step is to prove that for every \(p \in \Omega _N \times {\mathbb {R}}^N \) and every \(V \in \mathcal {T}_p(\Omega _N \times {\mathbb {R}}^N)\),

$$\begin{aligned} {\tilde{\nu }}_p (\tfrac{\partial }{\partial \alpha ^l}\big |_p, V)= 0, \qquad \forall l=1,2, \ldots , N. \end{aligned}$$
(5.18)

In fact, let \(u=\Psi _N (p) \in \mathcal {U}_N\) and \(p=(r^1, r^2, \ldots , r^N; \alpha ^1, \alpha ^2, \ldots , \alpha ^N)\), so \(r^l=r^l(p) = I_l \circ \Psi _N(p)\). Then we have \((\Psi _N^* \omega )_p(\tfrac{\partial }{\partial \alpha ^l}\big |_p, V) = \omega _u (\mathrm {d}\Psi _N(p) \left( \tfrac{\partial }{\partial \alpha ^l}\big |_p \right) ,\mathrm {d}\Psi _N(p) (V) ) = \omega _u (X_{I_l}(u), \mathrm {d}\Psi _N(p) (V) )\) by (5.16). Thus \((\Psi _N^* \omega )_p(\tfrac{\partial }{\partial \alpha ^l}\big |_p, V) =-\mathrm {d}I_l(u)(\mathrm {d}\Psi _N(p) (V)) = -\mathrm {d}(I_l\circ \Psi _N)(p)(V)\). On the other hand, \(\nu _p(\tfrac{\partial }{\partial \alpha ^l}\big |_p, V) = \sum _{j=1}^N ( \mathrm {d}r^j \wedge \mathrm {d}\alpha ^j) (\tfrac{\partial }{\partial \alpha ^l}\big |_p, V ) = - \mathrm {d}r^l(p)(V)\). Thus (5.18) is obtained by \({\tilde{\nu }} = \Psi _N^* \omega - \nu \).

Since we have \({\tilde{\nu }} = \sum _{1 \le j <k \le N} (a_{jk}\mathrm {d}\alpha ^j \wedge \mathrm {d}\alpha ^k + b_{jk}\mathrm {d}r^j \wedge \mathrm {d}\alpha ^k + c_{jk}\mathrm {d}r^j \wedge \mathrm {d}r^k)\), for some smooth functions \(a_{jk}, b_{jk}, c_{jk}\in C^{\infty }(\Omega _N \times {\mathbb {R}}^N)\), the second step is to prove that \(a_{jk}=b_{jk} = 0\) on \(\Omega _N \times {\mathbb {R}}^N\), for every \(1\le j <k \le N\). In fact, we have \(\mathrm {d}r^j \wedge \mathrm {d}r^k (\tfrac{\partial }{\partial \alpha ^l}\big |_p, V)= 0\), \(\mathrm {d}r^j \wedge \mathrm {d}\alpha ^k (\tfrac{\partial }{\partial \alpha ^l}\big |_p, V) = -\delta _{kl} \mathrm {d}r^j(p)(V)\) and \(\mathrm {d}\alpha ^j \wedge \mathrm {d}\alpha ^k (\tfrac{\partial }{\partial \alpha ^l}\big |_p, V) =\delta _{jl}\mathrm {d}\alpha ^k(p)(V) - \delta _{kl}\mathrm {d}\alpha ^j(p)(V)\). Let \(l\in \{2, \ldots , N\}\) be fixed, \(\forall 1\le j <k \le N\),

$$\begin{aligned}&\sum _{1 \le l<k \le N}a_{lk}\mathrm {d}\alpha ^k(p)(V) - \sum _{1 \le j <l \le N}(a_{jl}\mathrm {d}\alpha ^j(p)(V)+b_{jl}\mathrm {d}r^j(p)(V))\nonumber \\&={\tilde{\nu }}_p(\tfrac{\partial }{\partial \alpha ^l}\big |_p, V)=0. \end{aligned}$$
(5.19)

Then we replace V by \(\tfrac{\partial }{\partial r^j}\big |_p\) and \(\tfrac{\partial }{\partial \alpha ^j}\big |_p\) respectively in (5.19), then \(a_{jl}=b_{jl}=0\), \(\forall 1\le j \le l-1\).

It remains to show that \(c_{jk}\) depends only on \(r^1, r^2, \ldots , r^N\), for every \(1\le j <k \le N\). The symplectic form \(\omega \) is closed by Proposition 1.3 and \(\nu = \mathrm {d} \kappa \) is exact, where \(\kappa = \sum _{j=1}^N r^j \mathrm {d}\alpha ^j\). So \(\mathrm {d} {\tilde{\nu }} = \Psi _N^* (\mathrm {d} \omega ) =0\). Precisely, we have \(\sum _{1 \le j <k \le N} \sum _{l =1}^N \left( \frac{\partial c_{jk} }{\partial \alpha ^l} \mathrm {d}\alpha ^l \wedge \mathrm {d}r^j \wedge \mathrm {d}r^k + \frac{\partial c_{jk} }{\partial r^l} \mathrm {d}r^l \wedge \mathrm {d}r^j \wedge \mathrm {d}r^k \right) =0\). Since the family \(\{\mathrm {d}r^j \wedge \mathrm {d}r^k \wedge \mathrm {d}\alpha ^l\}_{1\le j<k \le N, 1\le l \le N} \bigcup \{ \mathrm {d}r^j \wedge \mathrm {d}r^k \wedge \mathrm {d}r^l \}_{1\le j<k<l \le N}\) is linearly independent in \(\varvec{\Omega }^3(\mathcal {U}_N)\), we have \( \frac{\partial c_{jk} }{\partial \alpha ^l} = 0\), for any \(1\le j<k \le N\) and \(l=1,2, \ldots , N\). \(\quad \square \)

Since the 2-form \({\tilde{\nu }}\) is independent of \(\alpha ^1, \alpha ^2, \ldots , \alpha ^N\), it suffices to consider points \(p=(\mathbf {r}, \varvec{\alpha }) \in \Omega _N \times {\mathbb {R}}^N\) with \(\varvec{\alpha }=0\). We shall prove \({\tilde{\nu }}=0\) by introducing the Lagrangian submanifold \(\Omega _N\times \{0_{{\mathbb {R}}^N}\}\).

End of the proof of formula (5.15)

We have \(\Omega _N\times \{0_{{\mathbb {R}}^N}\}= \Phi _N(\Lambda _N)\), where \(\Lambda _N\) is the Lagrangian submanifold of \((\mathcal {U}_N, \omega )\) defined by (5.14). If \(q \in \Omega _N\times \{0_{{\mathbb {R}}^N}\}\), set \(v = \Psi _N(q) \in \Lambda _N\), we have

$$\begin{aligned} \mathcal {T}_q(\Omega _N\times \{0_{{\mathbb {R}}^N}\}) = \bigoplus _{j=1}^N {\mathbb {R}}\frac{\partial }{\partial r^j}\Big |_q = \mathrm {d} \Phi _N(v) (\mathcal {T}_v(\Lambda _N )). \end{aligned}$$
(5.20)

For any point \(p =(r^1,r^2, \ldots , r^N;\alpha ^1, \alpha ^2, \ldots , \alpha ^N) \in \Omega _N \times {\mathbb {R}}^N\) and \(\forall V_1, V_2 \in \mathcal {T}_p(\Omega _N \times {\mathbb {R}}^N)\), where \(V_m = \sum _{j=1}^N \left( a_j^{(m)}\tfrac{\partial }{\partial r^j}\big |_p + b_j^{(m)}\tfrac{\partial }{\partial \alpha ^j}\big |_p \right) \), \(a_j^{(m)}, b_j^{(m)} \in {\mathbb {R}}\), \(m=1,2\), we choose \(q=(r^1,r^2, \ldots , r^N;0, 0, \ldots , 0) \in \Omega _N \times \{0_{{\mathbb {R}}^N}\}\) and \(W_1, W_2 \in \mathcal {T}_q(\Omega _N \times \{0_{{\mathbb {R}}^N}\})\), where \(W_m = \sum _{j=1}^N a_j^{(m)}\tfrac{\partial }{\partial r^j}\big |_q\), \(m=1,2\). We set \(v = \Psi _N(q) \in \Lambda _N\). Since \(c_{jk}(p)=c_{jk}(q)\), then (5.17) yields that \({\tilde{\nu }}_p(V_1,V_2)= \sum _{1\le j<k\le N}(a_j^{(1)}a_k^{(2)}-a_k^{(1)}a_j^{(2)})c_{jk} (p) = {\tilde{\nu }}_q(W_1,W_2) = \omega _v(\mathrm {d}\Psi _N(v)( W_1), \mathrm {d}\Psi _N(v)( W_2))\), because we have \(\nu _q(W_1, W_2)=0\). The identification (5.20) yields that \(h_m:= \mathrm {d}\Psi _N (v)(W_m) \in \mathcal {T}_v(\Lambda _N)\), for \(m=1,2\). Consequently, we have \({\tilde{\nu }}_p(V_1,V_2) = \omega _v(h_1, h_2)=0\). \(\quad \square \)

6 Asymptotic Approximation

This section is dedicated to describing the asymptotic behavior of the multi-soliton solutions of (1.1).

Proof of corollary 1.11

Given \(u \in \mathcal {U}_N\), we define \(\mathfrak {M}(u)= (M_{jj}(u)\delta _{kj})_{1\le k, j\le N}\), where \(M_{jj}\) is given in (1.18). Given \((t,x) \in {\mathbb {R}}^2\), we set \(\mathfrak {A}=\mathfrak {A}(u,t, x):=\mathfrak {M}(u)-x-\frac{t}{\pi } \mathfrak {V}(u)\), where \(\mathfrak {V}\) is given in Corollary 1.10. Then \(\mathfrak {A}(u,t,x)^{-1}=(a_j (u,t,x) \delta _{kj})_{1\le k,j \le N}\), where \(a_j(x,t,u)^{-1} := \gamma _j(u) -x-\frac{t}{\pi }I_j(u) + \frac{\pi i}{I_j(u)} \). We set \(\mathfrak {K}(u):=M(u)-\mathfrak {M}(u)\), then \(\forall u_0 \in \mathcal {U}_N\), we have \(u_{\infty }(t,x, u_0) = 2\mathrm {Im} \langle \mathfrak {A}(u_0,t,x)^{-1} X(u_0), Y(u_0)\rangle _{{\mathbb {C}}^N}\). If \(u : t \in {\mathbb {R}} \mapsto u(t) \in \mathcal {U}_N \) solves the BO equation (1.1) such that \(u(0)=u_0\) and |t| is large, then

$$\begin{aligned} \begin{aligned} u(t,x)=&u(t,x; u_0)= 2\mathrm {Im} \langle \left( \mathfrak {A}(u_0,t,x)+\mathfrak {K}(u_0)\right) ^{-1} X(u_0), Y(u_0)\rangle _{{\mathbb {C}}^N} \\ =&u_{\infty }(t,x; u_0) + 2 \mathrm {Im} \sum _{n\ge 2} \langle \left( -\mathfrak {A}(u_0,t,x)^{-1}\mathfrak {K}(u_0) \right) ^n \mathfrak {A}(u_0,t,x)^{-1} X(u_0), Y(u_0)\rangle _{{\mathbb {C}}^N}. \end{aligned} \end{aligned}$$
(6.1)

Given \(R>0\), we have \(\Vert \mathfrak {A}(u_0,t)^{-1}\Vert _{L_x^{\infty }(-R,R)} \le \sum _{j=1}^N \Bigg ( (\frac{|I_j(u_0)|}{\pi }|t| -R-|\gamma _j(u_0)|)^2 + \frac{\pi ^2}{I_j(u_0)^2}\Bigg )^{-\frac{1}{2}} \rightarrow 0\), when \(|t|\rightarrow +\infty \). So there exists \(\mathfrak {T}(u_0, R, N)>0\) such that \(2N^2\Vert \mathfrak {A}(u_0,t)^{-1}\Vert _{L_x^{\infty }(-R,R)}\Vert \mathfrak {K}(u_0) \Vert _{{\mathbb {C}}^{N\times N}}< 1\), if \(|t| \ge \mathfrak {T}(u_0, R, N)\). Moreover, \(\Vert \mathfrak {A}(u_0,t)^{-1}\Vert _{L_x^2({\mathbb {R}})}^2 \le \pi \sum _{j=1}^N \mathbf {k}_j(u_0)\). Then (6.1) yields that

$$\begin{aligned} \begin{aligned} \Vert u(t)-u_{\infty }(t)\Vert _{L_x^2(-R,R)} \lesssim _{u_0, N}&\sum _{n\ge 2} \Vert \left( -\mathfrak {A}(u_0,t)^{-1}\mathfrak {K}(u_0) \right) ^n \mathfrak {A}(u_0,t)^{-1}\Vert _{L_x^2(-R,R)} \\ \lesssim _{u_0, N}&\Vert \mathfrak {A}(u_0,t)^{-1}\Vert _{L_x^{\infty }(-R,R)}^2 \Vert \mathfrak {K}(u_0) \Vert _{{\mathbb {C}}^{N\times N}}^2 \Vert \mathfrak {A}(u_0,t)^{-1}\Vert _{L_x^2({\mathbb {R}})} \rightarrow 0 \end{aligned} \end{aligned}$$

as \(|t|\rightarrow +\infty \). Given \(x \in {\mathbb {R}}\), similarly, there exists \(\mathfrak {T}'(u_0, x, N)>0\) such that the series of functions \(t \in [\mathfrak {T}'(u_0, x, N), +\infty ) \mapsto 2t^2 \mathrm {Im} \sum _{n\ge 2} \langle \left( -\mathfrak {A}(u_0,t,x)^{-1}\mathfrak {K}(u_0) \right) ^n \mathfrak {A}(u_0,t,x)^{-1} X(u_0), Y(u_0)\rangle _{{\mathbb {C}}^N} \in {\mathbb {C}}\) converges uniformly. Since \(\lim _{t \rightarrow \pm \infty } t^2u_{\infty }(t,x)=\sum _{j=1}^N \frac{2}{\mathbf {k}_j(u_0)^3}\) and \(\lim _{t \rightarrow \pm \infty } t \mathfrak {A}(u_0,t,x)^{-1}=-\pi \mathfrak {V}(u_0)^{-1}\), we have \(\frac{u(t,x)}{u_{\infty }(t,x)} =1+ 2t^2 \mathrm {Im} \sum _{n\ge 2} \langle \left( -\mathfrak {A}(u_0,t,x)^{-1}\mathfrak {K}(u_0) \right) ^n \mathfrak {A}(u_0,t,x)^{-1} X(u_0), Y(u_0)\rangle _{{\mathbb {C}}^N}(t^2 u_{\infty }(t,x) )^{-1} \rightarrow 1\), as \(|t|\rightarrow +\infty \) by formula (6.1). \(\quad \square \)