1 Introduction and Main Results

The FPUT chain with N particles is the system with Hamiltonian

$$\begin{aligned} H_{F}(\mathbf{p},\mathbf{q})= \sum _{j=0}^{N-1}\frac{p_j^2}{2} + \sum _{j=0}^{N-1} V_F(q_{j+1}-q_{j}) \ , \quad V_F(x) = \frac{x^2}{2} - \frac{x^3}{6}+ {\mathtt b}\frac{x^4}{24} \ , \end{aligned}$$
(1.1)

which we consider with periodic boundary conditions \(q_N=q_0\,\), \(p_N=p_0\) and \({\mathtt b}> 0\). We observe that any generic nearest neighborhood quartic potential can be set in the form of \(V_F(x)\) through a canonical change of coordinates. Over the last 60 years the FPUT system has been the object of intense numerical and analytical research. Nowadays it is well understood that the system displays, on a relatively short time scale, an integrable-like behavior, first uncovered by Fermi, Pasta, Ulam and Tsingou [13, 14] and later interpreted in terms of closeness to a nonlinear integrable system by some authors, e.g. the Korteweg-de Vries (KdV) equation by Zabusky and Kruskal [41], the Boussinesq equation by Zakharov [42], and the Toda chain by Manakov first [31], and then by Ferguson, Flaschka and McLauglin [12]. On larger time scales the system displays instead an ergodic behavior and approaches its micro-canonical equilibrium state (i.e. measure), unless the energy is so low to enter a KAM-like regime [25, 26, 36].

In the present work we show that a family of first integrals of the Toda system are adiabatic invariants (namely almost constant quantities) for the FPUT system. We bound their variation for times of order \(\beta \), where \(\beta \) is the inverse of the temperature of the chain. Such estimates hold for a large set of initial data with respect to the Gibbs measure of the chain and they are uniform in the number of particles, thus they persist in the thermodynamic limit.

In the last few years, there has been a lot of activity in the problem of constructing adiabatic invariants of nonlinear chain systems in the thermodynamic limit, see [8, 9, 18, 19, 29, 30]. In particular adiabatic invariants in measure for the FPUT chain have been recently introduced by Maiocchi, Bambusi, Carati [29] by considering the FPUT chain a perturbation of the linear harmonic chain. Our approach is based on the remark [12, 31] that the FPUT chain (1.1) can be regarded as a perturbation of the (nonlinear) Toda chain [39]

$$\begin{aligned} H_{T}(\mathbf{p},\mathbf{q}):=\frac{1}{2} \sum _{j=0}^{N-1}{p_j^2} + \sum _{j=0}^{N-1}V_T(q_{j+1}-q_{j} )\ , \quad V_T(x) = e^{- x} + x - 1\, , \end{aligned}$$
(1.2)

which we consider again with periodic boundary conditions \(q_N=q_0\,\), \(p_N=p_0\). The equations of motion of (1.1) and (1.2) take the form

$$\begin{aligned} \dot{q}_j=\dfrac{\partial H}{\partial p_j}=p_j,\quad \dot{p}_j=-\dfrac{\partial H}{\partial q_j}=V'(q_{j+1}-q_{j}) -V'(q_{j}-q_{j-1}), \;\;j=0,\dots ,N-1,\nonumber \\ \end{aligned}$$
(1.3)

where H stands for \(H_F\) or \(H_T\) and V for \(V_F\) and \(V_T\) respectively.

According to the values of \({\mathtt b}\) in (1.1), the Toda chain is either an approximation of the FPUT chain of third order (for \({\mathtt b}\ne 1\)), or fourth order (for \({\mathtt b}= 1\)). We remark that the Toda chain is the only nonlinear integrable FPUT-like chain [11, 37].

The Toda chain admits several families of N integrals of motion in involution (e.g. [16, 24, 40]). Among the various families of integrals of motion, the ones constructed by Henon [22] and Flaschka [15] are explicit and easy to compute, being the trace of the powers of the Lax matrix associated to the Toda chain. In the following we refer to them simply as Toda integrals and denote them by \(J^{(k)}\), \(1 \le k \le N\) (see (2.12)).

As the \(J^{(k)}\)’s are conserved along the Toda flow, and the FPUT chain is a perturbation of the Toda one, the Toda integrals are good candidates to be adiabatic invariants when computed along the FPUT flow. This intuition is supported by several numerical simulations, the first by Ferguson–Flaschka–McLaughlin [12] and more recently by other authors [4, 6, 10, 20, 35]. Such simulations show that the variation of the Toda integrals along the FPUT flow is very small on long times for initial data of small specific energy. In particular, the numerical results in [4, 6, 20] suggest that such phenomenon should persists in the thermodynamic limit and for “generic” initial conditions.

Our first result is a quantitative, analytical proof of this phenomenon. More precisely, we fix an arbitrary \(m \in {\mathbb N}\) and provided N and \(\beta \) sufficiently large, we bound the variations of the first m Toda integrals computed along the flow of FPUT, for times of order

$$\begin{aligned} \frac{\beta }{\left( ({\mathtt b}-1)^2 + C_1 \beta ^{-1} \right) ^{\frac{1}{2}}} , \end{aligned}$$
(1.4)

where \(C_1\) is a positive constant, independent of \(\beta , N\). Such a bound holds for initial data in a large set with respect to the Gibbs measure. Note that the bound (1.4) improves to \(\beta ^{\frac{3}{2}}\) when \({\mathtt b}= 1\), namely when the Toda chain becomes a fourth order approximation of the FPUT chain. Such analytical time-scales are compatible with (namely smaller than) the numerical ones determined in [4,5,6].

An interesting question is whether the Toda integrals \(J^{(k)}\)’s control the normal modes of FPUT, namely the action of the linearized chain. It turns out that this is indeed the case: we prove that the quadratic parts \(J^{(2k)}_2\) (namely the Taylor polynomials of order 2) of the integral of motions \(J^{(2k)}\), are linear combinations of the normal modes. Namely one has

$$\begin{aligned} J^{(2k)} = \sum _{j=0}^{N-1} {\widehat{c}}_j^{(k)}\, E_j + O(({\widehat{\mathbf{p}}}, {\widehat{\mathbf{q}}})^3) , \end{aligned}$$
(1.5)

where \(E_j\) is the jth normal mode (see (2.21) for its formula), \(({\widehat{\mathbf{p}}}, {\widehat{\mathbf{q}}})\) are the discrete Hartley transform of \((\mathbf{p},\mathbf{q})\) (see definition below in (2.18)) and \({\widehat{\mathbf{c}}}^{(k)}\) are real coefficients.

So we consider linear combinations of the normal modes of the form

$$\begin{aligned} \sum _{j=0}^{N-1} {\widehat{g}}_j E_j \end{aligned}$$
(1.6)

where \(({\widehat{g}}_j)_j\) is the discrete Hartley transform of a vector \(\mathbf{g}\in {\mathbb R}^N\) which has only \(2\lfloor \frac{m}{2}\rfloor +2\) nonzero entries with m independent from N, here \(\lfloor \frac{m}{2}\rfloor \) is the integer part of \(\frac{m}{2}\). Our second result shows that linear combinations of the form (1.6), when computed along the FPUT flow, are adiabatic invariants for the same time scale as in (1.4).

Further we also show that linear combinations of the harmonic modes as in (1.6), are approximate invariant for the Toda dynamics (with large probability).

Examples of linear combinations (1.6) that we control are

$$\begin{aligned} \sum _{j=1}^N \sin ^{2\ell }\left( \frac{j\pi }{N}\right) \, E_j , \qquad \sum _{j=1}^N \cos ^{2\ell }\left( \frac{j\pi }{N}\right) \, E_j , \qquad \forall \ell =0, \ldots , \Big \lfloor \frac{m}{2}\Big \rfloor . \end{aligned}$$
(1.7)

These linear combinations weight in different ways low and high energy modes.

Finally we note that, in the study of the FPUT problem, one usually measures the time the system takes to approach the equilibrium when initial conditions very far from equilibrium are considered. On the other hand, our result indicates that, despite initial states are sampled from a thermal distribution, nonetheless complete thermalization is expected to be attained, in principle, over a time scale that increases with decreasing temperature.

Our results are mainly based on two ingredients. The first one is a detailed study of the algebraic properties of the Toda integrals. The second ingredient comes from adapting to our case, methods of statistical mechanics developed by Carati [8] and Carati–Maiocchi [9], and also in [18, 19, 29, 30].

2 Statement of Results

2.1 Toda integrals as adiabatic invariants for FPUT

We come to a precise statements of the main results of the present paper. We consider the FPUT chain (1.1) and the Toda chain (1.2) in the subspace

$$\begin{aligned} {\mathcal M}:= \left\{ (\mathbf{p},\mathbf{q}) \in {\mathbb R}^{N}\times {\mathbb R}^N :\ \ \ \sum _{j=0}^{N-1} q_j ={\mathcal L} \ \, ,\sum _{j=0}^{N-1} p_j = 0 \right\} \, , \end{aligned}$$
(2.1)

which is invariant for the dynamics. Here \({\mathcal L}\) is a positive constant.

Since both \(H_F\) and \(H_T\) depend just on the relative distance between \(q_{j+1}\) and \(q_j\), it is natural to introduce on \({\mathcal M}\) the variables \(r_j\)’s as

$$\begin{aligned} r_j := q_{j+1} - q_j , \qquad 0 \le j \le N -1 \, , \end{aligned}$$
(2.2)

which are naturally constrained to

$$\begin{aligned} \sum _{j=0}^{N-1} r_j = 0 \, , \end{aligned}$$
(2.3)

due to the periodic boundary condition \(q_N = q_0\). We observe that the change of coordinates (2.2) together with the condition (2.3) is well defined on the phase space \({\mathcal M}\), but not on the whole phase space \({\mathbb R}^N\times {\mathbb R}^N\). In these variables the phase space \({\mathcal M}\) reads

$$\begin{aligned} {\mathcal M}:= \left\{ (\mathbf{p},\mathbf{r}) \in {\mathbb R}^{N}\times {\mathbb R}^N :\ \ \ \sum _{j=0}^{N-1} r_j = \sum _{j=0}^{N-1} p_j = 0 \right\} \, . \end{aligned}$$
(2.4)

We endow \({\mathcal M}\) by the Gibbs measure of \(H_F\) at temperature \(\beta ^{-1}\), namely we put

$$\begin{aligned} \mathrm{d}\mu _F := \frac{1}{Z_F(\beta )} \ e^{-\beta {H_F(\mathbf{p},\mathbf{r})}} \, \ \delta \left( \sum _{j=0}^{N-1} r_j =0\right) \ \delta \left( \sum _{j=0}^{N-1} p_j =0\right) \ \mathrm{d}\mathbf{p}\, \mathrm{d}\mathbf{r}, \end{aligned}$$
(2.5)

where as usual \(Z_F(\beta )\) is the partition function which normalize the measure, namely

$$\begin{aligned} Z_F(\beta ) := \int _{{\mathbb R}^N \times {\mathbb R}^N} e^{-\beta {H_F(\mathbf{p},\mathbf{r})}} \ \delta \left( \sum _{j=0}^{N-1} r_j =0\right) \ \delta \left( \sum _{j=0}^{N-1} p_j =0\right) \ \mathrm{d}\mathbf{p}\, \mathrm{d}\mathbf{r}. \end{aligned}$$
(2.6)

We remark that we can consider the measure \(\mathrm{d}\mu _F\) as the weak limit, as \(\epsilon \rightarrow 0\), of the measure

$$\begin{aligned} \mathrm{d}\mu _\epsilon = \frac{ e^{-\beta H_F(\mathbf{p}, \mathbf{r})} \ e^{-\left( \sum _{j=0}^{N-1}r_j/\epsilon \right) ^2 - \left( \sum _{j=0}^{N-1}p_j/\epsilon \right) ^2 }}{\left( \int _{{\mathbb R}^{2N}}e^{-\beta H_F(\mathbf{p}, \mathbf{r})} \ e^{-\left( \sum _{j=0}^{N-1}r_j/\epsilon \right) ^2 - \left( \sum _{j=0}^{N-1}p_j/\epsilon \right) ^2 } \ \mathrm{d}\mathbf{p}\, \mathrm{d}\mathbf{r}\right) } \ \mathrm{d}\mathbf{p}\, \mathrm{d}\mathbf{r}\, . \end{aligned}$$

Given a function \(f:{\mathcal M}\rightarrow {\mathbb C}\), we will use the probability (2.5) to compute its average \(\left\langle f \right\rangle \), its \(L^2\) norm \(\Vert f \Vert \), its variance \(\sigma _f^2\) defined as

$$\begin{aligned}&\left\langle f \right\rangle := \mathbf {E}\left[ {f}\right] \equiv \int _{{\mathbb R}^{2N}} f(\mathbf{p},\mathbf{r}) \, \, \mathrm{d}\mu _F , \end{aligned}$$
(2.7)
$$\begin{aligned}&\Vert f \Vert ^2 := \mathbf {E}\left[ {|f|^2}\right] \equiv \int _{{\mathbb R}^{2N}} |f(\mathbf{p},\mathbf{r})|^2 \, \mathrm{d}\mu _F , \end{aligned}$$
(2.8)
$$\begin{aligned}&\sigma _f^2 := \Vert f - \left\langle f \right\rangle \Vert ^2 . \end{aligned}$$
(2.9)

In order to state our first theorem we must introduce the Toda integrals of motion. It is well known that the Toda chain is an integrable system [22, 39]. The standard way to prove its integrability is to put it in a Lax-pair form. The Lax form was introduced by Flaschka in [15] and Manakov [31] and it is obtained through the change of coordinates

$$\begin{aligned} b_j := -p_j \, , \qquad a_j:= e^{\frac{1}{2}(q_j-q_{j+1})} \equiv e^{- \frac{1}{2} r_j} , \qquad 0 \le j \le N-1 \, . \end{aligned}$$
(2.10)

By the geometric constraint (2.3) and the momentum conservation \(\sum _{j=0}^{N-1} p_j = 0\) (see (2.1)), such variables are constrained by the conditions

$$\begin{aligned} \sum _{j=0}^{N-1}{b_j}=0, \, \qquad \prod _{j=0}^{N-1}{a_j}=1 \ . \end{aligned}$$

The Lax operator for the Toda chain is the periodic Jacobi matrix [40]

$$\begin{aligned} L(b,a) := \left( \begin{array}{ccccc} b_{0} &{} a_{0} &{} 0 &{} \ldots &{} a_{N-1} \\ a_{0} &{} b_{1} &{} a_{1} &{} \ddots &{} \vdots \\ 0 &{} a_{1} &{} b_{2} &{} \ddots &{} 0 \\ \vdots &{} \ddots &{} \ddots &{} \ddots &{} a_{N-2} \\ a_{N-1} &{} \ldots &{} 0 &{} a_{N-2} &{} b_{N-1} \\ \end{array} \right) . \end{aligned}$$
(2.11)

We introduce the matrix \(A=L_+-L_-\) where for a square matrix X we call \(X_+\) the upper triangular part of X

$$\begin{aligned} \left( X_+\right) _{ij} =\left\{ \begin{array}{cc} X_{ij}, &{} i\le j \\ 0, &{} \text{ otherwise }\end{array}\right. \end{aligned}$$

and in a similar way by \(X_-\) the lower triangular part of X

$$\begin{aligned} \left( X_-\right) _{ij} =\left\{ \begin{array}{cc} X_{ij} , &{} i\ge j \\ 0, &{} \text{ otherwise. }\end{array}\right. \end{aligned}$$

A straightforward calculation shows that the Toda equations of motions (1.3) are equivalent to

$$\begin{aligned} \dfrac{d L}{dt}= [A,L]. \end{aligned}$$

It then follows that the eigenvalues of L are integrals of motion in involutions.

In particular, the trace of powers of L,

$$\begin{aligned} J^{(m)} := \frac{1}{m} \text {Tr}\left( L^m\right) , \qquad \forall 1 \le m \le N \end{aligned}$$
(2.12)

are N independent, commuting, integrals of motions in involution. Such integrals were first introduced by Henon [22] (with a different method), and we refer to them as Toda integrals. We give the first few of them explicitly, written in the variables \((\mathbf{p}, \mathbf{r})\):

$$\begin{aligned} \begin{aligned}&J^{(1)}(\mathbf{p}) := -\sum _{i=0}^{N-1} p_i, \qquad \qquad J^{(2)}(\mathbf{p},\mathbf{r}):= \sum _{i=0}^{N-1}\left[ \frac{p_i^2}{2} + e^{-r_i}\right] , \\&J^{(3)}(\mathbf{p},\mathbf{r}):= -\sum _{i=0}^{N-1} \left[ \frac{1}{3} p_i^3 + (p_i + p_{i+1}) e^{-r_i}\right] , \\&J^{(4)}(\mathbf{p},\mathbf{r}):= \sum _{i=0}^{N-1} \left[ \frac{1}{4} p_i^4 + (p_i^2 + p_i p_{i+1} + p_{i+1}^2) e^{-r_i} + \frac{1}{2} e^{-2r_i} + e^{-r_i - r_{i+1}}\right] . \end{aligned} \end{aligned}$$
(2.13)

Note that \(J^{(2)}\) coincides with the Toda Hamiltonian \(H_T\).

Our first result shows that the Toda integral \(J^{(m)}\), computed along the Hamiltonian flow \(\phi ^t_{H_F}\) of the FPUT chain, is an adiabatic invariant for long times and for a set of initial data in a set of large Gibbs measure. Here the precise statement:

Theorem 2.1

Fix \(m \in {\mathbb N}\). There exist constants \(N_0, \beta _0, C_0, C_1>0\) (depending on m), such that for any \(N > N_0\), \(\beta > \beta _0\), and any \(\delta _1,\delta _2>0\) one has

$$\begin{aligned} \mathbf{P}\left( \left| J^{(m)}\circ \phi ^t_{H_F} - J^{(m)} \right| > \delta _1\sigma _{J^{(m)}}\right) \le \delta _2 C_0 \, , \end{aligned}$$
(2.14)

for every time t fulfilling

$$\begin{aligned} |t| \le \frac{ \delta _1\sqrt{\delta _2}}{\Big (({\mathtt b}- 1)^2 + C_1 \beta ^{-1} \Big )^{1/2}} \beta \, . \end{aligned}$$
(2.15)

In (2.14) \(\mathbf{P}\) stands for the probability with respect to the Gibbs measure (2.5).

We observe that the time scale (2.15) increases to \(\beta ^{\frac{3}{2}}\) for \({\mathtt b}= 1\), namely if the Toda chain is a fifth order approximation of the FPUT chain.

Remark 2.2

By choosing \(0< \varepsilon < \frac{1}{4}\), \(\delta _1=\beta ^{-\epsilon }\) and \(\delta _2=\beta ^{-2\epsilon }\) the statement of the above theorem becomes:

$$\begin{aligned} \mathbf{P}\left( \left| J^{(m)}\circ \phi ^t_{H_F} - J^{(m)} \right| > \frac{\sigma _{J^{(m)}}}{\beta ^\varepsilon } \right) \le \frac{C_0}{\beta ^{2\varepsilon }} \, , \end{aligned}$$
(2.16)

for every time t fulfilling

$$\begin{aligned} |t| \le \frac{ \beta ^{1-2\varepsilon }}{\Big (({\mathtt b}- 1)^2 + C_1 \beta ^{-1} \Big )^{1/2}} \, . \end{aligned}$$
(2.17)

Remark 2.3

We observe that our estimates in (2.14) and (2.15) are independent from the number of particles N. Therefore we can claim that the result of theorem 2.1 holds true in the thermodynamic limit, i.e. when \(\lim _{N\rightarrow \infty } \frac{\langle H_F\rangle }{N} = e > 0\) where \(\langle H_F\rangle \) is the average over the Gibbs measure (2.5) of the FPUT Hamiltonian \(H_F\). The same observation applies to Theorems 2.5 and 2.6 below.

Our Theorem 2.1 gives a quantitative, analytical proof of the adiabatic invariance of the Toda integrals, at least for a set of initial data of large measure. It is an interesting question whether other integrals of motion of the Toda chain are adiabatic invariants for the FPUT chain. Natural candidates are the actions and spectral gaps.

Action-angle coordinates and the related Birkhoff coordinates (a cartesian version of action-angle variables) were constructed analytically by Henrici and Kappeler [23, 24] for any finite N, and by Bambusi and one of the author [1] uniformly in N, but in a regime of specific energy going to 0 when N goes to infinity (thus not the thermodynamic limit).

The difficulty in dealing with these other sets of integrals is that they are not explicit in the physical variables \((\mathbf{p}, \mathbf{r})\). As a consequence, it appears very difficult to compute their averages with respect to the Gibbs measure of the system.

Despite these analytical challenges, recent numerical simulations by Goldfriend and Kurchan [20] suggest that the spectral gaps of the Toda chain are adiabatic invariants for the FPUT chain for long times also in the thermodynamic limit.

2.2 Packets of normal modes

Our second result concerns adiabatic invariance of some special linear combination of normal modes. To state the result, we first introduce the normal modes through the discrete Hartley transform. Such transformation, which we denote by \({\mathcal H}\), is defined as

$$\begin{aligned} {\widehat{\mathbf{p}}}:= {\mathcal H}\mathbf{p}, \, \quad {\mathcal H}_{j,k} := \frac{1}{\sqrt{N}}\left( \cos \left( 2\pi \frac{jk}{N}\right) + \sin \left( 2\pi \frac{jk}{N}\right) \right) , \quad j,k = 0,\ldots , N-1\nonumber \\ \end{aligned}$$
(2.18)

and one easily verifies that it fulfills

(2.19)

The Hartley transform is closely related to the classical Fourier transform \({\mathcal F}\), whose matrix elements are \({\mathcal F}_{j,k}:= \frac{1}{\sqrt{N}} e^{- \mathrm{i}2\pi j k/N }\), as one has \({\mathcal H}= \mathfrak {R}{\mathcal F}- \mathfrak {I}{\mathcal F}\). The advantage of the Hartley transform is that it maps real variables into real variables, a fact which will be useful when calculating averages of quadratic Hamiltonians (see Sect. 5.2).

A consequence of (2.18) is that the change of coordinates

$$\begin{aligned} {\mathbb R}^N \times {\mathbb R}^N \rightarrow {\mathbb R}^N \times {\mathbb R}^N, \quad (\mathbf{p}, \mathbf{q}) \mapsto ({\widehat{\mathbf{p}}}, {\widehat{\mathbf{q}}}) := ({\mathcal H}\mathbf{p}, {\mathcal H}\mathbf{q}) \end{aligned}$$

is a canonical one. Due to \(\sum _j p_j =0, \, \sum _j q_j = {\mathcal L}\), one has also \({\widehat{p}}_0 =0, \, {\widehat{q}}_0 = {\mathcal L}/\sqrt{N}\). In these variables the quadratic part \(H_2\) of the Toda Hamiltonian (1.2), i.e. its Taylor expansion of order two nearby the origin, takes the form

$$\begin{aligned} H_2({\widehat{\mathbf{p}}}, {\widehat{\mathbf{q}}}) := \sum _{j=1}^{N-1} \frac{{\widehat{p}}_j^2 + \omega _j^2 {\widehat{q}}_j^2}{2} , \qquad \omega _j := 2\sin \left( \pi \frac{j}{N}\right) . \end{aligned}$$
(2.20)

We observe that (2.20) is exactly the Hamiltonian of the Harmonic Oscillator chain. We define

$$\begin{aligned} E_j := \frac{{\widehat{p}}_j^2 + \omega _j^2 {\widehat{q}}_j^2}{2} , \qquad j = 1, \ldots , N-1 \, , \end{aligned}$$
(2.21)

the jth normal mode.

To state our second result we need the following definition:

Definition 2.4

(m-admissible vector). Fix \(m \in {\mathbb N}\) and \( {\widetilde{m}} : = \left\lfloor \frac{m}{2}\right\rfloor \). For any \(N > m\), a vector \(\mathbf{x}\in {\mathbb R}^N\) is said to be m-admissible if there exits a non zero vector \(\mathbf{y}=(y_0, y_1, \ldots , y_{{\widetilde{m}}}) \in {\mathbb R}^{{\widetilde{m}} +1}\) with \(K^{-1} \le \sum _j |y_j| \le K\), K independent from N, such that

$$\begin{aligned} x_k= x_{N-k} = y_k, \text{ for } 0 \le k \le {\widetilde{m}} \hbox { and } x_{k} = 0 \hbox { otherwise.} \end{aligned}$$

We are ready to state our second result, which shows that special linear combinations of normal modes are adiabatic invariants for the FPUT dynamics for long times. Here the precise statement:

Theorem 2.5

Fix \(m\in \mathbb {N}\) and let \(\varvec{g}=(g_0,\dots ,g_{N-1})\in {\mathbb R}^N\) be a m-admissible vector (according to Definition 2.4). Define

$$\begin{aligned} \Phi := \sum _{j=0}^{N-1} {\widehat{g}}_j E_j , \end{aligned}$$
(2.22)

where \( {\widehat{\mathbf{g}}}\) is the discrete Hartley transform (2.18) of \(\mathbf{g}\), and \(E_j\) is the harmonic energy (2.21). Then there exist \(N_0, \beta _0, C_0, C_1>0\) (depending on m), such that for any \(N > N_0\), \(\beta > \beta _0\), \(0< \varepsilon < \frac{1}{4}\), one has

$$\begin{aligned} \mathbf{P}\left( \left| \Phi \circ \phi ^t_{H_F} - \Phi \right| > \frac{\sigma _{\Phi }}{\beta ^\varepsilon } \right) \le \frac{C_0}{\beta ^{2\varepsilon }} \, , \end{aligned}$$
(2.23)

for every time t fulfilling

$$\begin{aligned} |t| \le \frac{\, \beta ^{1-2\varepsilon }}{\Big (({\mathtt b}-1)^2 + C_2 \beta ^{-1} \Big )^{1/2}}. \end{aligned}$$
(2.24)

Again when \({\mathtt b}= 1\) the time scale improves by a factor \(\beta ^{\frac{1}{2}}\).

Finally we consider the Toda dynamics generated by the Hamiltonian \(H_T\) in (1.2). In this case we endow \({\mathcal M}\) in (2.4) by the Gibbs measure of \(H_T\) at temperature \(\beta ^{-1}\), namely we put

$$\begin{aligned} \mathrm{d}\mu _T := \frac{1}{Z_T(\beta )} \ e^{-\beta {H_T(\mathbf{p},\mathbf{r})}} \, \ \delta \left( \sum _j r_j =0\right) \ \delta \left( \sum _j p_j =0\right) \ \mathrm{d}\mathbf{p}\, \mathrm{d}\mathbf{r}, \end{aligned}$$
(2.25)

where as usual \(Z_T(\beta )\) is the partition function which normalize the measure, namely

$$\begin{aligned} Z_T(\beta ) := \int _{{\mathbb R}^N \times {\mathbb R}^N} e^{-\beta {H_T(\mathbf{p},\mathbf{r})}} \ \delta \left( \sum _j r_j =0\right) \ \delta \left( \sum _j p_j =0\right) \ \mathrm{d}\mathbf{p}\, \mathrm{d}\mathbf{r}. \end{aligned}$$
(2.26)

We prove that the quantity (2.22), computed along the Hamiltonian flow \(\phi ^t_{H_T}\) of the Toda chain, is an adiabatic invariant for all times and for a large set of initial data:

Theorem 2.6

Fix \(m\in \mathbb {N}\); let \(\mathbf{g}\in {\mathbb R}^N\) be an m-admissible vector and define \(\Phi \) as in (2.22). Then there exist \(N_0, \beta _0, C>0\) such that for any \(N > N_0\), \(\beta > \beta _0\), any \(\delta _1 >0\) one has

$$\begin{aligned} \mathbf{P}\left( \left| \Phi \circ \phi ^t_{H_T} - \Phi \right| > \delta _1 {\sigma _{\Phi }} \right) \le \frac{C}{\delta _1^2 \, \beta } \, , \end{aligned}$$
(2.27)

for all times.

Remark 2.7

It is easy to verify that the functions \(\Phi \) in (2.22) are linear combinations of

$$\begin{aligned} \sum _{j=0}^{N-1} \cos \left( \frac{2\ell j \pi }{N}\right) \, E_j , \qquad \ell =0, \ldots , \Big \lfloor \frac{m}{2}\Big \rfloor \end{aligned}$$
(2.28)

(choose \(g_\ell = g_{N-\ell }=1\), \(g_j = 0\) otherwise). Then, using the multi-angle trigonometric formula

$$\begin{aligned} \cos (2nx) = (-1)^{n}T_{2n}(\sin x) , \qquad \cos (2nx) = T_{2n}(\cos x) , \end{aligned}$$

where the \(T_n\)’s are the Chebyshev polynomial of the first kind, it follows that we can control (1.7). Actually these functions fall under the class considered in [29].

Let us comment about the significance of Theorems 2.5 and 2.6. The study of the dynamics of the normal modes of FPUT goes back to the pioneering numerical simulations of Fermi, Pasta, Ulam and Tsingou [13]. They observed that, corresponding to initial data with only the first normal mode excited, namely initial data with \(E_1\ne 0\) and \(E_j =0\) \(\, \forall j \ne 1\), the dynamics of the normal modes develops a recurrent behavior, whereas their time averages \(\frac{1}{t}\int _0^t E_j\circ \phi ^\tau _{H_F} \mathrm{d}\tau \) quickly relaxed to a sequence exponentially localized in j. This is what is known under the name of FPUT packet of modes.

Subsequent numerical simulations have investigated the persistence of the phenomenon for large N and in different regimes of specific energies [4, 6, 7, 17, 27, 32] (see also [2] for a survey of results about the FPUT dynamics).

Analytical results controlling packets of normal modes along the FPUT system are proven in [1, 3]. All these results deal with specific energies going to zero as the number of particles go to infinity, thus they do not hold in the thermodynamic limit. Our result controls linear combination of normal modes and holds in the thermodynamic limit.

2.3 Ideas of the proof

The starting point of our analysis is to estimate the probability that the time evolution of an observable \(\Phi (t)\), computed along the Hamiltonian flow of H, slightly deviates from its initial value. In our application \(\Phi \) is either the Toda integral of motion or a special linear combination of the harmonic energies and H is either the FPUT or Toda Hamiltonian. Quantitatively, Chebyshev inequality gives

$$\begin{aligned} \mathbf{P}\Big (\left| \Phi (t) - \Phi (0) \right|> \lambda \sigma _{\Phi (0)} \Big ) \le \frac{1}{\lambda ^2} \frac{\sigma ^2_{\Phi (t) - \Phi (0)}}{\sigma ^2_{\Phi (0)} } \ , \qquad \forall \lambda > 0 . \end{aligned}$$
(2.29)

So our first task is to give an upper bound on the variance \(\sigma _{\Phi (t) - \Phi (0)}\) and a lower bound on the variance \(\sigma _{\Phi (0)}\). Regarding the former bound we exploit the Carati-Maiocchi inequality [9]

$$\begin{aligned} \sigma _{\Phi (t) - \Phi (0)}^2 \le \left\langle \{ \Phi , H \}^2\right\rangle t^2 , \qquad \forall t \in {\mathbb R}, \end{aligned}$$
(2.30)

where \(\{\Phi , H\}\), denotes the canonical Poisson bracket

$$\begin{aligned} \{ \Phi , H\} := (\partial _\mathbf{q}\Phi )^\intercal \partial _\mathbf{p}H - (\partial _\mathbf{p}\Phi )^\intercal \partial _\mathbf{q}H \equiv \sum _{i=0}^{N-1} \partial _{q_i} \Phi \, \partial _{p_i} H - \partial _{p_i} \Phi \, \partial _{q_i} H.\qquad \quad \end{aligned}$$
(2.31)

Next we fix \(m \in {\mathbb N}\), consider the m-th Toda integral \(J^{(m)}\), and prove that the quotient

$$\begin{aligned} \frac{\left\langle \{ J^{(m)}, H_F\}^2\right\rangle }{\sigma ^2_{J^{(m)}} } \end{aligned}$$
(2.32)

scales appropriately in \(\beta \) (as \(\beta \rightarrow \infty \)) and it is bounded uniformly in N (provided N is large enough). It is quite delicate to prove that the quotient in (2.32) is bounded uniformly in N and for the purpose we exploit the rich structure of the Toda integral of motions.

This manuscript is organized as follows. In Sect. 3 we study the structure of the Toda integrals. In particular we prove that for any \(m \in {\mathbb N}\) fixed, and N sufficiently large, the m-th Toda integral \(J^{(m)}\) can be written as a sum \(\frac{1}{m}\sum _{j=1}^N h_j^{(m)}\) where each term depends only on at most m consecutive variables, moreover \(h_j^{(m)}\) and \(h_k^{(m)}\) have disjoint supports if the distance between j and k is larger than m. Then we make the crucial observation that the quadratic part of the Toda integrals \(J^{(m)}\) are quadratic forms in \(\mathbf{p}\) and \(\mathbf{q}\) generated by symmetric circulant matrices. In Sect. 3 we approximate the Gibbs measure with the measure were all the variable are independent random variables. and we calculate the error of our approximation. In Sect. 4 we obtain a bound on the variance of \(J^{(m)}(t)-J^{(m)}(0)\) with respect to the FPUT flow and a bound of linear combination of harmonic energies with respect to the FPUT flow and the Toda flow. Finally in Sect. 5 we prove our main results, namely Theorems 2.1, 2.5 and 2.6. We describe in the “Appendices” the more technical results.

3 Structure of the Toda Integrals of Motion

In this section we study the algebraic and the analytic properties of the Toda integrals defined in (2.12). First we write them explicitly:

Theorem 3.1

For any \(1 \le m \le N-1\), one has

$$\begin{aligned} J^{(m)}= \frac{1}{m} \sum _{j=1}^N h_{j}^{(m)} \, , \end{aligned}$$
(3.1)

where \( h_{j}^{(m)}:= [L^m]_{jj}\) is given explicitly by

$$\begin{aligned} h_{j}^{(m)} (\mathbf{p},\mathbf{r})= \sum _{(\mathbf{n},\mathbf{k})\in {\mathcal A}^{(m)}} (-1)^{|\mathbf{k}|}\, \rho ^{(m)}(\mathbf{n},\mathbf{k}) \prod _{i = -{\widetilde{m}} }^{{\widetilde{m}}-1} e^{-{n_i} r_{j+i}} \prod _{i = -{\widetilde{m}}+1 }^{{\widetilde{m}} -1} p_{j+i}^{k_i} \, , \end{aligned}$$
(3.2)

where it is understood \(r_j \equiv r_{j {\,\mathrm{mod}\,}N}, \, p_j \equiv p_{j {\,\mathrm{mod}\,}N}\) and \({\mathcal A}^{(m)}\) is the set

$$\begin{aligned} \begin{aligned} {\mathcal A}^{(m)} := \Big \{(\mathbf{n},\mathbf{k}) \in {\mathbb N}^{\mathbb {Z}}_0 \times {\mathbb N}^{\mathbb {Z}}_0 \ :\ \ \&\sum _{i= -{\widetilde{m}} }^{{\widetilde{m}}-1} \left( 2n_i + k_i\right) = m , \\&\forall i \ge 0, \ \ \ n_i = 0 \Rightarrow n_{i+1} = k_{i+1} = 0,\,\\&\forall i < 0, \ \ \ n_{i+1} = 0 \Rightarrow n_{i}= k_i = 0 \Big \}. \end{aligned} \end{aligned}$$
(3.3)

The quantity \({\widetilde{m}} := \lfloor m/2\rfloor \), \({\mathbb N}_0={\mathbb N}\cup \{0\}\) and \(\rho ^{(m)}(\mathbf{n}, \mathbf{m}) \in {\mathbb N}\) is given by

$$\begin{aligned} \rho ^{(m)}(\mathbf{n},\mathbf{k}) :=&\left( {\begin{array}{c}n_{-1} + n_0 + k_0\\ k_0\end{array}}\right) \left( {\begin{array}{c}n_{-1} + n_0\\ n_0\end{array}}\right) \prod _{i=-{\widetilde{m}} \atop i \ne -1}^{ {\widetilde{m}} -1}\left( {\begin{array}{c}n_i + n_{i+1} +k_{i+1} -1\\ k_{i+1}\end{array}}\right) \left( {\begin{array}{c}n_i + n_{i+1} -1\\ n_{i+1}\end{array}}\right) \, . \end{aligned}$$
(3.4)

We give the proof of this theorem in “Appendix D”.

Remark 3.2

The structure of \(J^{(N)}\) is slightly different, but we will not use it here.

We now describe some properties of the Toda integrals which we will use several times. The Hamiltonian density \( h_{j}^{(m)} (\mathbf{p},\mathbf{r})\) depends on the set \({\mathcal A}^{(m)}\) and the coefficient \(\rho ^{(m)}(\mathbf{n}, \mathbf{k})\) which are independent from the index j. This implies that \(h_j^{(m)}\) is obtained by \(h_1^{(m)}\) just by shifting \(1\rightarrow j\); in [18, 19] this property was formalized with the notion of cyclic functions, we will lately recall it for completeness.

A second immediate property, as one sees inspecting the formulas (3.3) and (3.4), is that there exists \(C^{(m)} >0\) (depending only on m) such that

$$\begin{aligned} |{\mathcal A}^{(m)}|\le C^{(m)} , \quad \rho ^{(m)}(\mathbf{n}, \mathbf{k}) \le C^{(m)} , \end{aligned}$$
(3.5)

namely the cardinality of the set \({\mathcal A}^{(m)}\) and the values of the coefficients \(\rho ^{(m)}(\mathbf{n}, \mathbf{k})\) are independent of N.

The last elementary property, which follows from the condition \(2|\mathbf{n}| + |\mathbf{k}| = m\) in (3.3), is that

$$\begin{aligned} \begin{aligned}&m \text{ even } \quad \Longrightarrow \quad h^{(m)}_j \text{ contains } \text{ only } \text{ even } \text{ polynomials } \text{ in } \mathbf {p}, \\ {}&m \text{ odd } \quad \Longrightarrow \quad h^{(m)}_j \text{ contains } \text{ only } \text{ odd } \text{ polynomials } \text{ in } \mathbf {p}. \end{aligned} \end{aligned}$$
(3.6)

Now we describe three other important properties of the Toda integrals, which are less trivial and require some preparation. Such properties are

  1. (i)

    cyclicity;

  2. (ii)

    uniformly bounded support;

  3. (iii)

    the quadratic parts of the Toda integrals are represented by circulant matrices.

We first define each of these properties rigorously, and then we show that the Toda integrals enjoy them.

Cyclicity Cyclic functions are characterized by being invariant under left and right cyclic shift. For any \(\ell \in {\mathbb Z}\), and \(\mathbf{x}=(x_1, x_2, \ldots , x_N)\in \mathbb {R}^N\) we define the cyclic shift of order \(\ell \) as the map

$$\begin{aligned} S_\ell :{\mathbb R}^N \rightarrow {\mathbb R}^N, \qquad (S_\ell x)_j := x_{(j+\ell ){\,\mathrm{mod}\,}N} . \end{aligned}$$
(3.7)

For example \(S_1\) and \(S_{-1}\) are the left respectively right shifts:

$$\begin{aligned} S_1(x_1, x_2, \ldots , x_N) := (x_2, \ldots , x_N, x_{1}), \qquad S_{-1}(x_1, x_2, \ldots , x_N) := (x_N, x_1, \ldots , x_{N-1}). \end{aligned}$$

It is immediate to check that for any \(\ell , \ell ' \in {\mathbb Z}\), cyclic shifts fulfills:

(3.8)

Consider now a a function \(H:{\mathbb R}^N \times {\mathbb R}^N \rightarrow {\mathbb C}\); we shall denote by \(S_\ell H:{\mathbb R}^N \times {\mathbb R}^N \rightarrow {\mathbb C}\) the operator

$$\begin{aligned} (S_\ell H)(\mathbf{p}, \mathbf{r}) := H(S_\ell \mathbf{p}, S_\ell \mathbf{r}) , \qquad \forall (\mathbf{p}, \mathbf{r}) \in {\mathbb R}^N \times {\mathbb R}^N. \end{aligned}$$
(3.9)

Clearly \(S_\ell \) is a linear operator. We can now define cyclic functions:

Definition 3.3

(Cyclic functions). A function \(H:{\mathbb R}^N \times {\mathbb R}^N \rightarrow {\mathbb C}\) is called cyclic if \(S_1 H = H\).

It is clear from the definition that a cyclic function fulfills \(S_\ell H = H\) \(\, \forall \ell \in {\mathbb Z}\).

It is easy to construct cyclic functions as follows: given a function \(h :{\mathbb R}^N \times {\mathbb R}^N \rightarrow {\mathbb C}\) we define the new function H by

$$\begin{aligned} H(\mathbf{p}, \mathbf{r}) := \sum _{\ell = 0}^{N-1} (S_{\ell } h)(\mathbf{p}, \mathbf{r}) . \end{aligned}$$
(3.10)

H is clearly cyclic and we say that H is generated by h.

Support Given a differentiable function \(F :{\mathbb R}^N \times {\mathbb R}^N \rightarrow {\mathbb C}\), we define its support as the set

(3.11)

and its diameter as

$$\begin{aligned} \mathrm{diam} \left( \mathrm{supp }\, F\right) := \sup _{i, j \in \mathrm{supp}\, F} {\mathtt d}(i,j) + 1 , \end{aligned}$$
(3.12)

where \({\mathtt d}\) is the periodic distance

$$\begin{aligned} {\mathtt d}(i,j) := \min \left( |i-j|, \ N- |i-j| \right) . \end{aligned}$$
(3.13)

Note that \(0\le {\mathtt d}(i,j) \le \lfloor N/2\rfloor \).

We often use the following property: if f is a function with diameter \(K \in {\mathbb N}\), and \(K \ll N\), then

$$\begin{aligned} {\mathtt d}(i,j) > K \quad \Longrightarrow \quad \mathrm{supp }\, S_j f \cap \mathrm{supp }\, S_i f = \emptyset , \end{aligned}$$
(3.14)

where \(S_j\) is the shift operator (3.7). With the above notation and definition we arrive to the following elementary result.

Lemma 3.4

Consider the Toda integral \(J^{(m)}= \frac{1}{m} \sum _{j=1}^N h_{j}^{(m)} \,\), \(1 \le m \le N\) in (3.1). Then \(J^{(m)}\) is a cyclic function generated by \(\frac{1}{m} h^{(m)}_1\), namely

$$\begin{aligned} J^{(m)}(\mathbf{p},\mathbf{r})=\frac{1}{m}\sum _{j=1}^N S_{j-1} h_1^{(m)}(\mathbf{p},\mathbf{r}). \end{aligned}$$
(3.15)

Further, each term \(h_j^{(m)}\) has diameter at most m. In particular \(h_j^{(m)}\) and \(h_k^{(m)}\) have disjoint supports provided \({\mathtt d}(j,k) > m\).

Circulant symmetric matrices We begin recalling the definition of circulant matrices (see e.g. [21, Chap. 3]).

Definition 3.5

(Circulant matrix). An \(N \times N\) matrix A is said to be circulant if there exists a vector \(\mathbf{a}=(a_j)_{j=0}^{N-1} \in {\mathbb R}^N\) such that

$$\begin{aligned} A_{j,k} = a_{(j-k) {\,\mathrm{mod}\,}N} . \end{aligned}$$

We will say that A is represented by the vector \( \mathbf{a}\).

In particular circulant matrices have all the form

$$\begin{aligned} A = {\begin{bmatrix} a_{0}&{} a_{{N-1}}&{}\dots &{} a_{{2}}&{} a_{{1}} \\ a_{{1}}&{} a_{0}&{} a_{{N-1}}&{}&{} a_{{2}} \\ \vdots &{} a_{{1}}&{} a_{0}&{}\ddots &{}\vdots \\ a_{{N-2}}&{}&{}\ddots &{}\ddots &{} a_{{N-1}} \\ a_{{N-1}}&{} a_{{N-2}}&{}\dots &{} a_{{1}}&{} a_{0} \\ \end{bmatrix}} \end{aligned}$$

where each row is the right shift of the row above.

Moreover, A is circulant symmetric if and only if its representing vector \(\mathbf{a}\) is even, i.e. one has

$$\begin{aligned} a_{k} = a_{N-k}\ , \quad \forall k . \end{aligned}$$
(3.16)

One of the most remarkable property of circulant matrices is that they are all diagonalized by the discrete Fourier transform (see e.g. [21, Chap. 3]). We show now that circulant symmetric matrices are diagonalized by the Hartley transform:

Lemma 3.6

Let A be a circulant symmetric matrix represented by the vector \(\mathbf{a}\in {\mathbb R}^N\). Then

$$\begin{aligned} {\mathcal H}A {\mathcal H}^{-1} = \sqrt{N} \, \mathrm{diag } \{{\widehat{a}}_j :\ 0 \le j \le N-1 \}, \end{aligned}$$
(3.17)

where \(\widehat{\mathbf{a}}= {\mathcal H}\mathbf{a}\).

Proof

First remark that a circulant matrix acts on a vector \(\mathbf{x}\in {\mathbb R}^N\) as a periodic discrete convolution,

$$\begin{aligned} A \mathbf{x}= \mathbf{a}\star \mathbf{x}, \qquad (\mathbf{a}\star \mathbf{x})_j := \sum _{k = 0}^{N-1} a_{j-k}\, x_{k} , \qquad 0 \le j \le N-1 , \end{aligned}$$
(3.18)

where it is understood \(a_\ell \equiv a_{\ell {\,\mathrm{mod}\,}N}\). As the Hartley transform of a discrete convolution is given by

$$\begin{aligned}{}[{\mathcal H}( \mathbf{a}\star \mathbf{x})]_k = \frac{\sqrt{N}}{2} \Big ( ({\widehat{a}}_k + {\widehat{a}}_{N-k}){\widehat{x}}_k + ({\widehat{a}}_k - {\widehat{a}}_{N-k}){\widehat{x}}_{N-k})\Big ), \end{aligned}$$

we obtain (3.17), using that the Hartley transform maps even vectors (see (3.16)) in even vectors. \(\square \)

Our interest in circulant matrices comes from the following fact: quadratic cyclic functions are represented by circulant matrices. More precisely consider a quadratic function of the form

$$\begin{aligned} Q(\mathbf{p}, \mathbf{r}) = \frac{1}{2}\mathbf{p}^\intercal A \mathbf{p}+ \frac{1}{2} \mathbf{r}^\intercal B \mathbf{r}+ \mathbf{p}^\intercal C \mathbf{r}, \end{aligned}$$
(3.19)

where ABC are \(N \times N\) matrices. Then one has

$$\begin{aligned} Q \text { is cyclic } \quad \Longleftrightarrow \quad A, B, C \text { are circulant } . \end{aligned}$$
(3.20)

This result, which is well known (see e.g. [21]), follows from the fact that Q cyclic is equivalent to ABC commuting with the left cyclic shift \(S_1\), and that the set of matrices which commute with \(S_1\) coincides with the set of circulant matrices.

We conclude this section collecting some properties of Toda integrals. Denote by \(J_2^{(m)}\) the Taylor polynomial of order 2 of \(J^{(m)}\) at zero; being a quadratic, symmetric, cyclic function, it is represented by circulant symmetric matrices. We have the following lemma.

Lemma 3.7

Let us consider the Toda integral

$$\begin{aligned} J^{(m)}(\mathbf{p},\mathbf{r})=\frac{1}{m}\sum _{j=1}^N S_{j-1} h_1^{(m)}(\mathbf{p},\mathbf{r}). \end{aligned}$$

Then \(h_1^{(m)}(\mathbf{p},\mathbf{q})\) has the following Taylor expansion at \(\mathbf{p}= \mathbf{r}= 0\):

$$\begin{aligned} h_1^{(m)}(\mathbf{p},\mathbf{r}) = {\varphi }_0^{(m)}+ {\varphi }_1^{(m)}(\mathbf{p},\mathbf{r}) + {\varphi }_2^{(m)}(\mathbf{p},\mathbf{r}) + {\varphi }_{\ge 3}^{(m)}(\mathbf{p},\mathbf{r}) \end{aligned}$$
(3.21)

where each \({\varphi }_k^{(m)}(\mathbf{p},\mathbf{r})\) is a homogeneous polynomial of degree \(k=0,1,2\) in \(\mathbf{p}\) and \(\mathbf{r}\) of diameter m and coefficients independent from N. The reminder \( {\varphi }_{\ge 3}^{(m)}(\mathbf{p},\mathbf{r})\) takes the form

$$\begin{aligned} {\varphi }_{\ge 3}^{(m)}(\mathbf{p},\mathbf{r}) := \sum _{(\mathbf{k}, \mathbf{n}) \in {\mathcal A}^{(m)} \atop |k| \ge 3 } \, (-1)^{|\mathbf{k}| } \rho ^{(m)}(\mathbf{n}, \mathbf{k}) \, \mathbf{p}^{\mathbf{k}} \left( 1 - \mathbf{n}^\intercal \mathbf{r}+ \frac{1}{2} (\mathbf{n}^\intercal \mathbf{r})^2 + \frac{(\mathbf{n}^\intercal \mathbf{r})^3}{2} \int _0^1 e^{-s\mathbf{n}^\intercal r} \, (1-s)^2 \, \mathrm{d}s \right) \,,\nonumber \\ \end{aligned}$$
(3.22)

with \( {\mathcal A}^{(m)}\) and \( \rho ^{(m)}\) defined in (3.3) and (3.4) respectively. Moreover the Taylor expansion of \(J^{(m)}(\mathbf{p},\mathbf{r})\) at \(\mathbf{p}= \mathbf{r}= 0\) takes the form

$$\begin{aligned} J^{(m)}(\mathbf{p}, \mathbf{r}) = J^{(m)}_0 + J^{(m)}_2(\mathbf{p},\mathbf{r})+ J^{(m)}_{\ge 3}(\mathbf{p},\mathbf{r}), \end{aligned}$$
(3.23)

where

  • \( J^{(m)}_0={\left\{ \begin{array}{ll} c \in {\mathbb R}, &{} m \text{ even } \\ 0\,, &{} m \text{ odd } \text{. } \end{array}\right. }\)

  • \(J^{(m)}_2(\mathbf{p},\mathbf{r})\) is a cyclic function of the form

    $$\begin{aligned} J^{(m)}_2(\mathbf{p},\mathbf{r}) = {\left\{ \begin{array}{ll} \mathbf{p}^\intercal A^{(m)} \mathbf{p}+ \mathbf{r}^\intercal A^{(m)} \mathbf{r}, &{} m \text{ even } \\ \mathbf{p}^\intercal B^{(m)} \mathbf{r}, &{} m \text{ odd } \end{array}\right. } \end{aligned}$$
    (3.24)

    with \(A^{(m)}, B^{(m)}\) circulant, symmetric \(N \times N\) matrices; their representing vectors \( \mathbf{a}^{(m)}\), \(\mathbf{b}^{(m)}\) are m-admissible (according to Definition 2.4) and

    $$\begin{aligned} a_k^{(m)} = a_{N-k}^{(m)}> 0 , \qquad b_k^{(m)} = b_{N-k}^{(m)} > 0 , \qquad \forall 0 \le k \le {\widetilde{m}} : =\Big \lfloor \frac{m}{2}\Big \rfloor . \end{aligned}$$
    (3.25)
  • The reminder \(J^{(m)}_{\ge 3}\) is a cyclic function generated by \(\frac{\varphi _{\ge 3}^{(m)}}{m}\).

The proof is postponed to “Appendix A”. We conclude this section giving the definition of m-admissible functions and we prove a lemma that characterizes them in terms of \(\{J_2^{(l)}\}_{l=1}^{N}\).

Definition 3.8

\(G_1,G_2:{\mathbb R}^N\times {\mathbb R}^N\rightarrow {\mathbb R}^N\) are called m-admissible functions of the first and second kind respectively if there exists a m-admissible vector \(\varvec{g}\in {\mathbb R}^N\) such that

$$\begin{aligned} G_1 := \sum _{j,l=0}^{N-1} g_l p_jr_{j+l} \, , \qquad G_2 := \sum _{j,l=0}^{N-1} g_l \left( p_j p_{j+l} + r_jr_{j+l}\right) \, . \end{aligned}$$
(3.26)

Remark 3.9

From Definition 3.8 and (3.20) one can deduce that both \(G_1\) and \(G_2\) can be represented with circulant and symmetric matrices. Indeed we have that \(G_1=\mathbf{p}^\intercal {\mathcal G}_1\mathbf{r}\) where \({(\mathcal G}_1)_{jk}=g_{(j-k) {\,\mathrm{mod}\,}N} \) and similarly for \(G_2\).

An immediate, but very useful, corollary of Lemma 3.7, is the fact that the quadratic parts of Toda integrals are a basis of the vector space of m-admissible functions.

Lemma 3.10

Fix \(m\in {\mathbb N}\) and let \(G_1\) and \( G_2\) be m-admissible functions of the first and second kind defined by a m-admissible vector \(\varvec{g}\in {\mathbb R}^N\). Then there are two unique sequences \(\{c_j\}_{j=0}^{{\widetilde{m}}}, \, \{d_j\}_{j=0}^{{\widetilde{m}}}\), with \(\max _j |c_j|, \, \max _j |d_j| \) independent from N, such that:

$$\begin{aligned} G_1 = \sum _{l=0}^{{\widetilde{m}}} c_l J_{2}^{(2l+1)}, \quad G_2 = \sum _{l=0}^{{\widetilde{m}}} d_l J_{2}^{(2l + 2)}, \end{aligned}$$
(3.27)

where \(J_2^{(m)}\) is the quadratic part (3.24) of the Toda integrals \(J^{(m)}\) in (3.1).

Proof

We will prove the statement just for functions of the first kind. The proof for functions of the second kind can be obtained in a similar way. Let \(J^{(2l+1)}_2 = \mathbf{p}^\intercal B^{(2l+1)} \mathbf{r}\) where the circulant matrix \(B^{(2l+1)} \) is represented by the vector \(\mathbf{b}^{(2l+1)}\) and let \(G_1=\mathbf{p}^\intercal {\mathcal G}_1\mathbf{r}\) where \({(\mathcal G}_1)_{jk}=g_{(j-k) {\,\mathrm{mod}\,}N} \). Then

$$\begin{aligned} {\mathcal G}_1=\sum _{l=0}^{{\widetilde{m}}} c_l B^{(2l+1)}\Longrightarrow g_{k}=\sum _{l=0}^{{\widetilde{m}}} b^{(2l+1)}_kc_l \,. \end{aligned}$$

From Lemma 3.7 the matrix \(\mathfrak {B}=[b^{(2l+1)}_k]_{k,l=0}^{{\widetilde{m}}}\) is upper triangular and the diagonal elements are always different from 0 (see in particular formula (3.25)). This implies that the above linear system is uniquely solvable for \((c_0,\dots , c_{{\widetilde{m}}})\). \(\square \)

4 Averaging and Covariance

In this section we collect some properties of the Gibbs measure \(\mathrm{d}\mu _F\) in (2.5). The first property is the invariance with respect to the shift operator. Namely for a function \(f:{\mathbb R}^N \times {\mathbb R}^N \rightarrow {\mathbb R}\); we have that

$$\begin{aligned} \left\langle S_j f \right\rangle = \left\langle f \right\rangle , \qquad \forall j = 0, \ldots , N-1\,, \end{aligned}$$
(4.1)

which follows from the fact that \((S_j)_*\mathrm{d}\mu _F = \mathrm{d}\mu _F\).

It is in general not possible to compute exactly the average of a function with respect to the Gibbs measure \(\mathrm{d}\mu _F\) in (2.5). This is mostly due to the fact that the variables \(p_0, \ldots , p_{N-1}\) and \(r_0, \ldots , r_{N-1}\) are not independent with respect to the measure \(\mathrm{d}\mu _F\), being constrained by the conditions \(\sum _i r_i = \sum _i p_i = 0\).

We will therefore proceed as in [29], by considering a new measure \(\mathrm{d}\mu _{F,\theta }\) on the extended phase space according to which all variables are independent. We will be able to compute averages and correlations with respect to this measure, and estimate the error derived by this approximation.

For any \(\theta \in {\mathbb R}\), we define the measure \(\mathrm{d}\mu _{F,\theta }\) on the extended space \({\mathbb R}^N\times {\mathbb R}^N\) by

$$\begin{aligned} \mathrm{d}\mu _{F,\theta } := \frac{1}{Z_{F,\theta }(\beta )} \ e^{-\beta {H_F(\mathbf{p},\mathbf{r})}} \, e^{- \theta \sum _{j=0}^{N-1} r_j} \ \mathrm{d}\mathbf{p}\, \mathrm{d}\mathbf{r}, \end{aligned}$$
(4.2)

where we define \(Z_{F,\theta }(\beta )\) as the normalizing constant of \(\mathrm{d}\mu _{F,\theta }\). We denote the expectation of a function f with respect to \(\mathrm{d}\mu _{F,\theta }\) by \( \left\langle f \right\rangle _\theta \). We also denote by

$$\begin{aligned} \Vert f \Vert ^2_\theta := \int _{{\mathbb R}^{2N}} |f(\mathbf{p},\mathbf{r})|^2 \, \mathrm{d}\mu _{F,\theta } . \end{aligned}$$

If \(\Vert f \Vert _\theta < \infty \) we say that \(f \in L^2(\mathrm{d}\mu _{F,\theta })\).

The measure \(\mathrm{d}\mu _{F,\theta }\) depends on the parameter \(\theta \in {\mathbb R}\) and we fix it in such a way that

$$\begin{aligned} \int _{\mathbb R}r \, e^{- \theta r - \beta V_F(r)} \, \mathrm{d}r = 0 . \end{aligned}$$
(4.3)

Following [29], it is not difficult to prove that there exists \(\beta _0 >0\) and a compact set \({\mathcal I}\subset {\mathbb R}\) such that for any \(\beta > \beta _0\), there exists \(\theta = \theta (\beta ) \in {\mathcal I}\) for which (4.3) holds true. We remark that (4.3) is equivalent to require that \(\left\langle r_j \right\rangle _\theta = 0\) for \(\, j=0,\dots , N-1\) and as a consequence \( \left\langle \sum _{j=0}^{N-1} r_j \right\rangle _\theta = 0 .\) We observe that \(\left\langle \sum _{j=0}^{N-1} r_j \right\rangle = 0\) with respect to the measure \(\mathrm{d}\mu _F\).

The main reason for introducing the measure \(\mathrm{d}\mu _{F, \theta }\) is that it approximates averages with respect to \(\mathrm{d}\mu _F\) as the following result shows.

Lemma 4.1

Fix \({\widetilde{\beta }}>0\) and let \(f :{\mathbb R}^N\times {\mathbb R}^N \rightarrow {\mathbb R}\) have support of size K (according to Definition 3.11) and finite second order moment with respect to \(\mathrm{d}\mu _{F,\theta }\), uniformly for all \(\beta > {\widetilde{\beta }}\). Then there exist positive constants \(C, N_0\) and \(\beta _0 \) such that for all \(N > N_0\), \(\beta > max\{\beta _0,\tilde{\beta }\}\) one has

$$\begin{aligned} \left| \left\langle f \right\rangle - \left\langle f \right\rangle _\theta \right| \le C \frac{K}{N} \sqrt{\left\langle f^2\right\rangle _{\theta } - \left\langle f \right\rangle _\theta ^2}\, . \end{aligned}$$
(4.4)

The above lemma is an extension to the periodic case of a result from [29], and we shall prove it in “Appendix C”. As an example of applications of Lemma 4.1, we give a bound to correlations functions.

Lemma 4.2

Fix \(K \in {\mathbb N}\). Let \(f, g :{\mathbb R}^{N}\times {\mathbb R}^N \rightarrow {\mathbb C}\) such that :

  1. 1.

    fg and \( fg \in L^2(\mathrm{d}\mu _{F,\theta })\),

  2. 2.

    the supports of f and g have size at most \(K \in {\mathbb N}\).

Then there exist \(C, N_0, \beta _0 >0\) such that for all \(N > N_0\), \(\beta > \beta _0\)

$$\begin{aligned} {\left| \left\langle f g \right\rangle - \left\langle f \right\rangle \left\langle g \right\rangle \right| \le 2\Vert f \Vert _\theta \, \Vert g \Vert _\theta + \frac{C K}{N} \Big (\Vert f \Vert _\theta \, \Vert g \Vert _\theta + \Vert fg \Vert _\theta \Big ) . } \end{aligned}$$
(4.5)

Moreover, if f and g have disjoint supports, then

$$\begin{aligned} \left| \left\langle f g \right\rangle - \left\langle f \right\rangle \left\langle g \right\rangle \right| \le \frac{C K}{N} \Big (\Vert f \Vert _\theta \, \Vert g \Vert _\theta + \Vert fg \Vert _\theta \Big ) . \end{aligned}$$
(4.6)

Proof

We substitute the measure \(\mathrm{d}\mu _F\) with \(\mathrm{d}\mu _{F,\theta }\) and then we control the error by using Lemma 4.1. With this idea, we write

$$\begin{aligned} \left\langle f g \right\rangle - \left\langle f \right\rangle \left\langle g \right\rangle =&\left\langle f g \right\rangle - \left\langle f g \right\rangle _{\theta } \end{aligned}$$
(4.7)
$$\begin{aligned}&+ \left\langle f g \right\rangle _\theta - \left\langle f \right\rangle _\theta \left\langle g \right\rangle _\theta \\&+ \left\langle f \right\rangle _\theta \left\langle g \right\rangle _\theta - \left\langle f \right\rangle \left\langle g \right\rangle \, , \end{aligned}$$
(4.8)

and estimate the different terms. We will often use the inequality

$$\begin{aligned} \left| \left\langle f \right\rangle _\theta \right| \le \Vert f \Vert _\theta \, , \end{aligned}$$
(4.9)

valid for any function \(f \in L^2(\mathrm{d}\mu _{F,\theta })\).

Estimate of (4.7): By Lemma 4.1, and the assumption that fg depends on at most 2K variables,

$$\begin{aligned} \left| \left\langle fg \right\rangle - \left\langle fg \right\rangle _{\theta } \right|&\le C \frac{2K}{N} \sqrt{\left\langle (fg)^2\right\rangle _\theta - \left\langle fg\right\rangle _\theta ^2} \le \frac{C' K }{N} \Vert fg \Vert _{\theta } \, . \end{aligned}$$

Estimate of (4.8): By Cauchy-Schwartz and (4.10) we have

$$\begin{aligned} \left| \left\langle fg \right\rangle _\theta - \left\langle f \right\rangle _\theta \left\langle g \right\rangle _\theta \right| \le 2 \Vert f \Vert _\theta \Vert g \Vert _\theta \, . \end{aligned}$$
(4.10)

Estimate of (4.9): We decompose further

$$\begin{aligned} \left\langle f \right\rangle _\theta \left\langle g \right\rangle _\theta - \left\langle f \right\rangle \left\langle g \right\rangle =&\left\langle g \right\rangle _\theta \left( \left\langle f\right\rangle _\theta - \left\langle f \right\rangle \right) + \left( \left\langle g \right\rangle _\theta - \left\langle g \right\rangle \right) \left\langle f \right\rangle _\theta \\&+ \left( \left\langle g \right\rangle _\theta - \left\langle g \right\rangle \right) \left( \left\langle f \right\rangle - \left\langle f\right\rangle _\theta \right) \, , \end{aligned}$$

again by Lemma 4.1 and (4.10) we obtain

$$\begin{aligned} \nonumber \left| \left\langle f \right\rangle _\theta \left\langle g \right\rangle _\theta - \left\langle f \right\rangle \left\langle g \right\rangle \right|&\le C\frac{K}{N} \Vert g \Vert _\theta \Vert f \Vert _\theta \, . \end{aligned}$$
(4.11)

Combining the three bounds above and redefining \(C=\text{ max }\{C,C'\}\) one obtains (4.5). To prove (4.6) it is sufficient to observe that if f and g have disjoint supports, then \(\left\langle fg\right\rangle _\theta = \left\langle f\right\rangle _\theta \left\langle g \right\rangle _\theta \) and consequently (4.8) is equal to zero.\(\square \)

In order to make Lemma 4.2 effective we need to show how to compute averages according to the measure (4.2).

Lemma 4.3

There exists \(\beta _0 >0\) such that for any \(\beta > \beta _0\), the following holds true. For any fixed multi-index \(\mathbf{k},\mathbf{l},\mathbf{n},\mathbf{s}\in {\mathbb N}_0^{N}\) and \(d,d' \in \{ 0,1,2\}\), there are two constants \(C_{\mathbf{k},\mathbf{l}}^{(1)} \in {\mathbb R}\) and \(C_{\mathbf{k},\mathbf{l}}^{(2)}>0\) such that

$$\begin{aligned} \frac{C^{(1)}_{\mathbf{k},\mathbf{l}}}{\beta ^{\frac{|\mathbf{k}| + |\mathbf{l}| }{2}}} \le \left\langle \mathbf{p}^\mathbf{k}\, \mathbf{r}^\mathbf{l}\, \left( \int _{0}^1 e^{- \xi \mathbf{n}^\intercal \mathbf{r}}(1-\xi )^2\mathrm{d}\xi \right) ^d \left( \int _{0}^1 e^{- \xi \mathbf{s}^\intercal \mathbf{r}}(1-\xi )^3\mathrm{d}\xi \right) ^{d'} \right\rangle _\theta \le \frac{C^{(2)}_{\mathbf{k},\mathbf{l}}}{\beta ^{\frac{|\mathbf{k}| + |\mathbf{l}| }{2}}} \end{aligned}$$

where \(\mathbf{p}^\mathbf{k}=\prod _{j=1}^Np_j^{k_j}\) and \( \mathbf{r}^\mathbf{l}=\prod _{j=1}^N r_j^{l_j} \). Moreover:

  1. (i)

    if \(k_i\) is odd for some i then \(C^{(1)}_{\mathbf{k},\mathbf{l}} = C^{(2)}_{\mathbf{k},\mathbf{l}} = 0\);

  2. (ii)

    if \(k_i, l_i\) are even for all i then \(C^{(1)}_{\mathbf{k},\mathbf{l}} >0\).

The lemma is proved in “Appendix B”.

Remark 4.4

Actually all the results of this section hold true (with different constants) also when we endow \({\mathcal M}\) with the Gibbs measure of the Toda chain in (2.25) and we use as approximating measure

$$\begin{aligned} \mathrm{d}\mu _{T,\theta } := \frac{1}{Z_{T,\theta }(\beta )} \ e^{-\beta {H_T(\mathbf{p},\mathbf{r})}} \, e^{- \theta \sum _{j=0}^{N-1} r_j} \ \mathrm{d}\mathbf{p}\, \mathrm{d}\mathbf{r}; \end{aligned}$$
(4.12)

here \(\theta \) is selected in such a way that

$$\begin{aligned} \int _{\mathbb R}r \, e^{- \theta r - \beta V_T(r)} \, \mathrm{d}r = 0 . \end{aligned}$$
(4.13)

We show in “Appendix B” that it is always possible to choose \(\theta \) to fulfill (4.13) (see Lemma B.1) and we also prove Lemma 4.3 for Toda. In “Appendix C” we prove Lemma 4.1 for the Toda chain.

5 Bounds on the Variance

In this section we prove upper and lower bounds on the variance of the quantities relevant to prove our main theorems.

5.1 Upper bounds on the variance of \(J^{(m)}\) along the flow of FPUT

In this subsection we only consider the case \({\mathcal M}\) endowed by the FPUT Gibbs measure. We denote by \(J^{(m)}(t) := J^{(m)}\circ \phi ^t_{H_F}\) the Toda integral computed along the Hamiltonian flow \(\phi ^t_{H_F}\) of the FPUT Hamiltonian. The aim is to prove the following result:

Proposition 5.1

Fix \(m \in {\mathbb N}\). There exist \(N_0, \beta _0, C_0, C_1>0\) such that for any \(N > N_0\), \(\beta > \beta _0\), one has

$$\begin{aligned}&\sigma _{J^{(m)}(t) - J^{(m)}(0)}^2 \le C_0 N \left( \frac{({\mathtt b}- 1)^2}{\beta ^4} + \frac{C_1}{\beta ^{5}}\right) t^2 , \qquad \forall t \in {\mathbb R}. \end{aligned}$$
(5.1)

Proof

As explained in the introduction, applying formula (2.30) we get

$$\begin{aligned} \sigma _{J^{(m)}(t) - J^{(m)}(0)}^2 \le \left\langle \{ J^{(m)}, H_F\}^2\right\rangle t^2 , \qquad \forall t \in {\mathbb R}. \end{aligned}$$
(5.2)

Therefore we need to bound \(\left\langle \{J^{(m)}, H_F\}^2 \right\rangle \). For the purpose we rewrite this term in a more convenient form. Since \(\left\langle \cdot \right\rangle \) is an invariant measure with respect to the Hamiltonian flow of \(H_F\), one has

$$\begin{aligned} \left\langle \{ J^{(m)}, H_F \} \right\rangle = 0 . \end{aligned}$$
(5.3)

Furthermore, since \(J^{(m)}\) is an integral of motion of the Toda Hamiltonian \(H_T\), we have

$$\begin{aligned} \left\{ J^{(m)}, H_T\right\} = 0 . \end{aligned}$$
(5.4)

We apply identities (5.3) and (5.4) to write

$$\begin{aligned} \left\langle \left\{ J^{(m)}, H_F\right\} ^2\right\rangle = \left\langle \left\{ J^{(m)}, H_F - H_T\right\} ^2 \right\rangle - \left\langle \{ J^{(m)}, H_F- H_T \} \right\rangle ^2 \, . \end{aligned}$$
(5.5)

The above expression enables us to exploit the fact that the FPUT system is a fourth order perturbation of the Toda chain. To proceed with the proof we need the following technical result.\(\square \)

Lemma 5.2

One has

$$\begin{aligned} \{J^{(m)}, H_F-H_T \}= \sum _{j=1}^N H_j^{(m)} \, , \end{aligned}$$
(5.6)

where the functions \(H_j^{(m)}\) fulfill

  1. (i)

    \(H_j^{(m)} = S_{j-1} H_1^{(m)}\) \(\ \forall j \), moreover the diameter of the support of \(H^{(m)}_j\) is at most m;

  2. (ii)

    there exist \(N_0, \beta _0, C, C'>0\) such that for any \(N > N_0\), \(\beta > \beta _0\), any \(i, j = 1, \ldots , N\), the following estimates hold true:

    $$\begin{aligned} \Vert H_j^{(m)} \Vert _\theta \le C\left( \frac{({\mathtt b}-1)^2}{\beta ^4} + \frac{C'}{\beta ^5} \right) ^{1/2}, \qquad \Vert H_i^{(m)} H_j^{(m)} \Vert _\theta \le C\left( \frac{\left( {\mathtt b}-1\right) ^4}{\beta ^8} + \frac{C'}{\beta ^{10}}\right) ^{1/2} . \end{aligned}$$
    (5.7)

The proof of the lemma is postponed at the end of the subsection.

We are now ready to finish the proof of Proposition 5.1. Substituting (5.6) in (5.5) we obtain

$$\begin{aligned} \left\langle \left\{ J^{(m)}, H_F\right\} ^2\right\rangle&= \sum _{j,i=1}^{N} \left[ \left\langle H_i^{(m)} H_j^{(m)} \right\rangle - \left\langle H_i^{(m)} \right\rangle \left\langle H_j^{(m)} \right\rangle \right] . \end{aligned}$$
(5.8)

Therefore estimating \(\left\langle \left\{ J^{(m)}, H_F\right\} ^2\right\rangle \) is equivalent to estimate the correlations between \(H_i^{(m)}\) and \(H_j^{(m)}\). Exploiting Lemma 4.2 and observing that if \({\mathtt d}(i,j) > m\) then \(H_i^{(m)}\) and \(H_j^{(m)}\) have disjoint supports (see Lemma 5.2 (i) and (3.14)), we get that there are positive constants that for convenience we still call C and \(C'\), such that \(\forall N, \beta \) large enough

$$\begin{aligned}&\left| \left\langle H_i^{(m)} H_j^{(m)}\right\rangle - \left\langle H_i^{(m)} \right\rangle \left\langle H_j^{(m)} \right\rangle \right| \le C \left( \frac{({\mathtt b}- 1)^2}{\beta ^4} + \frac{C'}{\beta ^5} \right) , \qquad \forall i, j , \end{aligned}$$
(5.9)
$$\begin{aligned}&\left| \left\langle H_i^{(m)} H_j^{(m)}\right\rangle - \left\langle H_i^{(m)} \right\rangle \left\langle H_j^{(m)} \right\rangle \right| \le \frac{C}{N} \left( \frac{({\mathtt b}- 1)^2}{\beta ^4} + \frac{C'}{\beta ^5} \right) , \qquad \forall i, j :{\mathtt d}(i, j) > m\,. \end{aligned}$$
(5.10)

From (5.8) we split the sum in two terms:

$$\begin{aligned} \left\langle \left\{ J^{(m)}, H_F\right\} ^2\right\rangle&= \sum _{{\mathtt d}(i,j) \le m}\left[ \left\langle H_i^{(m)} H_j^{(m)} \right\rangle - \left\langle H_i^{(m)} \right\rangle \left\langle H_j^{(m)} \right\rangle \right] \\&\quad + \sum _{{\mathtt d}(i,j)>m}\left[ \left\langle H_i^{(m)} H_j^{(m)} \right\rangle - \left\langle H_i^{(m)} \right\rangle \left\langle H_j^{(m)} \right\rangle \right] \, . \end{aligned}$$

We now apply estimates (5.9), (5.10) to get

$$\begin{aligned} \left\langle \left\{ J^{(m)}, H_F\right\} ^2\right\rangle&\le N C\left( \frac{({\mathtt b}- 1)^2}{\beta ^4} + \frac{C'}{\beta ^5} \right) + N^2 \frac{{\widetilde{C}}}{N} \left( \frac{({\mathtt b}-1)^2}{\beta ^4} + \frac{C'}{\beta ^5} \right) \nonumber \\&\le N C_1 \left( \frac{({\mathtt b}- 1)^2}{\beta ^4} + \frac{C_2}{\beta ^5} \right) \, \end{aligned}$$
(5.11)

for some positive constants \(C_1\) and \(C_2\). \(\square \)

5.1.1 Proof of Lemma 5.2

We start by writing the Poisson bracket \(\{J^{(m)}, H_F- H_T\}\) in an explicit form. First we observe that for any \(1 \le m < N\) one has from (2.12)

$$\begin{aligned} \frac{\partial J^{(m)}}{\partial p_{j-1}} = \frac{1}{m} \frac{\partial \text {Tr}\left( L^m\right) }{\partial p_{j-1}} = \text {Tr}\left( L^{m-1}\frac{\partial L}{\partial p_{j-1}}\right) = - [L^{m-1}]_{j,j} = -h_{j}^{(m-1)} , \end{aligned}$$
(5.12)

for all \(j=1, \ldots , N\). In the above relation \( h_{j}^{(m-1)}\) is the generating function of the \(m-1\) Toda integral defined in (3.2).

Next we observe that

$$\begin{aligned}&H_F(\mathbf{p},\mathbf{q}) - H_T(\mathbf{p},\mathbf{q}) = \sum _{j=0}^{N-1} R(q_{j+1}-q_j),\nonumber \\&\qquad R(x) := \frac{x^2}{2}- \frac{x^3}{6} + {\mathtt b}\frac{x^4}{24} - (e^{-x} -1 + x) . \end{aligned}$$
(5.13)

This implies also that

$$\begin{aligned} \begin{aligned} \left\{ J^{(m)}, H_F - H_T\right\}&= \sum _{j=1}^{N} h_{j}^{(m-1)}(\mathbf {p},\mathbf {q}) \, \left( R'(r_{j-2}) - R'(r_{j-1}) \right) \\ {}&= \sum _{j=1}^{N}( h_{j}^{(m-1)}(\mathbf {p},\mathbf {q})-h_{j}^{(m-1)}(\mathbf {0},\mathbf {0})) \, \left( R'(r_{j-2}) - R'(r_{j-1}) \right) \end{aligned} \end{aligned}$$
(5.14)

where, to obtain the second identity, we are using that \(h_{j}^{(m-1)}(\mathbf {0},\mathbf {0})\) is by (3.15) and (3.21) a constant independent from j and the second term in the last relation is a telescopic sum. Define

$$\begin{aligned} H_j^{(m)}:= \left( h_{j}^{(m-1)}(\mathbf{p},\mathbf{r}) - h_{j}^{(m-1)}(\mathbf {0},\mathbf {0})\right) \left( R'(r_{j-2}) - R'(r_{j-1}) \right) , \quad j=1,\ldots ,N ;\nonumber \\ \end{aligned}$$
(5.15)

then item (i) of Lemma 5.2 follows because clearly \(H_j^{(m)}=S_{j-1}H_1^{(m)}\). Furthermore, since \(h_j^{(m-1)}\) has diameter bounded by \(m-1\), the same property applies to \(H_j^{(m)}\).

To prove item (ii) we start by expanding \(R'(r_{j-1}) - R'(r_j)\) in Taylor series with integral remainder. Since

$$\begin{aligned} R'(x) = \frac{({\mathtt b}- 1)}{6} x^3 + \frac{x^4}{6}\int _{0}^1 e^{-\xi x}(1-\xi )^3\mathrm{d}\xi \, , \end{aligned}$$

we get that

$$\begin{aligned}&R'(r_{j-2}) - R'(r_{j-1}) = \frac{({\mathtt b}- 1)}{6}S_{j-1}\psi _3(\mathbf{r}) + \frac{1}{6}S_{j-1}\psi _4(\mathbf{r})\, , \end{aligned}$$
(5.16)

where explicitly

$$\begin{aligned}&\psi _3(\mathbf{r}) := r_{N-1}^3 - r_{0}^3 \, , \end{aligned}$$
(5.17)
$$\begin{aligned}&\psi _4(\mathbf{r}) := r_{N-1}^4\int _{0}^1 e^{-\xi r_{N-1}}(1-\xi )^3\mathrm{d}\xi - r_{0}^4\int _{0}^1 e^{-\xi r_{0}}(1-\xi )^3\mathrm{d}\xi \,. \end{aligned}$$
(5.18)

Combining (3.21) with (5.16) we rewrite \(H_j^{(m)}\) in (5.15) in the form

$$\begin{aligned} H^{(m)}_j = \frac{S_{j-1}}{6}\left( ({\varphi }^{(m)}_{1}+{\varphi }^{(m)}_{2}+{\varphi }^{(m)}_{\ge 3})\Big ( ({\mathtt b}- 1) \psi _3 + \psi _4 \Big ) \right) \,, \end{aligned}$$

where \({\varphi }^{(m)}_{j}\), \( j=0,1,2\), are defined in (3.21). Thus the squared \(L^2\) norm of \(H_j\) is given by (we suppress the superscript to simplify the notation)

$$\begin{aligned} \Vert H_j \Vert _\theta ^2&= \frac{1}{36}({\mathtt b}-1 )^2\Big (\sum _{\ell , \ell ' = 1}^{2} \, \left\langle \psi _3^2 \, {\varphi }_{\ell } \, {\varphi }_{\ell '}\right\rangle _{\theta } +\, \left\langle \psi _3^2 {\varphi }_{\ge 3}\left( {\varphi }_{\ge 3} + 2{\varphi }_{1} + 2{\varphi }_{2}\right) \right\rangle _\theta \Big ) \end{aligned}$$
(5.19)
$$\begin{aligned}&+\frac{{\mathtt b}- 1 }{18} \Big (\sum _{\ell , \ell ' = 1}^{2}\, \left\langle \psi _3 \psi _4 \, {\varphi }_{\ell } \, {\varphi }_{\ell '} \right\rangle _\theta + \left\langle \psi _3 \psi _4 {\varphi }_{\ge 3}\left( {\varphi }_{\ge 3} + 2{\varphi }_{1} + 2{\varphi }_{2}\right) \right\rangle _\theta \Big ) \end{aligned}$$
(5.20)
$$\begin{aligned}&+ \frac{1}{36}\sum _{\ell , \ell ' = 1}^{2}\left\langle \psi _4^2 \, {\varphi }_{\ell } \, {\varphi }_{\ell '} \right\rangle _\theta +\frac{1}{36}\left\langle \psi _4^2 \, {\varphi }_{\ge 3}\left( {\varphi }_{\ge 3} + 2{\varphi }_{1} + 2{\varphi }_{2}\right) \right\rangle _\theta \, . \end{aligned}$$
(5.21)

Consider now the terms in (5.19); by (3.24), (3.22) and (5.17), we know that each element is a linear combinations of functions of the form

$$\begin{aligned} \mathbf{p}^\mathbf{k}\, \mathbf{r}^\mathbf{l}\, \left( \int _{0}^1 e^{- \xi \mathbf{n}^\intercal \mathbf{r}}(1-\xi )^2\mathrm{d}\xi \right) ^d \left( \int _{0}^1 e^{- \xi \mathbf{s}^\intercal \mathbf{r}}(1-\xi )^3\mathrm{d}\xi \right) ^{d'}\, , \end{aligned}$$
(5.22)

with \(|\mathbf{k}| + |\mathbf{l}| \ge 6 + \ell + \ell ' \ge 8\), \(d,d'\in \{0,1,2\}\). The number of these functions and their coefficients are independent from N (see Lemma 3.7). By Lemma 4.3 it follows that there exists a constant \(C>0\), depending only on m, such that

$$\begin{aligned} \left| \text{ r.h.s. } \text{ of } \text{(5.19) } \right| \le C \, ({\mathtt b}- 1)^2 \, \beta ^{-4} . \end{aligned}$$
(5.23)

Analogously, line (5.20) is a linear combination of functions of the form (5.22) with \(|\mathbf{k}| + |\mathbf{l}| \ge 9\), \(d,d'\in \{0,1,2\}\). Applying Lemma 4.3 we get the estimate

$$\begin{aligned} \left| (5.20) \right| \le C' \, |{\mathtt b}-1| \, \beta ^{-9/2} \, \end{aligned}$$
(5.24)

for some constant \(C'>0\). In a similar way the expression (5.21) is a linear combination of functions of the form (5.22) with \(|\mathbf{k}| + |\mathbf{l}| \ge 10\), \(d,d'\in \{0,1,2\}\). Applying Lemma 4.3 we get the estimate

$$\begin{aligned} \left| (5.21) \right| \le C'' \, \beta ^{-5} \, , \end{aligned}$$
(5.25)

for some constant \(C''>0\). Combining (5.23), (5.24) and (5.25) we obtain estimate (5.7) for \(\Vert H_j \Vert _\theta \). The estimate for \(\Vert H_i^{(m)} H_j^{(m)} \Vert _\theta \) can be proved in an analogous way. \(\square \)

5.2 Lower bounds on the variance of m-admissible functions

From now on we consider \({\mathcal M}\) endowed with either the FPUT or the Toda Gibbs measure; the following result holds in both cases.

Proposition 5.3

Fix \(m \in {\mathbb N}\), let G be an m-admissible function of the first or second kind (see Definition 3.8). There exist \(N_0, \beta _0, C >0\) such that for any \(N > N_0\), \(\beta > \beta _0\), one has

$$\begin{aligned}&\sigma ^2_{G} =\left\langle G^2\right\rangle -\left\langle G\right\rangle ^2 \ge C\frac{N}{\beta ^2} . \end{aligned}$$
(5.26)

Proof

We first prove (5.26) when \(G = G_1= \mathbf{p}^\intercal {\mathcal G}_1 \mathbf{r}\) where \({\mathcal G}_1\) is a circulant, symmetric matrix represented by the m-admissible vector \(\mathbf{a}\in {\mathbb R}^N\). We now make the change of coordinates \((\mathbf{p}, \mathbf{r}) = ({\mathcal H}{\widehat{\mathbf{p}}}, {\mathcal H}{\widehat{\mathbf{r}}})\) which diagonalizes the matrix \({\mathcal G}_1\) (see (3.17)), getting

$$\begin{aligned} G_1({\widehat{\mathbf{p}}},{\widehat{\mathbf{r}}}) = \sqrt{N}\sum _{j=0}^{N-1}{\widehat{g}}_j {\widehat{p}}_j{\widehat{r}}_j. \end{aligned}$$

So we have just to compute

$$\begin{aligned} \sigma _{G_1}^2&= N \left\langle \sum _{i,j = 0}^{N-1} {\widehat{g}}_j{\widehat{g}}_i {\widehat{p}}_j {\widehat{p}}_i {\widehat{r}}_i {\widehat{r}}_j \right\rangle - N\left( \left\langle \sum _{j=0}^{N-1}{\widehat{g}}_j {\widehat{p}}_j{\widehat{r}}_j\right\rangle \right) ^2 \\&= N \sum _{i,j = 0}^{N-1} {\widehat{g}}_j{\widehat{g}}_i \left\langle {\widehat{p}}_j {\widehat{p}}_i\right\rangle \left\langle {\widehat{r}}_i {\widehat{r}}_j \right\rangle - N \left( \sum _{j=0}^{N-1}{\widehat{g}}_j \left\langle {\widehat{p}}_j \right\rangle \left\langle {\widehat{r}}_j\right\rangle \right) ^2 \, ,\nonumber \end{aligned}$$
(5.27)

where we used that \({\widehat{p}}_k, {\widehat{r}}_j\) are random variables independent from each other.

We notice that \({\widehat{p}}_1, {\widehat{p}}_2,\ldots , {\widehat{p}}_{N-1}\) are i.i.d. Gaussian random variable with variance \(\beta ^{-1}\), \({\widehat{p}}_0 = 0\) (see (2.1)), so that we have \(\left\langle {\widehat{p}}_j \right\rangle = 0\) and \(\left\langle {\widehat{p}}_j{\widehat{p}}_i\right\rangle = \frac{\delta _{i,j}}{\beta } \, \, i,j = 1,\ldots , N-1\) (remark that this holds true both for the FPUT and Toda’s potentials as the \(\mathbf{p} \)-variables have the same distributions).

As a consequence, (5.27) becomes:

$$\begin{aligned} \sigma ^2_{G_1} = \frac{N}{\beta } \sum _{j = 1}^{N-1} {\widehat{g}}_j^2 \left\langle {\widehat{r}}_j^2\right\rangle = \frac{1}{\beta } \left\langle {\widehat{\mathbf{r}}}^\intercal {\mathcal H}{\mathcal G}_1^2 {\mathcal H}{\widehat{\mathbf{r}}}\right\rangle =\frac{1}{\beta } \left\langle \mathbf{r}^\intercal {\mathcal G}_1^2 \mathbf{r}\right\rangle . \end{aligned}$$
(5.28)

Since \({\mathcal G}_1\) is circulant symmetric matrix so is \({\mathcal G}_1^2\) and its representing vector is \(\mathbf{d}:= \mathbf{g}\star \mathbf{g}\).

Next we remark that the identity \(\left\langle \left( \sum _{j=0}^{N-1} r_j\right) ^2\right\rangle = 0\) implies

$$\begin{aligned} \left\langle r_jr_i\right\rangle = - \frac{1}{N-1}\left\langle r_0^2 \right\rangle , \quad \forall i\ne j \, . \end{aligned}$$

Applying this property to (5.28) we get

$$\begin{aligned} \nonumber \sigma ^2_{G_1}&= \frac{1}{\beta } \sum _{j,l=0}^{N-1} d_l \, \left\langle r_j r_{j+l }\right\rangle = \frac{N}{\beta } \left\langle r_0^2\right\rangle d_0 + \frac{1}{\beta }\sum _{j, l \atop l \ne 0}^{N-1} d_l \left\langle r_j r_{j+l} \right\rangle \\&= \frac{1}{\beta } \left\langle r_0^2\right\rangle \left( Nd_0 - \frac{N}{N-1}\sum _{l \ne 0}^{N}d_l \right) \,. \end{aligned}$$
(5.29)

By Lemmas 4.1 and 4.3 we have that, for N sufficiently large, \(\left\langle r_0^2 \right\rangle \ge c \beta ^{-1}\). Finally, since the vectors \({\mathbf{g}},{\mathbf{d}}\) are m-admissible and 2m-admissible respectively we have that

$$\begin{aligned} d_0 = (\mathbf{g}\star \mathbf{g})_0 = \sum _{j=0}^{{\widetilde{m}}} g_j^2 \ge c_m , \qquad \sum _{l \ne 0}^{N-1}d_l = \sum _{l \ne 0}^{2 {\widetilde{m}}}d_l \le C_m, \end{aligned}$$
(5.30)

for some constants \(c_m>0\) and \(C_m>0\). Plugging (5.30) into (5.29) we obtain (5.26) for the case of m-admissible functions of the first kind.

For the case of admissible functions of the second kind, one has \(G_2 = {\mathbf{p}}^\intercal {\mathcal G}_2 {\mathbf{p}} + {\mathbf{r}}^\intercal {\mathcal G}_2 {\mathbf{r}}\) with \({\mathcal G}_2\) circulant, symmetric and represented by an m-admissible vector. Since \(\mathbf{p}\) and \( \mathbf{r}\) are independent random variables one gets

$$\begin{aligned} \sigma _{G_2} = \sigma _{\mathbf{p}^\intercal {\mathcal G}_2 \mathbf{p}+ \mathbf{r}^\intercal {\mathcal G}_2 \mathbf{r}} = \sigma _{\mathbf{p}^\intercal {\mathcal G}_2 \mathbf{p}} +\sigma _{ \mathbf{r}^\intercal {\mathcal G}_2 \mathbf{r}}\ge \sigma _{\mathbf{p}^\intercal {\mathcal G}_2 \mathbf{p}} . \end{aligned}$$

Then arguing as in the previous case one gets (5.26).\(\square \)

By applying Proposition 5.3 to the quantity \(J^{(m)}_2\) that is an m-admissible function of the first or second kind depending on the parity of m, we obtain the following result.

Corollary 5.4

The quadratic part \(J_2^{(m)}\) of the Taylor expansion of the Toda integral \(J^{(m)}\) near \((\mathbf{p}, \mathbf{r}) = (0, 0) \) satisfies

$$\begin{aligned} \sigma ^2_{J^{(m)}_2}\ge C \frac{N}{\beta ^2}\,, \end{aligned}$$
(5.31)

for some constant \(C>0\).

In a similar way we obtain a lower bound on the reminder \(J^{(m)}_{\ge 3}\) of the Taylor expansion of the Toda integral \(J^{(m)}\) near \(\mathbf{p}=0\) and \(\mathbf{r}=0\).

Lemma 5.5

Fix \(m \in {\mathbb N}\). There exist \(N_0, \beta _0, C >0\) such that for any \(N > N_0\), \(\beta > \beta _0\), one has

$$\begin{aligned} \sigma ^2_{J^{(m)}_{\ge 3}} \le C \frac{ N}{\beta ^3} . \end{aligned}$$
(5.32)

Proof

Recall from Lemma 3.7 that \(J^{(m)}_{\ge 3}\) is a cyclic function generated by \({\widetilde{h}}_1^{(m)}:=\frac{1}{m} {\varphi }^{(m)}_{\ge 3}\). Thus, denoting \(h^{(m)}_j := S_{j-1} {\widetilde{h}}^{(m)}_1\), we have \( J^{(m)}_{\ge 3} = \sum _{j = 1}^N {\widetilde{h}}^{(m)}_j \) and its variance is given by

$$\begin{aligned} \sigma ^2_{J_m^{\ge 3}} = \sum _{i,j=1}^N \left\langle {\widetilde{h}}^{(m)}_i{\widetilde{h}}^{(m)}_j\right\rangle - \left\langle {\widetilde{h}}^{(m)}_i \right\rangle \left\langle {\widetilde{h}}^{(m)}_j \right\rangle . \end{aligned}$$
(5.33)

We can bound the correlations in (5.33) exploiting Lemma 4.2, provide we estimate first the \(L^2(\mathrm{d}\mu _{F,\theta })\) and \(L^2(\mathrm{d}\mu _{T,\theta })\) norms of \({\widetilde{h}}^{(m)}_i \) and \({\widetilde{h}}^{(m)}_i {\widetilde{h}}^{(m)}_j\). Proceeding with the same arguments as in Lemma 5.2, one proves that there exists \(\tilde{C}>0\) such that for any \(N > N_0\), \(\beta > \beta _0\),

$$\begin{aligned} \Vert {\widetilde{h}}^{(m)}_i \Vert _\theta \le \tilde{C} \beta ^{-3/2}, \qquad \Vert {\widetilde{h}}^{(m)}_i \, {\widetilde{h}}^{(m)}_j \Vert _\theta \le \tilde{C} \beta ^{-3}. \end{aligned}$$
(5.34)

By Lemma 3.7, the function \({\widetilde{h}}_1^{(m)}\) has diameter at most m, so in particular if \({\mathtt d}(i,j) > m\), the functions \({\widetilde{h}}_i^{(m)}\) and \({\widetilde{h}}_j^{(m)}\) have disjoint supports (recall (3.14)).

We are now in position to apply Lemma 4.2 and obtain

$$\begin{aligned}&\left| \left\langle {\widetilde{h}}^{(m)}_i {\widetilde{h}}^{(m)}_j\right\rangle - \left\langle {\widetilde{h}}^{(m)}_i \right\rangle \left\langle {\widetilde{h}}^{(m)}_j \right\rangle \right| \le \frac{C'}{\beta ^3} , \qquad \forall i,j \end{aligned}$$
(5.35)
$$\begin{aligned}&\left| \left\langle {\widetilde{h}}^{(m)}_i{\widetilde{h}}^{(m)}_j\right\rangle - \left\langle {\widetilde{h}}^{(m)}_i \right\rangle \left\langle {\widetilde{h}}^{(m)}_j \right\rangle \right| \le \frac{C'}{N\beta ^3} , \qquad \forall i, j :{\mathtt d}(i,j) > m, \end{aligned}$$
(5.36)

for some constant \(C'>0\). Thus we split the variance in (5.33) in two parts

$$\begin{aligned} \sigma ^2_{J^{(m)}_{\ge 3}} = \sum _{{\mathtt d}(i,j)\le m} \left\langle {\widetilde{h}}^{(m)}_i{\widetilde{h}}^{(m)}_j\right\rangle - \left\langle {\widetilde{h}}^{(m)}_i \right\rangle \left\langle {\widetilde{h}}^{(m)}_j \right\rangle + \sum _{{\mathtt d}(i,j) > m} \left\langle {\widetilde{h}}^{(m)}_i{\widetilde{h}}^{(m)}_j\right\rangle - \left\langle {\widetilde{h}}^{(m)}_i \right\rangle \left\langle {\widetilde{h}}^{(m)}_j \right\rangle \end{aligned}$$

and apply estimates (5.35), (5.36) to get (5.32).\(\square \)

Combining Corollary 5.4 and Lemma 5.5 we arrive to the following crucial proposition.

Proposition 5.6

Fix \(m\in {\mathbb N}\). There exist \(N_0, \beta _0, C >0\) such that for any \(N > N_0\), \(\beta > \beta _0\), one has

$$\begin{aligned} \sigma ^2_{J^{(m)}}\ge C \frac{N}{\beta ^2} . \end{aligned}$$
(5.37)

Proof

By Lemma 3.7, we write \(J^{(m)} = J^{(m)}_0 + J^{(m)}_2+ J^{(m)}_{\ge 3}\) with \(J^{(m)}_0\) constant. By Corollary 5.4 and Lemma 5.5 we deduce that for N and \(\beta \) large enough,

$$\begin{aligned} \sigma _{J^{(m)}} = \sigma _{J^{(m)}_2 + J^{(m)}_{\ge 3}} \ge \sigma _{J^{(m)}_2 } - \sigma _{ J^{(m)}_{\ge 3}} \ge \frac{\sqrt{N}}{\beta }\left( \sqrt{C'} - \sqrt{\frac{C''}{\beta }}\right) \, , \end{aligned}$$

which leads immediately to the claimed estimate (5.37).\(\square \)

6 Proof of the Main Results

In this section we give the proofs of the main theorems of our paper.

6.1 Proof of Theorem 2.1

The proof is a straightforward application of Proposition 5.1 and 5.6. Having fixed \(m \in {\mathbb N}\), we apply (2.29) with \(\Phi = J^{(m)}\) and \(\lambda = \delta _1\) to get

$$\begin{aligned} \mathbf{P}\Big (\left| J^{(m)}(t) - J^{(m)}(0) \right| \ge \delta _1 { \sigma _{J^{(m)}(0)}} \Big )&\le C_0 \left( \frac{|{\mathtt b}- 1|^2}{\beta ^2} + \frac{C_1}{\beta ^{3}}\right) \frac{ t^2 }{\delta _1^2} \end{aligned}$$
(6.1)

from which one deduces the the statement of Theorem 2.1.

6.2 Proof of Theorem 2.5 and Theorem 2.6

The proofs of Theorems 2.5 and 2.6 are quite similar and we develop them at the same time. As in the proof of Theorem 2.1, the first step is to use Chebyshev inequality to bound

$$\begin{aligned} \begin{aligned} \mathbf{P}\left( \left| \Phi (t) - \Phi \right| > \lambda \sigma _{\Phi } \right) \le \frac{1}{\lambda ^2} \frac{\sigma ^2_{\Phi (t) - \Phi }}{ \sigma ^2_{\Phi }} \,, \end{aligned} \end{aligned}$$
(6.2)

where the time evolution is intended with respect to the FPUT flow \(\phi ^t_{F}\) or the Toda flow \(\phi ^t_{T}\). Accordingly, the probability is calculated with respect to the FPUT Gibbs measure (2.5) or the Toda Gibbs measure (2.25).

Next we observe that the quantity \(\Phi := \sum _{j=1}^{N-1} {\widehat{g}}_j E_j \) defined in (2.22) can be written in the form

$$\begin{aligned} \Phi (\mathbf{p},\mathbf{r}) = \sum _{j=1}^{N-1} {\widehat{g}}_j E_j = \frac{1}{2 \sqrt{N}}\sum _{j,l=0}^{N-1} g_l \left( p_j p_{j+l} + r_j r_{j+l} \right) =\frac{1}{2 \sqrt{N}} G_2(\mathbf{p},\mathbf{r}), \end{aligned}$$
(6.3)

where \(\mathbf{g}\in {\mathbb R}^N\) is a m-admissible vector and \(G_2(\mathbf{p},\mathbf{r})\) is a m-admissible function of the second kind, as in Definition 3.8. As the inequality (2.29) is scaling invariant, proving (6.2) is equivalent to obtain that

$$\begin{aligned} \begin{aligned} \mathbf{P}\left( \left| G_2(t) - G_2 \right| > \lambda \sigma _{G_2} \right) \le \frac{1}{\lambda ^2} \frac{\sigma ^2_{G_2(t) - G_2}}{ \sigma ^2_{G_2}} \end{aligned} . \end{aligned}$$
(6.4)

Applying Proposition 5.3 we can estimate \(\sigma ^2_{G_2}\). We are then left to give an upper bound to \(\sigma ^2_{G_2(t) - G_2}\). By Lemma 3.10, there exists a unique sequence \(\{c_j\}_{j=0}^{{\widetilde{m}} -1}\), with \(\max _j |c_j| \) independent from N, such that \(G_2(p,r) = \sum _{l=0}^{{\widetilde{m}}-1} c_lJ_{2}^{(2l+2)}\), where \( J_{2}^{(2l+2)}\) are defined in (3.24). Hence we bound

$$\begin{aligned} \sigma _{G_2(t) - G_2(0)} \le \sum _{l=0}^{{\widetilde{m}}-1} |c_l| \, \sigma _{J_{2}^{(2l+2)}(t) - J_{2}^{(2l+2)}(0)} . \end{aligned}$$

Next we interpolate \(J_{2}^{(2l)}\) with the integrals \(J^{(2l)}\) and exploit the fact that they are adiabatic invariants for the FPUT flow and integrals of motion for the Toda flow. More precisely

$$\begin{aligned} \sigma _{J_{2}^{(2l)}(t) - J_{2}^{(2l)}(0)}&\le \sigma _{J_{2}^{(2l)}(t) - J^{(2l)}(t)} +\sigma _{J^{(2l)}(0) - J_{2}^{(2l)}(0)} \end{aligned}$$
(6.5)
$$\begin{aligned}&\quad + \sigma _{J^{(2l)}(t) - J^{(2l)}(0)}. \end{aligned}$$
(6.6)

By the invariance of the two measures with respect to their corresponding flow and Lemma 5.5, we get both for FPUT and Toda the estimate

$$\begin{aligned} \sigma _{J_{2}^{(2l)}(t) - J^{(2l)}(t)} = \sigma _{J_{2}^{(2l)}(0) - J^{(2l)}(0)} = \sigma _{J_{\ge 3}^{(2l)}} \le \sqrt{\frac{\tilde{C}_1 N}{\beta ^{3}}}, \end{aligned}$$
(6.7)

for some constant \(\tilde{C}_1>0\) and for \(\beta >\beta _0\) and \(N>N_0\). As (6.6) is zero for the Toda flow (being \(J^{(2l)}(t)\) constant along the flow), we get

$$\begin{aligned} \sigma _{G_2\circ \phi _T^t - G_2}^2 \le \frac{C_1 N}{\beta ^3}, \end{aligned}$$
(6.8)

for some constant \(C_1>0\) and for \(\beta >\beta _0\) and \(N>N_0\). Combing Proposition 5.3 with (6.8) we conclude that

$$\begin{aligned} \begin{aligned} \mathbf{P}\left( \left| G_2\circ \phi ^t_T - G_2 \right|> \delta _1 \sigma _{G_2} \right) \le \frac{C_1}{\delta _1^2\beta },\quad \forall \delta _1>0, \end{aligned} \end{aligned}$$
(6.9)

namely we have concluded the proof of Theorem 2.6.

We are left to estimate (6.6) for FPUT, but this is exactly the quantity bounded in Proposition 5.1. We conclude that

$$\begin{aligned} \sigma ^2_{G_2\circ \phi _F^t - G_2} \le \frac{C_1 N}{\beta ^3} + C_3 N \left( \frac{|{\mathtt b}- 1|^2}{\beta ^4} + \frac{C_2}{\beta ^{5}}\right) t^2, \end{aligned}$$
(6.10)

for some constant \(C_j>0\), \(j=1,2,3\) and for \(\beta >\beta _0\) and \(N>N_0\).

Combing Proposition 5.3 with (6.10) we obtain

$$\begin{aligned} \mathbf{P}\left( \left| G_2\circ \phi ^t_F - G_2 \right| > \lambda \sigma _{G_2} \right) \le \frac{C_1 }{\lambda ^2\beta } + \frac{C_3}{\lambda ^2} \left( \frac{|{\mathtt b}- 1|^2}{\beta ^2} + \frac{C_2}{\beta ^{3}}\right) t^2. \end{aligned}$$
(6.11)

Choosing \(\lambda =\beta ^{-\varepsilon }\) with \(0<\varepsilon <\frac{1}{4}\), (6.11) is equivalent to

$$\begin{aligned} \mathbf{P}\left( \left| G_2\circ \phi ^t_F - G_2 \right| > \frac{\sigma _{G_2}}{\beta ^\varepsilon } \right) \le \frac{C_1}{\beta ^{2\varepsilon }}, \end{aligned}$$
(6.12)

for some redefine constant \(C_1>0\) and for every time t fulfilling (2.24).

We have thus concluded the proof of Theorem 2.5.