1 Introduction

Stochastic partial differential equations arise in the modelling of applications in mathematical physics (e.g. Navier–Stokes equations [9, 18, 22, 37] or stochastic non-linear Schrödinger equations [4, 13]), biology (e.g. catalytic branching processes [12, 30]), and finance (e.g. forward prices [16, 24, 38]). While the construction of solutions to the underlying stochastic equations is an important mathematical issue, having applications in mind it is indispensable to also study their specific properties. Among them, an investigation of the long-time behavior of solutions, that is existence and uniqueness of invariant measures and convergence of transition probabilities, are often important and at the same time also challenging mathematical topics. In this work we investigate the long-time behavior of mild solutions to the stochastic partial differential equation of the form

$$\begin{aligned} dX_t=(AX_t+F(X_t)) dt + \sigma (X_t)dW_t + \int _E\gamma (X_{t},\nu )\widetilde{N}(dt,d\nu ), \qquad t \ge 0 \end{aligned}$$
(1.1)

on a separable Hilbert space H, where (AD(A)) is the generator of a strongly continuous semigroup \((S(t))_{t \ge 0}\) on H, \((W_t)_{t \ge 0}\) is a Q-Wiener process and \(\widetilde{N}(dt,d\nu )\) denotes a compensated Poisson random measure with compensator \(dt \mu (d\nu )\) on \(\mathbb {R}_+ \times E\) with E a Polish space. The precise conditions need to be imposed on these objects will be formulated in the subsequent sections. We focus in particular on SPDEs with multiple limiting distributions.

In the literature the study on the existence and uniqueness of invariant measures often relies on different variants of a dissipativity condition. The simplest form of such a dissipativity condition is: There exists \(\alpha > 0\) such that

$$\begin{aligned} \langle Ax - Ay,x-y\rangle _H + \langle F(x) - F(y), x-y \rangle _H \le - \alpha \Vert x-y \Vert _H^2, \qquad x,y \in D(A). \end{aligned}$$
(1.2)

Indeed, if (1.2) is satisfied, \(\sigma \) and \(\gamma \) are globally Lipschitz-continuous, and \(\alpha \) is large enough, then there exists a unique invariant measure for the Markov process obtained from (1.1), see, e.g., [32, Section 16], [10, Chapter 11, Section 5], and [36] where such a condition was formulated for the Yosida approximations of the operator (AD(A)). Note that (1.2) is satisfied, if F is globally Lipschitz continuous and (AD(A)) satisfies for some \(\beta > 0\) large enough the inequality \(\langle Ax,x\rangle _H \le - \beta \Vert x\Vert _H^2\), \(x \in D(A)\), i.e. (AD(A)) is the generator of a strongly continuous semigroup satisfying \(\Vert S(t) \Vert _{L(H)} \le \mathrm {e}^{- \beta t}\). Here and below we denote by L(H) the space of bounded linear operators from H to H and by \(\Vert \cdot \Vert _{L(H)}\) its operator norm. For weaker variants of the dissipativity condition (e.g. cases where (1.2) only holds for \(\Vert x\Vert _H, \Vert y\Vert _H \ge R\) for some \(R > 0\)), in general one can neither guarantee the existence nor uniqueness of an invariant measure. Hence, to treat such cases, additional arguments, e.g. coupling methods, are required. Such arguments have been applied to different stochastic partial differential equations on Hilbert spaces in [33,34,35] where existence and, in particular, uniqueness of invariant measures was studied. We also mention [7, 23] for an extension of Harris-type theorems for Wasserstein distances, and [21, 25] for extensions of coupling methods.

In contrast to the aforementioned methods and applications, several stochastic models exhibit phase transition phenomena where uniqueness of invariant measures fails to hold. For instance, the generator (AD(A)) and drift F appearing in the Heath–Jarrow–Morton–Musiela equation do not satisfy (1.2), but instead F is globally Lipschitz continuous and the semigroup generated by (AD(A)) satisfies

$$\begin{aligned} \Vert S(t)x - Px\Vert _H \le \mathrm {e}^{- \alpha t}\Vert x - Px\Vert _H \end{aligned}$$

for some projection operator P. Based on this property it was shown in [36, 38] that the Heath–Jarrow–Morton–Musiela equation has infinitely many invariant measures parametrized by the initial state of the process, see also Sect. 6. Another example is related to stochastic Volterra equations as studied, e.g., in [6]. There, using a representation of stochastic Volterra equations via SPDEs and combined with some arguments originated from the study of the Heath–Jarrow–Morton–Musiela equation, the authors studied existence of limiting distributions allowing, in particular, that these distributions depend on the initial state of the process.

In this work we provide a general and unified approach for the study of multiple invariant measures and, moreover, we show that with dependence on the initial distribution the law of the mild solution of (1.1) is governed in the limit \(t \rightarrow \infty \) by one of the invariant measures. In particular, we show that the methods developed in [6, 36, 38] can be embedded as a special case of a general framework where one replaces (1.2) by a weaker dissipativity condition, which we call hereinafter generalized dissipativity condition:

  1. (GDC)

    There exists a projection operator \(P_1\) on the Hilbert space H and there exist constants \(\alpha > 0, \beta \ge 0\) such that, for \(x,y \in D(A)\), one has:

    $$\begin{aligned}&\langle Ax - Ay, x - y \rangle _H + \langle F(x) - F(y), x-y \rangle _H \\&\quad \le - \alpha \Vert x - y \Vert _H^2 + \left( \alpha + \beta \right) \Vert P_1 x - P_1 y \Vert _H^2. \end{aligned}$$

Note that for the special case \(P_1 = 0\) condition (GDC) contains the classical dissipativity condition. However, when \(P_1 \ne 0\), the additional term \(\Vert P_1x - P_1y\Vert _H^2\) describes the influence of the non-dissipative part of the drift. Sufficient conditions and additional remarks on this condition are collected in the end of Sect. 2 while particular examples are discussed in Sects. 56.

We will show that under condition (GDC) and additional restrictions on the projected coefficients \(P_1F\), \(P_1\sigma \), and \(P_1 \gamma \), the Markov process obtained from (1.1) has for each initial data \(X_0 = x\) a limiting distribution \(\pi _x\) depending only on \(P_1x\). on This will often imply that there are multiple limiting distributions for (1.1). Moreover, the transition probabilities converge exponentially fast in the Wasserstein 2-distance to this limiting distribution. In order to prove this result, we first decompose the Hilbert space H according to

$$\begin{aligned} H = H_0 \oplus H_1, \qquad x = P_0x + P_1x, \ \ P_0 := I - P_1, \end{aligned}$$

where I denotes the identity operator on H, and then investigate the components \(P_0X_t\) and \(P_1X_t\) separately. Based on an technique from [39], we construct, for each \(\tau \ge 0\), a coupling of \(X_t\) and \(X_{t+\tau }\). This coupling will be then used to efficiently estimate the Wasserstein 2-distance for the solution started at two different points.

This work is organized as follows. In Sect. 2 we first discuss the special case where \(F, \sigma , \gamma \) are independent of X. In such a case X is an Ornstein–Uhlenbeck type process and the collection of invariant measures can be easily characterized by its characteristic function. This section can be seen as a motivation for our more general results discussed in the subsequent sections. More precisely, we could also have studied the Ornstein–Uhlenbeck process by our general results from Sect. 4, however, in such a case we need to impose unnecessary strong conditions on the Lévy measure and would not obtain the characterization of invariant measures in terms of their Fourier transforms. Afterward, we investigate in Sects. 35 the general case for which the methods from Sect. 2, that is convergence of the Fourier transform, can not be applied. More precisely, after having introduced and discussed in Sect. 3 the generalized dissipativity condition (GDC), we state the precise conditions imposed on the coefficients of the SPDE (1.1), discuss some properties of the solution and then provide sufficient conditions for the generalized dissipativity condition (GDC). Based on condition (GDC) we also derive an estimate on the trajectories of the process when started at two different initial points, i.e. we estimate the \(L^2\)-norm of \(X_t^x - X_t^y\) when \(x \ne y\). Based on this estimate, we then state and prove our main results in Sect. 4. Examples are then discussed in the subsequent Sects. 5 and 6. Namely, the Heath–Jarrow–Morton–Musiela equation is considered in Sect. 5 for which we first show that the main results of Sect. 4 contain [36, 38], and then extend these results by characterizing its multiple limiting distributions more explicitly. Finally, we apply our results in Sect. 6 to an SPDE with delay.

2 Ornstein–Uhlenbeck process in a Hilbert space

Let H be a separable Hilbert space and let \((Z_t)_{t \ge 0}\) be an H-valued Lévy process with Lévy triplet \((b,Q,\mu )\) defined on a stochastic basis \((\Omega , \mathcal {F}, (\mathcal {F}_t)_{t \ge 0}, \mathbb {P})\) with the usual conditions. This has characteristic exponent \(\Psi \) of Lévy-Khinchine form, i.e.

$$\begin{aligned} \mathbb {E}\left[ \mathrm {e}^{\mathrm {i} \langle u, Z_t \rangle _H} \right] = \mathrm {e}^{t \Psi (u)}, \qquad u \in H, \ \ t > 0, \end{aligned}$$

with \(\Psi \) given by

$$\begin{aligned} \Psi (u) = \mathrm {i} \langle b, u\rangle _H - \frac{1}{2}\langle Qu,u\rangle _H + \int _H \left( \mathrm {e}^{\mathrm {i}\langle u, z\rangle _H} - 1 - \mathrm {i}\langle u,z \rangle _H \mathbb {1}_{ \{ \Vert z \Vert _H \le 1\}} \right) \mu (dz), \end{aligned}$$

where \(b \in H\) denotes the drift, Q denotes the covariance operator being a positive, symmetric, trace-class operator on H, and \(\mu \) is a Lévy measure on H (see e.g. [3, 27, 28, 32]). Let \((S(t))_{t \ge 0}\) be a strongly continuous semigroup on H. The Ornstein–Uhlenbeck process driven by \((Z_t)_{t \ge 0}\) is the unique mild solution to

$$\begin{aligned} dX^x_t = AX^x_tdt + dZ_t, \qquad X^x_0 = x \in H, \ \ t \ge 0, \end{aligned}$$
(2.1)

where (AD(A)) denotes the generator of \((S(t))_{t \ge 0}\), i.e. \((X^x_t)_{t \ge 0}\) satisfies

$$\begin{aligned} X^x_t = S(t)x + \int _0^t S(t-s)dZ_s, \qquad t \ge 0. \end{aligned}$$

The characteristic function of \((X^x_t)_{t \ge 0}\) is given by

$$\begin{aligned} \mathbb {E}\left[ \mathrm {e}^{\mathrm {i} \langle u, X^x_t \rangle _H} \right] = \exp \left( \mathrm {i}\langle S(t)x, u \rangle _H + \int _0^t \Psi (S(r)^*u)dr \right) , \qquad u \in H, \ \ t \ge 0. \end{aligned}$$

See e.g. the review article [3] where also sufficient conditions for the existence and for the uniqueness as well as properties of invariant measures are discussed. It is well-known that the Ornstein–Uhlenbeck process has a unique invariant measure provided that \((S(t))_{t \ge 0}\) is uniformly exponentially stable, that is

$$\begin{aligned} \exists \alpha > 0, \ M \ge 1: \qquad \Vert S(t) \Vert _{L(H)} \le M\mathrm {e}^{-\alpha t}, \qquad t \ge 0, \end{aligned}$$

and the Lévy measure \(\mu \) satisfies a \(\log \)-integrability condition for its big jumps

$$\begin{aligned} \int _{ \{ \Vert z \Vert _H > 1 \} } \log (1 + \Vert z \Vert _H) \mu (dz) < \infty . \end{aligned}$$
(2.2)

Below we show that for a uniformly convergent semigroup \((S(t))_{t \ge 0}\) the corresponding Ornstein–Uhlenbeck process may admit multiple invariant measures parameterized by the range of the limiting projection operator of the semigroup.

Theorem 2.1

Suppose that \((S(t))_{t \ge 0}\) is uniformly exponentially convergent, i.e. there exists a projection operator P on H and constants \(M \ge 1\), \(\alpha > 0\) such that

$$\begin{aligned} \Vert S(t)x - Px \Vert _H \le M \Vert x \Vert _H \mathrm {e}^{- \alpha t}, \qquad t \ge 0, x \in H. \end{aligned}$$
(2.3)

Suppose that the Lévy process satisfies the following conditions:

  1. (i)

    The drift b satisfies \(Pb = 0\).

  2. (ii)

    The covariance operator Q satisfies \(PQu = 0\) for all \(u \in H\).

  3. (iii)

    The Lévy measure \(\mu \) is supported on \(\mathrm {ker}(P)\) and satisfies (2.2).

Then for each \(x \in H\) it holds

$$\begin{aligned} X_t^x \longrightarrow Px + X_{\infty }^0, \qquad t \rightarrow \infty \end{aligned}$$

in law, where \(X_{\infty }^0\) is an H-valued random variable determined by

$$\begin{aligned} \mathbb {E}\left[ \mathrm {e}^{\mathrm {i} \langle u, X_{\infty }^0 \rangle _H}\right] = \exp \left( \int _0^{\infty } \Psi (S(r)^*u)dr \right) . \end{aligned}$$

In particular, the set of all limiting distributions for the Ornstein–Uhlenbeck process \((X^x_t)_{t \ge 0}\) is given by \(\left\{ \delta _{y} *\mu _{\infty } \ | \ y \in \mathrm {ran}(P) \right\} \), where \(\mu _{\infty }\) denotes the law of \(X_{\infty }^0\).

Proof

We first prove the existence of a constant \(C > 0\) such that

$$\begin{aligned} \int _0^{\infty }|\Psi (S(r)^*u)|dr \le C ( \Vert u \Vert _H + \Vert u \Vert _H^2), \qquad u \in H, \end{aligned}$$
(2.4)

where \(S(r)^*\) denotes the adjoint operator to S(r) on L(H). To do so we estimate

$$\begin{aligned} | \Psi (S(r)^*u) |&\le | \langle b, S(r)^*u \rangle _H | + |\langle QS(r)^*u, S(r)^*u \rangle _H| \\&\quad + \int _{ \{ \Vert z \Vert _H \le 1 \} } \left| \mathrm {e}^{\mathrm {i}\langle S(r)^*u, z\rangle _H} - 1 - \mathrm {i}\langle S(r)^*u,z \rangle _H \right| \mu (dz) \\&\quad + \int _{ \{\Vert z \Vert _H > 1 \}} \left| \mathrm {e}^{\mathrm {i}\langle S(r)^*u, z\rangle _H} - 1\right| \mu (dz) \\&= I_1 + I_2 + I_3 + I_4. \end{aligned}$$

We find by (2.3) that \(\Vert S(r)x \Vert _H \le M\mathrm {e}^{- \alpha r}\Vert x \Vert _H\) for all \(x \in \mathrm {ker}(P)\) and hence

$$\begin{aligned} I_1 = | \langle S(r)b, u \rangle _H | \le \Vert u \Vert _H \Vert S(r)b \Vert _H \le \Vert u \Vert _H M \mathrm {e}^{- \alpha r} \Vert b \Vert _H. \end{aligned}$$

For the second term \(I_2\) we use \(\mathrm {ran}(Q) \subset \mathrm {ker}(P)\) so that

$$\begin{aligned} \Vert S(r)Qu \Vert _H \le M\mathrm {e}^{-\alpha r} \Vert Qu \Vert _H \le \mathrm {e}^{- \alpha r} \Vert Q \Vert _{L(H)} \Vert u \Vert _H. \end{aligned}$$

This yields \(\Vert QS(r)^* \Vert _{L(H)} = \Vert S(r)Q\Vert _{L(H)} \le M \mathrm {e}^{- \alpha r} \Vert Q \Vert _{L(H)}\) and hence

$$\begin{aligned} I_2&= |\langle QS(r)^*u, S(r)^*u \rangle _H| \\&\le \Vert QS(r)^*u \Vert _H \Vert S(r)^* u\Vert _H \\&\le M\Vert u \Vert _H \Vert QS(r)^*u \Vert _{H} \\&\le M\Vert u \Vert _H^2 \Vert Q \Vert _{L(H)}M \mathrm {e}^{-\alpha r}. \end{aligned}$$

For the third term \(I_3\) we obtain

$$\begin{aligned} I_3&\le C \int _{ \{ \Vert z \Vert _H \le 1\} } | \langle S(r)^*u, z \rangle _H|^2 \mu (dz) \\&= C \int _{ \{ \Vert z \Vert _H \le 1 \} \cap \mathrm {ker}(P) } | \langle u, S(r)z \rangle _H |^2 \mu (dz) \\&\le C\Vert u\Vert _H^2 \mathrm {e}^{- \alpha r} \int _{ \{ \Vert z \Vert _H \le 1 \} } \Vert z \Vert _H^2 \mu (dz), \end{aligned}$$

where \(C > 0\) is a generic constant. Proceeding similarly for the last term, we obtain

$$\begin{aligned} I_3&\le C \int _{ \{ \Vert z \Vert _H> 1 \} } \min \left\{ 1, | \langle S(r)^*u, z \rangle _H| \right\} \mu (dz) \\&\le C \int _{ \{ \Vert z \Vert _H> 1 \} \cap \mathrm {ker}(P) } \min \left\{ 1, \Vert u\Vert _H \mathrm {e}^{- \alpha r} \Vert z \Vert _H \right\} \mu (dz) \\&\le C \Vert u \Vert _H \mathrm {e}^{- \alpha r} \left( \mu ( \{ \Vert z \Vert _H> 1\} ) + \int _{ \{ \Vert z\Vert _H > 1\}} \log ( 1 + \Vert z \Vert _H) \mu (dz) \right) , \end{aligned}$$

where we have used, for \(a = \Vert u\Vert _H \mathrm {e}^{-\alpha r}\), \(b = \Vert z\Vert _H\), the elementary inequalities

$$\begin{aligned} \min \{1,ab\}&\le C \log (1 + ab) \\&\le C \min \{ \log (1 +a), \log (1+b)\} + C \log (1+a)\log (1+b) \\&\le C a \left( 1 + \log (1+b) \right) , \end{aligned}$$

see [19, appendix]. Combining the estimates for \(I_1,I_2,I_3, I_4\) we conclude that (2.44.2) is satisfied. Hence, using

$$\begin{aligned} \lim _{t \rightarrow \infty } \langle S(t)x, u \rangle _H = \langle Px,u \rangle _H \end{aligned}$$

we find that

$$\begin{aligned} \lim \limits _{t \rightarrow \infty } \mathbb {E}\left[ \mathrm {e}^{\mathrm {i} \langle u, X_t^x \rangle _H} \right] = \exp \left( \mathrm {i}\langle Px, u \rangle _H + \int _0^{\infty } \Psi (S(r)^*u)dr \right) . \end{aligned}$$
(2.5)

Since, in view of (2.44.2), \(u \longmapsto \int _0^{\infty }\Psi (S(r)^*u)dr\) is continuous at \(u = 0\), the assertion follows from Lévy’s continuity theorem combined with the particular form of (2.52.6). \(\square \)

The next remark shows that the Lévy driven OU-process is a particular case of (1.1) where \(F,\sigma , \gamma \) independent of x.

Remark 2.2

Let \(F, \sigma , \gamma \) be independent of the state space variables \(x \in H\). Then (1.1) takes the form

$$\begin{aligned} dX_t = (AX_t + F)dt + \sigma dW_t + \int _{E}\gamma (\nu )\widetilde{N}(dt,dz). \end{aligned}$$
(2.6)

Setting

$$\begin{aligned} Z_t = Ft + \sigma W_t + \int _0^t \int _{E} \gamma (\nu ) \widetilde{N}(ds,dz), \end{aligned}$$

we observe that \((Z_t)_{t \ge 0}\) is a Lévy process with characteristic triplet \((F, \sigma , \mu \circ \gamma ^{-1})\), up to a possible change of drift related to the compensation of jumps. This shows that (2.52.6) is equivalent to

$$\begin{aligned} dX_t = AX_t dt + dZ_t \end{aligned}$$

and hence the Lévy driven OU-process covers the case where \(F,\sigma ,\gamma \) in (1.1) are independent of the state variables.

Below we briefly discuss an application of this result to a stochastic perturbation of the Kolmogorov equation associated with a symmetric Markov semigroup. Let E be a Polish space and \(\eta \) a Borel probability measure on E. Let (AD(A)) be the generator of a symmetric Markov semigroup \((S(t))_{t \ge 0}\) on \(H := L^2(E,\eta )\). Then there exists, for each \(f \in D(A)\), a unique solution to the Kolmogorov equation (see, e.g., [31])

$$\begin{aligned} \frac{dv(t)}{dt} = Av(t), \qquad v(0) = f. \end{aligned}$$

Below we consider an additive stochastic perturbation of this equation in the sense of Itô, i.e. the stochastic partial differential equation

$$\begin{aligned} dv(t) = Av(t)dt + dZ_t, \qquad v(0) = f, \end{aligned}$$
(2.7)

where \((Z_t)_{t \ge 0}\) is an \(L^2(E,\eta )\)-valued Lévy process with characteristic function \(\Psi \). Let \((v(t);f))_{t \ge 0}\) be the unique mild solution to this equation.

Corollary 2.3

Suppose that the semigroup generated by (AD(A)) on \(L^2:=L^2(E,\eta )\) satisfies (2.3) with the projection operator

$$\begin{aligned} Pv = \int _E v(x) \eta (dx), \end{aligned}$$

and \(H=L^2(E,\eta )\). Assume that the Lévy process \((Z_t)_{t \ge 0}\) satisfies the conditions (i) – (iii) of Theorem 2.1. Then

$$\begin{aligned} v(t; f) \longrightarrow \int _E f(x) \eta (dx) + v(\infty ), \qquad t \rightarrow \infty \end{aligned}$$

in law, where \(v(\infty )\) is a random variable whose characteristic function is given by

$$\begin{aligned} \mathbb {E}\left[ \mathrm {e}^{\mathrm {i} \langle u, v(\infty ) \rangle _{L^2}}\right] = \exp \left( \int _0^{\infty } \Psi (S(r)^*u)dr \right) . \end{aligned}$$

We close this section with an example of a semigroup \((S(t))_{t \ge 0}\) for which this corollary can be applied.

Example 2.4

Let \((X_t)_{t \ge 0}\) be a Feller process on a separable Hilbert space E and let \((p_t)_{t \ge 0}\) be its transition semigroup acting on \(C_b(E)\). Suppose that \((X_t)_{t \ge 0}\) has a unique invariant measure \(\eta \). Then, by Jensen’s inequality, \((p_t)_{t \ge 0}\) can be uniquely extended to a strongly continuous semigroup on \(L^2(E, \eta )\) which is for simplicity again denoted by \((p_t)_{t \ge 0}\). Suppose that this semigroup is \(L^2\)-exponentially convergent in the sense that

$$\begin{aligned} \lim _{t \rightarrow \infty }\int _E \left( p_tf - \int _E f(x)\eta (dx) \right) ^2 d\eta = 0, \qquad \forall f \in L^2(E,\eta ). \end{aligned}$$

Then \((p_t)_{t \ge 0}\) satisfies (2.3) with projection operator \(Pv = \int _E v(x) \eta (dx)\).

3 Preliminaries

3.1 Framework and notation

Here and throughout this work, \((\Omega ,\mathcal {F},(\mathcal {F}_t)_{t\in \mathbb {R}_+},\mathbb {P})\) is a filtered probability space satisfying the usual conditions. Let U be a separable Hilbert space and \(W=(W_t)_{t \ge 0}\) be a Q-Wiener process with respect to \((\mathcal {F}_t)_{t\in \mathbb {R}_+}\) on \((\Omega ,\mathcal {F},(\mathcal {F}_t)_{t\in \mathbb {R}_+},\mathbb {P})\), where \(Q:U\rightarrow U\) is a non-negative, symmetric, trace class operator. Let E be a Polish space, \(\mathcal {E}\) the Borel-\(\sigma \)-field on E, and \(\mu \) a \(\sigma \)-finite measure on \((E,\mathcal {E})\). Let \(N(dt,d\nu )\) be a \((\mathcal {F}_t)_{t \ge 0}\)-Poisson random measure with compensator \(dt \mu (d\nu )\) and denote by \(\widetilde{N}(dt,d\nu ) = N(dt,d\nu ) - dt \mu (d\nu )\) the corresponding compensated Poisson random measure. Suppose that the random objects \((W_t)_{t \ge 0}\) and \(N(dt,d\nu )\) are mutually independent.

In this work we investigate the long-time behavior of mild solutions to the stochastic partial differential equation (1.1) with initial condition \(X_0 \in L^2(\Omega , \mathcal {F}_0, \mathbb {P};H)\), that is

$$\begin{aligned} dX_t=(AX_t+F(X_t)) dt + \sigma (X_t)dW_t + \int _E\gamma (X^x_{t},\nu )\widetilde{N}(dt,d\nu ), \qquad t \ge 0, \end{aligned}$$
(3.1)

where (AD(A)) is the generator of a strongly continuous semigroup \((S(t))_{t \ge 0}\) on H, \(H \ni x \mapsto F(x) \in H\) and \(H \ni x\mapsto \sigma (x) \in L_2^0\) are Borel measurable mappings, and \((x,\nu )\mapsto \gamma (x,\nu )\) is measurable from \(( H\times E,\mathcal {B}(H)\otimes \mathcal {E})\) to \((H,\mathcal {B}(H))\). Here \(\mathcal {B}(H)\) denotes the Borel-\(\sigma \)-algebra on H, and \(L_2^0:=L_2^0(H)\) is the Hilbert space of all Hilbert–Schmidt operators from \(U_0\) to H, where \(U_0 := Q^{1/2}U\) is a separable Hilbert space endowed with the scalar product

$$\begin{aligned} \langle x,y \rangle _{U_0}:=\langle Q^{-1/2} x, Q^{-1/2} y\rangle _U=\sum _{k\in \mathbb {N}}\tfrac{1}{\lambda _k}\langle x,e_k\rangle _U\langle e_k,y\rangle _U,\qquad \forall x,y \in U_0, \end{aligned}$$

and \(Q^{-1/2}\) denotes the pseudoinverse of \(Q^{1/2}\). Here \((e_j)_{j\in \mathbb {N}}\) denotes an orthogonal basis of eigenvectors of Q in U with corresponding eigenvalues \((\lambda _j)_{j\in \mathbb {N}}\). For comprehensive introductions to integration concepts in infinite dimensional settings we refer e.g. to [10] for the case of Q-Wiener processes and e.g. to [3, 28, 32] for compensated Poisson random measures as integrators. Throughout this work we suppose that the coefficients \(F,\sigma , \gamma \) are Lipschitz continuous. More precisely:

  1. (A1)

    There exist constants \(L_F, L_{\sigma }, L_{\gamma } \ge 0\) such that for all \(x,y \in H\)

    $$\begin{aligned} \Vert F(x) - F(y) \Vert _H^2&\le L_{F}\Vert x-y\Vert _{H}^2, \nonumber \\ \Vert \sigma (x) - \sigma (y)\Vert _{L_2^0(H)}^2&\le L_{\sigma }\Vert x-y\Vert _H^2, \nonumber \\ \int _{E} \Vert \gamma (x,\nu ) - \gamma (y,\nu )\Vert _H^2 \mu (d\nu )&\le L_{\gamma } \Vert x-y\Vert _H^2. \end{aligned}$$
    (3.2)

    Moreover we suppose that

    $$\begin{aligned} \int _E \Vert \gamma (0,\nu )\Vert _H^2 \mu (d\nu )<\infty . \end{aligned}$$
    (3.3)

Note that condition (3.3) implies that the jumps satisfy the usual growth conditions, i.e.

$$\begin{aligned} \int _E \Vert \gamma (x, \nu )\Vert _H^2 \mu (d\nu )&\le 2\int _E \Vert \gamma (x,\nu ) - \gamma (0,\nu )\Vert _H^2 \mu (d\nu ) + 2 \int _{E} \Vert \gamma (0,\nu )\Vert _H^2 \mu (d\nu ) \\&\le 2 \max \left\{ L_{\gamma }, \int _{E} \Vert \gamma (0,\nu )\Vert _H^2 \mu (d\nu ) \right\} (1 + \Vert x \Vert _H^2). \end{aligned}$$

Moreover, it follows from (GDC) and (3.2) it follows

$$\begin{aligned} \langle Ax, x \rangle _H \le \left( \beta + \sqrt{L_F}\right) \Vert x \Vert _H^2, \qquad x \in D(A). \end{aligned}$$

Hence \(A - (\beta + \sqrt{L_F})\) is dissipative and thus by the Lumer-Phillips theorem the semigroup \((S(t))_{t \ge 0}\) generated by (AD(A)) is quasi-contractive, i.e.

$$\begin{aligned} \Vert S(t)x \Vert _H \le \mathrm {e}^{\left( \beta + \sqrt{L_F}\right) t}\Vert x\Vert _H, \quad x \in H. \end{aligned}$$
(3.4)

Then, under conditions (GDC) and (A1), for each initial condition \(X_0\in L^2(\Omega ,\mathcal {F}_0,\mathbb {P};H)\) there exists a unique cádlág, \((\mathcal {F}_t)_{t \ge 0}\)-adapted, mean square continuous, mild solution \((X_t)_{t\ge 0}\) to (3.1) such that, for each \(T > 0\), there exists a constant \(C(T) > 0\) satisfying

$$\begin{aligned} \mathbb {E} \left[ \sup _{t\in [0,T]} \Vert X_t \Vert _H^2 \right] \le C(T)\left( 1 + \mathbb {E}\left[ \Vert X_0\Vert _H^2 \right] \right) \end{aligned}$$
(3.5)

This means that \((X_t)_{t \ge 0}\) satisfies \(\mathbb {P}\)-a.s.

$$\begin{aligned} X_t&= S(t)X_0 + \int _0^t S(t-s)F(X_s)ds + \int _0^t S(t-s)\sigma (X_s)dW_s \nonumber \\&\quad + \int _0^t \int _E S(t-s) \gamma (X_s, \nu ) \widetilde{N}(ds,d\nu ), \qquad t \ge 0, \end{aligned}$$
(3.6)

where all (stochastic) integrals are well-defined, see, e.g., [1, 28], and [17]. Moreover, for each \(X_0,Y_0 \in L^2(\Omega , \mathcal {F}_0, \mathbb {P}; H)\), the corresponding unique solutions \((X_t)_{t \ge 0}\) and \((Y_t)_{t \ge 0}\) satisfy

$$\begin{aligned} \mathbb {E} \left[ \Vert X_t - Y_t \Vert _H^2 \right] \le C(T)\mathbb {E}\left[ \Vert X_0-Y_0 \Vert _H^2 \right] , \qquad t \in [0,T]. \end{aligned}$$
(3.7)

If \(X_0 \equiv x \in H\), then we denote by \((X_t^x)_{t \ge 0}\) the corresponding solution to (3.1). Such solution constitutes a Markov process whose transition probabilities \(p_t(x,dy) = \mathbb {P}[ X_t^x \in dy]\) are measurable with respect to x. By slight abuse of notation we denote by \((p_t)_{t \ge 0}\) its transition semigroup, i.e., for each bounded measurable function \(f: H \longrightarrow \mathbb {R}\), \(p_tf\) is given by

$$\begin{aligned} p_tf(x) = \mathbb {E}\left[ f(X_t^x) \right] = \int _{H}f(y)p_t(x,dy), \qquad t \ge 0, \ \ x \in H. \end{aligned}$$

Using the continuous dependence on the initial condition, see (3.7), it can be shown that \(p_tf \in C_b(H)\) for each \(f \in C_b(H)\), i.e. the transition semigroup is \(C_b\)-Feller.

In this work we investigate the the existence of invariant measures and convergence of the transition probabilities towards these measures for the Markov process \((X_t^x)_{t \ge 0}\) with particular focus on the cases where uniqueness of invariant measures fails to hold. We denote by \(p_t^*\) the adjoint operator to \(p_t\) defined by

$$\begin{aligned} p_t^* \rho (dx) = \int _H p_t(y,dx) \rho (dy), \qquad t \ge 0. \end{aligned}$$

Recall that a probability measure \(\pi \) on \((H, \mathcal {B}(H))\) is called invariant measure for the semigroup \((p_t)_{t \ge 0}\) if and only if \(p_t^* \pi = \pi \) holds for each \(t \ge 0\). Let \(\mathcal {P}_2(H)\) be the space of Borel probability measures \(\rho \) on \((H, \mathcal {B}(H))\) with finite second moments. Recall that \(\mathcal {P}_2(H)\) is separable and complete when equipped with the Wasserstein-2-distance

$$\begin{aligned} \mathrm {W}_2(\rho , \widetilde{\rho }) = \inf _{ G \in \mathcal {H}(\rho , \widetilde{\rho })} \left( \int _{H \times H} \Vert x - y \Vert _H^2 G(dx,dy) \right) ^{\frac{1}{2}}, \qquad \rho , \widetilde{\rho } \in \mathcal {P}_2(H). \end{aligned}$$
(3.8)

Here \(\mathcal {H}(\rho ,\widetilde{\rho })\) denotes the set of all couplings of \((\rho , \widetilde{\rho })\), i.e. Borel probability measures on \(H \times H\) whose marginals are given by \(\rho \) and \(\widetilde{\rho }\), respectively, see [40, Section 6] for a general introduction to couplings and Wasserstein distances.

3.2 Discussion of generalized dissipativity condition

In this section we briefly discuss the condition

$$\begin{aligned} \langle Ax, x \rangle _H \le - \lambda _0 \Vert x\Vert _H^2 + (\lambda _0 +\lambda _1) \Vert P_1 x \Vert _H^2, \qquad x \in D(A), \end{aligned}$$
(3.9)

where \(\lambda _0 > 0\) and \(\lambda _1 \ge 0\). Note that, if (3.9) and condition (3.1) are satisfied, then

$$\begin{aligned}&\langle Ax - Ay, x-y\rangle _H + \langle F(x) - F(y), x-y\rangle _H \nonumber \\&\le \langle Ax - Ay, x-y\rangle _H + \sqrt{L_F}\Vert x-y\Vert _H^2 \nonumber \\&\le - \left( \lambda _0 - \sqrt{L}_F\right) \Vert x - y \Vert _H^2 + \left( \lambda _0 + \lambda _1 \right) \Vert P_1 x - P_1y \Vert _H^2, \end{aligned}$$
(3.10)

i.e. the generalized dissipativity condition (GDC) is satisfied for \(\alpha = \lambda _0 - \sqrt{L_F}\) and \(\beta = \lambda _1 + \sqrt{L_F}\), provided that \(\lambda _0 > \sqrt{L_F}\).

Proposition 3.1

Suppose that there exists an orthogonal decomposition \(H = H_0 \oplus H_1\) of H into closed linear subspaces \(H_0, H_1 \subset H\) such that \((S(t))_{t \ge 0}\) leaves \(H_0\) and \(H_1\) invariant and there exist constants \(\lambda _0 > 0\) and \(\lambda _1 \ge 0\) satisfying

$$\begin{aligned} \Vert S(t) x_0\Vert _H\le \mathrm {e}^{-\lambda _0 t}\Vert x_0\Vert _H, \qquad \Vert S(t)x_1 \Vert _H \le \mathrm {e}^{\lambda _1 t} \Vert x_1\Vert _H, \qquad \forall t\ge 0. \end{aligned}$$

for all \(x_0 \in H_0\) and \(x_1 \in H_1\). Then (3.9) holds for \(P_1\) being the orthogonal projection operator onto \(H_1\).

Proof

Let \(P_0\) be the orthogonal projection operator onto \(H_0\). Since \((S(t))_{t \ge 0}\) leaves the closed subspace \(H_0\) invariant, its restriction \((S(t)|_{H_0})_{t\ge 0}\) onto \(H_0\) is a strongly continuous semigroup of contractions on \(H_0\) with generator \((A_0,D(A_0))\) being the \(H_0\) part of A, that is

$$\begin{aligned} A_0x = Ax, \qquad x \in D(A_0) = \{ y \in D(A) \cap H_0 \ | \ Ay \in H_0 \}. \end{aligned}$$

Since \(H_0\) is closed and S(t) leaves \(H_0\) invariant, it follows that \(Ay = \lim _{t \rightarrow 0} \frac{S(t)y - y}{t} \in H_0\) for \(y \in D(A) \cap H_0\), i.e. \(D(A_0) = D(A) \cap H_0\) and \(P_0: D(A) \rightarrow D(A_0)\).

Arguing exactly in the same way shows that the restriction \((S(t)|_{H_1})_{t\ge 0}\) is a strongly continuous semigroup of contractions on \(H_1\) with generator \((A_1,D(A_1))\) given by \(A_1x = Ax\) and \(x \in D(A_1) = D(A) \cap H_1\) so that \(P_1: D(A) \rightarrow D(A_1)\). Since S(t) leaves \(H_0\) and \(H_1\) invariant, we obtain \(P_0S(t) = S(t)P_0\), \(P_1 S(t) = S(t)P_1\) from which we conclude that \(AP_1 x = P_1 Ax\) and \(AP_0x = P_0Ax\) for \(x \in D(A)\).

Since \((\mathrm {e}^{\lambda _0 t}S(t)|_{H_0})_{t\ge 0}\) is a strongly continuous semigroup of contractions on \(H_0\) with generator \(A_0 + \lambda _0 I\), and \((\mathrm {e}^{-\lambda _1 t}S(t)|_{H_1})_{t\ge 0}\) is a strongly continuous semigroup of contractions on \(H_1\) with generator \(A_1 - \lambda _1 I\), we have by the Lumer-Phillips theorem (see [31, Theorem 4.3])

$$\begin{aligned} \langle A_0x_0, x_0\rangle _H \le - \lambda _0 \Vert x_0 \Vert _H^2 \ \text { and } \ \langle A_1x_1, x_1 \rangle _H \le \lambda _1 \Vert x_1 \Vert _H^2, \quad x_0 \in H_0,\ x_1 \in H_1. \end{aligned}$$

Hence we find that

$$\begin{aligned} \langle A x, x \rangle _H&= \langle Ax, P_0 x \rangle _H + \langle Ax, P_1x \rangle _H \\&= \langle P_0 Ax, P_0x \rangle _H + \langle P_1Ax, P_1x \rangle _H \\&=\langle A_0P_0 x , P_0 x \rangle _H + \langle A_1P_1 x,P_1 x\rangle _H \\&\le -\lambda _0 \Vert P_0 x\Vert _H^2 + \lambda _1 \Vert P_1 x\Vert _H^2 \\&= - \lambda _0 \Vert x \Vert _H^2 + (\lambda _0 + \lambda _1)\Vert P_1 x\Vert _H^2, \end{aligned}$$

where the last equality follows from \(H_0 \perp H_1\). This proves the assertion. \(\square \)

At this point it is worthwhile to mention that Onno van Gaans has investigated in [39] ergodicity for a class of Lévy driven stochastic partial differential equations where the semigroup \((S(t))_{t \ge 0}\) was supposed to be hyperbolic. Proposition 3.1 can be also applied for hyperbolic semigroups provided that the hyperbolic decomposition is orthogonal. The conditions of previous proposition are satisfied whenever \((S(t))_{t \ge 0}\) is a symmetric, uniformly convergent semigroup.

Remark 3.2

Suppose that \((S(t))_{t \ge 0}\) is a strongly continuous semigroup on H and there exists an orthogonal projection operator P on H and \(\lambda _0 > 0\) such that

$$\begin{aligned} \Vert S(t)x - Px \Vert _{H} \le \mathrm {e}^{- \lambda _0 t}\Vert x - Px\Vert _H, \qquad t \ge 0, \ \ x \in H. \end{aligned}$$
(3.11)

Then the conditions of Proposition 3.1 are satisfied for \(H_0 = \mathrm {ker}(P)\) and \(H_1 = \mathrm {ran}(P)\) with \(\lambda _0 > 0\) and \(\lambda _1 = 0\). In particular, \((S(t))_{t \ge 0}\) is a semigroup of contractions.

The following example shows that (3.9) can also be satisfied for non-symmetric and non-convergent semigroups.

Example 3.3

Let \(H = \mathbb {R}^2\), \(H_0 = \mathbb {R}\times \{0\}\), \(H_1 = \{0\} \times \mathbb {R}\), and denote by \(P_0, P_1\) the projection operators onto \(H_0\) and \(H_1\), respectively. Let A be given by \(A = \begin{pmatrix} -1 &{} 1 \\ 0 &{} 1 \end{pmatrix}\). Then

$$\begin{aligned} \left\langle \begin{pmatrix} x \\ y \end{pmatrix}, A \begin{pmatrix}x \\ y \end{pmatrix} \right\rangle _H&= -x^2 + xy + y^2 \\&\le - \frac{1}{2} (x^2 + y^2) + 2 y^2 \\&= - \frac{1}{2} \Vert (x,y) \Vert _H^2 + 2 \Vert P_1(x,y) \Vert _H^2, \end{aligned}$$

i.e. (3.9) holds for \(\lambda _0 = \frac{1}{2}\) and \(\lambda _1 = \frac{3}{2}\). Since \( \mathrm {e}^{tA} = \begin{pmatrix} \mathrm {e}^{-t} &{} \frac{\mathrm {e}^t - \mathrm {e}^{-t}}{2} \\ 0 &{} \mathrm {e}^{t} \end{pmatrix}\), it is clear that neither the conditions of Proposition 3.1 nor of Remark 3.2 are satisfied.

3.3 Key stability estimate

Define, for \(x,y \in D(A)\), the function

$$\begin{aligned} \mathcal {L}(\Vert \cdot \Vert _H^2)(x,y)&:= 2 \langle A (x-y) + F(x)-F(y), x-y\rangle _H+ \Vert \sigma (x)-\sigma (y)\Vert _{L_2^0(H)}^2 \\&\quad + \int _E \Vert \gamma (x,\nu )-\gamma (y,\nu )\Vert _H^2 \mu (d\nu ). \end{aligned}$$

Remark that if (1.1) has a strong solution, then the function

$$\begin{aligned} \mathcal {L}(\Vert \cdot \Vert _H^2)(z):= 2 \langle A (z) + F(z), z\rangle _H+ \Vert \sigma (z)\Vert _{L_2^0(H)}^2+ \int _E \Vert \gamma (z,\nu )\Vert _H^2 \mu (d\nu ). \end{aligned}$$

is simply the generator \(\mathcal {L}\) applied to the unbounded function \(\Vert z\Vert _H^2\), see, e.g,. [2, equation (3.4)]). Since we work with mild solutions instead, all computations given below require to use additionally Yosida approximations for the mild solution of (1.1).

Below we first prove a Lyapunov-type estimate for \(\mathcal {L}(\Vert \cdot \Vert _H^2)\) and then deduce from that by an application of the generalized Itô-formula A.2 to (3.1) an estimate for the \(L^2\)-norm of \(X_t^x - X_t^y\).

Lemma 3.4

Assume that condition (GDC) and (A1) are satisfied. Then

$$\begin{aligned} \mathcal {L}(\Vert \cdot \Vert _H^2)(x,y)\le - \left( 2 \alpha - L_{\sigma } - L_{\gamma } \right) \Vert x-y\Vert _H^2+2(\alpha +\beta ) \Vert P_1x- P_1y\Vert _H^2 \end{aligned}$$
(3.12)

holds for \(x,y \in D(A)\).

Proof

Using first (A1) and then (GDC) we find that

$$\begin{aligned} \mathcal {L}(\Vert \cdot \Vert _H^2)(x,y)&\le (L_{\sigma } + L_{\gamma }) \Vert x -y \Vert _H^2 \\&\quad + 2\langle Ax-Ay,x-y \rangle _H + 2\langle F(x) - F(y), x-y \rangle _H \\&\le - \left( 2\alpha - L_{\sigma } - L_{\gamma } \right) \Vert x-y\Vert _H^2 + 2\left( \alpha + \beta \right) \Vert P_1x - P_1y \Vert _H^2. \end{aligned}$$

This proves the asserted inequality. \(\square \)

The following is our key stability estimate.

Proposition 3.5

Suppose that (GDC) and (A1) are satisfied, that

$$\begin{aligned} \varepsilon := 2\alpha - L_{\sigma } - L_{\gamma } > 0, \end{aligned}$$
(3.13)

and suppose that

$$\begin{aligned} \sup _{x\in H}\int _E \Vert \gamma (x,\nu )\Vert ^4 \mu (d\nu )<\infty . \end{aligned}$$
(3.14)

Then, for each \(X_0, Y_0 \in L^2(\Omega ,\mathcal {F}_0,\mathbb {P};H)\) and all \(t \ge 0\),

$$\begin{aligned}&\mathbb {E}\left[ \Vert X_t- Y_t\Vert _H^2 \right] \nonumber \\&\quad \le \mathrm {e}^{- \varepsilon t}\mathbb {E} \left[ \Vert X_0 - Y_0\Vert _{H}^2 \right] + 2(\alpha +\beta ) \int _0^t \mathrm {e}^{ - \varepsilon (t-s)}\mathbb {E}\left[ \Vert P_1 X_s-P_1Y_s\Vert _H^2\right] ds, \end{aligned}$$
(3.15)

where \((X_t)_{t \ge 0}\) and \((Y_t)_{t \ge 0}\) denote the unique solutions to (3.1), respectively.

Proof

Let \((X_t^n)_{t\ge 0}\) and \((Y_t^n)_{t\ge 0}\) be the strong solutions to the corresponding Yosida-approximation systems

$$\begin{aligned} {\left\{ \begin{array}{ll} dX^n_t=AX^n_t+R_n F(X^n_t) dt +R_n \sigma (X^n_t)dW_t + \int _ER_n\gamma (X^n_{t},\nu )\widetilde{N}(dt,d\nu ),\\ X^n_0=R_nX_0,\quad t \ge 0 \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} {\left\{ \begin{array}{ll} dY^n_t=AY^n_t+R_n F(Y^n_t) dt +R_n \sigma (Y^n_t)dW_t + \int _ER_n\gamma (Y^n_{t},\nu )\widetilde{N}(dt,d\nu ),\\ Y^n_0=R_nY_0,\quad t \ge 0 \end{array}\right. } \end{aligned}$$

where \(R_n=n(n-A)^{-1}\) for \(n\in \mathbb {N}\) with \(n > \alpha + \beta + \sqrt{L_F} =: \lambda \). By (3.4) we find for each \(n \ge 1 + \lambda \) the inequality

$$\begin{aligned} \Vert R_nz \Vert _H \le \frac{n}{n - \lambda }\Vert z\Vert _H \le (1 + \lambda )\Vert z \Vert _H. \end{aligned}$$

By classical properties of the resolvent (see [31, Lemma 3.2]), one clearly has \(R_nz \rightarrow z\) as \(n \rightarrow \infty \) in H . Moreover, by properties of the Yosida approximation of mild solutions of SPDEs (compare e.g. with Appendix A2 in [28] or Section 2 in [2]) we have

$$\begin{aligned} \lim _{n \rightarrow \infty } \mathbb {E}\left[ \sup _{t \in [0,T]} \Vert X_t^n - X_t \Vert _H^2 + \sup _{t \in [0,T]} \Vert Y_t^n - Y_t \Vert _H^2 \right] = 0, \qquad \forall T > 0 \end{aligned}$$

and hence there exists a subsequence (which is again denoted by n) such that \(X_t^n \longrightarrow X_t\) and \(Y_t^n \longrightarrow Y_t\) hold a.s. for each \(t \ge 0\). Following a method proposed in [2] we verify that sufficient conditions are satisfied to apply the generalized Itô-formula from Theorem A.2 to the function \(F(t,z):= \mathrm {e}^{\varepsilon t}\Vert z \Vert _H^2\), where \(\varepsilon = 2 \alpha - L_{\sigma } - L_{\gamma }\) is given by (3.13):

$$\begin{aligned}&X_t^n-Y_t^n = R_n (X_0 - Y_0)+\int _0^{t}\left\{ A(X_s^n-Y_s^n)+ R_n (F(X_s^n)-F(Y_s^n))\right\} ds \\&\quad + \int _0^{t}R_n( \sigma (X_s^n)-\sigma (Y_s^n))dW_s + \int _0^t\int _E R_n( \gamma (X_{s}^n,\nu )-\gamma (Y_{s}^n,\nu ))\widetilde{N}(ds,d\nu ). \end{aligned}$$

Observe that, by condition (A1) and (3.14), one has

$$\begin{aligned}&\int _0^t\int _E \Vert R_n( \gamma (X_s^n,\nu )-\gamma (Y_s^n,\nu ))\Vert _H^2 \mu (d\nu )ds \\&\qquad + \int _0^t\int _E \Vert R_n( \gamma (X_s^n,\nu )-\gamma (Y_s^n,\nu ))\Vert _H^4 \mu (d\nu )ds \\&\quad \le (1+ \lambda )^2 \int _0^t \int _E \Vert \gamma (X_s^n,\nu ) - \gamma (Y_s^n,\nu ) \Vert _H^2 \mu (d\nu )ds \\&\qquad + 8 (1 + \lambda )^4 \int _0^t \int _E \left( \Vert \gamma (X_s^n,\nu )\Vert _H^4 + \Vert \gamma (Y_s^n, \nu ) \Vert _H^4 \right) \mu (d\nu ) ds \\&\quad \le L_{\gamma } (1 + \lambda )^2 \int _0^t \Vert X_s^n - Y_s^n \Vert _H^2 ds \\&\qquad + 16 (1+\lambda )^4 t \sup _{z \in H} \int _E \Vert \gamma (z,\nu )\Vert _H^4 \mu (d\nu ) < \infty . \end{aligned}$$

Thus we can apply the generalized Itô-formula from Theorem A.2 and obtain (similar to (3.5) in [2])

$$\begin{aligned}&\mathrm {e}^{\varepsilon t} \Vert X_t^n-Y_t^n\Vert _H^2-\Vert R_n(X_0-Y_0)\Vert _H^2\nonumber \\&\quad = \int _0^t \langle 2 \mathrm {e}^{\varepsilon s} (X_s^n-Y_t^n), R_n(\sigma (X_s^n)-\sigma (Y_s^n)) dW_s \rangle _H\nonumber \\&\qquad + \int _0^t \mathrm {e}^{\varepsilon s} \left[ \varepsilon \Vert X_s^n-Y_s^n\Vert _H^2 +\mathcal {L}_n(\Vert \cdot \Vert _H^2)(X_s^n,Y_s^n) \right] ds \nonumber \\&\qquad + \int _0^{t}\int _E \mathrm {e}^{\varepsilon s} \left[ \Vert X_{s}^n-Y_{s}^n+R_n(\gamma (X_{s}^n,\nu )-\gamma (Y_{s}^n,\nu ))\Vert _H^2\right. \nonumber \\&\qquad \left. - \Vert X_{s}^n-Y_{s}^n\Vert _H^2 \right] \widetilde{N}(ds, d\nu ), \end{aligned}$$
(3.16)

where we used, for \(z,w \in D(A)\), the notation

$$\begin{aligned} \mathcal {L}_n(\Vert \cdot \Vert _H^2)(z,w)&:= 2 \langle z-w , A (z-w) + R_n( F(z)-F(w))\rangle _H\\&\quad + \Vert R_n (\sigma (z)-\sigma (w))\Vert _{L_2^0(H)}^2\\&\quad + \int _E \Vert R_n(\gamma (z,\nu )-\gamma (w,\nu ))\Vert _H^2 \mu (d\nu ). \end{aligned}$$

Taking expectations in (3.16) yields

$$\begin{aligned}&\mathrm {e}^{\varepsilon t} \mathbb {E}\left[ \Vert X_t^n-Y_t^n\Vert _{H}^2 \right] - \mathbb {E} \left[ \Vert R_n(X_0-Y_0)\Vert _{H}^2 \right] \nonumber \\&\quad = \mathbb {E}\left[ \int _0^t \mathrm {e}^{\varepsilon s} \left( \varepsilon \Vert X_s^n-Y_s^n\Vert _{H}^2 +\mathcal {L}_n(\Vert \cdot \Vert _{H}^2)(X_s^n,Y_s^n) \right) ds \right] . \end{aligned}$$
(3.17)

Lemma 3.4 yields

$$\begin{aligned}&\mathrm {e}^{\varepsilon t} \mathbb {E} \left[ \Vert X_t^n-Y_t^n\Vert _{H}^2 \right] -\mathbb {E} \left[ \Vert R_n(x-y)\Vert _{H}^2\right] \\&\quad - 2(\alpha +\beta ) \int _0^t \mathrm {e}^{\varepsilon s} \mathbb {E}\left[ \Vert P_1X_s^n-P_1Y_s^n\Vert _H^2\right] ds \\&\quad \le \mathbb {E}\left[ \int _0^t \mathrm {e}^{\varepsilon s} (-\mathcal {L}(\Vert \cdot \Vert _{H}^2)(X_s^n,Y_s^n)+\mathcal {L}_n(\Vert \cdot \Vert _{H}^2)(X_s^n,Y_s^n))ds \right] . \end{aligned}$$

Below we prove that the right-hand-side tends to zero as \(n \rightarrow \infty \), which would imply the assertion of this theorem. To prove the desired convergence to zero we apply the generalized Lebesgue Theorem (see [28, Theorem 7.1.8]). For this reason we have to prove that

$$\begin{aligned} \mathcal {L}(\Vert \cdot \Vert _{H}^2)(X_s^n,Y_s^n)- \mathcal {L}_n(\Vert \cdot \Vert _{H}^2)(X_s^n,Y_s^n)\rightarrow 0 \end{aligned}$$
(3.18)

holds a.s. for each \(s>0\) as \(n\rightarrow \infty \) and, moreover, there exists a constant \(C > 0\) such that

$$\begin{aligned} |\mathcal {L}(\Vert \cdot \Vert _{H}^2)(X_s^n,Y_s^n)- \mathcal {L}_n(\Vert \cdot \Vert _{H}^2)(X_s^n,Y_s^n)| \le C \Vert X_s^n - Y_s^n \Vert _H^2. \end{aligned}$$
(3.19)

We start with the proof of (3.18). Denote \(F_s^n:=F(X_s^n)-F(Y_s^n)\), \(\sigma _s^n:= \sigma (X_s^n)- \sigma (Y_s^n)\) and \(\gamma _s^n(\nu ):=\gamma (X_s^n,\nu )-\gamma (Y_s^n,\nu )\) and analogously \(F_s:=F(X_s)-F(Y_s)\), \(\sigma _s:= \sigma (X_s)- \sigma (Y_s)\) and \(\gamma _s(\nu ):=\gamma (X_s,\nu )-\gamma (Y_s,\nu )\) for each \(n\in \mathbb {N}\), \(s\ge 0\) and \(\nu \in E\). Then

$$\begin{aligned}&\vert (\mathcal {L}(\Vert \cdot \Vert _H^2)(X_s^n,Y_s^n)- \mathcal {L}_n(\Vert \cdot \Vert _H^2)(X_s^n,Y_s^n))\vert \\&\quad \le 2 \vert \langle X_s^n-Y_s^n, F_s^n-R_nF_s^n\rangle _H \vert + \vert \Vert \sigma _s^n\Vert _{L_2^0}^2-\Vert R_n\sigma _s^n\Vert _{L_2^0}^2\vert \\&\qquad + \left| \int _E \Vert \gamma _s^n(\nu )\Vert _H^2-\Vert R_n\gamma _s^n(\nu )\Vert _H^2 \mu (d\nu ) \right| \\&\quad =: I_1 + I_2 + I_3. \end{aligned}$$

For the first term \(I_1\) we estimate

$$\begin{aligned} I_1&\le 2\Vert X_s^n-Y_s^n\Vert _H\Vert F_s^n-R_n F_s^n\Vert _H \\&\le 2 \Vert X_s^n-Y_s^n\Vert _H \left( \Vert F_s^n-F_s\Vert _H + \Vert F_s-R_nF_s\Vert _H +\Vert R_n F_s-R_n F_s^n\Vert _H \right) \\&\le 2\Vert X_s^n-Y_s^n\Vert _{H} \left( \Vert F_s^n-F_s\Vert _H + \Vert F_s-R_nF_s\Vert _H + (1+\lambda )\Vert F_s- F_s^n\Vert _H \right) . \end{aligned}$$

Using that \(X_s^n \rightarrow X_s\) and \(Y_s^n \rightarrow Y_s\) as a.s. for some subsequence (also denoted by n), we easily find that the right-hand side tends to zero. The convergence of the second term follows from

$$\begin{aligned} I_2&= \left| \Vert \sigma _s^n\Vert _{L_2^0}-\Vert R_n\sigma _s^n\Vert _{L_2^0}\right| \left( \Vert \sigma _s^n\Vert _{L_2^0}+\Vert R_n\sigma _s^n\Vert _{L_2^0}\right) \\&\le (2 + \lambda )\sqrt{L_{\sigma }} \Vert \sigma _s^n- R_n\sigma _s^n\Vert _{L_2^0} \Vert X_s^n-Y_s^n\Vert _H \\&\le (2 {+} \lambda )^2\sqrt{L_{\sigma }}\Vert X_s^n{-}Y_s^n\Vert _H \left( \Vert \sigma _s^n{-}\sigma _s\Vert _{L_2^0}+\Vert \sigma _s-R_n\sigma _s\Vert _{L_2^0} + \Vert \sigma _s- \sigma _s^n\Vert _{L_2^0} \right) . \end{aligned}$$

It remains to show the convergence of the third term. First, observe

$$\begin{aligned} I_3&\le (2 + \lambda )\int _E \Vert \gamma _s^n(\nu )- R_n\gamma _s^n(\nu )\Vert _H \Vert \gamma _s^n(\nu )\Vert _H \mu (d\nu ) \\&\le (2 + \lambda )\int _E \bigg (\Vert \gamma _s^n(\nu )-\gamma _s(\nu )\Vert _H +\Vert \gamma _s(\nu )-R_n\gamma _s(\nu )\Vert _H \\&\quad +\Vert R_n\gamma _s(\nu )- R_n\gamma _s^n(\nu )\Vert _H \bigg ) \Vert \gamma _s^n(\nu )\Vert _{H} \mu (d\nu ) \\&\le (2 + \lambda ) \left( \int _E \Vert \gamma _s^n(\nu )\Vert _H^2 \mu (d\nu )\right) ^{\frac{1}{2}} \bigg [ \left( \int _E \Vert \gamma _s^n(\nu )-\gamma _s(\nu )\Vert _H^2 \mu (d\nu ) \right) ^{\frac{1}{2}} \\&\quad + \left( \int _E \Vert \gamma _s(\nu )-R_n\gamma _s(\nu )\Vert _H^2 \mu (d\nu ) \right) ^{\frac{1}{2}} + \left( \int _E \Vert R_n\gamma _s(\nu )- R_n\gamma _s^n(\nu )\Vert _H^2 \mu (d\nu )\right) ^{\frac{1}{2}} \bigg ]\\&\le \sqrt{2}(2 + \lambda )^2 L_{\gamma } \Vert X_s^n - Y_s^n\Vert _H \left( \Vert X_s^n-X_s\Vert _{H}+\Vert Y_s^n-Y_s\Vert _H \right) \\&\quad + (2 + \lambda ) \sqrt{L_{\gamma }} \Vert X_s^n - Y_s^n \Vert _H \left( \int _E \Vert \gamma _s(\nu )-R_n\gamma _s(\nu )\Vert _{H}^2 \mu (d\nu ) \right) ^{\frac{1}{2}} \\&= I^1_3 + I_3^2 \end{aligned}$$

where the last inequality follows from condition (A1) combined with the inequality

$$\begin{aligned}&\Vert R_n \gamma _s(\nu ) - R_n\gamma ^n_s(\nu )\Vert _H^2 \\&\quad \le (1 + \lambda )^2 \Vert \gamma _s(\nu ) - \gamma _s^n(\nu ) \Vert _H^2 \\&\quad \le 2(1 + \lambda )^2 \left( \Vert \gamma (X_s,\nu ) - \gamma (Y_s,\nu )\Vert _H^2 + \Vert \gamma (X_s^n,\nu ) - \gamma (Y_s^n, \nu )\Vert _H^2 \right) . \end{aligned}$$

The first expression \(I_1^1\) clearly tends to zero as \(n \rightarrow \infty \). For the second expression \(I_3^2\) we use the inequality \(\Vert \gamma _s(\nu )-R_n\gamma _s(\nu )\Vert _{H}^2 \le 2(2 + \lambda )^2 \Vert \gamma _s(\nu )\Vert _H^2\) so that dominated convergence theorem is applicable, which shows that \(I_3^2 \rightarrow 0\) as \(n \rightarrow \infty \) a.s.. This proves (3.18). Concerning (3.19), we find that

$$\begin{aligned}&\vert (\mathcal {L}(\Vert \cdot \Vert _H^2)(X_s^n,Y_s^n)- \mathcal {L}_n(\Vert \cdot \Vert _H^2)(X_s^n,Y_s^n))\vert \\&\quad \le 2 \vert \langle X_s^n-Y_s^n, F_s^n-R_nF_s^n\rangle _H \vert + \vert \Vert \sigma _s^n\Vert _{L_2^0(H)}^2-\Vert R_n\sigma _s^n\Vert _{L_2^0(H)}^2\vert \\&\qquad + \left| \int _E \Vert \gamma _s^n(\nu )\Vert _H^2-\Vert R_n\gamma _s^n(\nu )\Vert _H^2 \mu (d\nu ) \right| \\&\quad \le 2 ( 2 + \lambda ) \Vert X_s^n-Y_s^n\Vert _H \Vert F_s^n\Vert _H + \left( 1 + (1 + \lambda )^2\right) \\&\qquad \left[ \Vert \sigma _s^n\Vert _{L_2^0(H)}^2 + \int _E \Vert \gamma _s^n(\nu )\Vert _H^2 \mu (d\nu ) \right] \\&\quad \le 2 ( 2 + \lambda ) L_F \Vert X_s^n-Y_s^n\Vert _H^2 + \left( 1 + (1 + \lambda )^2 \right) (L_{\sigma } + L_{\gamma }) \Vert X_s^n-Y_s^n\Vert _H^2. \end{aligned}$$

Hence the generalized Lebesgue Theorem is applicable, and thus the assertion of this theorem is proved. \(\square \)

Note that condition (3.14) is used to guarantee that the Itô-formula A.2 for Hilbert space valued jump diffusions can be applied for \((x,t)\rightarrow \mathrm {e}^{t\varepsilon }\Vert x \Vert _H^2\). The assertion of Proposition 3.5 is also true when \(\varepsilon \le 0\), but will be only applied for the case when \(\varepsilon > 0\).

4 Convergence to limiting distribution

4.1 The strongly dissipative case

As a consequence of our key stability estimate we can provide a simple proof for the existence and uniqueness of a unique limiting distribution in the spirit of classical results such as [32, Section 16], [10, Chapter 11, Section 5], and [36].

Theorem 4.1

Assume that condition (GDC) is satisfied for \(P_1 = 0\) (and hence \(\beta = 0\)), (A1) holds, and (3.14) is satisfied. If (3.13) is satisfied, then

$$\begin{aligned} \mathrm {W}_2(p_t^* \rho , p_t^* \widetilde{\rho }) \le \mathrm {W}_2(\rho , \widetilde{\rho })\mathrm {e}^{- \varepsilon t/2}, \qquad t \ge 0, \end{aligned}$$
(4.1)

holds for any \(\rho , \widetilde{\rho } \in \mathcal {P}_2(H)\). In particular, the Markov process determined by (3.1) has a unique invariant measure \(\pi \). This measure has finite second moments and it holds that

$$\begin{aligned} \mathrm {W}_2(p_t^*\rho , \pi ) \le \mathrm {W}_2(\rho , \pi ) \mathrm {e}^{- \varepsilon t/2 }, \qquad t \ge 0, \end{aligned}$$
(4.2)

for each \(\rho \in \mathcal {P}_2(H)\).

Proof

Using (GDC) with \(P_1 = 0\) combined with Proposition 3.5 we find that

$$\begin{aligned} \mathbb {E}[ \Vert X_t^x - X_t^y\Vert _H^2 ] \le \mathrm {e}^{-\varepsilon t} \Vert x-y\Vert _H^2, \qquad x,y \in H. \end{aligned}$$

Using the definition of the Wasserstein distance, we conclude that

$$\begin{aligned} \mathrm {W}_2(p_t^* \delta _x, p_t^* \delta _y) \le \left( \mathbb {E}[ \Vert X_t^x - X_t^y\Vert _H^2 ] \right) ^{1/2} \le \Vert x-y\Vert _H \mathrm {e}^{- \varepsilon t/2}. \end{aligned}$$

The latter one readily yields (4.14.12). Finally, the existence and uniqueness of an invariant measure as well as (2.44.2) can be derived from (4.14.12) combined with a standard Cauchy argument. \(\square \)

This result can be seen as an analogue of the conditions introduced in [32, Section 16], [10, Chapter 11, Section 5], and [36], where a similar statement was given. Opposite to this case, in this work we focus on the study of multiple invariant measures. For this purpose we will assume that \(\varepsilon > 0\) and that (GDC) holds for some \(P_1 \ne 0\).

4.2 The case of vanishing coefficients

While Proposition 3.5 provides an estimate on the \(L^2\)-norm of the difference \(X_t^x - X_t^y\), such an estimate alone does neither imply the existence nor uniqueness of an invariant distribution. However, if the coefficients \(F, \sigma , \gamma \) vanish at \(H_1\), then we may characterize the limiting distributions in \(L^2\).

Theorem 4.2

Suppose that (GDC) holds with a projection operator \(P_1\), (A1), (3.14), (3.13) are satisfied, that \((S(t))_{t \ge 0}\) leaves \(H_0 := \mathrm {ran}(I - P_1)\) invariant, and that \(\mathrm {ran}(P_1) \subset \mathrm {ker}(A)\). Moreover, assume that

$$\begin{aligned} P_1F\equiv 0,\ P_1\sigma \equiv 0,\ P_1\gamma \equiv 0. \end{aligned}$$
(4.3)

Given any \(X_0 \in L^2(\Omega , \mathcal {F}_0, \mathbb {P};H)\) which satisfies

$$\begin{aligned} F(P_1 X_0) = 0, \ \sigma (P_1X_0) = 0, \ \gamma (P_1 X_0,\cdot ) = 0, \qquad a.s., \end{aligned}$$
(4.4)

then the inequality

$$\begin{aligned} \mathbb {E}\left[ \Vert X_t - P_1X_0 \Vert _H^2 \right] \le \mathrm {e}^{-\varepsilon t}\mathbb {E}\left[ \Vert (I-P_1) X_0\Vert ^2_H \right] \end{aligned}$$

holds. In particular, let \(\rho \) be the law of \(X_0 \in L^2(\Omega , \mathcal {F}_0, \mathbb {P}; H)\) and \(\rho _1\) be the law of \(P_1X_0\), respectively. Then \(\rho _1\) is an invariant measure.

Proof

Fix \(X_0 \in L^2(\Omega , \mathcal {F}_0, \mathbb {P};H)\) with property (4.4) and set \(P_0 = I - P_1\). Since \(\mathrm {ran}(P_1) \subset \mathrm {ker}(A)\) we find that \(S(t)P_1 = P_1\) for \(t \ge 0\) and hence \(P_0S(t)P_1 = 0\). Moreover, since \((S(t))_{t \ge 0}\) leaves \(H_0\) invariant, we obtain \(P_0S(t) = P_0S(t)P_0 + P_0S(t)P_1 = P_0S(t)P_0 = S(t)P_0\). Hence, using (4.3) we find that

$$\begin{aligned} P_1X_t = P_1 S(t)X_0 = P_1S(t)P_0X_0 + P_1S(t)P_1X_0 = P_1X_0. \end{aligned}$$

From this we conclude that \((P_0 X_t)_{t\ge 0}\) satisfies

$$\begin{aligned} P_0X_t&= P_0S(t)X_0 + \int _0^t P_0S(t-s)F(X_s)ds + \int _0^t P_0S(t-s) \sigma (X_s)dW_s \\&\quad +\int _0^t \int _E P_0S(t-s)\gamma (X_s) \widetilde{N}(ds,d\nu ) \\&= S(t)P_0X_0 + \int _0^t S(t-s) P_0F(P_1X_0 + P_0X_t)ds \\&\quad + \int _0^t S(t-s) P_0\sigma (P_1X_0 + P_0X_s)dW_s \\&\quad +\int _0^t \int _E S(t-s)P_0\gamma (P_1X_0 + P_0X_s)\widetilde{N}(ds,d\nu ) \\&= S(t)P_0X_0 + \int _0^t S(t-s) \widetilde{F}(P_0X_t)ds + \int _0^t S(t-s)\widetilde{\sigma }(P_0X_s) dW_s \\&\quad \int _0^t \int _E S(t-s)\widetilde{\gamma }(P_0X_s)\widetilde{N}(ds,d\nu ), \end{aligned}$$

where we have set \(\widetilde{F}(y):= P_0 F(P_1X_0 + y),\ \widetilde{\sigma }(y):= P_0 \sigma (P_1X_0 + y)\) and \(\widetilde{\gamma }(y,\nu ):=P_0\gamma (P_1X_0 + y,\nu )\) for all \(y \in H_0\) and \(\nu \in E\). Since these coefficients share the same Lipschitz estimates as \(F,\sigma \) and \(\gamma \), are \(\mathcal {F}_0\)-measurable and the noise terms are independent of \(\mathcal {F}_0\), we can apply Proposition 3.5 (conditionally on \(\mathcal {F}_0\)) to the process \((P_0X_t)_{t \ge 0}\) obtained from the above auxiliary SPDE and obtain

$$\begin{aligned} \mathbb {E}[ \Vert X_t - P_1X_0 \Vert _H^2 ] = \mathbb {E}[ \Vert P_0 X_t \Vert _H^2 ] = \mathbb {E}[ \Vert P_0 X_t - P_0Y_t^0\Vert _H^2 ] \le \mathrm {e}^{-\varepsilon t}\mathbb {E}[\Vert P_0 X_0 \Vert ^2_H], \end{aligned}$$

where we have used that \(P_0Y_t = 0\) for the unique solution with \(Y_0 = 0\) due to (4.4). \(\square \)

This theorem can be applied, for instance, to the Heath–Jarrow–Morton–Musiela equation, see Sect. 5.

4.3 Main result: the general case

In Theorem 4.2 we have assumed (4.3), (4.4), and that \((S(t))_{t \ge 0}\) leaves \(H_0\) invariant. Below we continue with the more general case. Namely, for the projection operator \(P_1\) given by condition (GDC) we set \(P_0 = I - P_1\) and suppose that:

  1. (A2)

    The semigroup \((S(t))_{t \ge 0}\) leaves \(H_1 := \mathrm {ran}(I - P_0)\) invariant, one has

    $$\begin{aligned} P_1\sigma = P_1\gamma = 0 \ \text { and } \ P_1F(x) = P_1F(P_1x), \qquad x \in H. \end{aligned}$$

Let us briefly comment on this condition. Let \((X_t)_{t \ge 0}\) be the unique solution to (3.6) and decompose the process \(X_t\) according to \(X_t = P_0X_t + P_1 X_t\). Then condition (A2) simply implies that \(P_1X_t\) is \(\mathcal {F}_0\)-measurable and satisfies \(\omega \)-wisely the deterministic equation

$$\begin{aligned} f(t;x) = P_1S(t)x + \int _0^t P_1S(t-s)P_1F(P_1f(s;x))ds, \ \ f(0,x) = x \in H, \end{aligned}$$
(4.5)

i.e. \(P_1X_t = f(t;x)\) with \(f(0,x) = x = X_0\) holds a.s. Our next condition imposes a control on this component:

  1. (A3)

    For each \(x \in H_1 = \mathrm {ran}(P_1)\) there exists \(\widetilde{f}(x) \in H_1\) and constants \(C(x) > 0\), \(\delta (x) > 0\) such that

    $$\begin{aligned} \Vert f(t;x) - \widetilde{f}(x)\Vert _H^2 \le C(x)\mathrm {e}^{-\delta (x) t}, \qquad t \ge 0. \end{aligned}$$

Without loss of generality we will always suppose that \(\delta (x) \in (0,|\varepsilon |)\). Such assumption will simplify our arguments later on. Note that, if \(P_1 F(P_1 \cdot ) = 0\) then condition (A3) reduces to a condition on the limiting behavior of the semigroup \((S(t))_{t \ge 0}\) when restricted to \(H_1 = \mathrm {ran}(P_1)\). In such a case condition (A3) is, for instance, satisfied if \(\mathrm {ran}(P_1) \subset \mathrm {ker}(A)\). Recall that condition (GDC) was formulated in the introduction and that (A1), (3.14) and (3.13) were formulated in Sect. 3. The following is our main result in this section.

Theorem 4.3

Suppose that condition (GDC) holds for some projection operator \(P_1\), that conditions (A1) – (A3), (3.14) and (3.13) are satisfied. Then the following assertions hold:

  1. (a)

    For each \(x \in H\) there exists an invariant measure \(\pi _{\delta _x} \in \mathcal {P}_2(H)\) for the Markov semigroup \((p_t)_{t \ge 0}\) and a constant \(K(\alpha ,\beta ,\varepsilon ,h) > 0\) such that

    $$\begin{aligned} \mathrm {W}_2(p_t(x,\cdot ), \pi _{\delta _x}) \le K(\alpha , \beta , \varepsilon , x)\mathrm {e}^{- \frac{\delta (x)}{2} t}, \qquad t \ge 0. \end{aligned}$$
  2. (b)

    Suppose, in addition to the conditions of (A3), that there are constants \(\delta \) and C, such that

    $$\begin{aligned} \delta (x) \ge \delta > 0 \ \ \text { and } \ \ C(x) \le C(1 + \Vert x \Vert _H)^4, \qquad x \in H. \end{aligned}$$
    (4.6)

    Then, for each \(\rho \in \mathcal {P}_2(H)\), there exists an invariant measure \(\pi _{\rho } \in \mathcal {P}_2(H)\) for the Markov semigroup \((p_t)_{t \ge 0}\) and a constant \(K(\alpha ,\beta ,\varepsilon ) > 0\) such that

    $$\begin{aligned} \mathrm {W}_2(p_t^*\rho , \pi _{\rho }) \le K(\alpha , \beta , \varepsilon )\int _{H}(1 + \Vert x\Vert _H)^2 \rho (dx)\mathrm {e}^{- \frac{\delta }{2} t}, \qquad t \ge 0. \end{aligned}$$

The proof of this theorem relies on the key stability estimate formulated in Proposition 3.5 and is given at the end of this section. So far we have stated the existence of invariant measures parametrized by the initial state of the process. However, under the given conditions it can also be shown that \(\pi _{\delta _x}\) as well as \(\pi _{\rho }\) depend only on the \(H_1\) part of x or \(\rho \), respectively.

Corollary 4.4

Suppose that condition (GDC) holds for some projection operator \(P_1\), that conditions (A1) – (A3), (3.14) and (3.13) are satisfied. Then the following assertions hold:

  1. (a)

    Let \(x,y \in H\) be such that \(P_1x = P_1y\). Then \(\pi _{\delta _x} = \pi _{\delta _y}\).

  2. (b)

    Suppose, in addition, that (4.6) holds. Let \(\rho , \widetilde{\rho } \in \mathcal {P}_2(H)\) be such that \(\rho \circ P_1^{-1} = \widetilde{\rho } \circ P_1^{-1}\). Then \(\pi _{\rho } = \pi _{\widetilde{\rho }}\).

Let us briefly compare the conditions imposed in Theorem 4.2 with those imposed in Theorem 4.3. In Theorem 4.3 we have weakened (4.3) with respect to F by replacing \(P_1F = 0\) by \(P_1F(x) = P_1F(P_1x)\). Moreover, we have replaced \(\mathrm {ran}(P_1) \subset \mathrm {ker}(A)\) by condition (A3). Finally note that condition (4.4) is not assumed in Theorem 4.3. Below we provide a counter example showing that, in general, condition (A3) cannot be omitted.

Example 4.5

Let \(H = \mathbb {R}^2\) and \((W_t)_{t \ge 0}\) be a 2-dimensional standard Brownian motion. Let \(Y_t = (Y_t^1, Y_t^2) \in H = \mathbb {R}^2\) be the solution of

$$\begin{aligned} dY_t = \begin{pmatrix} -1 &{}\quad 1 \\ 0 &{}\quad 1 \end{pmatrix}Y_tdt + \begin{pmatrix}1 &{}\quad 0 \\ 0 &{}\quad 0 \end{pmatrix}dW_t. \end{aligned}$$

Then condition (A1) holds for \(F = 0\), \(\gamma = 0\) and clearly \(\sigma (x) = \begin{pmatrix}1 &{} 0 \\ 0 &{} 0 \end{pmatrix}\). Example 3.3 shows that (GDC) holds with \(P_1\) being the projection onto the second coordinate. Moreover, (4.3) and hence (A2) holds. However, since

$$\begin{aligned} Y_t^2 = \mathrm {e}^t Y_0^2 + \int _0^t \mathrm {e}^{t-s}dW_s^2 \end{aligned}$$

it is clear that condition (A3) is not satisfied. Moreover, \(Y_t^2\) does not have a limiting distribution and hence also \(Y_t\) cannot have a limiting distribution.

The next remark shows that, under a stronger condition on the Lévy measure, the results obtained in Theorem 2.1 could partially also be deduced from the general statements of this section.

Remark 4.6

The results obtained in Sect. 2 for the Lévy driven Ornstein–Uhlenbeck process could partially be also obtained from the above results. Indeed, (2.1) can be cast into the form

$$\begin{aligned} dX_t^x&= AX_t^xdt + dZ_t \\&= (AX_t^x + b)dt + Q dW_t + \int _{\{ \Vert z \Vert _H \le 1\} \backslash \{0\}}z \widetilde{N}(dt,dz) + \int _{ \{ \Vert z \Vert _H > 1\} } z N(dt,dz) \\&= (AX_t^x + b + c)dt + Q dW_t + \int _{H \backslash \{0\}} z \widetilde{N}(dt,dz), \end{aligned}$$

where \(c = \int _{ \{ \Vert z\Vert _H > 1\}} z \mu (dz)\) and we have used the Lévy-Ito decomposition for the Lévy process \((Z_t)_{t \ge 0}\), i.e.,

$$\begin{aligned} Z_t = bt + Q W_t + \int _{\{ \Vert z \Vert _H \le 1\} \backslash \{0\}}z \widetilde{N}(dt,dz) + \int _{ \{ \Vert z \Vert _H > 1\} } z N(dt,dz), \end{aligned}$$

where N(dtdz) is a Poisson random measure with compensator \(dt \mu (dz)\). Suppose that the conditions of Theorem 2.1 are satisfied. If the semigroup generated by (AD(A)) is also symmetric, then using Proposition 3.1 one can show that also (GDC) holds. Condition (A1) is clearly satisfied for \(L_{\gamma } = L_{\sigma } = 0\) and \(\sigma (x) = Q\), \(\gamma (x,z) = z\). Thus (3.13) is satisfied. To prove condition (A2) we let \(H_0 = \mathrm {ker}(P)\), \(H_1 = \mathrm {ran}(P)\) and observe that \(PX_t\), compare with (4.5), simplifies to

$$\begin{aligned} PX_t&= PS(t)x + \int _0^t PS(t-s)(b+c) ds + \int _0^t PS(t-s)QdW_s \\&\quad + \int _0^t \int _{H \backslash \{0\}} PS(t-s)z \widetilde{N}(ds,dz) \\&= Px + \int _0^t \int _{H \backslash \{0\}} Pz \widetilde{N}(ds,dz), \end{aligned}$$

where we have used \(PS(t-s) = PS(t-s)P = P\) and the conditions imposed in Theorem 2.1 on the Lev́y triplet, i.e., \(P(b+c) = 0\) and \(PQ = 0\). Noting that \(\mathrm {supp}(\mu ) \subset \mathrm {ker}(P)\) we find that \(\widetilde{N}\) is supported on \(\mathbb {R}_+ \times \mathrm {ker}(P)\) and hence

$$\begin{aligned} \int _0^t \int _{H \backslash \{0\} } Pz \widetilde{N}(ds,dz) = 0. \end{aligned}$$

This shows that condition (A3) is satisfied for any choice of \(C(x), \delta (x)\) and \(\widetilde{f}(x) = Px\). Finally, (3.14) requires that \(\mu \) satisfies the stronger moment condition

$$\begin{aligned} \int _{ \{ \Vert z \Vert _H > 1 \}} \Vert z \Vert _H^4 \mu (dz) < \infty . \end{aligned}$$

Thus under the above assumptions the existence of multiple invariant measures for the Ornstein–Uhlenbeck process also follows from Theorem 4.3 and Corollary 4.4. However, in contrast to Theorem 2.1, the general results from this section, do not provide an explicit characterization of the limiting distributions in terms of the Fourier transform and also require to assume stronger conditions.

Next we turn to a proof of Theorem 4.3 and Corollary 4.4.

4.4 Construction of a coupling

Let \(x \in H\) and let \((X_t^x)_{t \ge 0}\) be the unique mild solution to (3.6). Below we construct for given \(\tau \ge 0\) a coupling for the law of \((X_t^x, X_{t+\tau }^x)\). Let \((Y_{t}^{x,\tau })_{t \ge 0}\) be the unique mild solution to the SPDE

$$\begin{aligned} Y_t^{x,\tau }&= S(t)x + \int _0^t S(t-s)F(Y_s^{x,\tau })ds + \int _0^t S(t-s)\sigma (Y_s^{x,\tau })dW^{\tau }_s \nonumber \\&\quad + \int _0^t \int _E S(t-s) \gamma (Y_{s}^{x,\tau }, \nu ) \widetilde{N}^{\tau }(ds,d\nu ), \qquad t \ge 0, \end{aligned}$$
(4.7)

where \(W^{\tau }_s = W_{\tau + s} - W_{\tau }\) is a Q-Wiener process, and \(\widetilde{N}^{\tau }(ds,d\nu )\) defined by

$$\begin{aligned} \widetilde{N}^{\tau }((0,t] \times A) := \widetilde{N}((\tau ,\tau + t] \times A) \end{aligned}$$

for \(t > 0\) and \(A \in \mathcal {E}\) is a Poisson random measure with respect to the filtration \((\mathcal {F}_s^{\tau })_{s \ge 0}\) defined by \(\mathcal {F}_s^{\tau } = \mathcal {F}_{s + \tau }\).

Lemma 4.7

Suppose that (GDC), (A1), (3.14) and (3.13) are satisfied. Then for each \(x \in H\) and \(t, \tau \ge 0\) the following assertions hold:

  1. (a)

    \(Y_{t}^{x,\tau }\) has the same law as \(X_t^{x}\).

  2. (b)

    It holds that

    $$\begin{aligned} \mathbb {E}\left[ \Vert Y_t^{x,\tau } - X_{t+\tau }^x \Vert _H^2 \right]&\le \mathrm {e}^{-\varepsilon t} \mathbb {E}\left[ \Vert x - X_{\tau }^x \Vert _H^2 \right] \\&\quad + 2 ( \alpha + \beta ) \int _0^t \mathrm {e}^{- \varepsilon (t-s)} \mathbb {E}\left[ \Vert P_1Y_s^{x,\tau } - P_1X_{s + \tau }^x \Vert _H^2 \right] ds. \end{aligned}$$

Proof

(a) Since (3.6) has a unique solution it follows from the Yamada-Watanabe Theorem (see [26]) that also uniqueness in law holds for this equation. Since the driving noises \(N^{\tau }\) and \(W^{\tau }\) in (4.7) have the same law as N and W from (3.6), it follows that the unique solution to (4.7) has the same law as the solution to (3.6). This proves the assertion.

(b) Set \(X_{t}^{x,\tau } := X_{t+ \tau }^x\), then by direct computation we find that

$$\begin{aligned} X_{t}^{x,\tau }&= S(t)S(\tau )x + \int _0^{t + \tau } S(t+\tau - s)F(X_s^x) ds + \int _0^{t + \tau } S(t + \tau -s)\sigma (X_s^{x})dW_s \\&\quad + \int _0^{t + \tau }\int _E S(t+\tau - s)\gamma (X_s^{x},\nu )\widetilde{N}(ds,d\nu ) \\&= S(t)S(\tau )x + S(t)\int _{0}^{\tau }S(\tau - s)F(X_s^x)ds + S(t)\int _0^{\tau }S(\tau -s)\sigma (X_s^x)dW_s \\&\quad + S(t)\int _0^{\tau }\int _E S(\tau -s)\gamma (X_s^x,\nu )\widetilde{N}(ds,d\nu ) \\&\quad + \int _{\tau }^{t+\tau }S(t + \tau - s)F(X_s^x)ds + \int _{\tau }^{t+\tau }S(t+\tau -s)\sigma (X_s^x)dW_s \\&\quad + \int _{\tau }^{t+\tau }\int _E S(t+\tau -s)\gamma (X_s^x,\nu )\widetilde{N}(ds,d\nu ) \\&= S(t)X_{0}^{x,\tau } + \int _0^{t}S(t-s)F(X_{s}^{x,\tau })ds + \int _0^t S(t-s)\sigma (X_s^{x,\tau })dW^{\tau }_s \\&\quad + \int _0^{t}\int _E S(t-s)\gamma (X_s^{x,\tau },\nu )\widetilde{N}^{\tau }(ds,d\nu ), \end{aligned}$$

where in the last equality we have used, for appropriate integrands \(\Phi (s,\nu )\) and \(\Psi (s)\), that

$$\begin{aligned} \int _{\tau }^{\tau + t}\Psi (s)dW_s&= \int _0^t \Psi (s+\tau )dW_s^{\tau }, \\ \int _{\tau }^{\tau + t}\int _E \Phi (s,\nu )\widetilde{N}(ds,d\nu )&= \int _0^t \int _E \Phi (s+\tau ,\nu )\widetilde{N}^{\tau }(ds,d\nu ). \end{aligned}$$

Hence \((X_t^{x,\tau })_{t \ge 0}\) also solves (4.7) with \(\mathcal {F}_0^{\tau } = \mathcal {F}_{\tau }\) and initial condition \(X_0^{x,\tau } = X_{\tau }^x\). Consequently, the assertion follows from Proposition 3.5 applied to \(X_t^{x,\tau }\) and \(Y_t^{x,\tau }\). \(\square \)

4.5 Proof of Theorem 4.3

Proof of Theorem 4.3

Fix \(x \in H\) and recall that \(p_t(x,\cdot )\) denotes the transition probabilities of the Markov process obtained from (3.6). Below we prove that \((p_t(x,\cdot ))_{t \ge 0} \subset \mathcal {P}_2(H)\) is a Cauchy sequence with respect to the Wasserstein distance \(\mathrm {W}_2\). Fix \(t, \tau \ge 0\). We treat the cases \(\tau \in (0,1]\) and \(\tau > 1\) separately.

Case \(0 < \tau \le 1\): Then using the coupling lemma 4.7.(b) yields

$$\begin{aligned}&\mathrm {W}_2(p_{t+ \tau }(x,\cdot ), p_t(x,\cdot )) \le \left( \mathbb {E}\left[ \Vert Y_{t}^{x,\tau } - X_{t+\tau }^x \Vert _H^2 \right] \right) ^{1/2} \\&\quad \le \mathrm {e}^{- \frac{\varepsilon }{2}t} \left( \mathbb {E}\left[ \Vert X_{\tau }^x - x \Vert _H^2 \right] \right) ^{1/2} \\&\quad \quad + \sqrt{2(\alpha + \beta )} \left( \int _0^t \mathrm {e}^{- \varepsilon (t - s)} \mathbb {E}\left[ \Vert P_1Y_s^{x,\tau } - P_1X_{s + \tau }^x \Vert _H^2 \right] ds \right) ^{1/2} \\&\quad =: I_1 + I_2. \end{aligned}$$

The first term \(I_1\) can be estimated by

$$\begin{aligned} I_1 \le \mathrm {e}^{- \frac{\varepsilon }{2}t} \sup _{s \in [0,1]} \left( \mathbb {E}\left[ \Vert X_s^x - x \Vert _H^2 \right] \right) ^{1/2}. \end{aligned}$$

To estimate the second term \(I_2\) we first observe that by condition (A2) we have \(P_1Y_s^{x,\tau } = P_1X_s^x = f(s;x)\) being deterministic and hence by condition (A3) one has for each \(s \ge 0\) that

$$\begin{aligned} \mathbb {E}\left[ \Vert P_1Y_s^{x,\tau } - P_1X_{s + \tau }^{x} \Vert _H^2 \right]&\le 2 \Vert P_1 Y_{s}^{x,\tau } - \widetilde{f}(x)\Vert _H^2 + 2 \Vert P_1 X_{s+\tau }^x - \widetilde{X}_{\infty }^{x} \Vert _H^2 \nonumber \\&\le 4C(x) \mathrm {e}^{-\delta (x) s}. \end{aligned}$$
(4.8)

This readily yields

$$\begin{aligned}&\int _0^{t} \mathrm {e}^{- \varepsilon ( t - s)} \mathbb {E}\left[ \Vert P_1Y_s^{x,\tau } - P_1X_{s + \tau }^{x} \Vert _H^2 \right] ds \nonumber \\&\quad \le 4C(x)\int _0^{t}\mathrm {e}^{- \varepsilon (t - s)} \mathrm {e}^{- \delta (x)s} ds\nonumber \\&\quad = 4C(x)\mathrm {e}^{- \varepsilon t} \frac{ \mathrm {e}^{(\varepsilon - \delta (x))t} - 1}{\varepsilon - \delta (x)}\nonumber \\&\quad \le 4C(x)\frac{\mathrm {e}^{- \delta (x)t}}{\varepsilon - \delta (x)}. \end{aligned}$$
(4.9)

Inserting this into the definition of \(I_2\) gives

$$\begin{aligned} I_2 \le 2 \sqrt{ \frac{(\alpha + \beta ) C(x)}{ \varepsilon - \delta (x)}}\mathrm {e}^{- \frac{\delta (x)}{2}t}. \end{aligned}$$

Case \(\tau > 1\): Fix some \(N \in \mathbb {N}\) with \(\tau< N < 2\tau \) and define a sequence of numbers \((a_n)_{n = 0,\dots , N}\) by

$$\begin{aligned} a_n := \frac{\tau }{N} n, \qquad n = 0, \dots , N. \end{aligned}$$

Then \(a_0 = 0\), \(a_N = \tau \) and \(a_n - a_{n-1} = \frac{\tau }{N} =: \varkappa \in (\frac{1}{2},1)\) for \(n = 1, \dots , N\). Hence we obtain from the coupling Lemma 4.7.(b)

$$\begin{aligned}&\mathrm {W}_2(p_{t+\tau }(x,\cdot ), p_t(x,\cdot )) \\&\quad \le \sum _{n=1}^{N}\mathrm {W}_2( p_{t + a_{n}}(x,\cdot ), p_{t+a_{n-1}}(x,\cdot )) \\&\quad \le \sum _{n=1}^{N}\left( \mathbb {E}\left[ \Vert Y_{t+a_{n-1}}^{x,\varkappa } - X_{t + a_{n-1} + \varkappa }^{x} \Vert _H^2 \right] \right) ^{1/2} \\&\quad \le \sum _{n=1}^{N} \mathrm {e}^{-\frac{\varepsilon }{2}(t+a_{n-1})}\left( \mathbb {E}\left[ \Vert X_{\varkappa }^x - x\Vert _H^2 \right] \right) ^{1/2} \\&\qquad + \sqrt{2(\alpha + \beta )}\sum _{n=1}^{N} \left( \int _0^{t+a_{n-1}} \mathrm {e}^{- \varepsilon (t+a_{n-1}-s)}\mathbb {E}\left[ \Vert P_1Y_s^{x,\varkappa } - P_1X_{s + \varkappa }^{x} \Vert _H^2 \right] ds \right) ^{1/2} \\&\quad =: J_1 + J_2. \end{aligned}$$

For the first term \(J_1\) we use \(\varkappa > \frac{1}{2}\) so that

$$\begin{aligned} \sum _{n=1}^{N}\mathrm {e}^{- \frac{\varepsilon }{2}\varkappa (n-1)} \le \sum _{n=0}^{\infty } \mathrm {e}^{- \frac{\varepsilon }{4}n} = \left( 1 - \mathrm {e}^{- \frac{\varepsilon }{4}}\right) ^{-1}, \end{aligned}$$

from which we obtain

$$\begin{aligned} J_1&= \mathrm {e}^{- \frac{\varepsilon }{2} t} \sup _{s \in [0,1]}\left( \mathbb {E}[ \Vert X_{s}^x - x\Vert _H^2] \right) ^{\frac{1}{2}} \sum _{n=1}^{N}\mathrm {e}^{- \frac{\varepsilon }{2}\varkappa (n-1)} \\&\le \sup _{s \in [0,1]}\left( \mathbb {E}[ \Vert X_{s}^x - x\Vert _H^2] \right) ^{\frac{1}{2}} \left( 1 - \mathrm {e}^{- \frac{\varepsilon }{4}}\right) ^{-1} \mathrm {e}^{- \frac{\varepsilon }{2}t}. \end{aligned}$$

To estimate the second term \(J_2\) we first observe that by condition (A2) we have \(P_1Y_s^{x,\tau } = P_1 X_s^x = f(s;x)\) being deterministic and hence by condition (A3), one has for \(s \ge 0\)

$$\begin{aligned} \mathbb {E}\left[ \Vert P_1Y_s^{x,\varkappa } - P_1X_{s + \varkappa }^{x} \Vert _H^2 \right]&\le 2 \Vert P_1 Y_{s}^{x,\varkappa } - \widetilde{f}(x)\Vert _H^2 + 2 \Vert P_1 X_{s+\varkappa }^x - \widetilde{f}(x) \Vert _H^2 \\&\le 4C(x) \mathrm {e}^{-\delta (x) s}. \end{aligned}$$

Hence we find that

$$\begin{aligned}&\int _0^{t + a_{n-1}} \mathrm {e}^{- \varepsilon ( t + a_{n-1} - s)} \mathbb {E}\left[ \Vert P_1Y_s^{x,\varkappa } - P_1 X_{s + \varkappa }^{x} \Vert _H^2 \right] ds \\&\quad \le 4C(x)\int _0^{t + a_{n-1}} \mathrm {e}^{- \varepsilon (t + a_{n-1} - s)} \mathrm {e}^{- \delta (x)s} ds \\&\quad = 4C(x)\mathrm {e}^{- \varepsilon (t + a_{n-1})} \frac{ \mathrm {e}^{(\varepsilon - \delta (x))(t + a_{n-1})} - 1}{\varepsilon - \delta (x)} \\&\quad \le 4C(x)\frac{\mathrm {e}^{- \delta (x)t}}{\varepsilon - \delta (x)} \mathrm {e}^{- \delta (x) a_{n-1}} \\&\quad \le 4C(x) \frac{\mathrm {e}^{- \delta (x)t}}{\varepsilon - \delta (x)} \mathrm {e}^{- \frac{ \delta (x)}{2} (n-1)} \end{aligned}$$

where the last inequality follows from \(a_{n-1} = \varkappa (n-1) \ge \frac{1}{2}(n-1)\). From this we readily derive the estimate

$$\begin{aligned} J_2 \le 2 \sqrt{ \frac{(\alpha + \beta ) C(x)}{ \varepsilon - \delta (x)}} \left( 1 - \mathrm {e}^{- \frac{\delta (x)}{4}} \right) ^{-1}\mathrm {e}^{- \frac{\delta (x)}{2}t}. \end{aligned}$$

Hence, using also (3.5) we obtain

$$\begin{aligned} \mathrm {W}_2(p_{t+\tau }(x,\cdot ), p_t(x,\cdot )) \le K(\alpha , \beta , \varepsilon , x)\mathrm {e}^{- \frac{\delta (x)}{2}t}, \qquad t,\tau \ge 0, \end{aligned}$$
(4.10)

where the constant \(K(\alpha , \beta , \varepsilon , x) > 0\) is given by

$$\begin{aligned} K(\alpha , \beta , \varepsilon , x) = K(\varepsilon )(1 + \Vert x\Vert _H) + 2 \sqrt{ \frac{(\alpha + \beta ) C(x)}{ \varepsilon - \delta (x)}} \left( 1 - \mathrm {e}^{- \frac{\delta (x)}{4}} \right) ^{-1} \end{aligned}$$

with another constant \(K(\varepsilon ) > 0\). This implies that, for each \(x \in H\), \((p_t(x,\cdot ))_{t \ge 0}\) has a limit in \(\mathcal {P}_2(H)\). Denote this limit by \(\pi _{\delta _x}\). Assertion (a) now follows by taking the limit \(\tau \rightarrow \infty \) in (4.10) and using the fact that \(K(\alpha , \beta , \varepsilon , x)\) is independent of \(\tau \).

It remains to prove assertion (b). First observe that, using \(\delta (x) \ge \delta > 0\) and \(C(x) \le C(1 + \Vert x \Vert _H)^4\), we have

$$\begin{aligned} K(\alpha ,\beta , \varepsilon ,x) \le (1 + \Vert x \Vert _H)^2 \widetilde{K}(\alpha ,\beta ,\varepsilon ) \end{aligned}$$

for some constant \(\widetilde{K}(\alpha ,\beta ,\varepsilon )\). Note that

$$\begin{aligned} p_t^*\rho (dy) = \int _H p_t(z,dy)\rho (dz) \ \ \text { and } \ \ p_{t+\tau }^*\rho (dy) = \int _H p_{t+\tau }(z,dy)\rho (dz). \end{aligned}$$

Hence using first the convexity of the Wasserstein distance and then (4.10) we find that

$$\begin{aligned} \mathrm {W}_2(p_{t+\tau }^*\rho , p_t^* \rho )&\le \int _{H} \mathrm {W}_2(p_{t+\tau }(x,\cdot ), p_t(x,\cdot )) \rho (dx) \\&\le \widetilde{K}(\alpha ,\beta , \varepsilon ) \int _H (1 + \Vert x \Vert _H)^2 \rho (dx) \cdot \mathrm {e}^{- \frac{\delta }{2}t}. \end{aligned}$$

Since \(\rho \in \mathcal {P}_2(H)\), the assertion is proved. \(\square \)

4.6 Proof of Corollary 4.4

Proof of Corollary 4.4

Recall that, by condition (A2) the process \(P_1 X_t^x\) solves

$$\begin{aligned} P_1X_t^x = P_1 S(t)P_1x + \int _0^t P_1S(t-s)F(P_1X_s^x)ds. \end{aligned}$$

Since F is globally Lipschitz continuous by condition (A1), it follows that this equation has for each \(x \in H\) a unique solution and is deterministic. From this we readily conclude that \(P_1 X_t^{x}=P_1 X_t^{y}\) holds for all \(t\ge 0\), provided that \(P_1x = P_1y\). Hence Proposition 3.5 yields for such xy

$$\begin{aligned} \mathbb {E}\left[ \Vert X_t^{x}-X_t^{y}\Vert _H^2\right] \le \mathrm {e}^{-\varepsilon t}\Vert x-y\Vert _{H}^2, \qquad \forall t\ge 0. \end{aligned}$$
(4.11)

Then for each \(x,y \in H\) with \(P_1x = P_1y\) and each \(t \ge 0\) we obtain

$$\begin{aligned} \mathrm {W}_2(\pi _{\delta _x}, \pi _{\delta _y})&\le \mathrm {W}_2(\pi _{\delta _x}, p_t(x,\cdot )) + \mathrm {W}_2(p_t(x,\cdot ), p_t(y,\cdot )) + \mathrm {W}_2(p_t(y,\cdot ), \pi _{\delta _y}) \\&\le \mathrm {W}_2(\pi _{\delta _x}, p_t(x,\cdot )) + \mathrm {e}^{- \frac{\varepsilon }{2}t}\Vert x - y \Vert _H + \mathrm {W}_2(p_t(y,\cdot ), \pi _{\delta _y}). \end{aligned}$$

Letting \(t \rightarrow \infty \) yields \(\pi _{\delta _x} = \pi _{\delta _y}\) and hence assertion (a) is proved.

To prove assertion (b), let \(\rho , \widetilde{\rho } \in \mathcal {P}_2(H)\) be such that \(\rho \circ P_1^{-1} = \widetilde{\rho } \circ P_1^{-1}\). Then

$$\begin{aligned} \mathrm {W}_2(\pi _{\rho }, \pi _{\widetilde{\rho }})&\le \mathrm {W}_2( \pi _{\rho }, p_t^* \rho ) + \mathrm {W}_2( p_t^*\rho , p_t^* \widetilde{\rho }) + \mathrm {W}_2( p_t^* \widetilde{\rho }, \pi _{\widetilde{\rho }}) \end{aligned}$$

Again, by letting \(t \rightarrow \infty \), it suffices to prove that

$$\begin{aligned} \limsup _{t \rightarrow \infty }\mathrm {W}_2( p_t^*\rho , p_t^* \widetilde{\rho }) = 0. \end{aligned}$$
(4.12)

Let G be a coupling of \((\rho , \widetilde{\rho })\). Using the convexity of the Wasserstein distance and Proposition 3.5 gives

$$\begin{aligned}&\mathrm {W}_2(p_t^*\rho , p_t^* \widetilde{\rho }) \\&\quad \le \int _{H \times H} \mathrm {W}_2(p_t(x,\cdot ), p_t(y,\cdot )) G(dx,dy) \\&\quad \le \int _{H \times H} \left( \mathbb {E}\left[ \Vert X_t^x - X_t^y \Vert _H^2 \right] \right) ^{1/2} G(dx,dy) \\&\quad \le \int _{H \times H} \mathrm {e}^{-\frac{\varepsilon }{2}t} \Vert x -y\Vert _H G(dx,dy) \\&\qquad + \sqrt{2(\alpha + \beta )}\int _{H \times H} \left( \int _0^{t} \mathrm {e}^{- \varepsilon (t-s)}\mathbb {E}\left[ \Vert P_1X_s^x - P_1X_s^y \Vert _H^2 \right] ds \right) ^{1/2} G(dx,dy) \\&\quad =: I_1 + I_2. \end{aligned}$$

The first term \(I_1\) satisfies

$$\begin{aligned} I_1 \le \left( 2 + \int _{H} \Vert x \Vert _H^2 \rho (dx) + \int _{H} \Vert y \Vert _H^2 \widetilde{\rho }(dy) \right) \mathrm {e}^{- \frac{\varepsilon }{2}t}. \end{aligned}$$

For the second term we first use (A2) so that \(P_1X_s^x = P_1X_s^{P_1x}\), \(P_1X_s^{y} = P_1X_s^{P_1y}\) and hence we find for each \(T > 0\) a constant \(C(T) > 0\) such that for \(t \in [0,T]\)

$$\begin{aligned} I_2&= \sqrt{2(\alpha + \beta )}\int _{H_1 \times H_1} \left( \int _0^{t} \mathrm {e}^{- \varepsilon (t-s)} \Vert P_1X_s^x - P_1X_s^y \Vert _H^2 ds \right) ^{1/2} G(dx, dy) \\&\le C(T) \left( \int _{H \times H} \Vert P_1x - P_1y\Vert _H^2 G(dx,dy) \right) ^{1/2}. \end{aligned}$$

Let us choose a particular coupling G as follows: By disintegration we write \(\rho (dx) = \rho (x_1,dx_0)(\rho \circ P_1^{-1})(dx_1)\), \(\widetilde{\rho }(dx) = \widetilde{\rho }(x_1,dx_0)(\widetilde{\rho } \circ P_1^{-1})(dx_1) = \widetilde{\rho }(x_1,dx_0)(\rho \circ P_1^{-1})(dx_1)\) where \(\rho (x_1,dx_0)\), \(\widetilde{\rho }(x_1,dx_0)\) are conditional probabilities defined on \(\mathcal {B}(H_0)\) and we have used that \((\rho \circ P_1^{-1})(dx_1) = (\widetilde{\rho } \circ P_1^{-1})(dx_1)\). Then G is, for \(A,B \in \mathcal {B}(H)\), given by

$$\begin{aligned} G(A \times B) := \int _{H \times H} \mathbb {1}_{A}(x_0, x_1) \mathbb {1}_{B}(y_0, y_1)\rho (x_1,dx_0)\widetilde{\rho }(y_1,dy_0)\widetilde{G}(dx_1,dy_1), \end{aligned}$$

where \(\widetilde{G}\) is a probability measure on \(H_1^2\) given, for \(A_1,B_1 \in \mathcal {B}(H_1)\), by

$$\begin{aligned} \widetilde{G}(A_1 \times B_1) = (\rho \circ P_1^{-1})(A_1 \cap B_1) = \rho \left( \{ x \in H \ | \ P_1x \in A_1 \cap B_1 \} \right) . \end{aligned}$$

For this particular choice of G we find that

$$\begin{aligned}&\int _{H \times } \Vert P_1 x - P_1y \Vert _H^2 G(dx,dy) \\&\quad = \int _{H_1 \times H_1} \int _{H_0^2}\Vert x_1 - y_1 \Vert _H^2 \rho (x_1,dx_0) \widetilde{\rho }(y_1,dy_0)\widetilde{G}(dx_1,dy_1) \\&\quad = \int _{H_1 \times H_1} \Vert x_1 - y_1 \Vert _H^2 \widetilde{G}(dx_1,dy_1) = 0 \end{aligned}$$

and hence \(I_2 = 0\), since \(\widetilde{G}\) is supported on the diagonal of \(H_1 \times H_1\). This proves (4.14.12) and completes the proof. \(\square \)

5 The Heath–Jarrow–Mortion–Musiela equation

The Heath–Jarrow–Morton–Musiela equation (HJMM-equation) describes the term structure of interest rates in terms of its forward rate dynamics modelled, for \(\beta > 0\) fixed, on the separable Hilbert space of forward curves

$$\begin{aligned} H_{\beta }&= \left\{ h:\mathbb {R}_+\rightarrow \mathbb {R}: h\text { is absolutely continuous and } \Vert h \Vert _{\beta } < \infty \right\} , \nonumber \\ \langle h,g \rangle _{\beta }&= h(\infty )g(\infty ) + \int _0^{\infty } h'(x)g'(x)\mathrm {e}^{\beta x}dx \end{aligned}$$
(5.1)

with norm \(\Vert h \Vert _{\beta }^2 = \langle h, h \rangle _{\beta }\). Such space was first motivated and introduced by Filipovic [15]. Note that \(h(\infty ):=\lim _{x\rightarrow \infty }h(x)\) exists, whenever \(\int _0^{\infty }(h'(x))^2 \mathrm {e}^{\beta x}dx<\infty \). It is called the long rate of the forward curve h. The HJMM-equation on \(H_{\beta }\) is given by

$$\begin{aligned} {\left\{ \begin{array}{ll} dX_t = \left( AX_t+F_{HJMM}(\sigma ,\gamma )(X_t)\right) dt + \sigma (X_t)dW_t + \int _E\gamma (X_{t},\nu )\widetilde{N}(dt,d\nu ),\\ X_0 \in L^2(\Omega , \mathcal {F}_0, \mathbb {P}; H_{\beta }) \end{array}\right. } \end{aligned}$$
(5.2)

where \((W_t)_{t \ge 0}\) is a Q-Wiener process, \(\widetilde{N}(dt,d\nu )\) is a compensated Poisson random measure on E with compensator \(dt \mu (d\nu )\) as defined in Sect. 3.1 for \(H := H_{\beta }\), and

  1. (i)

    A is the infinitesimal generator of the shift semigroup \((S(t))_{t\in \mathbb {R}_+}\) on \(H_{\beta }\), that is \(S(t)h(x):=h(x+t)\) for all \(t,x\ge 0\).

  2. (ii)

    \(h\mapsto \sigma (h)\) is a \(\mathcal {B}(H_{\beta })/\mathcal {B}(L_2^0)\)-measurable mapping from \(H_{\beta }\) into \(L_2^0(H_{\beta })\) and \((h,\nu )\mapsto \gamma (h,\nu )\) is \(\mathcal {B}(H_{\beta })\otimes \mathcal {E}/\mathcal {B}(H_{\beta })\)-measurable mapping from \(H_{\beta }\times E\) into \(H_{\beta }\).

  3. (iii)

    The drift is of the form

    $$\begin{aligned} F_{HJMM}(\sigma ,\gamma )(h) = \sum _{j\in \mathbb {N}}\sigma ^j(h) \Sigma ^j(h)-\int _E\gamma (h,\nu )\left( \mathrm {e}^{\Gamma (h,\nu )}-1\right) \mu (d\nu ), \end{aligned}$$

    with \(\sigma ^j(h) = \sqrt{\lambda _j}\sigma (h)e_j\),

    $$\begin{aligned} \Sigma ^j(h)(t) = \int _0^t\sigma ^j(h)(s)ds \ \text { and } \ \Gamma (h,\nu )(t) = -\int _0^t \gamma (h,\nu )(s)ds. \end{aligned}$$

The special form of the drift stems from mathematical finance and is sufficient for the absence of arbitrage opportunities. We denote the space of all forward rates with long rate equal to zero by

$$\begin{aligned} H_{\beta }^0=\lbrace h\in H_{\beta }: h(\infty )=0\rbrace . \end{aligned}$$

For the construction of a unique mild solution to (5.2) the following conditions have been introduced in [11]:

  1. (B1)

    \(\sigma :H_{\beta }\rightarrow L_2^0(H_{\beta }^0)\), \(\gamma :H_{\beta }\times E \rightarrow H_{\beta '}^0\) are Borel measurable for some \(\beta ' > \beta \).

  2. (B2)

    There exists a function \(\Phi :E\rightarrow \mathbb {R_+}\) such that \(\Phi (\nu )\ge \vert \Gamma (h,\nu )(t)\vert \) for all \(h\in H_{\beta }\), \(\nu \in E\) and \(t\ge 0\).

  3. (B3)

    There is an \(M\ge 0\) such that, for all \(h\in H_{\beta }\), and some \(\beta ' > \beta \)

    $$\begin{aligned} \Vert \sigma (h)\Vert _{L_2^0(H_{\beta })} \le M,\quad \int _E \mathrm {e}^{\Phi (\nu )} \max \lbrace \Vert \gamma (h,\nu )\Vert _{\beta '}^2,\Vert \gamma (h,\nu )\Vert _{\beta '}^4\rbrace \mu (d\nu ) \le M. \end{aligned}$$
  4. (B4)

    The function \(F_2:H_{\beta }\rightarrow H_{\beta }^0\) defined by

    $$\begin{aligned} F_2(h)=-\int _E \gamma (h,\nu ) \left( \mathrm {e}^{\Gamma (h,\nu )}-1 \right) \mu (d\nu ) \end{aligned}$$

    has the weak derivative given by

    $$\begin{aligned} \frac{d}{dx}F_2(h)=\int _E \gamma (h,\nu )^2 \mathrm {e}^{\Gamma (h,\nu )} \mu (d\nu )-\int _E \left( \frac{d}{dx}\gamma (h,\nu )\right) \left( \mathrm {e}^{\Gamma (h,\nu )}-1\right) \mu (d\nu ). \end{aligned}$$
  5. (B5)

    There are constants \(L_{\sigma }, L_{\gamma }>0\) such that, for all \(h_1,h_2 \in H_{\beta }\), we have

    $$\begin{aligned}&\Vert \sigma (h_1)-\sigma (h_2)\Vert _{L_2^0(H_{\beta })}^2 \le L_{\sigma } \Vert h_1 - h_2 \Vert _{\beta }^2, \\&\int _E \mathrm {e}^{\Phi (\nu )} \Vert \gamma (h_1,\nu )-\gamma (h_2,\nu )\Vert _{\beta '}^2 \mu (d\nu )\le L_{\gamma } \Vert h_1 - h_2 \Vert _{\beta }^2. \end{aligned}$$

The following is the basic existence and uniqueness result for the Heath–Jarrow–Morton–Musiela equation (5.2).

Theorem 5.1

[11] Suppose that conditions (B1)–(B5) are satisfied. Then \(F_{HJMM}: H_{\beta } \longrightarrow H_{\beta }^0\) and there exists a constant \(L_{F}>0\) such that, for each \(h_1,h_2\in H_{\beta }\),

$$\begin{aligned} \Vert F_{HJMM}(h_1) - F_{HJMM}(h_2)\Vert _{\beta }^2\le L_F \Vert h_1 -h_2\Vert _{\beta }^2. \end{aligned}$$
(5.3)

This constant can be chosen as

$$\begin{aligned} L_F = \frac{\max (L_{\sigma },L_{\gamma })\sqrt{M}}{\beta }\left( \sqrt{6M\sqrt{2}} +\sqrt{\frac{8}{\beta ^3} + \frac{16}{\beta } }+\sqrt{ \frac{16 (1+\frac{1}{\sqrt{\beta }})^2+48}{(\beta '-\beta )}} \right) . \end{aligned}$$
(5.4)

Moreover, for each initial condition \(h \in L^2(\Omega , \mathcal {F}_0, \mathbb {P}; H_{\beta })\) there is a unique adapted, cádlág mild solution \((X_t)_{t\ge 0}\) to (5.2).

Proof

This result can be found essentially in [11], where the bound on \(L_F\) is an immediate result from its derivation. \(\square \)

Using the space of all functions with zero long rate we obtain the decomposition

$$\begin{aligned} H_{\beta } = H_{\beta }^0 \oplus \mathbb {R}, \qquad h = (h - h(\infty )) + h(\infty ), \end{aligned}$$

where \(h(\infty ) \in \mathbb {R}\) is identified with a constant function. Denote by

$$\begin{aligned} P_0h = h - h(\infty ) \ \text { and } \ P_1 h = h(\infty ) \end{aligned}$$

the corresponding projections onto \(H_{\beta }^0\) and \(\mathbb {R}\), respectively. Such a decomposition of \(H_{\beta }\) was first used in [38] to study invariant measures for the HJMM-equation driven by a Q-Wiener process. An extension to the Lévy driven HJMM-equation was then obtained in [36]. The proof of the next theorem shows that the results of Sect. 4 imply the stability properties of the HJMM-equation as a particular case.

Theorem 5.2

Suppose that conditions (B1)–(B5) are satisfied. If

$$\begin{aligned} \beta > 2 \sqrt{L_F} + L_{\sigma } + L_{\gamma }, \end{aligned}$$
(5.5)

then for each initial distribution \(\rho \) on \(H_{\beta }\) with finite second moments there exists an invariant measure \(\pi _{\rho }\) and it holds that

$$\begin{aligned} \mathrm {W}_2(p_t^*\rho , \pi _{\rho }) \le K\left( 1 + \int _{H_{\beta }} \Vert h \Vert _{\beta }^2 \rho (dh) \right) \mathrm {e}^{- \frac{\beta - 2\sqrt{L_F} - L_{\sigma } - L_{\gamma }}{2} t} \end{aligned}$$
(5.6)

for some constant \(K = K(\beta , \sigma , \gamma ) > 0\). Moreover, given \(\rho , \widetilde{\rho }\) such that \(\rho \circ P_1^{-1} = \widetilde{\rho } \circ P_1^{-1}\), then \(\pi _{\rho } = \pi _{\widetilde{\rho }}\).

Proof

Observe that the assertion is an immediate consequence of Theorem 4.3 and Corollary 4.4. Below we briefly verify the assumptions given in these statements. Condition (A1) follows from (B1), (B5), and (5.3). The growth condition (3.14) is satisfied by (B3) and the fact that \(\Vert \cdot \Vert _{\beta } \le \Vert \cdot \Vert _{\beta '}\) for \(\beta < \beta '\). It is not difficult to see that

$$\begin{aligned} \Vert S(t)h - P_1h \Vert _{\beta } \le \mathrm {e}^{-\frac{\beta }{2}t} \Vert h - P_1h \Vert _{\beta }, \qquad t \ge 0 \end{aligned}$$

and that \((S(t))_{t \ge 0}\) leaves \(H_{\beta }^0\) as well as \(\mathbb {R}\subset H_{\beta }\) invariant. Hence Remark 3.2 yields that

$$\begin{aligned} \langle Ah,h \rangle _{\beta } \le -\frac{\beta }{2}\Vert h\Vert _{\beta }^2 + \frac{\beta }{2} \Vert P_1h\Vert _{\beta }^2, \qquad h \in D(A). \end{aligned}$$

It follows from the considerations in Sect. 2 (see (3.10)) that (GDC) is satisfied for \(\alpha = \frac{\beta }{2} - \sqrt{L_F}\). Consequently, \(\varepsilon = \beta - 2\sqrt{L_F} - L_{\sigma } - L_{\gamma }\) and (3.13) holds due to (5.5). Since the coefficients map into \(H_{\beta }^0\) and \(S(t) P_1 h =h(\infty )=P_1 h\), conditions (A2), (A3) and (4.6) are trivially satisfied. The particular form of the estimate (5.6) follows from the proof of Theorem 4.3. \(\square \)

Comparing our result with [36, 38], we allow for a more general jump noise and prove convergence in the stronger Wasserstein distance with an exponential rate. Moreover, assuming that the volatilities map constant functions onto zero, i.e.

$$\begin{aligned} \sigma (c)\equiv 0,\quad \gamma (c,\nu )\equiv 0, \qquad \forall c \in \mathbb {R}\subset H_{\beta }, \ \nu \in E \end{aligned}$$
(5.7)

shows that \(F(c) \equiv 0\) and hence also (4.4) is satisfied. Hence we may apply Theorem 4.2 to characterize these invariant measures more explicitly. In fact, since \(P_1 h=h(\infty )\) holds for all \(h \in H_\beta \) we get the following corollary.

Corollary 5.3

Suppose that conditions (B1) – (B5) are satisfied, that (5.5) and (5.7) hold. Then

$$\begin{aligned} \mathbb {E}\left[ \Vert X_t - X_0(\infty )\Vert _{\beta }^2 \right] \le \mathbb {E}\left[ \Vert X_0 - X_0(\infty )\Vert _{\beta }^2 \right] \mathrm {e}^{- \left( \beta - 2 \sqrt{L_F} - L_{\sigma } - L_{\gamma }\right) t} \end{aligned}$$

for each \(X_0 \in L^2(\Omega ,\mathcal {F}_0,\mathbb {P};H_{\beta })\).

This Corollary describes a case where the set of multiple invariant measures is explicitly given by the laws of square integrable random variables over the continuum of long rates, including the case of invariant measures \(\delta _{h_0(\infty )}\) for all \(X_0 = h_0 \in H_{\beta }\).

We close this section by applying our results for the particular example discussed before also in [36].

Example 5.4

Take

$$\begin{aligned} \sigma ^1(h)(x) := \int _x^{\infty }\min \left( \mathrm {e}^{- \beta y},\ |h'(y)| \right) dy \end{aligned}$$

and \(\sigma ^j\equiv 0\) for \(j\ge 2\). Then

$$\begin{aligned} \Vert \sigma (h) \Vert _{L_2^0(H_{\beta })}^2 = \Vert \sigma ^1(h)\Vert _{\beta }^2\le \int _0^\infty (\mathrm {e}^{-2\beta x}) \mathrm {e}^{\beta x} dx=\frac{1}{\beta } =:M \end{aligned}$$

and since \(\min (a,b_1)-\min (a,b_2)\le |b_1-b_2|\) for \(a,b_1,b_2\in \mathbb {R}_+\), we also have

$$\begin{aligned} \Vert \sigma (h_1)-\sigma (h_2) \Vert _{L_2^0(H_{\beta })}^2&= \Vert \sigma ^1(h_1)-\sigma ^1(h_2)\Vert _{\beta }^2\\&=\int _0^\infty (\min (\mathrm {e}^{-\beta x},|h_1'(x)|)-\min (\mathrm {e}^{-\beta x},|h_2'(x)|))^2 \mathrm {e}^{\beta x} dx\\&\le \int _0^\infty (h_1'(x)-h_2'(x))^2 \mathrm {e}^{\beta x} dx\\&\le \Vert h_1-h_2\Vert _{\beta }^2. \end{aligned}$$

Consequently, by taking \(\gamma \equiv 0\), the conditions (B1) – (B5) are satisfied with \(L_{\sigma }=1\) and \(L_{\gamma } = 0\) and \(M=\frac{1}{\beta }\) for the Lipschitz and growth constants. By (5.4) we get

$$\begin{aligned} L_F = \frac{1}{\sqrt{\beta ^3}}\left( \sqrt{\frac{6\sqrt{2}}{\beta }} +\sqrt{\frac{8}{\beta ^3} + \frac{16}{\beta } } + \sqrt{ \frac{16 (1+\frac{1}{\sqrt{\beta }})^2+48}{(\beta '-\beta )}} \right) , \end{aligned}$$

for \(\beta ' > \beta \). Choosing \(\beta \ge 3\) and \(\beta ' > \beta \) large enough such that \(L_F < 1\), we find that

$$\begin{aligned} 2 \sqrt{L_F} + L_{\sigma } + L_{\gamma }&< 3 = \beta , \end{aligned}$$

i.e. (5.5) is satisfied. It is clear that \(\sigma (c) \equiv 0\) for each constant function c. Hence Corollary 5.3 is applicable.

6 Stochastic partial differential equations with delay

6.1 Description of the model

Let H be a separable Hilbert space and \((W_t)_{t \ge 0}\) a Q-Wiener process on a stochastic basis \((\Omega , \mathcal {F}, (\mathcal {F}_t)_{t \ge 0}, \mathbb {P})\) with the usual conditions. Below we investigate invariant measures for the stochastic delay equation

$$\begin{aligned} {\left\{ \begin{array}{ll} dX_t = \left( AX_t+G(X_{t+\cdot }) \right) dt + \sigma (X_t,X_{t+\cdot })dW_t, \qquad t > 0 \\ X_0 = \phi _0, X_{0+\cdot }=\phi , \end{array}\right. } \end{aligned}$$
(6.1)

where \(\phi _0 \in L^2(\Omega , \mathcal {F}_0, \mathbb {P};H)\), \(\phi \in L^2(\Omega , \mathcal {F}_0, \mathbb {P}; L^2([-1,0];H))\) and for \(t\ge 1\) the term \(X_{t + \cdot }\) denotes the past segment of the trajectory, i.e.

$$\begin{aligned} X_{t+\cdot }:\, [-1,0]\longrightarrow & {} H \\ s\longmapsto & {} X_{t+s}, \end{aligned}$$

and for \(t\in [0,1)\)

$$\begin{aligned} X_{t+\cdot }:\, [-1,0]\longrightarrow & {} H \\ s\longmapsto & {} \phi (t+s)\mathbf{1}_{[-1,-t)}(s) + X_{t+s}{} \mathbf{1}_{[-t,0]}(s), \end{aligned}$$

and

  1. (i)

    (AD(A)) is the infinitesimal generator of a strongly continuous semigroup \((S(t))_{t \ge 0}\) on H.

  2. (ii)

    \((\psi _0,\psi )\mapsto \sigma (\psi _0,\psi )\) is measurable from \(H \times L^2([-1,0]; H)\) to \(L_2^0(H)\).

  3. (iii)

    \(G:W^{1,2}([-1,0];H)\rightarrow H\) is a continuous linear operator given by the Riemann-Stieltjes integral

    $$\begin{aligned} G \phi := \int _{-1}^0 \eta (ds) \phi (s) \end{aligned}$$

    where \(\eta :[-1,0] \rightarrow L(H)\) is of bounded variation.

Such an equation is usually studied in an extended Hilbert space which also takes the evolution of the past segment \((X_{t+\cdot })_{t \ge 0}\) into account, see [8]. Below we follow this approach. Namely, introduce the new Hilbert space

$$\begin{aligned} \mathcal {H} = H \times L^2([-1,0]; H), \qquad \Vert (\phi _0, \phi ) \Vert _{\mathcal {H}} = \left( \Vert \phi _0 \Vert _H^2 + \Vert \phi \Vert _{L^2([-1,0];H)}^2 \right) ^{1/2}. \end{aligned}$$
(6.2)

Define the operator

$$\begin{aligned} \mathcal {A}_0:=\left( \begin{matrix} A &{} 0 \\ 0 &{} \frac{d}{ds} \end{matrix}\right) \quad D(\mathcal {A})=\lbrace (\phi _0,\phi )^T\in D(A)\times W^{1,2}([0,1];H): \phi (0)=\phi _0\rbrace , \end{aligned}$$

which generates a strongly continuous semigroup \((\mathcal {S}_0(t))_{t\ge 0}\) on \(\mathcal {H}\), given by

$$\begin{aligned} \mathcal {S}_0(t):=\left( \begin{matrix} S(t) &{}\quad 0 \\ S_t &{}\quad T_0(t) \end{matrix}\right) \end{aligned}$$
(6.3)

due to [5, Theorem 3.25]. Here \((T_0(t))_{t\ge 0}\) is the nilpotent left shift semigroup on \(L^2([-1,0];H)\) and

$$\begin{aligned} S_t\phi _0 (\tau ):= {\left\{ \begin{array}{ll} S(t+\tau )\phi _0, &{}\quad -t<\tau \le 0, \\ 0, &{}\quad -1\le \tau \le -t. \end{array}\right. } \end{aligned}$$

It then follows from [5, Theorem 3.29] that the operator \(\mathcal {A}\) with domain \(D(\mathcal {A})=D(\mathcal {A}_0)\) given by

$$\begin{aligned} \mathcal {A}:=\left( \begin{matrix} A &{}\quad G \\ 0 &{}\quad \frac{d}{ds} \end{matrix}\right) =\mathcal {A}_0+\left( \begin{matrix} 0 &{}\quad G \\ 0 &{}\quad 0 \end{matrix}\right) \end{aligned}$$
(6.4)

is the generator of a strongly continuous semigroup \((\mathcal {S}(t))_{t\ge 0}\) on \(\mathcal {H}\). Thus, we can formally identify (6.1) with the \(\mathcal {H}\)-valued SPDE

$$\begin{aligned} {\left\{ \begin{array}{ll} d\mathcal {X}_t= \mathcal {A} \mathcal {X}_t dt+ \Sigma (\mathcal {X}_t)dW_t \\ \mathcal {X}_0=(\phi _0, \phi )^T\quad t \ge 0, \end{array}\right. } \qquad \Sigma (\phi _0,\phi ):= \begin{pmatrix} \sigma (\phi _0,\phi ) &{}\quad 0 \\ 0 &{} \quad 0\end{pmatrix}. \end{aligned}$$
(6.5)

6.2 Main results for (6.5)

Next we proceed to apply the results of this work to the SPDE (6.5). For this purpose we make the following assumption:

  1. (C1)

    There exists an \(L_{\sigma }>0\) such that

    $$\begin{aligned} \Vert \sigma (\phi _0,\phi )-\sigma (\psi _0,\psi )\Vert _{L_0^2(H)}^2 \le L_{\sigma } \left( \Vert \phi _0-\psi _0\Vert _H^2 + \Vert \phi -\psi \Vert _{L^2([-1,0];H)}^2 \right) \end{aligned}$$

    holds for all \((\phi ,\phi _0),(\psi _0,\psi )\in \mathcal {H}\).

  2. (C2)

    The operator (AD(A)) satisfies (GDC) with projection operators \(P_0,P_1\) and constants \(\alpha > 0\), \(\beta \ge 0\).

We will see that condition (C1) implies (A1), condition (C2) will be used to prove that \(\mathcal {A}\) also satisfies (GDC) with respect to a (possibly equivalent) scalar product on \(\mathcal {H}\).

Proposition 6.1

Suppose that conditions (C1), (C2) are satisfied, that \(\eta \) has a jump at \(-1\) and that one of the following conditions hold:

  1. (i)

    G is bounded on \(L^2([-1,0];H)\),z or

  2. (ii)

    \((S(t))_{t \ge 0}\) leaves \(H_0 = \mathrm {ran}(P_0)\) and \(H_1 = \mathrm {ran}(P_1)\) invariant, \(H_0, H_1\) are orthogonal, \(\mathrm {ran}(G) \subset H_1\) and \(GP_0\) extends to a bounded linear operator on \(L^2([-1,0];H)\).

Then for each initial condition \((\phi _0,\phi )\in L^2(\Omega , \mathcal {F}_0, \mathbb {P};\mathcal {H})\) there exists a unique mild solution \((\mathcal {X}_t)_{t\ge 0} \subset L^2(\Omega , \mathcal {F}, \mathbb {P}; \mathcal {H})\) to (6.5).

Proof

Under condition (i) we work on the Hilbert space \(\mathcal {H}\) while under condition (ii) we work on the Hilbert space \(\mathcal {H}^{\tau }\), which is algebraically \(\mathcal {H}\) but equipped with the equivalent norm given by

$$\begin{aligned} \Vert (\phi _0,\phi )\Vert ^2_{\mathcal {H}^\tau }:=\Vert \phi _0\Vert _H^2+\int _{-1}^0\Vert P_0\phi (s)\Vert _H^2 ds+\int _{-1}^0\Vert P_1\phi (s)\Vert _H^2\tau (s) ds, \end{aligned}$$
(6.6)

where

$$\begin{aligned} \tau (r)=\int _{-1}^r \Vert \eta (dr)\Vert _{L(H)}, \qquad r \in [-1,0] \end{aligned}$$
(6.7)

denotes the variation of \(\eta \). Note that due to a result of Webb (see [41] and Remark 6.4 below) this norm is, indeed, equivalent to the original norm on \(\mathcal {H}\). For condition (A1) we first observe that \(L_F = L_{\gamma } = 0\) (as \(F=0\), \(\gamma =0\)) and if assumption (i) holds, then

$$\begin{aligned} \Vert \Sigma (\phi _0, \phi ) - \Sigma (\psi _0, \psi ) \Vert _{L_0^2(\mathcal {H})}^2&\le \Vert \sigma (\phi _0, \phi ) - \sigma (\psi _0, \psi ) \Vert _{L_0^2(H)}^2 \\&\le L_{\sigma } \left( \Vert \phi _0 - \psi _0 \Vert _H^2 + \Vert \phi - \psi \Vert _{L^2([-1,0];H)}^2 \right) \\&= L_{\sigma }\Vert (\phi _0, \phi )^T- (\psi _0, \psi )^T\Vert _{\mathcal {H}}^2. \end{aligned}$$

If condition (ii) holds, then analogously we obtain

$$\begin{aligned} \Vert \Sigma (\phi _0, \phi ) - \Sigma (\psi _0, \psi ) \Vert _{L_0^2(\mathcal {H}^{\tau })}^2&\le L_{\sigma }\Vert (\phi _0, \phi )^T- (\psi _0, \psi )^T\Vert _{\mathcal {H}}^2 \\&\le \max \{1,\tau (0)\}L_{\sigma }\Vert (\phi _0, \phi )^T- (\psi _0, \psi )^T\Vert _{\mathcal {H}^{\tau }}^2. \end{aligned}$$

This shows that condition (A1) is satisfied. Finally, it follows from Proposition 6.5 below that the operator \((\mathcal {A}, D(\mathcal {A}))\) satisfies condition (GDC). \(\square \)

We proceed to formulate our main results on invariant measures for (6.5). For this purpose we introduce the following additional condition:

  1. (C3)

    For each \((\phi _0, \phi ) \in \mathcal {H}\) there exist \(M(\phi _0,\phi ) \ge 1\), \(\delta (\phi _0,\phi ) > 0\) and an element \(\widetilde{f}(\phi _0,\phi ) \in \mathcal {H}\) such that

    $$\begin{aligned} \Vert \mathcal {S}(t)(P_1 \phi _0,\phi )- \widetilde{f}(\phi _0,\phi )\Vert _{\mathcal {H}} \le M(\phi _0,\phi ) \mathrm {e}^{-t\delta (\phi _0,\phi )}, \qquad t \ge 0. \end{aligned}$$

Observe that (C3) corresponds to condition (A3), and is trivially satisfied, if \((\mathcal {S}(t))_{t\ge 0}\) is exponentially stable which is for example the case in the setting of [5, Corollary 5.9].

Introduce the subspaces

$$\begin{aligned} \mathcal {H}_0 := H_0\times \lbrace 0\rbrace \ \ \text { and } \ \ \mathcal {H}_1 := H_1 \times L^2([-1,0];H), \end{aligned}$$

which yield an orthogonal decomposition of \(\mathcal {H}\) with projection operators

$$\begin{aligned}&\mathcal {P}_0: \mathcal {H} \longrightarrow \mathcal {H}_0, \ \ (\phi _0,\phi ) \longmapsto (P_0\phi _0, 0), \\&\mathcal {P}_1: \mathcal {H} \longrightarrow \mathcal {H}_1, \ \ (\phi _0, \phi ) \longmapsto (P_1\phi _0, \phi ). \end{aligned}$$

The following is our main result for this section, and uses that \(\mathcal {A}\) satisfies (GDC), whenever A does, and some additional conditions are satisfied (this technical result is proved below in Proposition 6.5).

Theorem 6.2

Suppose that conditions (C1) – (C3) hold, that \(P_1 \sigma (\phi _0, \phi ) = 0\) for all \((\phi _0, \phi ) \in \mathcal {H}\), and that one of the following conditions are satisfied:

  1. (i)

    G is bounded on \(L^2([-1,0];H)\), \(GP_1 = P_1G\), \((S(t))_{t \ge 0}\) commutes with \(P_1\), and

    $$\begin{aligned} \alpha > 1/2 + L_{\sigma }/2; \end{aligned}$$
  2. (ii)

    \((S(t))_{t \ge 0}\) leaves \(H_0 = \mathrm {ran}(P_0)\) and \(H_1 = \mathrm {ran}(P_1)\) invariant, \(H_0, H_1\) are orthogonal, \(\mathrm {ran}(G) \subset H_1\), \(GP_0\) extends to a bounded linear operator on \(L^2([-1,0];H)\), and

    $$\begin{aligned} \alpha > 1/2 + \max \{1,\tau (0)\}L_{\sigma }/2. \end{aligned}$$

Then the assertions of Theorem 4.3 and Corollary 4.4 are applicable. In particular, for each \(h:=(\phi _0, \phi ) \in L^2(\Omega , \mathcal {F}_0, \mathbb {P};\mathcal {H})\) there exists an invariant measure \(\pi _{Law(h)}\) for the Markov process \((\mathcal {X}_t)_{t \ge 0}\), and this measure satisfies \(\pi _{Law(h)}=\pi _{Law(\mathcal {P}_1h)}\).

Proof

Let us first show that condition (A2) is satisfied, i.e. that \(\mathcal {S}(t)\) leaves \(\mathcal {H}_1\) invariant and \(\mathcal {P}_1\Sigma = 0\). It follows from Lemma 6.6 below that \(\mathcal {P}_1\) commutes with the semigroup \((\mathcal {S}(t))_{t \ge 0}\). Moreover, one has

$$\begin{aligned} \mathcal {P}_1 \Sigma (\phi _0,\phi ) = \begin{pmatrix} P_1 \sigma (\phi _0,\phi ) &{}\quad 0 \\ 0 &{}\quad 0 \end{pmatrix} = 0 \end{aligned}$$

due to \(P_1 \sigma = 0\). This shows that condition (A2) is satisfied. Condition (A3) is immediate by assumption (C3) while, by virtue of Proposition 6.5, (3.13) reduces under condition (i) to

$$\begin{aligned} \varepsilon = 2\left( \alpha - \frac{1}{2} \right) - L_{\sigma } > 0, \end{aligned}$$

and under condition (ii) to

$$\begin{aligned} \varepsilon = 2\left( \alpha - \frac{1}{2} \right) - \max \{1,\tau (0)\}L_{\sigma } > 0. \end{aligned}$$

Altogether we conclude that Theorems 4.3 and 4.4 apply, which proves the assertion. \(\square \)

Remark 6.3

Condition (ii) is slightly more restrictive on the semigroup and the projection operators than condition (i). In contrast to the latter, condition (ii) contains delay operators like point evaluations in \(H_1\), that is \(G =\delta _{-1} P_1\) for \(\delta _{-1} \phi = \phi (-1)\) for \(\phi \in W^{1,2}([-1,0];H_1)\).

6.3 Some technical results

Let us first provide a sufficient and easy to check condition for the operator \(\mathcal {A}\) to satisfy the generalized dissipativity condition (GDC). As a first step we recall a result from [41].

Remark 6.4

(An equivalent scalar product). Let \(\tau \) be defined as in (6.7) and suppose that \(\eta \) has a jump at \(-1\). Suppose that there exists \(c \in \mathbb {R}\) such that \(A-c\) is dissipative. Then the Hilbert space norm defined by

$$\begin{aligned} \Vert (\phi _0,\phi )\Vert _{\mathcal {H}^{\tau }}^2:=\Vert \phi _0\Vert _H^2+\int _{-1}^0 \Vert \phi (s)\Vert _H^2\tau (s) ds \end{aligned}$$

is equivalent to the original one on \(\mathcal {H}\). Moreover, \(\mathcal {A}-\gamma I\) is dissipative for every \(\gamma \ge \max \{0,c+\tau (0)\}\) with respect to this norm, i.e.

$$\begin{aligned} \langle \mathcal {A}(\phi _0,\phi )^T, (\phi _0,\phi ) \rangle _{\mathcal {H}^{\tau }} \le \gamma \Vert (\phi _0,\phi )^T\Vert _{\mathcal {H}^{\tau }}, \qquad \text {for all } (\phi _0,\phi ) \in D(\mathcal {A}). \end{aligned}$$

Based on this observation we can now provide sufficient conditions for \((\mathcal {A},D(\mathcal {A}))\) to satisfy (GDC).

Proposition 6.5

Suppose that A satisfies (GDC) with constants \(\alpha ,\beta \ge 0\).

  1. (i)

    If G extends to a continuous linear operator on \(L^2([-1,0];H)\) and \(\alpha > 1/2\), then \(\mathcal {A}\) satisfies (GDC), i.e.,

    $$\begin{aligned} \langle \mathcal {A}(\phi _0, \phi )^T, (\phi _0, \phi )^T\rangle _{\mathcal {H}} \le - \widetilde{\alpha }\Vert (\phi _0,\phi )\Vert _{\mathcal {H}}^2 + \left( \widetilde{\alpha } + \widetilde{\beta } \right) \Vert \mathcal {P}_1(\phi _0,\phi )^T\Vert _{\mathcal {H}}^2, \end{aligned}$$

    where

    $$\begin{aligned} \widetilde{\alpha } := \alpha - \frac{1 + \varepsilon ^2}{2} \ \ \text { and } \ \ \widetilde{\beta } := \beta + \alpha + \frac{1}{2\varepsilon ^2}\Vert G \Vert _{L(L^2([-1,0];H))}^2 + \frac{\varepsilon ^2}{2}, \end{aligned}$$

    and \(\varepsilon > 0\) is such that \(\varepsilon < \sqrt{2\alpha - 1}\).

  2. (ii)

    Assume that \(H_0, H_1\) provide an orthogonal decomposition of H such the semigroup \((S(t))_{t \ge 0}\) generated by (AD(A)) leaves \(H_0\) and \(H_1\) invariant. Moreover, suppose that \(\mathrm {ran}(G)\subseteq H_1\), and that \(W^{1,2}([-1,0];H) \ni \phi \mapsto GP_0\phi \in H_1\) extends to a continuous linear operator \(GP_0:L^2([-1,0];H)\rightarrow H_1\) with operator norm denoted by \(\Vert GP_0 \Vert \). Define an equivalent Hilbert space norm by (6.6). If \(\alpha > 1/2\), then \(\mathcal {A}\) satisfies (GDC) with respect to this norm, i.e., it holds that

    $$\begin{aligned} \langle \mathcal {A}(\phi _0, \phi )^T, (\phi _0, \phi )^T\rangle _{\mathcal {H}^{\tau }}&\le - \left( \alpha - \frac{1}{2} \right) \Vert (\phi _0, \phi ) \Vert _{\mathcal {H}^{\tau }} \\&\quad + \left( \left( \alpha - \frac{1}{2} \right) + \beta + \tau (0) + \frac{\Vert GP_0\Vert }{2} \right) \Vert \mathcal {P}_1(\phi _0, \phi ) \Vert _{\mathcal {H}^{\tau }}^2 \end{aligned}$$

Proof

(i) For \((\phi _0,\phi )^T\in D(\mathcal {A}_0)\) we have

$$\begin{aligned} \langle \mathcal {A}_0 (\phi _0,\phi )^T,(\phi _0,\phi )^T\rangle _{\mathcal {H}}&= \langle A \phi _0,\phi _0\rangle _H+\int _{-1}^0 \left\langle \frac{d}{ds} \phi (s),\phi (s) \right\rangle _H ds \\&= \langle A \phi _0,\phi _0\rangle _H + \int _{-1}^0 \frac{1}{2}\frac{d}{ds}\Vert \phi (s)\Vert _H^2ds \\&= \langle A \phi _0,\phi _0\rangle _H+\frac{1}{2} (\Vert \phi (0)\Vert _H^2-\Vert \phi (-1)\Vert _H^2) \\&\le \langle A \phi _0,\phi _0\rangle _H+\frac{1}{2} \Vert \phi _0\Vert _H^2, \end{aligned}$$

where we used the fact that \(\phi _0=\phi (0)\). Making further use of the fact that A satisfies (GDC) we find

$$\begin{aligned}&\langle \mathcal {A}_0 (\phi _0,\phi )^T,(\phi _0,\phi )^T\rangle _{\mathcal {H}} \le - \left( \alpha -\frac{1}{2} \right) \Vert \phi _0\Vert _{H}^2 + (\beta +\alpha )\Vert P_1\phi _0\Vert _{H}^2 \\&\quad \le -\left( \alpha -\frac{1}{2}\right) \Vert (\phi _0,\phi )\Vert _{\mathcal {H}}^2 + \left( \beta +2\alpha -\frac{1}{2} \right) \Vert (P_1\phi _0,\phi )^T\Vert _{\mathcal {H}}^2. \end{aligned}$$

To estimate the operator \(\mathcal {A}\) we will use that

$$\begin{aligned} \langle G\phi , \phi _0 \rangle _H&\le \Vert G\phi \Vert _H \Vert \phi _0 \Vert _H \\&\le \frac{1}{2\varepsilon ^2}\Vert G\phi \Vert _{H}^2 + \frac{\varepsilon ^2}{2}\Vert \phi _0 \Vert _H^2 \\&= \frac{1}{2\varepsilon ^2}\Vert G \Vert _{L(L^2([-1,0];H))}^2 \Vert \phi \Vert _{L^2([-1,0];H)}^2 + \frac{\varepsilon ^2}{2}\Vert \phi _0\Vert _H^2 \\&\le \frac{1}{2\varepsilon ^2}\Vert G \Vert _{L(L^2([-1,0];H))}^2 \Vert \mathcal {P}_1(\phi _0, \phi )^T\Vert _{\mathcal {H}}^2 + \frac{\varepsilon ^2}{2} \Vert (\phi _0, \phi )^T\Vert _{\mathcal {H}}^2 \end{aligned}$$

where \(\varepsilon > 0\). Thus we obtain

$$\begin{aligned}&\langle \mathcal {A} (\phi _0,\phi )^T,(\phi _0,\phi )^T\rangle _{\mathcal {H}} \\&\quad = \langle \mathcal {A}_0(\phi _0, \phi )^T, (\phi _0,\phi )^T\rangle _{\mathcal {H}} + \langle G\phi , \phi _0 \rangle _H \\&\quad \le -\left( \alpha -\frac{1}{2}\right) \Vert (\phi _0,\phi )\Vert _{\mathcal {H}}^2 + \left( \beta +2\alpha -\frac{1}{2} \right) \Vert \mathcal {P}_1(\phi _0,\phi )^T\Vert _{\mathcal {H}}^2 \\&\qquad + \frac{1}{2\varepsilon ^2}\Vert G \Vert _{L(L^2([-1,0];H))}^2 \Vert \mathcal {P}_1(\phi _0, \phi )^T\Vert _{\mathcal {H}}^2 + \frac{\varepsilon ^2}{2} \Vert (\phi _0, \phi )^T\Vert _{\mathcal {H}}^2 \\&\quad = - \left( \alpha - \frac{1 + \varepsilon ^2}{2} \right) \Vert (\phi _0,\phi )\Vert _{\mathcal {H}}^2 + \left( \beta +2\alpha -\frac{1}{2} + \frac{1}{2\varepsilon ^2}\Vert G \Vert _{L(L^2([-1,0];H))}^2 \right) \\&\qquad \Vert \mathcal {P}_1(\phi _0,\phi )^T\Vert _{\mathcal {H}}^2. \end{aligned}$$

Assuming \(\varepsilon \) is so small that \(\varepsilon < \sqrt{2\alpha -1}\), we obtain \(\alpha - \frac{1+\varepsilon ^2}{2} > 0\) and

$$\begin{aligned}&\beta +2\alpha -\frac{1}{2} + \frac{1}{2\varepsilon ^2}\Vert G \Vert _{L(L^2([-1,0];H))}^2 \\&\quad = \left( \alpha - \frac{1+ \varepsilon ^2}{2} \right) + \beta + \alpha + \frac{1}{2\varepsilon ^2}\Vert G \Vert _{L(L^2([-1,0];H))}^2 + \frac{\varepsilon ^2}{2} > 0 \end{aligned}$$

which proves the assertion.

(ii) As \(P_0,P_1\) are complementary self-adjoint projections, they induce an orthogonal decomposition \(H = H_0 \oplus H_1\). Thus, for \((\phi _0, \phi ) \in \mathcal {H}\) we have \((\phi _0, \phi ) = (P_0\phi _0, P_0\phi ) + (P_1\phi _0, P_1\phi )\) which gives also an orthogonal decomposition

$$\begin{aligned} \mathcal {H} = \left( H_0 \times L^2([-1,0];H_0) \right) \oplus \left( H_1 \times L^2([-1,0];H_1) \right) . \end{aligned}$$

Applying Remark 6.4 to the Hilbert space \(H_1 \times L^2([-1,0];H_1)\) we find that

$$\begin{aligned} \Vert P_1\phi _0 \Vert _{H^2} + \int _{-1}^0 \Vert P_1\phi (s) \Vert _H^2 \tau (s)ds \end{aligned}$$

gives rise to a norm on \(H_1 \times L^2([-1,0];H_1)\) which is equivalent to the one given by (6.2) when applied to \((P_1\phi _0, P_1\phi )\). Thus, the norm defined in (6.6) is, indeed, equivalent to the original norm on \(\mathcal {H}\). Let \((\phi _0,\phi )\in D(\mathcal {A})\), so that \(\phi (0)=\phi _0\) and we can write:

$$\begin{aligned}&\langle \mathcal {A} (\phi _0,\phi )^T,(\phi _0,\phi )^T\rangle _{\mathcal {H}^\tau } \\&\quad = \left\langle \left( A\phi _0 + G\phi , \frac{d}{ds}\phi \right) ^T, (\phi _0, \phi )^T\right\rangle _{\mathcal {H}^{\tau }} \\&\quad = \langle A\phi _0, \phi _0 \rangle _H + \langle G\phi , \phi _0 \rangle _H \\&\qquad + \int _{-1}^{0} \left\langle \frac{d}{ds}P_0\phi (s), P_0\phi (s) \right\rangle _H ds + \int _{-1}^{0} \left\langle \frac{d}{ds}P_1\phi (s), P_1\phi (s) \right\rangle _H \tau (s) ds \\&\quad = I_1 + I_2 + I_3 + I_4. \end{aligned}$$

For the first term \(I_1\) we use \(\phi _0 = P_0\phi _0 + P_1\phi _0\), then the fact that \(P_0,P_1\) are self-adjoint projection operators and finally \(P_0A = AP_0\), \(P_1A = AP_1\) on D(A) (similarly to the proof of Proposition 3.1) to find that

$$\begin{aligned} I_1&= \langle P_0A\phi _0, P_0\phi _0 \rangle _H + \langle P_1A\phi _0,P_1\phi _0 \rangle _H \\&= \langle AP_0\phi _0, P_0\phi _0 \rangle _H + \langle AP_1 \phi _0, P_1\phi _0 \rangle _H \le - \alpha \Vert P_0 \phi _0 \Vert _H^2 + \langle AP_1\phi _0, P_1\phi _0 \rangle _H, \end{aligned}$$

where the last inequality follows from (GDC) combined with \(P_1P_0\phi _0 = 0\). Likewise, for the second term we use that \(\mathrm {ran}(G) \subset H_1\) so that \(P_0G = 0\) to obtain

$$\begin{aligned} I_2&= \langle G \phi , P_1 \phi _0 \rangle _H \\&= \langle GP_0 \phi , P_1\phi _0 \rangle _H + \langle GP_1\phi , P_1\phi _0 \rangle _H \\&\le \Vert GP_0 \Vert \Vert P_0\phi \Vert _{L^2([-1,0];H)} \Vert P_1 \phi _0\Vert _{H} + \langle GP_1\phi , P_1\phi _0 \rangle _H \\&\le \frac{\Vert GP_0 \Vert }{2}\Vert P_0\phi \Vert _{L^2([-1,0];H)}^2 + \frac{\Vert GP_0 \Vert }{2}\Vert P_1\phi _0 \Vert _H^2 + \langle GP_1\phi , P_1\phi _0 \rangle _H. \end{aligned}$$

For the third term \(I_3\) we obtain

$$\begin{aligned} I_3 = \frac{1}{2}\int _{-1}^{0} \frac{d}{ds}\Vert P_0 \phi (s) \Vert _H^2 ds \le \frac{1}{2} \Vert P_0 \phi _0 \Vert _H^2. \end{aligned}$$

To summarize, we obtain

$$\begin{aligned}&\langle \mathcal {A} (\phi _0,\phi )^T,(\phi _0,\phi )^T\rangle _{\mathcal {H}^\tau } \\&\quad \le - \left( \alpha - \frac{1}{2} \right) \Vert P_0 \phi _0 \Vert _H^2 + \frac{\Vert GP_0 \Vert }{2} \left( \Vert P_1 \phi _0\Vert _H^2 + \Vert P_0\phi \Vert _{L^2([-1,0];H)}^2 \right) \\&\qquad + \langle AP_1 \phi _0, P_1\phi _0 \rangle _H + \langle GP_1\phi , P_1\phi _0 \rangle _H + \int _{-1}^{0} \left\langle \frac{d}{ds}P_1\phi (s), P_1\phi (s) \right\rangle _H \tau (s) ds \\&\quad = - \left( \alpha - \frac{1}{2} \right) \Vert (\phi _0, \phi )^T\Vert _{\mathcal {H}^{\tau }}^2 + \left( \alpha - \frac{1}{2} \right) \Vert P_1 \phi _0 \Vert _H^2 + \left( \alpha - \frac{1}{2} \right) \int _{-1}^0 \Vert P_0 \phi (s) \Vert _H^2 ds \\&\qquad + \left( \alpha - \frac{1}{2} \right) \int _{-1}^0 \Vert P_1 \phi (s) \Vert _H^2 \tau (s) ds + \frac{\Vert GP_0 \Vert }{2}\Vert P_1 \phi _0\Vert _H^2 + \frac{\Vert GP_0 \Vert }{2} \int _{-1}^0 \Vert P_0 \phi (s)\Vert _{H}^2 ds \\&\qquad + \left\langle \mathcal {A}(P_1\phi _0, P_1\phi )^T, (P_1\phi _0, P_1\phi )^T\right\rangle _{\mathcal {H}^{\tau }} \\&\quad \le - \left( \alpha - \frac{1}{2} \right) \Vert (\phi _0, \phi )^T\Vert _{\mathcal {H}^{\tau }}^2 + \left( \alpha - \frac{1}{2} + \gamma + \frac{\Vert GP_0 \Vert }{2} \right) \Vert P_1 \phi _0 \Vert _H^2 \\&\qquad + \left( \alpha - \frac{1}{2} + \frac{\Vert GP_0 \Vert }{2} \right) \int _{-1}^0 \Vert P_0 \phi (s)\Vert _H^2 ds + \left( \alpha - \frac{1}{2} + \gamma \right) \int _{-1}^0 \Vert P_1\phi (s) \Vert _H^2 \tau (s)ds \\&\quad \le - \left( \alpha - \frac{1}{2} \right) \Vert (\phi _0, \phi )^T\Vert _{\mathcal {H}^{\tau }}^2 + \left( \alpha - \frac{1}{2} + \gamma + \frac{\Vert GP_0 \Vert }{2} \right) \Vert \mathcal {P}_1(\phi _0, \phi )^T\Vert _{\mathcal {H}^{\tau }}^2, \end{aligned}$$

where we have used the fact that \(A - \beta I\) is dissipative so that by Remark 6.4 with \(\gamma = \beta + \tau (0)\)

$$\begin{aligned} \left\langle \mathcal {A}(P_1\phi _0, P_1\phi )^T, (P_1\phi _0, P_1\phi )^T\right\rangle _{\mathcal {H}^{\tau }}&\le \gamma \Vert (P_1\phi _0, P_1 \phi )^T\Vert _{\mathcal {H}^{\tau }}^2 \\&= \gamma \Vert P_1 \phi _0 \Vert _H^2 + \gamma \int _{-1}^0 \Vert P_1 \phi (s) \Vert _H^2 \tau (s)ds. \end{aligned}$$

This proves the assertion. \(\square \)

The next result has been used in the previous proof.

Lemma 6.6

Consider the setting of stochastic delay equation, i.e., let \((\mathcal {S}_0(t))_{t \ge 0}\), \((\mathcal {S}(t))_{t \ge 0}\), \((\mathcal {A}_0, D(\mathcal {A}_0))\), \((\mathcal {A}, D(\mathcal {A}))\), G, \(P_1\) as in Sects. 6.1, 6.2 and Theorem 6.2. In particular suppose that \(P_1S(t) = P_1S(t)\), where S(t) is given in (6.3). Then

$$\begin{aligned} \mathcal {P}_1\mathcal {S}(t) = \mathcal {S}(t) \mathcal {P}_1, \qquad t \ge 0. \end{aligned}$$

Proof

Let us first consider the case where G satisfies assumption (i) from Proposition 6.5, i.e. G is bounded from \(L^2([-1,0];H)\) to H. Since G is bounded we obtain from the bounded perturbation theorem (the Dyson-Phillips series) the representation

$$\begin{aligned} \mathcal {S}(t) = \sum _{n=0}^{\infty }\mathcal {S}_0^{(n)}(t), \end{aligned}$$

where the series converges in \(L(\mathcal {H})\), and \(\mathcal {S}_0^{(n)}(t)\) is inductively defined by

$$\begin{aligned} \mathcal {S}_0^{(0)}(t) = \mathcal {S}_0(t), \qquad \mathcal {S}_0^{(n+1)}(t) = \int _0^t \mathcal {S}_0^{(n)}(s)\begin{pmatrix}0 &{}\quad G \\ 0 &{}\quad 0 \end{pmatrix}\mathcal {S}_0(t-s)ds. \end{aligned}$$

Thus it suffices to prove that

$$\begin{aligned} \mathcal {P}_1 \mathcal {S}_0^{(n)}(t) = \mathcal {S}_0^{(n)}(t)\mathcal {P}_1, \qquad n \ge 1, \ \ t \ge 0. \end{aligned}$$
(6.8)

For \(n = 0\) we use the particular form of \(\mathcal {P}_1\) and \(\mathcal {S}_0(t)\) to find that

$$\begin{aligned} \mathcal {P}_1 \mathcal {S}_0(t)(\phi _0,\phi )^T&= \mathcal {P}_1 \begin{pmatrix}S(t)\phi _0 \\ S_t\phi _0 + T_0(t)\phi \end{pmatrix} \\&= \begin{pmatrix}P_1S(t)\phi _0 \\ S_t\phi _0 + T_0(t)\phi \end{pmatrix} = \begin{pmatrix}S(t)P_1\phi _0 \\ S_t\phi _0 + T_0(t)\phi \end{pmatrix} = \mathcal {S}_0(t)\mathcal {P}_1(\phi _0,\phi )^T, \end{aligned}$$

where we have used the assumption that S(t) commutes with \(P_1\). Now suppose that (6.8) holds for some \(n \ge 0\). Then

$$\begin{aligned} \mathcal {P}_1\mathcal {S}_0^{(n+1)}(t)&= \mathcal {P}_1 \mathcal {S}_0(t) + \int _0^t \mathcal {P}_1 \mathcal {S}_0^{(n)}(s)\begin{pmatrix}0 &{}\quad G \\ 0 &{}\quad 0 \end{pmatrix}\mathcal {S}_0(t-s)ds \\&= \mathcal {S}_0(t) \mathcal {P}_1 + \int _0^t\mathcal {S}_0^{(n)}(s)\begin{pmatrix}0 &{}\quad G \\ 0 &{}\quad 0 \end{pmatrix}\mathcal {S}_0(t-s) \mathcal {P}_1ds = \mathcal {S}_0^{(n+1)}(t)\mathcal {P}_1, \end{aligned}$$

where we have used that

$$\begin{aligned} \mathcal {P}_1 \begin{pmatrix}0 &{}\quad G \\ 0 &{}\quad 0 \end{pmatrix}(\phi _0,\phi )^T&= \mathcal {P}_1 (G\phi , 0)^T\\&= (P_1G\phi , 0) = (GP_1\phi ,0) = \begin{pmatrix}G &{}\quad 0 \\ 0 &{}\quad 0 \end{pmatrix} \mathcal {P}_1(\phi _0, \phi )^T. \end{aligned}$$

This completes the proof for the case where \(G: L^2([-1,0];H) \longrightarrow H\) is bounded.

Let us now consider the case where condition (ii) from Proposition 6.5 holds. Following [5, Theorem 3.29] we know that the semigroup \((\mathcal {S}(t))_{t \ge 0}\) is constructed as a Miyadera-Voigt perturbation and hence has due to [14, Chapter III, Corollary 3.15] a series representation of the form

$$\begin{aligned} \mathcal {S}(t) = \sum _{n=0}^{\infty }\overline{V}^n\mathcal {S}_0(t), \end{aligned}$$

where \(\overline{V}\) denotes the closure of the operator

$$\begin{aligned} F \longmapsto VF(t) := \int _0^t F(s) \begin{pmatrix}0 &{}\quad G \\ 0 &{}\quad 0 \end{pmatrix} \mathcal {S}_0(t-s)ds, \end{aligned}$$

where \(F \in C([0,t_0]; L_s(\mathcal {H}))\) (for some small but fixed \(t_0 > 0\)) and \(L_s(\mathcal {H})\) denotes the space of bounded linear operators over \(\mathcal {H}\) equipped with the strong operator topology. Following the same computations as in the first case, we can prove that \(\mathcal {P}_1 V^n\mathcal {S}_0(t) = V^n \mathcal {S}_0(t)\mathcal {P}_1\) and hence \(\mathcal {P}_1 \overline{V}^n\mathcal {S}_0(t) = \overline{V}^n \mathcal {S}_0(t) \mathcal {P}_1\). This proves the assertion also in this case. \(\square \)