1 Introduction

Pearson diffusions [29] constitute a family of stochastic processes X(t) that are solutions of Stochastic Differential Equations (SDEs) of the form,

$$\begin{aligned} dX(t)=\mu (X(t))dt+\sigma (X(t))dW(t), \end{aligned}$$

where W(t) is a standard Brownian motion, \(\mu (x)\) is a polynomial of degree at most 1, \(\sigma ^2(x)\) is a polynomial of degree at most 2 and the equation has to be considered in the context of Itô calculus. In particular, if such equation admits a weak ergodic solution with diffusion space \(E \subseteq {\mathbb {R}}\), then its stationary distribution admits a density function m(x) that satisfies the so called Pearson equation [58]:

$$\begin{aligned} \frac{m'(x)}{m(x)}=\frac{b_0+b_1x}{d_0+d_1x+d_2x^2}, \ x \in E \end{aligned}$$

where the coefficient are determined by \(\mu (x)\) and \(\sigma ^2(x)\). For this reason, such kind of diffusions are statistically tractable with many different tools. For instance, they naturally arise while studying non-linear time series linked to dynamical models with random perturbations (see, for instance, [55]). An important tool to deal with Pearson diffusions is the spectral decomposition, proposed in [47]. Indeed Pearson diffusions can be subdivided in three spectral categories depending on the spectrum of the generator. In the first spectral category, we have Pearson diffusions whose generator admit purely discrete spectrum and then a spectral decomposition of the transition density of the process follows easily. In the second spectral category, we have Pearson diffusions whose generator admits a discrete spectrum and an absolutely continuous spectrum (both simple) separated by a cut-off value. Such Pearson diffusions are more difficult to study due to the fact that the eigenfunctions in the continuous spectrum are expressed in terms of hypergeometric functions, as shown in [45] and [11]. Finally, in the third spectral category we only have Student diffusions, whose generator admits a simple discrete spectrum and an absolutely continuous spectrum of multiplicity two separated by a cut-off value. This case has been studied in [46] and [5]. Second and third spectral category go under the name of heavy tailed Pearson diffusions and their spectral properties have been analysed in [10].

In the modern theory of diffusion processes, anomalous diffusions have shown to be quite useful (see [32, 53]) and stochastic models for such diffusions have been widely studied (see [31, 50, 64]). Anomalous diffusions models naturally arise when studying the motion of particles in heterogeneous media. For such kind of models, different analytic techniques are known (see, for instance, [21]). Such models fall into the class of fractional motions, which are described via a Langevin-type equation (see [28]). A class of anomalous diffusions can be seen as limit of continuous time random walks whose inter-jump times have infinite mean. Such limits lead to the appearance of the fractional Riemann-Liouville or the fractional Caputo derivative in the Fokker–Planck equation (see for instance [51, 69], but also [53, Equation (101)], that can be rewritten in terms of Caputo derivatives). Despite Laplace transforms methods let us obtain the solution up to inverse Laplace transform, fractional Fokker–Planck equations are not usually solved explicitly. In this context, the spectral decomposition of Pearson diffusions has been revealed to be a quite powerful tool to explicitly express strong solutions of fractional diffusion equations. This lead to the introduction of fractional Pearson diffusions: in [43] the first spectral category was covered, in [44] the authors focused on the second spectral category, while, up to our knowledge, there were no known methods to extend the spectral decomposition method to the fractional case in the third spectral category. In the second case, to express strong solutions of fractional Kolmogorov equations, the semigroup approach presented in [13] has been used. In both cases, the stochastic representation of such solutions is given by means of time-changed Pearson diffusions (with inverse stable subordinators). Similar strategies have been shown to work for lattice approximation of fractional Pearson diffusions: in [7] the spectral decomposition of a fractional immigration-death process (that is the lattice approximation of the Ornstein-Uhlenbeck process) is presented. On the other hand, subordinated Pearson diffusions (in particular the subordinated Jacobi process) have been shown to be useful tools to obtain large deviation principles (see [26]). Let us also remark that spectral properties of time-changed Markov processes can be used to determine the behaviour of their correlation structure, as shown in [42, 56].

However, the classical fractional model is not the unique way one can achieve an anomalous diffusion. Indeed, anomalous diffusions can be introduced also by considering more general relaxation patterns than the exponential or the Mittag-Leffler ones (see [54]). Different inter-jump times in continuous time random walks and different relaxation patterns lead to a wider class of anomalous diffusions obtained via subordination of a random process (see [70]). In this case, one has to consider a more general non-local derivative in place of the Caputo one in Fokker–Planck equations. In [37] and [72] other non-local derivatives have been constructed, with the aid of Bernstein functions [65], that are the Laplace exponents of subordinators. A non-local derivative that can be defined in this way is, for instance, the tempered fractional derivative, which is an alternative to the classical fractional derivative that preserve the finiteness of the mean of the associated subordinator. The tempered model has been widely used to describe anomalous diffusions in heterogeneous media (in particular in geophysics, as in [48, 76]) and its properties are well-known (see, for instance, [2]). A wide discussion on relaxation patterns can be found in [52]. Generalized fractional calculus is strictly linked with the definition of time-changed Markov processes. Indeed, for instance, in [23, 72] a link between abstract generalized fractional differential equations and time-changed semigroups is established, while in [25] properties of the Green measures of time-changed Markov properties are exploited. Moreover, such non-local derivatives have been used for instance to define a class of time-non-local birth-death processes (see [8]) whose stationary distributions fall into the Katz family, which are discrete analogous of Pearson diffusions of the first spectral category. On the other hand, a first generalization to the general time-non-local setting of a Pearson diffusion (in particular the Ornstein-Uhlenbeck process) has been achieved in [30]. Let us also recall that in [26] the authors study a subordinated Jacobi process (with an inverse Gaussian subordinator), thus leading to a time-changed version of the aforementioned process.

The aim of the paper is to give a unified framework for working with time-non-local generalizations of Pearson diffusions and time-non-local equations that are linked to them, exhibiting exact strong solutions for a particular class of time-non-local advection-diffusion equations. In particular, we focus on time-non-local Cauchy problems of the form

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t^\varPhi u(t,y)={\mathcal {G}}u(t,y) &{} t>0, \ y \in E \\ u(0,y)=g(y) &{} y \in E \end{array}\right. } \end{aligned}$$
(1)

and

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t^\varPhi v(t,x)={\mathcal {F}}v(t,x) &{} t>0, \ x \in E \\ v(0,x)=f(x) &{} x \in E \end{array}\right. } \end{aligned}$$
(2)

where \({\mathcal {G}}\) and \({\mathcal {F}}\) are respectively the generator and the Fokker–Planck operator of a Pearson diffusion and \(\partial _t^\varPhi \) is a Caputo-type non-local derivative linked to a Bernstein function \(\varPhi \). We use the notion of time-changed process to introduce the family of time-non-local Pearson diffusions and then provide the spectral decomposition of the transition densities of such diffusions. Moreover, we use both the spectral decomposition and a semigroup approach to exploit strong solutions of the aforementioned time-non-local Cauchy problems. Let us remark that in this paper we cover also the case of the third spectral category, for which spectral results for time-non-local generalizations were (up to our knowledge) unknown even in the standard fractional case.

Section 2 is dedicated to a complete recap on spectral properties of classical Pearson diffusions, while Sect. 3 presents some basic properties of Bernstein functions, inverse subordinators and non-local derivatives. We also specify the semigroup theory results given in [13, 23, 72] to the case of complete Bernstein functions, in which better regularity can be proven. The proof of such result is quite technical, thus it is left in Appendix A. The definition and some preliminary properties of time-non-local Pearson diffusions are presented in Sect. 4, with particular attention to the existence of a transition probability density. In Sect. 5 we focus on time-non-local Pearson diffusions of the first spectral category, extending the results given in [43] to general Bernstein functions. In Sect. 6 we consider time-non-local Pearson diffusions of the second (adapting the approach given in [44]) and the third spectral categories. Section 7 is devoted to stochastic representation results of strong solutions of Eqs. (1) and (2). Finally, in Sect. 8 we explore some further properties of the time-non-local Pearson diffusions, as limit distributions, first-order stationarity and long/short-range dependence.

2 Pearson Diffusions and Their Spectral Classification

From now on let us fix a filtered probability space \((\varOmega , \varSigma , {\mathcal {F}}_t, {\mathbb {P}})\). Let us give the definition of Pearson diffusion as presented in [29].

Definition 1

A Pearson diffusion X(t) is a diffusion process satisfying the following stochastic differential equation

$$\begin{aligned} dX(t)=\mu (X(t))dt+\sigma (X(t))dW(t), \end{aligned}$$
(3)

where W(t) is a standard Brownian motion and \(\mu (x)\) and \(\sigma ^2(x)\) are polynomials respectively of at most first and second degree. The stochastic differential equation is interpreted in the sense of Itô calculus.

In particular let us set

$$\begin{aligned} \mu (x)=a_0+a_1x&D(x)=\frac{\sigma ^2(x)}{2}=d_0+d_1x+d_2x^2. \end{aligned}$$

The diffusion space of X(t) is given by an interval \(E=(l,L)\) in which the polynomial D(x) is positive (hence the function \(\sigma (x)\) is real and non-zero).

Moreover, let us recall that the family of operators \((T(t))_{t \ge 0}\) acting on \(C_0(E)\), i.e. the space of continuous functions on the closure \({\bar{E}}\) of E that are 0 at infinity (if E is bounded then \(C_0(E)=C({\bar{E}})\)), given by

$$\begin{aligned} T(t)f(x)={\mathbb {E}}_x[f(X(t))], \end{aligned}$$

where \({\mathbb {P}}_x(\cdot )={\mathbb {P}}(\cdot |X(0)=x)\), is a uniformly bounded strongly continuous \(C_0\)-semigroup on \(L^2(E)\) (this is proved, for instance, in [43, 44]). In particular its generator \({\mathcal {G}}\) is defined as

$$\begin{aligned} {\mathcal {G}}g(y)=\left[ \mu (y)\frac{d }{d y}+D(y)\frac{d^{2} }{d y^{2}}\right] g(y), \end{aligned}$$

with operator core \(C^2_b(E)\), that is the space of bounded continuous functions on E with bounded continuous first and second derivatives.

Concerning the process X(t), the transition probability density p(txy) is well defined for \(x,y \in E\). In particular, it is solution of the following Cauchy problem (the backward problem)

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{\partial p}{\partial t}(t,x;y)={\mathcal {G}}p(t,x;y) &{} t>0, \ y \in E\\ p(0,x;y)=\delta _x(y) &{} y \in E \end{array}\right. } \end{aligned}$$
(4)

where \(\delta _x\) is a Dirac delta centred in \(x \in E\).

On the other hand, we can also define the Fokker–Planck operator, or just forward operator, as

$$\begin{aligned} {\mathcal {F}}f(x)=-\frac{d }{d x}(\mu (x)f(x))+\frac{d^{2} }{d x^{2}}(D(x)f(x)), \end{aligned}$$

with operator core \(C^2_b(E)\). In particular, p(txy) is also solution of the Cauchy problem (the forward problem):

$$\begin{aligned} {\left\{ \begin{array}{ll} \frac{\partial p}{\partial t}(t,x;y)={\mathcal {F}}p(t,x;y) &{} t>0, \ x \in E\\ p(0,x;y)=\delta _y(x) &{} x \in E. \end{array}\right. } \end{aligned}$$
(5)

In particular, if the process X(t) admits a stationary measure \({\mathbf {m}}\), then its density m(x) must satisfy a stationary version of the forward equation, that becomes:

$$\begin{aligned} \frac{m'(x)}{m(x)}=\frac{(a_0-d_1)+(a_1-2d_2)x}{d_0+d_1x+d_2x^2}. \end{aligned}$$
(6)

Such equation is called Pearson equation, by the fact that it was introduced in [58] to classify some important classes of distributions. From now on we will only work with Pearson diffusions that admit a stationary measure that is also the limit measure of the process. In particular this means that, up to a re-parametrization, \(\mu (x)=-b_0(x-b_1)\) for some \(b_0>0\) and \(b_1 \in {\mathbb {R}}\).

After such observation, we can recognize six different Pearson diffusions depending on the coefficients of D:

  • If \(D\equiv d_0\), then X(t) is a Ornstein–Uhlenbeck (OU) process and the stationary distribution is a Gaussian distribution;

  • If \(D(x)=d_0+d_1x\), then X(t) is a Cox–Ingersoll–Ross (CIR) process and the stationary distribution is a Gamma distribution;

  • If \(D(x)=d_0+d_1x+d_2x^2\) with \(d_2<0\), then X(t) is a Jacobi process and the stationary distribution is a Beta distribution;

  • If \(D(x)=d_0+d_1x+d_2x^2\) with \(d_2>0\) and the discriminant \(\varDelta _D>0\), then X(t) is a Fisher–Snedecor (FS) process and the stationary distribution is a Fisher–Snedecor distribution;

  • If \(D(x)=d_0+d_1x+d_2x^2\) with \(d_2>0\) and the discriminant \(\varDelta _D=0\), then X(t) is a reciprocal Gamma (RG) process and the stationary distribution is a reciprocal Gamma distribution;

  • If \(D(x)=d_0+d_1x+d_2x^2\) with \(d_2>0\) and the discriminant \(\varDelta _D<0\), then X(t) is a Student process and the stationary distribution is a Student distribution.

Depending on the property of the spectrum of the generator \({\mathcal {G}}\), in [44] these distributions were subdivided in three categories depending on the spectral category of Linetski classification (see [47, Theorem 3.2]):

  • The first spectral category contains the OU, the CIR and the Jacobi processes: their generator \({\mathcal {G}}\) admits purely discrete spectrum with infinitely many single non-negative eigenvalues \((\lambda _n)_{n \in {\mathbb {N}}}\);

  • The second spectral category contains the FS and the RG processes: their generator \({\mathcal {G}}\) admits a discrete part and an absolutely continuous part that are disjoint and both of multiplicity one;

  • The third spectral category contains only the Student processes: its generator \({\mathcal {G}}\) admits a discrete part of multiplicity one and a disjoint absolutely continuous part of multiplicity two.

Such classification is based on the oscillatory/non-oscillatory behaviour of the endpoints of the Sturm–Liouville equations \({\mathcal {G}}f=-\lambda f\) (see [3, 27, 74]). Moreover, let us stress out that for the eigenvalues \(\lambda _n\) in the discrete spectrum of \({\mathcal {G}}\), the eigenfunctions are given by classical orthogonal polynomials \(P_n\), each one of degree n.

Let us give some details on each Pearson diffusion. In particular we will re-parametrize again the polynomials D(x) and \(\mu (x)\) in a form that will make the writing of the parameters of the stationary distributions easier. Moreover, we define the normalized polynomials, where the normalization constant is chosen with respect to the stationary density m(x), which is also the orthogonality density of the polynomials, i.e.

$$\begin{aligned} \int _E P_n(x)P_l(x)m(x)dx=\delta _{n,l}, \qquad n, l \in \{0, 1, \dots \} \end{aligned}$$

where \(\delta _{n,l}\) is the Kronecker delta symbol. In the following we will denote by \(Q_n\) the normalized polynomials and with \(K_n\) the normalization constants.

2.1 Pearson Diffusions of Spectral Category I

2.1.1 The OU Process

The OU process is solution of the SDE

$$\begin{aligned} dX(t)=-\theta (X(t)-\mu )dt+\sqrt{2\theta \sigma ^2}dW(t), \ t \ge 0 \end{aligned}$$

as \(\theta >0\) and \(\mu ,\sigma \in {\mathbb {R}}\). The diffusion space is given by \(E={\mathbb {R}}\) and its stationary density is a Gaussian one, given by

$$\begin{aligned} m(x)=\frac{1}{\sqrt{2\pi \sigma ^2}}e^{-\frac{(x-\mu )^2}{2\sigma ^2}}, \ x \in {\mathbb {R}}\end{aligned}$$

Concerning the eigenvalue equation \({\mathcal {G}}f=-\lambda f\), it admits solutions for \(\lambda _n=\theta n\) as \(n \ge 0\) and its solutions are the Hermite polynomials (see [66]), defined by means of the Rodrigues formula (see [34])

$$\begin{aligned} H_n(x)=(-1)^n(m(x))^{-1}\frac{d^{n} }{d x^{n}}m(x), \ x \in {\mathbb {R}}, \ n \in {\mathbb {N}}_0 \end{aligned}$$

with normalization constants \(K_n=\frac{\sigma ^n}{\sqrt{n!}}\).

2.1.2 The CIR Process

The CIR process is solution of the SDE

$$\begin{aligned} dX(t)=-\theta \left( X(t)-\frac{b}{a}\right) dt+\sqrt{\frac{2\theta }{a}X(t)}dW(t), \ t \ge 0 \end{aligned}$$

where \(\theta ,a,b>0\). The diffusion space is given by \(E=(0,+\infty )\) and its stationary density is the Gamma one, given by

$$\begin{aligned} m(x)=\frac{a^b}{\varGamma (b)}x^{b-1}e^{-ax}, \ x>0. \end{aligned}$$

Even in this case, the eigenvalue equation \({\mathcal {G}}f=-\lambda f\) admits solutions for \(\lambda _n=\theta n\). The eigenfunctions are given by some linear modifications of Laguerre polynomials (see [66]) \(L_n^{(b-1)}(ax)\) for \(x>0\) and \(n \in {\mathbb {N}}\), where the Laguerre polynomials \(L_n^{(\gamma )}(x)\) are defined by the Rodrigues formula (see [34])

$$\begin{aligned} L_n^{(\gamma )}(x)=\frac{1}{n!}x^{-\gamma }e^x\frac{d^{n} }{d x^{n}}x^{n+\gamma }e^{-x}, \ x \in {\mathbb {R}}, \ \gamma >-1, \ n \in {\mathbb {N}}_0, \end{aligned}$$

with normalization constants \(K_n=\sqrt{\frac{\varGamma (b)n!}{\varGamma (b+n)}}\).

2.1.3 The Jacobi Process

The Jacobi process is solution of the SDE

$$\begin{aligned} dX(t)=-\theta \left( X(t)-\frac{b-a}{a+b+2}\right) dt+\sqrt{\frac{2\theta }{a+b+2}(1-X^2(t))}dW(t), \ t \ge 0 \end{aligned}$$

as \(a,b>-1\). The diffusion space is \(E=(-1,1)\) and its stationary density is a Beta one

$$\begin{aligned} m(x)=(1-x)^a(1+x)^b\frac{\varGamma (a+b+2)}{\varGamma (a+1)\varGamma (b+1)2^{a+b+1}}. \end{aligned}$$

The eigenvalue equation \({\mathcal {G}}f=-\lambda f\) admits solutions for \(\lambda _n=\frac{n \theta (n+a+b+1)}{a+b+2}\) and the eigenfunctions are Jacobi polynomials defined by the Rodrigues formula (see [34])

$$\begin{aligned} P^{(a,b)}_n(x)=\frac{(-1)^n}{2^nn!}(1-x)^{-a}(1+x)^{-b}\frac{d^{n} }{d x^{n}}[(1-x)^{a+n}(1+x)^{b+n}] \end{aligned}$$

with normalization constants \(K_n=\sqrt{\frac{2^{a+b+1}\varGamma (n+a+1)\varGamma (n+b+1)}{(2n+a+b+1)\varGamma (n+a+b+1)n!}}\). Let us recall that some properties of statistical interest for the Jacobi process have been highlighted in [26].

2.2 Pearson Diffusions of Spectral Category II

2.2.1 The FS Process

The FS process is solution of the SDE

$$\begin{aligned} dX(t)=-\theta \left( X(t)-\frac{\beta }{\beta -2}\right) dt+\sqrt{\frac{4\theta }{\alpha (\beta -2)}X(t)(\alpha X(t)+\beta )}dW(t), \ t \ge 0 \end{aligned}$$

as \(\alpha ,\theta >0\) and \(\beta >2\). The diffusion space is \(E=(0,+\infty )\) and its stationary density is a Fisher–Snedecor one:

$$\begin{aligned} m(x)=\frac{\left( \frac{\alpha x}{\alpha x +\beta }\right) ^{\frac{\alpha }{2}}\left( \frac{\beta }{\alpha x +\beta }\right) ^{\frac{\beta }{2}}}{xB\left( \frac{\alpha }{2},\frac{\beta }{2}\right) }, \ x>0. \end{aligned}$$

A spectral analysis of the FS process has been carried in [11]. In particular it is not difficult to see that the Fisher–Snedecor density admits finite even moments up to \(2N_1\) as \(N_1=\lfloor \frac{\beta }{4}\rfloor \). Thus we have a finite number of simple eigenvalues in the discrete spectrum of \(-{\mathcal {G}}\). The eigenvalues are given by

$$\begin{aligned} \lambda _n=\frac{\theta }{\beta -2}n(\beta -2n), \ n=0,\ldots , \left\lfloor \frac{\beta }{4}\right\rfloor \end{aligned}$$

and the respective eigenfunctions are the Fisher–Snedecor polynomials \(F_n^{(\alpha ,\beta )}(x)\), defined by the Rodrigues formula

$$\begin{aligned} F_n^{(\alpha ,\beta )}(x)=x^{1-\frac{\alpha }{2}}(\alpha x+ \beta )^{\frac{\alpha }{2}+\frac{\beta }{2}}\frac{d^{n} }{d x^{n}}\left[ 2^nx^{\frac{\alpha }{2}+n-1}(\alpha x+\beta )^{n-\frac{\alpha }{2}-\frac{\beta }{2}}\right] \end{aligned}$$

with normalizing constant

$$\begin{aligned} K_n=(-1)^n\sqrt{\frac{B\left( \frac{\alpha }{2},\frac{\beta }{2}\right) }{n! (2\beta )^{2n}B\left( \frac{\alpha }{2}+n,\frac{\beta }{2}-2n\right) }\left[ \prod _{k=1}^{n}\left( \frac{\beta }{2}+k-2n\right) ^{-1}\right] }. \end{aligned}$$

The absolutely continuous spectrum is \(\sigma _{ac}(-{\mathcal {G}})=(\varLambda _1,+\infty )\) where the cut-off \(\varLambda _1\) is given by

$$\begin{aligned} \varLambda _1=\frac{\theta \beta ^2}{8(\beta -2)}. \end{aligned}$$

For \(\lambda \in (\varLambda _1,+\infty )\), the fundamental solutions of the Sturm–Liouville equation \({\mathcal {G}}f=-\lambda f\) are given in terms of hypergeometric functions. A detailed study is made in [11]. In any case, the solution that appears in the absolutely continuous part of the spectral decomposition is

$$\begin{aligned} f_1(x,-\lambda )={}_2F_1\left( -\frac{\beta }{4}+\varDelta _1(\lambda ),-\frac{\beta }{4}-\varDelta _1(\lambda );\frac{\alpha }{2};-\frac{\alpha }{\beta }x\right) \end{aligned}$$

where

$$\begin{aligned} \varDelta _1(\lambda )=\sqrt{\frac{\beta ^2}{16}-\frac{\lambda (\beta -s)}{2\theta }}. \end{aligned}$$

Let us recall that the solution \(f_1\) is found by making use of the theory of Kummer’s solutions for Hypergeometric Equations (see [59, 68]).

2.2.2 The RG Process

The RG process is solution of the SDE

$$\begin{aligned} dX(t)=-\theta \left( X(t)-\frac{\alpha }{\beta -1}\right) dt+\sqrt{\frac{2\theta }{\beta -1}X^2(t)}dW(t), \ t \ge 0 \end{aligned}$$

as \(\alpha ,\theta >0\) and \(\beta >1\). The diffusion space is \(E=(0,+\infty )\) and its stationary density is a reciprocal Gamma one:

$$\begin{aligned} m(x)=\frac{\alpha ^\beta }{\varGamma (\beta )}x^{-\beta -1}e^{-\frac{\alpha }{x}}, \ x>0. \end{aligned}$$

As for the Fisher–Snedecor distribution, let us recall that such density admits finite even moments up to \(2N_2\) as \(N_2=\left\lfloor \frac{\beta }{2}\right\rfloor \). A complete spectral analysis of the RG process has been made in [45]. We have a finite number of simple eigenvalues in the discrete spectrum of \(-{\mathcal {G}}\), given by

$$\begin{aligned} \lambda _n=n\theta \frac{\beta -n}{\beta -1}, \ n=0,\ldots , \left\lfloor \frac{\beta }{2}\right\rfloor . \end{aligned}$$

The eigenfunctions are Bessel polynomials \(B_n^{(\alpha ,\beta )}\) defined by the Rodrigues formula

$$\begin{aligned} B_n^{(\alpha ,\beta )}(x)= x^{\beta +1}e^{\frac{\alpha }{x}}\frac{d^{n} }{d x^{n}}\left[ x^{2n-(\beta +1)}e^{-\frac{\alpha }{2}}\right] , \end{aligned}$$

with normalizing constant

$$\begin{aligned} K_n=\frac{(-1)^n}{\alpha ^n}\sqrt{\frac{(\beta -2n)\varGamma (\beta )}{\varGamma (n+1)\varGamma (\beta -n+1)}}. \end{aligned}$$

The absolutely continuous spectrum is \(\sigma _{ac}(-{\mathcal {G}})=(\varLambda _2,+\infty )\) where the cut-off \(\varLambda _2\) is given by

$$\begin{aligned} \varLambda _2=\frac{\theta \beta ^2}{4(\beta -1)}. \end{aligned}$$

For \(\lambda \in (\varLambda _2,+\infty )\), the eigenfunction we will use is given by

$$\begin{aligned} f_2(x,-\lambda )=\alpha ^{\frac{\beta +1}{2}}{}_2F_0\left( -\frac{\beta }{2}+\varDelta _2(\lambda ),-\frac{\beta }{2}-\varDelta _2(\lambda );\, ;-\frac{x}{\alpha }\right) \end{aligned}$$

where

$$\begin{aligned} \varDelta _2(\lambda )=\frac{1}{2}\sqrt{\beta ^2-\frac{4\lambda (\beta -1)}{\theta }}. \end{aligned}$$

2.2.3 Spectral Decomposition Theorem for Spectral Category II

Let us recall the spectral decomposition theorem for Pearson diffusions of spectral category II as given in [11, 45].

Theorem 1

Let X(t) be a Pearson diffusion of spectral category II and, if X(t) is a FS process, let \(\alpha >2\) with \(\alpha \not = 2(m+1)\) for any \(m \in {\mathbb {N}}\). The density \(p(t,x;x_0)\) admits the following spectral decomposition:

$$\begin{aligned} p(t,x;x_0)=p_d(t,x;x_0)+p_c(t,x;x_0) \end{aligned}$$

where

$$\begin{aligned} p_d(t,x;x_0)=m(x)\sum _{n=0}^{N_j}e^{-\lambda _n t}Q_n(x_0)Q_n(x) \end{aligned}$$

and

$$\begin{aligned} p_c(t,x;x_0)=\frac{m(x)}{\pi }\int _{\varLambda _j}^{+\infty }e^{-\lambda t}a_j(\lambda )f_j(x_0,-\lambda )f_j(x,-\lambda )d\lambda , \end{aligned}$$

where \(j=1,2\) and

$$\begin{aligned} a_1(\lambda )&=(-i\varDelta _1(\lambda ))\left| \frac{\sqrt{B\left( \frac{\alpha }{2},\frac{\beta }{2}\right) }\varGamma \left( -\frac{\beta }{4}+\varDelta _1(\lambda )\right) \varGamma \left( \frac{\alpha }{2}+\frac{\beta }{4}+\varDelta _1(\lambda )\right) }{\varGamma \left( \frac{\alpha }{2}\right) \varGamma (1+2\varDelta _1(\lambda ))}\right| ^2;\\ a_2(\lambda )&=(-i\varDelta _2(\lambda ))\left| \frac{\sqrt{\varGamma (\beta )}\varGamma \left( -\frac{\beta }{2}+\varDelta _2(\lambda )\right) }{\alpha ^{\frac{\beta +1}{2}}\varGamma (1+2\varDelta _2(\lambda ))}\right| ^2. \end{aligned}$$

2.3 Pearson Diffusions of Spectral Category III: Student Diffusions

The Student process is solution of the SDE

$$\begin{aligned} dX(t)=-\theta (X(t)-\mu )dt+\sqrt{\frac{2\theta \delta ^2}{\nu -1}\left( 1+\left( \frac{X(t)-\mu '}{\delta }\right) ^2\right) }dW(t), \ t \ge 0 \end{aligned}$$

where \(\theta ,\delta >0\), \(\mu ,\mu ' \in {\mathbb {R}}\) and \(\nu >1\). If \(\mu =\mu '\) we refer to it as the symmetric Student process, while for \(\mu \not = \mu '\) it is called skew Student process.

In [10] it is shown that such process admits diffusion space \(E={\mathbb {R}}\) and stationary density

$$\begin{aligned} m(x)= & {} \frac{\varGamma \left( \frac{\nu +1}{2}\right) }{\delta \sqrt{\pi } \varGamma \left( \frac{\nu }{2} \right) } \, \prod \limits _{k=0}^{\infty }\left( 1 + \left( \frac{(\mu - \mu ^{\prime })(\nu -1)}{\delta (\nu +1+2k)}\right) ^{2} \right) ^{-1} \\&\times \frac{\exp \left\{ \frac{(\mu -\mu ^{\prime })(\nu -1)}{\delta }\arctan {\left( \frac{x - \mu ^{\prime }}{\delta } \right) } \right\} }{\left[ 1 + \left( \frac{x - \mu ^{\prime }}{\delta }\right) ^{2} \right] ^{\frac{\nu +1}{2}}}, \ x \in {\mathbb {R}}. \end{aligned}$$

The Student density admits finite even moments up to \(2N_3\) as \(N_3=\left\lfloor \frac{\nu }{2}\right\rfloor \). Thus the discrete spectrum of \((-{\mathcal {G}})\) is made of a finite number of simple eigenvalues given by

$$\begin{aligned} \lambda _n=\frac{n\theta (\nu -n)}{\nu -1}, \ n=0,\ldots , \left\lfloor \frac{\nu }{2}\right\rfloor \end{aligned}$$

while the eigenfunctions are the generalized Routh-Romanovski polynomials \({\widetilde{R}}_n(x)\) defined by the Rodrigues formula

$$\begin{aligned} {\widetilde{R}}_n(x)&=\left( \frac{\delta ^2}{\nu -1}\right) ^n\left( 1+\left( \frac{x-\mu '}{\delta }\right) ^2\right) ^{\frac{\nu +1}{2}} \\&\quad \times \exp \left\{ \frac{(\mu '-\mu )(\nu -1)}{\delta }\arctan \left( \frac{x-\mu '}{\delta }\right) \right\} \\&\quad \times \frac{d^{n} }{d x^{n}}\left( \left( 1+\left( \frac{x-\mu '}{\delta }\right) ^2\right) ^{n-\frac{\nu +1}{2}}\right. \\&\quad \times \left. \exp \left\{ \frac{(\mu -\mu ')(\nu -1)}{\delta }\arctan \left( \frac{x-\mu '}{\delta }\right) \right\} \right) \end{aligned}$$

with normalization constant

$$\begin{aligned} K_n&=\left( \frac{1-\nu }{2\delta }\right) ^n\\&\quad \times \sqrt{\frac{(2n-\nu -2)\sin [\pi (2n-\nu +1)]\varGamma (\nu +n-1)\varGamma ^2\left( \frac{\nu +1}{2}-n\right) }{(\nu -1)(-1)^nn!\delta 2^{1-\nu }\pi ^2 c(\nu ,\delta ,\mu ,\mu ')}}\\&\quad \times \sqrt{\prod _{k=0}^{+\infty }\left[ 1+\left( \frac{(\nu -1)(\mu -\mu ')}{\delta (\nu +1-2n+2k)}\right) ^{2}\right] ^{-1}}, \end{aligned}$$

where \(c(\nu ,\delta ,\mu ,\mu ')\) is a suitable constant. If \(\mu =\mu '\) we obtain the classical Routh-Romanovski polynomials \(R_n(x)\) and their normalizing constant (up to a multiplicative constant). Concerning the absolutely continuous spectrum \(\sigma _{ac}(-{\mathcal {G}})=(\varLambda _3,+\infty )\), where the cut-off \(\varLambda _3\) is given by

$$\begin{aligned} \varLambda _3=\frac{\theta \nu ^2}{4(\nu -1)}, \end{aligned}$$

we have to recall that its elements are of multiplicity 2.

Concerning the eigenfunctions, in place of the monotonic solutions considered in [10], we can consider two independent linear combination of them. Indeed, by using the same strategy adopted in [46], referring to Kummer solutions (see [59, 68]), we can consider one of the solutions of \({\mathcal {G}}f=-\lambda f\) and its complex conjugate. In [46] the eigenfunctions are explicitly expressed in the case \(\mu =\mu '\), while for \(\mu \not = \mu '\), the eigenfunctions are studied in [5], by reconsidering the seminal paper [75]. For the sake of shortness, we omit the explicit formula for such eigenfunctions, but we refer to it as \(f_3(x;-\lambda )\) and \({\overline{f}}_3(x;-\lambda )\). Using Linetski approach (see [47]) as done in [10], we obtain the following spectral decomposition theorem (see also [46] for the case \(\mu =\mu '\)).

Theorem 2

The Student process is ergodic. Moreover, if \(\nu \not = 2k-1\) for any \(k \in {\mathbb {N}}\), the density \(p(t,x;x_0)\) admits the following spectral decomposition:

$$\begin{aligned} p(t,x;x_0)=p_d(t,x;x_0)+p_c(t,x;x_0) \end{aligned}$$

where

$$\begin{aligned} p_d(t,x;x_0)=m(x)\sum _{n=0}^{N_3}e^{-\lambda _n t}Q_n(x_0)Q_n(x) \end{aligned}$$

and

$$\begin{aligned} p_c(t,x;x_0)= & {} m(x)\int _{\varLambda _3}^{+\infty }e^{-\lambda t}\left( \frac{f_3(x_0,-\lambda )f_3(x,-\lambda )}{\left\| f_3(\cdot ,-\lambda ) \right\| _{L^2(m(dx))}^2}\right. \\&\left. +\frac{{\bar{f}}_3(x_0,-\lambda ){\bar{f}}_3(x,-\lambda )}{\left\| {\bar{f}}_3(\cdot ,-\lambda ) \right\| _{L^2(m(dx))}^2}+\frac{f_3(x_0,-\lambda ){\bar{f}}_3(x,-\lambda )+{\bar{f}}_3(x_0,-\lambda )f_3(x,-\lambda )}{\left\| {\bar{f}}_3(\cdot ,-\lambda ) \right\| _{L^2(m(dx))}\left\| f_3(\cdot ,-\lambda ) \right\| _{L^2(m(dx))}}\right) d\lambda . \end{aligned}$$

To work with Student diffusions we will make use of the speed density. Let us recall that for any solution of an SDE of the form (3) (where \(\mu (x)\) and \(\sigma ^2(x)\) are not necessarily polynomials) with diffusion space E, the speed density is defined as

$$\begin{aligned} sp(x)=\frac{2}{\sigma ^2(x)}e^{\int _c^x\frac{\mu (y)}{\sigma ^2(y)}dy}, \ x \in E, \end{aligned}$$

where \(c \in E\) is fixed. The speed density is uniquely defined up to a multiplicative constant that we can suppose is set to 1. For further details on the speed density, we refer to [20]. In the specific case of the Student diffusion we have

$$\begin{aligned} sp(x)=\frac{ \exp \left\{ \frac{(\mu -\mu ')(\nu -1)}{\delta } \; \arctan \left( \frac{x-\mu '}{\delta }\right) \right\} }{\left( 1+\left( \frac{x-\mu '}{\delta }\right) ^{2}\right) ^{\frac{\nu +1}{2}}}, x \in {\mathbb {R}}, \end{aligned}$$

where \(\int _{-\infty }^{+\infty }sp(x)dx=M<+\infty \).

3 Inverse Subordinators and Non-local Convolution Derivatives

Now let us introduce our main object of study. Let us denote by \(\mathcal {BF}\) the convex cone of Bernstein functions, that is to say \(\varPhi \in \mathcal {BF}\) if and only if \(\varPhi \in C^\infty ({\mathbb {R}}^+)\), \(\varPhi (\lambda ) \ge 0\) and for any \(n \in {\mathbb {N}}\)

$$\begin{aligned} (-1)^n\frac{d^{n} \varPhi }{d \lambda ^{n}}(\lambda )\le 0. \end{aligned}$$

In particular it is known that for \(\varPhi \in \mathcal {BF}\) the following Lévy–Khintchine representation ([65]) is given

$$\begin{aligned} \varPhi (\lambda )=a_\varPhi +b_\varPhi \lambda +\int _0^{+\infty }(1-e^{-\lambda t})\nu _\varPhi (dt) \end{aligned}$$
(7)

where \(a_\varPhi ,b_\varPhi \ge 0\) and \(\nu _\varPhi \) is a Lévy measure on \({\mathbb {R}}^+\) such that

$$\begin{aligned} \int _0^{+\infty }(1 \wedge t)\nu _\varPhi (dt)<+\infty . \end{aligned}$$
(8)

The triple \((a_\varPhi ,b_\varPhi ,\nu _\varPhi )\) is called the Lévy triple of \(\varPhi \). Also the vice versa can be shown, i.e. for any Lévy triple \((a_\varPhi ,b_\varPhi ,\nu _\varPhi )\) such that \(\nu _\varPhi \) is a Lévy measure satisfying the integral condition (8) there exists a unique Bernstein function \(\varPhi \) such that Eq. (7) holds. In the following we will consider \(a_\varPhi =b_\varPhi =0\) and \(\nu _\varPhi (0,+\infty )=+\infty \). Moreover, let us denote \({\bar{\nu }}_\varPhi (t)=\nu _\varPhi (t,+\infty )\).

It is also known (see [65]) that for each Bernstein function \(\varPhi \in \mathcal {BF}\) there exists a unique subordinator \(\sigma _\varPhi =\{\sigma _\varPhi (y), y \ge 0\}\) (i. e. an increasing Lévy process) such that

$$\begin{aligned} {\mathbb {E}}[e^{-\lambda \sigma _\varPhi (y)}]=e^{-y\varPhi (\lambda )}. \end{aligned}$$

For general notions on subordinators we refer to [15, Chapter 3] and [16]. Our hypothesis on \(a_\varPhi ,b_\varPhi \) and \(\nu _\varPhi \) imply that \(\sigma _\varPhi \) is a pure jump strictly increasing process.

For any Bernstein function \(\varPhi \) we can define the inverse subordinator \(L_\varPhi \) as, for any \(t \ge 0\),

$$\begin{aligned} L_\varPhi (t):=\inf \{y> 0: \ \sigma _\varPhi (y)>t\}. \end{aligned}$$

\(L_\varPhi (t)\) is absolutely continuous for any \(t>0\) and we denote by \(f_\varPhi (s;t)\) its density. Let us recall (see [49]) that, denoting by \({\overline{f}}_\varPhi (s;\lambda )\) the Laplace transform of \(f_\varPhi (s;t)\) with respect to t,

$$\begin{aligned} {\overline{f}}_\varPhi (s;\lambda )=\frac{\varPhi (\lambda )}{\lambda }e^{-s\varPhi (\lambda )}, \ \lambda >0. \end{aligned}$$

Moreover, let us observe that, under our assumptions, the sample paths of \(L_\varPhi (t)\) are almost surely increasing and continuous. Now let us recall the definition of non-local convolution derivative, defined in [37] and [72].

Definition 2

Let \(u:{\mathbb {R}}^+ \rightarrow {\mathbb {R}}\) be an absolutely continuous function. Then we define the non-local convolution derivative induced by \(\varPhi \) of u as

$$\begin{aligned} \partial _t^\varPhi u(t)=\int _0^t u'(\tau ){\overline{\nu }}_\varPhi (t-\tau )d\tau . \end{aligned}$$
(9)

Let us observe that one can define also the regularized version of the non-local convolution derivative as

$$\begin{aligned} \partial _t^\varPhi u(t)=\frac{d }{d t}\int _0^t (u(\tau )-u(0+)){\overline{\nu }}_\varPhi (t-\tau )d\tau \end{aligned}$$
(10)

observing that it coincides with the previous definition on absolutely continuous functions. We will always use Eq. (10).

It can be shown, by Laplace transform arguments (see, for instance [6, 38]) or by Green functions arguments (see [39]), that the Cauchy problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t^\varPhi {\mathfrak {e}}_\varPhi (t;\lambda )=\lambda {\mathfrak {e}}_\varPhi (t;\lambda ) &{} t>0\\ {\mathfrak {e}}_\varPhi (0;\lambda )=1 \end{array}\right. } \end{aligned}$$

admits a unique solution for any \(\lambda \in {\mathbb {R}}\) and it is given by \({\mathfrak {e}}_\varPhi (t;\lambda ):={\mathbb {E}}[e^{\lambda L_\varPhi (t)}]\). In [8] the following proposition is proved.

Proposition 1

Fix \(t>0\). Then there exists a constant K(t) such that

$$\begin{aligned} \lambda {\mathfrak {e}}_\varPhi (t;-\lambda )\le K(t), \ \forall \lambda \in [0,+\infty ). \end{aligned}$$
(11)

A Bernstein function \(\varPhi \) is said to be complete if its Lévy measure \(\nu _\varPhi (dt)\) admits a density \(\nu _\varPhi (t)\) that is completely monotone. Following the approach given in [13] we are able to prove the following theorem (the proof is given in Appendix A).

Theorem 3

Let \((X,\left\| \cdot \right\| _{})\) be a Banach space and \((T(t))_{t \ge 0}\) be a uniformly bounded and strongly continuous \(C_0\)-semigroup on X. Define the family of linear operators on X \((T_\varPhi (t))_{t \ge 0}\) as

$$\begin{aligned} T_\varPhi (t)u=\int _0^{+\infty }T(s)uf_\varPhi (s;t)ds, \ u \in X, \end{aligned}$$

where \(\varPhi \) is a driftless complete Bernstein function, \(f_\varPhi (s;t)\) is the density of the inverse subordinator \(L_\varPhi (t)\) associated to \(\varPhi \) and the integral has to be intended in Bochner sense. Then \((T_{\varPhi }(t))_{t \ge 0}\) is a uniformly bounded and strongly continuous family of linear operators. Moreover, it is also strongly analytic in a suitable sector \({\mathbb {C}}(\alpha )=\{z \in {\mathbb {C}}\setminus \{0\}: \ |{\mathrm{Arg}}(z)|<\alpha \}\) where \({\mathrm{Arg}}\) is the principal argument. Finally, if \((A,{\mathcal {D}}(A))\) is the generator of the semigroup \((T(t))_{t \ge 0}\) and \(u \in {\mathcal {D}}(A)\), then also \(T_\varPhi (t)u \in {\mathcal {D}}(A)\) and it solves the Cauchy problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t^\varPhi T_\varPhi (t)u=AT_\varPhi (t)u, &{} t>0,\\ T_\varPhi (0)u=u, \end{array}\right. } \end{aligned}$$
(12)

where the equality holds in X (and not necessarily pointwise).

Remark 1

Let us recall that \(T_\varPhi (t)\) in general is not a semigroup.

Concerning strong continuity we refer to the definition given in [57] for general families of operators of one parameter. Moreover, let us observe that a similar result has been shown in [23, Theorem 2.1]. However, following the lines of [13], we are also able to show better regularity of the family of operators \((T_\varPhi (t))_{t \ge 0}\).

Let us give some examples of complete Bernstein functions and associated subordinators.

  • For fixed \(\alpha \in (0,1)\) we can consider \(\varPhi (\lambda )=\lambda ^\alpha \). In this case \(\sigma _\varPhi \) is an \(\alpha \)-stable subordinator. Extensive informations on inverse \(\alpha \)-stable subordinators are given in [51]. In this case \({\mathfrak {e}}_\varPhi (t;\lambda )=E_\alpha (\lambda t^\alpha )\), where \(E_\alpha \) is the one-parameter Mittag-Leffler function (see [17]) defined as

    $$\begin{aligned} E_\alpha (z)=\sum _{k=0}^{+\infty }\frac{z^k}{\varGamma (\alpha k+1)}, \ z \in {\mathbb {C}}. \end{aligned}$$

    In such case, the Caputo-type non-local derivative coincides with the classical Caputo fractional derivative (see [35]).

  • For fixed \(\alpha \in (0,1)\) and \(\theta >0\) we can consider \(\varPhi (\lambda )=(\lambda +\theta )^\alpha -\theta ^\alpha \). In this case \(\sigma _\varPhi \) is a tempered \(\alpha \)-stable subordinator. Inverse tempered stable subordinators are studied for instance in [41]. Even in this case the Caputo-type non-local derivative is linked to a well-known non-local operator, called the tempered fractional derivative (see [22]);

  • For fixed \(\alpha \in (0,1)\) we can consider \(\varPhi (\lambda )=\log (1+\lambda ^{\alpha })\). In this case \(\sigma _\varPhi \) is a geometric \(\alpha \)-stable subordinator. For informations on such process we refer to [67].

  • If in the previous example we consider \(\alpha =1\) we obtain \(\varPhi (\lambda )=\log (1+\lambda )\) that is still a complete Bernstein function. In this case \(\sigma _\varPhi \) is a Gamma subordinator. Again we refer to [67] and references therein.

4 Definition of the Time-non-local Pearson Diffusions

Let us now define the time-non-local Pearson diffusions.

Definition 3

Let X(t) be a Pearson diffusion and \(\varPhi \in \mathcal {BF}\) be a driftless Bernstein function. Let \(L_\varPhi (t)\) be an inverse subordinator associated to \(\varPhi \) and independent of X(t). Then we define the time-non-local Pearson diffusion induced by X and \(L_\varPhi \) as \(X_\varPhi (t):=X(L_\varPhi (t))\).

The first property one has to recall is that \(X_\varPhi (t)\) is not a Markov process, but it is still a semi-Markov one (see [24]).

The fact that \(L_\varPhi (t)\) is a.s. continuous implies that the Brownian motion W(t) in the Stochastic Differential Equation (3) is in synchronization (in the sense of [36]) with \(L_\varPhi (t)\) for any driftless Bernstein function \(\varPhi \). Thus the first change-of-variable formula for synchronized processes (see [36, Lemma 2.3]) and the duality theorem [36, Theorem 4.2] lead us to the following characterization of time-non-local Pearson diffusions (following the lines of [44]).

Proposition 2

\(X_\varPhi (t)\) is the unique strong solution of

$$\begin{aligned} dX_\varPhi (t)=\mu (X_\varPhi (t))dL_\varPhi (t)+\sigma (X_\varPhi (t))dW(L_\varPhi (t)) \end{aligned}$$
(13)

with initial datum \(X_\varPhi (0)=x_0\) if and only if \(X_\varPhi (t)\) is a time-non-local Pearson diffusion.

Thus we can characterize the time-non-local Peason diffusions as the unique strong solutions of Eq. (13), in analogy of what is done with the classical Pearson diffusion. In particular, from this result, we still have that time-non-local Pearson diffusions are semimartingales, but with respect to a time-changed filtration \({\mathcal {F}}_{L_\varPhi (t)}\).

Now let us show that time-non-local Pearson diffusions admit in some sense a transition probability density.

Lemma 1

Let \(X_\varPhi (t)\) be a time-non-local Pearson diffusion with diffusion space E. Then there exists a function \(p_\varPhi (t,x;x_0)\) such that for any Borel set \(B \in {\mathcal {B}}(E)\), \(t>0\) and \(x_0 \in E\) it holds

$$\begin{aligned} {\mathbb {P}}(X_\varPhi (t)\in B|X_\varPhi (0)=x_0)=\int _{B}p_\varPhi (t,x;x_0)dx. \end{aligned}$$

Moreover, the following integral representation holds:

$$\begin{aligned} p_\varPhi (t,x;x_0)=\int _0^{+\infty }p(s,x;x_0)f_\varPhi (s;t)ds, \ x,x_0 \in E, \ t>0 \end{aligned}$$
(14)

where p is the transition probability density of X(t).

Proof

Let us first observe that since \(L_\varPhi (0)=0\) almost surely, then \(X_\varPhi (0)=X(0)\) almost surely. Now fix \(B \in {\mathcal {B}}(E)\) and observe, by the independence of \(L_\varPhi \) and X and by definition of \(p(t,x;x_0)\),

$$\begin{aligned} {\mathbb {P}}(X_\varPhi (t) \in B|X_\varPhi (0)=x_0)&=\int _0^{+\infty }{\mathbb {P}}(X(s) \in B|X_\varPhi (0)=x_0)f_\varPhi (s;t)ds\\&=\int _0^{+\infty }\int _{B}p(s,x;x_0)dxf_\varPhi (s;t)ds\\&=\int _{B}\int _0^{+\infty }p(s,x;x_0)f_\varPhi (s;t)dsdx, \end{aligned}$$

where we used Fubini’s theorem since all the integrands are non-negative. Thus, by uniqueness of the Radon-Nikodym derivative of measures, we have Eq. (14). \(\square \)

We call \(p_\varPhi (t,x;x_0)\) the transition probability density of \(X_\varPhi (t)\).

Let us stress out that, for any Pearson diffusion, if we consider the semigroup \(T(t)g(x)={\mathbb {E}}^x[g(X(t))]\) acting on \(C_0(E)\), as a direct consequence of the previous lemma, the family of operators \((T_\varPhi (t))_{t \ge 0}\) acting on \(C_0(E)\) and defined as \(T_\varPhi (t)g(x)={\mathbb {E}}^x[g(X_\varPhi (t))]\) is obtained by time-changing T(t). Moreover, we can characterize the family of adjoint operator \((T^*_\varPhi (t))_{t \ge 0}\) of \(T_\varPhi (t)\).

Lemma 2

Let \(X_\varPhi (t)\) be a time-non-local Pearson diffusion and \(T_\varPhi (t)\) be the family of operators defined on \(C_0(E)\) as \(T_\varPhi (t)g(x)={\mathbb {E}}^x[g(X_\varPhi (t))]\). Then the adjoint operators \((T^*_\varPhi (t))_{t \ge 0}\) are defined as

$$\begin{aligned} T^*_\varPhi (t)f(x)=\int _{E}p_\varPhi (t,x;y)f(y)dy \end{aligned}$$

and represent the time-changed semigroup of the strongly continuous semigroup \((T^*(t))_{t \ge 0}\) defined as

$$\begin{aligned} T^*(t)f(x)=\int _{E}p(t,x;y)f(y)dy. \end{aligned}$$

We omit the proof since it easily follows from the definition of \(T^*_\varPhi \) and the integral representation given in Lemma 1.

In the following sections we will investigate the spectral decomposition of the transition probability density of time-non-local Pearson diffusions and the existence of strong solutions for the associated backward and forward Kolmogorov problems.

5 Spectral Decomposition of Time-non-local Pearson Diffusions of Spectral Category I

5.1 Spectral Decomposition of the Transition Probability Density

Let us consider a Pearson diffusion X(t) of spectral category I with diffusion space E, generator \({\mathcal {G}}\), stationary density m(x) and orthonormal polynomials \(Q_n(x)\). Then it is well known that the spectral decomposition of \(p(t,x;x_0)\) is given by

$$\begin{aligned} p(t,x;x_0)=m(x)\sum _{n=0}^{+\infty }e^{-\lambda _n t}Q_n(x)Q_n(x_0) \end{aligned}$$

where \(\lambda _n\) are the eigenvalues of \(-{\mathcal {G}}\) for each \(Q_n\). We want to show a similar spectral decomposition for \(p_\varPhi (t,x;x_0)\) where, in place of \(e^{-\lambda _n t}\), we have \({\mathfrak {e}}_\varPhi (t;-\lambda _n)\). To do this, we first need to show the following technical Lemma.

Lemma 3

Let \(\varPhi \in \mathcal {BF}\) be a Bernstein function, X(t) be a Pearson diffusion of spectral category I with diffusion space E, associated family of classical orthonormal polynomials \(Q_n(x)\) and \(\lambda _n\) the respective eigenvalues. Then the series

$$\begin{aligned} \sum _{n=0}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(x)Q_n(y) \end{aligned}$$

absolutely converges for fixed \(t>0\) and \(x,y \in E\) and normally converges for xy belonging to compact sets in E and \(t \ge t_0\).

Proof

Let us recall that, by Eq. (11), for fixed \(t>0\) there exists \(K>0\) such that \({\mathfrak {e}}_\varPhi (t;-\lambda _n)\le \frac{K}{\lambda _n}\).

Let us first work with the OU process. We can suppose without loss of generality that \(\mu =0\) and \(\sigma =1\). Then there exists a constant \(C>0\) such that

$$\begin{aligned} |Q_n(x)|\le C e^{\frac{x^2}{4}}n^{-\frac{1}{4}}\left( 1+\left( \frac{|x|}{\sqrt{2}}\right) ^{\frac{5}{2}}\right) , \end{aligned}$$
(15)

where the estimate follows from the definition of the normalizing constant and the estimate on Hermite polynomials given in [63, p. 369]. Moreover, \(\lambda _n=\theta n\), thus

$$\begin{aligned} {\mathfrak {e}}_\varPhi (t;-\lambda _n)|Q_n(x)||Q_n(y)|\le C n^{-1-\frac{1}{2}}, \end{aligned}$$
(16)

obtaining the absolute convergence. Normal convergence follows from the same estimates and the fact that \({\mathfrak {e}}_\varPhi (t;-\lambda _n)\le {\mathfrak {e}}_\varPhi (t_0;-\lambda _n)\).

Concerning the CIR process, we can suppose without loss of generality \(a=1\). We have, considering the normalizing constant and the estimate on Laguerre polynomials given in [63, p. 348], that there exists a constant C independent of x and n such that, for n big enough, it holds

$$\begin{aligned} |Q_n(x)|\le C \frac{e^{\frac{x}{2}}}{x^{\frac{2b-1}{4}}}n^{-\frac{1}{4}}, \end{aligned}$$
(17)

thus we have again (16) and both absolute and normal converge follow.

Now let us consider the Jacobi process case. By [71, Theorem 8.21.8] and the definition of the normalizing constant we have

$$\begin{aligned} Q_n(x)=C(x,a,b)\cos (N(a,b)\theta +\gamma (a))+O(n^{-1}) \end{aligned}$$
(18)

where the remainder term and the function C(xab) are uniform in x as \(x \in K_x \subset (-1,1)\) with \(K_x\) compact set and \(\cos (\theta )=x\). Thus, in this case, we have

$$\begin{aligned} {\mathfrak {e}}_\varPhi (t;-\lambda _n)|Q_n(x)||Q_n(y)|\le C n^{-2}, \end{aligned}$$

the bound being uniform as \(t \ge t_0>0\), \(x \in K_x\) and \(y \in K_y\) where \(K_x\) and \(K_y\) are compact sets in \(E=(-1,1)\). \(\square \)

Now we are ready to give the spectral decomposition of the transition probability density \(p_\varPhi (t,x;x_0)\).

Theorem 4

Let \(\varPhi \in \mathcal {BF}\) be a Bernstein function, X(t) be a Pearson diffusion of spectral category I with diffusion space E, associated family of classical orthonormal polynomials \(Q_n(x)\) with eigenvalues \(\lambda _n\). Let \(X_\varPhi (t)\) be the respective time-non-local Pearson diffusion. Then

$$\begin{aligned} p_\varPhi (t,x;x_0)=m(x)\sum _{n=0}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(x)Q_n(x_0) \end{aligned}$$

for any \(t>0\) and \(x,x_0 \in E\).

Proof

By using Eq. (14) we know that

$$\begin{aligned} p_\varPhi (t,x;x_0)=m(x)\int _0^{+\infty }\sum _{n=0}^{+\infty }e^{-\lambda _n s}Q_n(x)Q_n(x_0)f_\varPhi (s;t)ds. \end{aligned}$$

Now we have to show that we can exchange the order of the integral with the summation. To do this, let us observe that

$$\begin{aligned} \sum _{n=0}^{+\infty }\int _0^{+\infty }e^{-\lambda _n s}|Q_n(x)Q_n(x_0)|f_\varPhi (s;t)ds = \sum _{n=0}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda _n)|Q_n(x)Q_n(x_0)|<+\infty \end{aligned}$$

by the previous Lemma and then we can use Fubini’s theorem to conclude the proof. \(\square \)

Now that we have the spectral decomposition of the transition density, we can focus on showing the existence of strong solutions to the time-non-local backward Kolmogorov equation under suitable assumptions on the initial datum.

5.2 The Time-non-local Backward Kolmogorov Equation

Here we want to focus on the following time-non-local Cauchy problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t^\varPhi u(t,y)={\mathcal {G}}u(t,y) &{} t>0, \ y \in E \\ u(0,y)=g(y) &{} y \in E \end{array}\right. } \end{aligned}$$
(19)

for suitable initial data g.

Definition 4

A function u(ty) is a strong solution (in \(L^2(m(x)dx)\)) of Problem (19) if and only if:

  • \(t\ge 0 \mapsto u(t,\cdot ) \in L^2(m(x)dx)\) is strongly continuous;

  • The function \(\partial _t^\varPhi u(t,y)\) is well-defined for any \(t>0\) and \(y \in E\);

  • The convolution integral involved in \(\partial _t^\varPhi u(t,y)\) is a Bochner integral and the derivative is a strong derivative in \(L^2(m(x)dx)\);

  • The function \(t>0 \mapsto \partial _t^\varPhi u(t,\cdot ) \in L^2(m(x)dx)\) is strongly continuous;

  • For fixed \(t>0\), \(u(t,\cdot )\in C^2(E)\);

  • The equations of (19) hold pointwise for any \(t>0\) and almost all \(y \in E\).

Let us first show the next result, following the lines of [43].

Theorem 5

Let X(t) be a Pearson diffusion of spectral category I with diffusion space E, associated family of orthonormal polynomials \(Q_n\) with respective eigenvalues \(\lambda _n\), generator \({\mathcal {G}}\) and stationary density m(x). Let \(\varPhi \in \mathcal {BF}\) be a Bernstein function. Let \(g \in L^2(m(x)dx)\) be decomposed as \(g(y)=\sum _{n=0}^{+\infty }g_nQ_n(y)\), where the series converges in \(L^2(m(x)dx)\), absolutely for fixed \(y \in E\) and uniformly on compact intervals \([y_1,y_2]\subset E\). Then the unique strong solution in \(L^2(m(x)dx)\) of the problem (19) is given by

$$\begin{aligned} u(t,y)=\sum _{n=0}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(y)g_n. \end{aligned}$$
(20)

Moreover, \(p_\varPhi (t,x;y)\) is the fundamental solution of the problem (19), in the sense that, for any initial datum \(g \in L^2(m(x)dx)\) such that \(g=\sum _{n=0}^{+\infty }g_nQ_n\) converges in \(L^2(m(x)dx)\), absolutely for fixed \(y \in E\) and uniformly on compact interval \([y_1,y_2]\subset E\), it holds

$$\begin{aligned} u(t,y)=\int _{E}p_\varPhi (t,x;y)g(x)dx. \end{aligned}$$

Proof

First of all, let us observe that the function in Eq. (20) is well defined, by showing that the involved series converges in \(L^2(m(x)dx)\). To do this, let us first define for \(N \in {\mathbb {N}}\)

$$\begin{aligned} u_N(t,y)=\sum _{n=0}^{N}{\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(y)g_n. \end{aligned}$$

By Parseval’s identity, setting \(N>M\), we have that

$$\begin{aligned} \left\| u_N(t,\cdot )-u_M(t,\cdot ) \right\| _{L^2(m(x)dx)}^2=\sum _{n=M+1}^{N}{\mathfrak {e}}_\varPhi ^2(t;-\lambda _n)g^2_n\le \sum _{n=M+1}^{N}g^2_n. \end{aligned}$$

However, still using Parseval’s identity, we have \(\sum _{n=0}^{+\infty }g_n^2=\left\| g \right\| _{L^2(m(x)dx)}^2<+\infty \), thus concluding the convergence in \(L^2(m(x)dx)\) of the sequence \(u_N(t,\cdot )\) by Cauchy’s criterion.

Now let us show that \(t \mapsto u(t,\cdot )\) is strongly continuous in \(L^2(m(x)dx)\). Concerning strong continuity at 0, let us fix \(\varepsilon >0\) and consider \(N>0\) such that \(\sum _{n=N+1}^{+\infty }g_n^2<\varepsilon \). Thus we have, by Parseval’s identity,

$$\begin{aligned} \left\| u(t,\cdot )-g(\cdot ) \right\| _{L^2(m(x)dx)}^2< \sum _{n=0}^{N}g_n^2|{\mathfrak {e}}_\varPhi ^2(t;-\lambda _n)-1|+\varepsilon . \end{aligned}$$

Taking the limit superior as \(t \rightarrow 0^+\) we have

$$\begin{aligned} \limsup _{t \rightarrow 0^+}\left\| u(t,\cdot )-g(\cdot ) \right\| _{L^2(m(x)dx)}^2 \le \varepsilon . \end{aligned}$$

Since \(\varepsilon >0\) is arbitrary, we have continuity at \(0^+\). Strong continuity at any \(t>0\) is proven in an analogous way.

Now let us observe that for the single summand it holds

$$\begin{aligned} \partial _t^\varPhi {\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(y)g_n=-\lambda _n{\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(y)g_n={\mathcal {G}}{\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(y)g_n \end{aligned}$$
(21)

for any \(n \in {\mathbb {N}}\). Thus, we actually have to show that we can exchange the operators \({\mathcal {G}}\) and \(\partial _t^\varPhi \) with the summation.

Let us first consider \(\partial _t^\varPhi \sum _{n=0}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(y)g_n\). First of all, let us denote by \({\mathcal {I}}_\varPhi (t)=\int _0^t {\bar{\nu }}_\varPhi (s)ds\) the integrated tail of the Lévy measure. \({\mathcal {I}}_\varPhi (t)\) is an increasing non-negative function with derivative \({\bar{\nu }}_\varPhi (t)\), hence we can rewrite

$$\begin{aligned} \int _0^t {\bar{\nu }}_\varPhi (t-\tau )(u(\tau ,y)-g(y))d\tau =\int _0^t (u(\tau ,y)-g(y))d{\mathcal {I}}_\varPhi (t-\tau ). \end{aligned}$$

Now let us observe that

$$\begin{aligned} u(\tau ,y)-g(y)=\sum _{n=1}^{+\infty }({\mathfrak {e}}_\varPhi (\tau ;-\lambda _n)-1)g_nQ_n(y) \end{aligned}$$
(22)

and in particular

$$\begin{aligned} \sum _{n=1}^{+\infty }|({\mathfrak {e}}_\varPhi (\tau ;-\lambda _n)-1)g_nQ_n(y)|\le \sum _{n=1}^{+\infty }|g_nQ_n(y)|, \end{aligned}$$

where the right-hand side converges for fixed \(y \in E\). Thus the series in the right-hand side of (22) normally converges and we have, by using [62, Theorem 7.16],

$$\begin{aligned} \int _0^t {\bar{\nu }}_\varPhi (t-\tau )(u(\tau ,y)-g(y))d\tau =\sum _{n=0}^{+\infty }\int _0^t({\mathfrak {e}}_\varPhi (\tau ;-\lambda _n)-1)g_nQ_n(y){\bar{\nu }}_{\varPhi }(t-\tau )d\tau . \end{aligned}$$
(23)

Now we only need to show that we can exchange the derivative \(\frac{d }{d t}\) with the summation sign on the right-hand side of Eq. (23). To do this, we have to show that the series of the derivatives converges uniformly in any compact set containing t (see [62, Theorem 7.17]). However, the series of the derivatives is given by

$$\begin{aligned} \sum _{n=0}^{+\infty }-\lambda _n{\mathfrak {e}}_\varPhi (t;-\lambda _n)g_nQ_n(y). \end{aligned}$$

Now fix \(t_0>0\) and \(t\ge t_0\). Then we have, by Eq. (11) and the fact that \(t \mapsto {\mathfrak {e}}_\varPhi (t;-\lambda )\) is decreasing for any \(\lambda >0\), \(\lambda _n{\mathfrak {e}}_\varPhi (t;-\lambda _n)\le M(t_0)\). In particular, this implies

$$\begin{aligned} \sum _{n=0}^{+\infty }\lambda _n{\mathfrak {e}}_\varPhi (t;-\lambda _n)|g_nQ_n(y)|\le M(t_0)\sum _{n=0}^{+\infty }|g_nQ_n(y)|, \end{aligned}$$

and then the series of the derivative is normally convergent in any compact set separated from 0. Finally, we have

$$\begin{aligned} \partial _t^\varPhi u(t,y)=\sum _{n=0}^{+\infty }-\lambda _n{\mathfrak {e}}_\varPhi (t;-\lambda _n)g_nQ_n(y). \end{aligned}$$
(24)

Arguing as we did for the series on the right-hand side of Eq. (20), one can easily show that the series on the right-hand side of Eq. (24) strongly converges in \(L^2(m(x)dx)\) and that \(\partial _t^\varPhi u(t,y)\) is strongly continuous in \(L^2(m(x)dx)\) as \(t>0\).

Now we have to show that we can exchange the operator \({\mathcal {G}}\) with the series on the right-hand side of Eq. (20). Since \({\mathcal {G}}\) is a second order differential operator with polynomial coefficients, we only need to exchange the first and the second derivative with respect to y with the series. To do this, we need to argue differently depending on the process. Let us first consider the OU process. The differential recurrence relation between Hermite polynomials (see, for instance, [1, Formula 22.8.8]) becomes, after normalization \(Q_n'(y)=\sqrt{n}Q_{n-1}(y)\). Thus, combining this relation with (15) we have that, in any compact set \([y_1,y_2] \subset {\mathbb {R}}\) containing y, it holds

$$\begin{aligned} |{\mathfrak {e}}_\varPhi (t;-\lambda _n)g_nQ'_n(y)|\le C(t) n^{-\frac{3}{4}}|g_n| \end{aligned}$$
(25)

where the right-hand side is the summand of a convergent series, since both \((n^{-3/4})_{n \in {\mathbb {N}}}\) and \((g_n)_{n \in {\mathbb {N}}}\) belong to \(\ell ^2\). Concerning the second derivative, the Sturm–Liouville equation \({\mathcal {G}}Q_n(y)=-\lambda _n Q_n(y)\) can be rewritten as \(Q_n''(y)=yQ_n'(y)-nQ_n(y)\), thus we have to study only the term in \(nQ_n(y)\). For such term, recalling that \(\lambda _n=\theta n\), we know that \(n{\mathfrak {e}}_\varPhi (t;-\lambda _n)\le M(t)\) for some M(t), hence the uniform convergence of \(\sum _{n=0}^{+\infty }Q_n(y)g_n\) on compact intervals implies the uniform convergence of \(\sum _{n=0}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda _n)nQ_n(y)g_n\) on compact intervals.

Now let us consider the CIR process. After normalization, [71, Formula 5.1.14] becomes \(Q'_{n,b-1}(y)=-\frac{(n-1)^{b/2}}{n^{(b-1)/2}}Q_{n-1,b}(y)\), where we denote with the indexes the dependence on the parameter b. By using (17), we obtain again Eq. (25) uniformly on compact intervals, thus obtaining the desired convergence. Moreover, the Sturm–Liouville equation \({\mathcal {G}}Q_n(y)=-\lambda _n Q_n(y)\) becomes \(yQ_n''(y)=(y-b)Q_n'(y)-nQ_n(y)\) thus, since \(\lambda _n=\theta n\), we can argue as in the OU process case to obtain the uniform convergence of \(\sum _{n=0}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda _n)nQ_n(y)g_n\) on compact intervals.

Last case is the Jacobi process case. In this case, normalizing [1, Formula 22.8.1], we have

$$\begin{aligned} Q'_n(y)= & {} \frac{n(a-b-(2n+a+b)y)}{(2n+a+b)(1-y^2)}Q_n(y)\\&+\frac{2(n+a)(n+b)}{(2n+a+b)(1-y^2)}\sqrt{\frac{n}{n-1}}Q_{n-1}(y). \end{aligned}$$

Both terms lead to a series of the form \(\sum _{n=0}^{+\infty }n{\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(y)g_n\). However, this time we know that \(Q_n(y)\) are bounded and \(\lambda _n \sim Cn^2\), thus we have that (still by using (11))

$$\begin{aligned} |{\mathfrak {e}}_\varPhi (t;-\lambda _n)nQ_n(y)g_n|\le C \frac{g_n}{n} \end{aligned}$$

where the right-hand side is the summand of a convergent series since \((g_n)_{n \in {\mathbb {N}}}, (1/n)_{n \in {\mathbb {N}}} \in \ell ^2\). Concerning the second derivative, the Sturm–Liouville equation \({\mathcal {G}}Q_n(y)=-\lambda _n Q_n(y)\) becomes \((1-y^2)Q_n''(y)=-((b-a)-(a+b-2)y)Q_n'(y)-n(n+a+b+1)Q_n(y)\), thus, since \(n(n+a+b+1)\) and \(\lambda _n\) are both quadratic in n, the same argument as in the previous cases leads to uniform convergence on compact intervals of \(\sum _{n=0}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda _n)n(n+a+b+1)Q_n(y)g_n\).

Concluding we have in general

$$\begin{aligned} {\mathcal {G}}u(t,y)=\sum _{n=0}^{+\infty }{\mathcal {G}}{\mathfrak {e}}_\varPhi (t;-\lambda _n) Q_n(y)g_n \end{aligned}$$

and then also \(u(t,\cdot ) \in C^2(E)\). Equation (21) and the fact we can exchange the order of the operators with the series imply that the equations of (19) hold pointwise.

Now let us show the third property in Definition 4. First of all, let us show that the convolution integral is actually a Bochner integral. By Bochner’s theorem ([4, Theorem 1.1.4]) we only have to show that

$$\begin{aligned} \int _0^t \left\| u(t-\tau ,\cdot )-u(0+,\cdot ) \right\| _{L^2(m(x)dx)}{\overline{\nu }}_\varPhi (\tau )d\tau <+\infty . \end{aligned}$$

To do this, let us use Jensen’s inequality and Parseval’s identity to achieve

$$\begin{aligned}&\left( \int _0^t \left\| u(t-\tau ,\cdot )-u(0+,\cdot ) \right\| _{L^2(m(x)dx)}{\overline{\nu }}_\varPhi (\tau )d\tau \right) ^2\\ {}&\qquad \quad \le {\mathcal {I}}_\varPhi (t)\int _0^t \left\| u(t-\tau ,\cdot )-u(0+,\cdot ) \right\| _{L^2(m(x)dx)}^2{\overline{\nu }}_\varPhi (\tau )d\tau \\&\quad \qquad ={\mathcal {I}}_\varPhi (t)\int _0^t \sum _{n=1}^{+\infty }(1-{\mathfrak {e}}_\varPhi (t;-\lambda _n))^2g^2_n{\overline{\nu }}_\varPhi (\tau )d\tau \le {\mathcal {I}}^2_\varPhi (t)\left\| g \right\| _{L^2(m(x)dx)}^2. \end{aligned}$$

Now let us show that the derivative is actually a strong derivative in \(L^2(m(x)dx)\), that is to say

$$\begin{aligned} \begin{aligned} \lim _{h \rightarrow 0}\left\| \frac{1}{h}\right.&\left. \left( \int _0^{t+h}(u(t+h-\tau ,\cdot )-u(0+,\cdot )){\overline{\nu }}_\varPhi (\tau )d\tau \right. \right. \\&\left. \left. -\int _0^{t}(u(t-\tau ,\cdot )-u(0+,\cdot )){\overline{\nu }}_\varPhi (\tau )d\tau \right) -\partial _t^\varPhi u(t,\cdot ) \right\| _{L^2(m(x)dx)}=0. \end{aligned} \end{aligned}$$
(26)

Let us argue for \(h>0\), as the case \(h<0\) is analogous. By triangular inequality and Eq. (24) we have

$$\begin{aligned} \begin{aligned}&\Bigg \Vert \frac{1}{h}\left( \int _0^{t+h}(u(t+h-\tau ,\cdot )-u(0+,\cdot )){\overline{\nu }}_\varPhi (\tau )d\tau \right. \\&\qquad \left. -\int _0^{t}(u(t-\tau ,\cdot )-u(0+,\cdot )){\overline{\nu }}_\varPhi (\tau )d\tau \right) -\partial _t^\varPhi u(t,\cdot )\Bigg \Vert _{{L^2(m(x)dx)}}\\&\quad \le \Bigg \Vert \frac{1}{h}\int _0^{t}(u(t+h-\tau ,\cdot )-u(t-\tau ,\cdot )){\overline{\nu }}_\varPhi (\tau )d\tau \\&\qquad + \sum _{n=1}^{+\infty }\lambda _n {\mathfrak {e}}_\varPhi (t;-\lambda _n)g_nQ_n\Bigg \Vert _{L^2(m(x)dx)}\\&\qquad +\Bigg \Vert \frac{1}{h}\int _{t}^{t+h}(u(t+h-\tau ,\cdot )-u(0+,\cdot )){\overline{\nu }}_\varPhi (\tau )d\tau \Bigg \Vert {L^2(m(x)dx)}. \end{aligned} \end{aligned}$$
(27)

Let us argue with the second summand; by Bochner’s theorem we have

$$\begin{aligned}&\left\| \frac{1}{h}\int _{t}^{t+h}(u(t+h-\tau ,\cdot )-u(0+,\cdot )){\overline{\nu }}_\varPhi (\tau )d\tau \right\| _{L^2(m(x)dx)}\\ {}&\quad \le \frac{1}{h}\int _{t}^{t+h}\left\| u(t+h-\tau ,\cdot )-u(0+,\cdot ) \right\| _{L^2(m(x)dx)}{\overline{\nu }}_\varPhi (\tau )d\tau \\&\quad \le \frac{{\overline{\nu }}_\varPhi (t)}{h}\int _{0}^{h}\left\| u(\tau ,\cdot )-u(0+,\cdot ) \right\| _{L^2(m(x)dx)}d\tau , \end{aligned}$$

thus, taking the limit and recalling that, by strong continuity of \(u(t,\cdot )\) in \(L^2(m(x)dx)\) the function \(t \mapsto \left\| u(\tau ,\cdot )-u(0+,\cdot ) \right\| _{L^2(m(x)dx)}\) is continuous, we have

$$\begin{aligned} \lim _{h \rightarrow 0^+}\left\| \frac{1}{h}\int _{t}^{t+h}(u(t+h-\tau ,\cdot )-u(0+,\cdot )){\overline{\nu }}_\varPhi (\tau )d\tau \right\| _{L^2(m(x)dx)}=0. \end{aligned}$$
(28)

Now let us consider the first summand of the right-hand side of Eq. (27). To handle this summand, let us first observe that

$$\begin{aligned} \begin{aligned} \int _0^t&(u(t+h-\tau ,y)-u(t-\tau ,y)){\overline{\nu }}_\varPhi (\tau )d\tau \\ {}&=\int _0^t\sum _{n=1}^{+\infty }({\mathfrak {e}}_\varPhi (t+h-\tau ;-\lambda _n)-{\mathfrak {e}}_\varPhi (t-\tau ;-\lambda _n))g_nQ_n(y){\overline{\nu }}_\varPhi (\tau )d\tau . \end{aligned} \end{aligned}$$
(29)

Being \(|{\mathfrak {e}}_\varPhi (t+h-\tau ;-\lambda _n)-{\mathfrak {e}}_\varPhi (t-\tau ;-\lambda _n)|\le 1\) and \(\sum _{n=1}^{+\infty }|g_nQ_n(y)|\) convergent, we have that the series on the right-hand side of Eq. (29) normally converges, thus we achieve

$$\begin{aligned} \begin{aligned} \int _0^t&(u(t+h-\tau ,y)-u(t-\tau ,y)){\overline{\nu }}_\varPhi (\tau )d\tau \\&=\sum _{n=1}^{+\infty }\int _0^t({\mathfrak {e}}_\varPhi (t+h-\tau ;-\lambda _n)-{\mathfrak {e}}_\varPhi (t-\tau ;-\lambda _n))g_nQ_n(y){\overline{\nu }}_\varPhi (\tau )d\tau . \end{aligned} \end{aligned}$$
(30)

Moreover, the same argument as before shows that the series on the right-hand side of Eq. (30) is absolutely convergent. Being also the series in Eq. (24) absolutely convergent, we have

$$\begin{aligned} \begin{aligned}&\Bigg \Vert \frac{1}{h}\int _0^{t}(u(t+h-\tau ,\cdot )-u(t-\tau ,\cdot )){\overline{\nu }}_\varPhi (\tau )d\tau \\ {}&\qquad +\sum _{n=1}^{+\infty }\lambda _n {\mathfrak {e}}_\varPhi (t;-\lambda _n)g_nQ_n\Bigg \Vert {L^2(m(x)dx)}\\&\quad =\Bigg \Vert \sum _{n=1}^{+\infty }\bigg (\int _0^{t}\frac{{\mathfrak {e}}_\varPhi (t+h-\tau ,\cdot )-{\mathfrak {e}}_\varPhi (t-\tau ,\cdot )}{h}{\overline{\nu }}_\varPhi (\tau )d\tau \\ {}&\qquad +\lambda _n {\mathfrak {e}}_\varPhi (t;-\lambda _n)\bigg )g_nQ_n\Bigg \Vert _{L^2(m(x)dx)}. \end{aligned} \end{aligned}$$
(31)

Now let us show that the series on the right-hand side of Eq. (31) converges in \(L^2(m(x)dx)\). To do this, let us define

$$\begin{aligned} S_N=\sum _{n=1}^{N}\left( \int _0^{t}\frac{{\mathfrak {e}}_\varPhi (t+h-\tau ,\cdot )-{\mathfrak {e}}_\varPhi (t-\tau ,\cdot )}{h}{\overline{\nu }}_\varPhi (\tau )d\tau +\lambda _n {\mathfrak {e}}_\varPhi (t;-\lambda _n)\right) g_nQ_n \end{aligned}$$

and observe that, for any \(N>M\)

$$\begin{aligned} \left\| S_N\right. \left. -S_M \right\| _{L^2(m(x)dx)}^2=&\sum _{n=M+1}^{N}\left( \int _0^{t}\frac{{\mathfrak {e}}_\varPhi (t+h-\tau ,\cdot )-{\mathfrak {e}}_\varPhi (t-\tau ,\cdot )}{h}{\overline{\nu }}_\varPhi (\tau )d\tau \right. \\&+\lambda _n {\mathfrak {e}}_\varPhi (t;-\lambda _n)\Bigg )^2g^2_n\\ \le&2\sum _{n=M+1}^{N}\Bigg (\left( \int _0^{t}\frac{{\mathfrak {e}}_\varPhi (t+h-\tau ,\cdot )-{\mathfrak {e}}_\varPhi (t-\tau ,\cdot )}{h}{\overline{\nu }}_\varPhi (\tau )d\tau \right) ^2\\&+\left( \lambda _n {\mathfrak {e}}_\varPhi (t;-\lambda _n)\right) ^2\Bigg )g^2_n\\ \le&2\left( \frac{1}{h^2}{\mathcal {I}}^2_\varPhi (t)+K^2(t)\right) \sum _{n=M+1}^{N}g_n^2, \end{aligned}$$

where we used Eq. (11). Being \((g_n)_{n \ge 0}\in \ell ^2\), we have that \(S_N\) converges in \(L^2(m(x)dx)\) by Cauchy’s criterion. This implies that we can use Parseval’s identity in the right-hand side of (31) to achieve

$$\begin{aligned} \begin{aligned}&\left\| \sum _{n=1}^{+\infty }\left( \int _0^{t}\frac{{\mathfrak {e}}_\varPhi (t+h-\tau ,\cdot )-{\mathfrak {e}}_\varPhi (t-\tau ,\cdot )}{h}{\overline{\nu }}_\varPhi (\tau )d\tau \right. \right. \\&\qquad \left. +\lambda _n {\mathfrak {e}}_\varPhi (t;-\lambda _n)\Bigg )g_nQ_n\right\| _{L^2(m(x)dx)}^2\\&=\sum _{n=1}^{+\infty }\left( \int _0^{t}\frac{{\mathfrak {e}}_\varPhi (t+h-\tau ,\cdot )-{\mathfrak {e}}_\varPhi (t-\tau ,\cdot )}{h}{\overline{\nu }}_\varPhi (\tau )d\tau \right. \\ {}&\qquad +\lambda _n {\mathfrak {e}}_\varPhi (t;-\lambda _n)\Bigg )^2g_n^2\\&=\sum _{n=1}^{+\infty }\left( \frac{J_\varPhi (t+h)-J_\varPhi (t)}{h}-\frac{1}{h}\int _{t}^{t+h}({\mathfrak {e}}_\varPhi (t+h-\tau ,\cdot )-{\mathfrak {e}}_\varPhi (0,\cdot )){\overline{\nu }}_\varPhi (\tau )d\tau \right. \\ {}&\qquad +\lambda _n {\mathfrak {e}}_\varPhi (t;-\lambda _n)\Bigg )^2g_n^2, \end{aligned} \end{aligned}$$
(32)

where

$$\begin{aligned} J_\varPhi (t)=\int _0^t {\mathfrak {e}}_\varPhi (t-\tau ;-\lambda _n){\overline{\nu }}_\varPhi (\tau )d\tau . \end{aligned}$$

Now let us observe that \(J_\varPhi \) belongs to \(C^1\) and

$$\begin{aligned} J_\varPhi '(t)=-\lambda _n {\mathfrak {e}}_\varPhi (t;-\lambda _n) \end{aligned}$$

thus, by Lagrange’s theorem, we know there exists \(\theta _n(h)\in [t,t+h]\) such that

$$\begin{aligned} \frac{J_\varPhi (t+h)-J_\varPhi (t)}{h}=-\lambda _n {\mathfrak {e}}_\varPhi (\theta _n(h);-\lambda _n). \end{aligned}$$

Using the latter in Eq. (32) we obtain

$$\begin{aligned} \begin{aligned}&\left\| \sum _{n=1}^{+\infty }\left( \int _0^{t}\frac{{\mathfrak {e}}_\varPhi (t+h-\tau ,\cdot )-{\mathfrak {e}}_\varPhi (t-\tau ,\cdot )}{h}{\overline{\nu }}_\varPhi (\tau )d\tau \right. \right. \\&\qquad \left. +\lambda _n {\mathfrak {e}}_\varPhi (t;-\lambda _n)\Bigg )g_nQ_n\right\| _{L^2(m(x)dx)}^2\\&\le 2\sum _{n=1}^{+\infty }\left( \lambda _n^2({\mathfrak {e}}_\varPhi (t;-\lambda _n)-{\mathfrak {e}}_\varPhi (\theta _n(h);-\lambda _n))^2\right. \\ {}&\qquad \left. +\left( \frac{{\overline{\nu }}_\varPhi (t)}{h}\int _{0}^{h}({\mathfrak {e}}_\varPhi (\tau ,\cdot )-{\mathfrak {e}}_\varPhi (0,\cdot ))d\tau \right) ^2\right) g_n^2. \end{aligned} \end{aligned}$$
(33)

Now let us show that the series in the right-hand side of (33) normally converges. Indeed, considering \(h \in [0,\delta ]\) for some \(\delta >0\), we have

$$\begin{aligned} \sum _{n=1}^{+\infty }&\Bigg (\lambda _n^2({\mathfrak {e}}_\varPhi (t;-\lambda _n)-{\mathfrak {e}}_\varPhi (\theta _n(h);-\lambda _n))^2\\ {}&\qquad +\left( \frac{{\overline{\nu }}_\varPhi (t)}{h}\int _{0}^{h}({\mathfrak {e}}_\varPhi (\tau ,\cdot )-{\mathfrak {e}}_\varPhi (0,\cdot ))d\tau \right) ^2\Bigg )g_n^2\\&\le (2K^2(t+\delta )+{\overline{\nu }}_\varPhi ^2(t))\sum _{n=1}^{+\infty }g_n^2. \end{aligned}$$

Thus we can take the limit as \(h \rightarrow 0^+\) inside the summation sign, obtaining, since \({\mathfrak {e}}_\varPhi (t;-\lambda _n)\) is continuous,

$$\begin{aligned} \begin{aligned} \lim _{h \rightarrow 0^+}\left\| \right.&\left. \sum _{n=1}^{+\infty }\left( \int _0^{t}\frac{{\mathfrak {e}}_\varPhi (t+h-\tau ,\cdot )-{\mathfrak {e}}_\varPhi (t-\tau ,\cdot )}{h}{\overline{\nu }}_\varPhi (\tau )d\tau \right. \right. \\&\qquad \left. \left. +\lambda _n {\mathfrak {e}}_\varPhi (t;-\lambda _n)\right) g_nQ_n \right\| _{L^2(m(x)dx)}=0, \end{aligned} \end{aligned}$$

that is to say

$$\begin{aligned} \begin{aligned}&\lim _{h \rightarrow 0^+}\Bigg \Vert \frac{1}{h}\int _0^{t}(u(t+h-\tau ,\cdot )-u(t-\tau ,\cdot )){\overline{\nu }}_\varPhi (\tau )d\tau \\&\qquad +\sum _{n=1}^{+\infty }\lambda _n {\mathfrak {e}}_\varPhi (t;-\lambda _n)g_nQ_n\Bigg \Vert _{L^2(m(x)dx)}=0. \end{aligned} \end{aligned}$$
(34)

Taking the limit as \(h \rightarrow 0^+\) in (27) and using Eqs. (28) and (34) we achieve Eq. (26).

Uniqueness of the solution follows from the fact that \((Q_n)_{n \in {\mathbb {N}}}\) is a complete orthonormal system in \(L^2(m(x)dx)\). Finally, the fact that \(p_\varPhi (t,x;y)\) is the fundamental solution of (19) follows from the Spectral Decomposition Theorem 4. \(\square \)

Remark 2

Let us observe that if \(g \in C^2_0(E)\) and \(\varPhi \) is a complete Bernstein function, we can use Theorem 3 to obtain that u(ty) defined in Eq. (41) is solution of (19) and can be extended to an analytic function in a sector \({\mathbb {C}}(\alpha )\) for some \(\alpha <\frac{\pi }{2}\).

5.3 The Time-non-local Forward Kolmogorov Equation

Here we want to focus on the following time-non-local Cauchy problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \partial _t^\varPhi v(t,x)={\mathcal {F}}v(t,x) &{} t>0, \ x \in E \\ v(0,x)=f(x) &{} x \in E \end{array}\right. } \end{aligned}$$
(35)

for suitable initial datum f. For the definition of strong solution, we still refer to Definition 4. To do this, we first need to show the following preliminary result, whose proof can be omitted since it follows by direct calculations and by using Pearson equation (6).

Lemma 4

Let X(t) be a Pearson diffusion of spectral category I with diffusion space E, associated family of orthonormal polynomials \(Q_n\) with respective eigenvalues \(\lambda _n\), stationary density m(x) and Fokker–Planck operator \({\mathcal {F}}\). Then

$$\begin{aligned} {\mathcal {F}}(m(x)Q_n(x))=-\lambda _nm(x)Q_n(x). \end{aligned}$$

Now that we have proven this Lemma, we can actually show existence, uniqueness and spectral decomposition of the strong solutions (in \(L^2(m(x)dx)\)) of (35).

Theorem 6

Let X(t) be a Pearson diffusion of spectral category I with diffusion space E, associated family of orthonormal polynomials \(Q_n\) with respective eigenvalues \(\lambda _n\), Fokker–Planck operator \({\mathcal {F}}\) and bounded stationary density m(x). Let \(\varPhi \in \mathcal {BF}\) be a Bernstein function. Let \(f:E \rightarrow {\mathbb {R}}\) be such that \(f/m \in L^2(m(x)dx)\) is decomposed as

$$\begin{aligned} f/m=\sum _{n=0}^{+\infty }f_nQ_n \end{aligned}$$

where the series converges in \(L^2(m(x)dx)\), absolutely for fixed \(x \in E\) and uniformly on compact intervals \([x_1,x_2]\subset E\). Then the unique strong solution in \(L^2(m(x)dx)\) of the problem (19) is given by

$$\begin{aligned} v(t,x)=m(x)\sum _{n=0}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(x)f_n. \end{aligned}$$
(36)

Moreover, \(p_\varPhi (t,x;y)\) is the fundamental solution of the problem (19), in the sense that, for any initial datum f such that \(f/m\in L^2(m(x)dx)\) and \(f/m=\sum _{n=0}^{+\infty }f_nQ_n\) converges in \(L^2(m(x)dx)\), absolutely for fixed \(x \in E\) and uniformly on compact interval \([x_1,x_2]\subset E\), it holds

$$\begin{aligned} v(t,x)=\int _{E}p_\varPhi (t,x;y)f(y)dy. \end{aligned}$$

Proof

Since we are supposing that m(x) is bounded, it is easy to see that if a function f is such that \(f/m \in L^2(m(x)dx)\), then \(f \in L^2(m(x)dx)\). Thus the convergence of the series in (36) in \(L^2(m(x)dx)\) is ensured by the convergence of the series defined by v(tx)/m(x). Moreover, the strong continuity of v(tx) follows from the one of v(tx)/m(x). For the single summand of (36), it holds, by using the previous Lemma,

$$\begin{aligned} \partial _t^\varPhi m(x){\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(x)f_n={\mathcal {F}}m(x){\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(x)f_n. \end{aligned}$$

Finally, the exact same proof of Theorem 5 still works in this case. \(\square \)

Remark 3

If m is not bounded (i.e. in the cases of the CIR process as \(b \in (0,1)\) and of the Jacobi process as \(a,b \in (-1,0)\)), one can prove in the same way that v(tx) is a strong solution in \(L^2(m^{-1}(x)dx)\).

If, additionally, \(f \in L^2(m(x)dx)\) and the same convergence assumptions on f/m hold also for \(f=\sum _{n=0}^{+\infty }Q_n{\widetilde{f}}_n\), then v(tx) is a strong solution in \(L^2(m(x)dx)\) even if m(x) is unbounded. Finally, let us recall that if \(f \in C^2_0(E)\) and \(\varPhi \) is a complete Bernstein function, then, by Lemma 2 and Theorem 3, we know that \(v(t,\cdot )\) can be analytically extended in a sectorial region \({\mathbb {C}}(\alpha )\) where \(\alpha <\frac{\pi }{2}\).

6 Spectral Decomposition of Time-non-local Pearson Diffusions of Spectral Category II and III

Now we consider the case of Pearson diffusions of spectral category II and III. We discuss them together since to exploit strong solutions of Eqs. (19) and (35) we have to directly use semigroup theory, being the spectral decomposition of \(p_\varPhi (t,x;y)\) more complicated.

Let us first focus on the spectral decomposition of the transition densities.

6.1 Spectral Decomposition of the Transition Density for Spectral Category II

Now let us show a spectral decomposition theorem for the transition density \(p_\varPhi (t,x;y)\) of a time-non-local Pearson diffusion of spectral category II.

Theorem 7

Let \(\varPhi \in \mathcal {BF}\) be a Bernstein function, X(t) be a Pearson diffusion of spectral category II with diffusion space E, associated family of classical orthonormal polynomials \(Q_n(x)\) for \(n \le N_j\), with respective eigenvalues in the discrete spectrum of \(-{\mathcal {G}}\) given by \(\lambda _n\) and m(x) stationary density. Then, for any \(t>0\) and \(x,x_0 \in E\), it holds

$$\begin{aligned} p_\varPhi (t,x;x_0)=p_{d,\varPhi }(t,x;x_0)+p_{c,\varPhi }(t,x;x_0) \end{aligned}$$

where

$$\begin{aligned} p_{d,\varPhi }(t,x;x_0)=m(x)\sum _{n=0}^{N_j}{\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(x_0)Q_n(x) \end{aligned}$$
(37)

and

$$\begin{aligned} p_{c,\varPhi }(t,x;x_0)=\frac{m(x)}{\pi }\int _{\varLambda _j}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda )a_j(\lambda )f_j(x;-\lambda )f_j(x_0;-\lambda )d\lambda , \end{aligned}$$

where \(j=1,2\) and the involved functions are defined in subsection 2.2.

Proof

By Theorem 1 we already know that

$$\begin{aligned} p(t,x;x_0)=p_{d}(t,x;x_0)+p_{c}(t,x;x_0) \end{aligned}$$

where

$$\begin{aligned} p_{d}(t,x;x_0)=m(x)\sum _{n=0}^{N_j}e^{-\lambda _n t}Q_n(x_0)Q_n(x) \end{aligned}$$

and

$$\begin{aligned} p_{c}(t,x;x_0)=\frac{m(x)}{\pi }\int _{\varLambda _j}^{+\infty }e^{-\lambda t}a_j(\lambda )f_j(x;-\lambda )f_j(x_0;-\lambda )d\lambda . \end{aligned}$$

Moreover, by Eq. (14) we know that

$$\begin{aligned} p_\varPhi (t,x;x_0)&=\int _0^{+\infty }p(s,x;x_0)f_\varPhi (s;t)ds\\&=\int _0^{+\infty }p_d(s,x;x_0)f_\varPhi (s;t)ds+\int _0^{+\infty }p_c(s,x;x_0)f_\varPhi (s;t)ds\\&:=p_{d,\varPhi }(t,x;x_0)+p_{c,\varPhi }(t,x;x_0). \end{aligned}$$

Concerning \(p_{d,\varPhi }\), we already know that it is given by Eq. (37), by linearity of the integral. Thus let us argue on \(p_{c,\varPhi }\). To use Fubini’s theorem we only have to show that

$$\begin{aligned} \int _{\varLambda _j}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda )|a_j(\lambda )f_j(x;-\lambda )f_j(x_0;-\lambda )|d\lambda <+\infty . \end{aligned}$$

The estimates in [44, pp. 3522, 3526] lead to (for fixed \(x,x_0 \in E\))

$$\begin{aligned} |a_j(\lambda )f_j(x;-\lambda )f_j(x_0;-\lambda )|\le C_j |\varDelta _j(\lambda )|^{-1}(1+O(|\varDelta _j(\lambda )|^{-1})). \end{aligned}$$

Moreover, we have that

$$\begin{aligned} {\mathfrak {e}}_\varPhi (t;-\lambda )\le M(t)\lambda ^{-1} \end{aligned}$$

and \(|\varDelta _j(\lambda )| \ge C_j \lambda ^{\frac{1}{2}}\) as \(\lambda \rightarrow +\infty \). Thus we finally get

$$\begin{aligned} {\mathfrak {e}}_\varPhi (t;-\lambda )|a_j(\lambda )f_j(x;-\lambda )f_j(x_0;-\lambda )|=O(\lambda ^{-\frac{3}{2}}), \end{aligned}$$

concluding the proof. \(\square \)

6.2 Spectral Decomposition of the Transition Density for Spectral Category III

Concerning the spectral decomposition of \(p_\varPhi (t,x;y)\) for spectral category III, we will not use asymptotics of the eigenfunctions. Instead, we will use the fact that the continuous part of the spectrum is of multiplicity two on our advantage.

Theorem 8

Let \(\varPhi \in \mathcal {BF}\) be a Bernstein function, X(t) be a Student process with diffusion space \(E={\mathbb {R}}\), associated family of classical orthonormal polynomials \(Q_n(x)\) for \(n \le N_3\) with respective eigenvalues in the discrete spectrum of \(-{\mathcal {G}}\) given by \(\lambda _n\) and m(x) Student density. Then, for any \(t>0\) and \(x,x_0 \in E\), it holds

$$\begin{aligned} p_\varPhi (t,x;x_0)=p_{d,\varPhi }(t,x;x_0)+p_{c,\varPhi }(t,x;x_0) \end{aligned}$$

where

$$\begin{aligned} p_{d,\varPhi }(t,x;x_0)=m(x)\sum _{n=0}^{N_3}{\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(x_0)Q_n(x) \end{aligned}$$
(38)

and

$$\begin{aligned}&p_{c,\varPhi }(t,x;x_0) = m(x)\int _{\varLambda _3}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda )\left( \frac{f_3(x_0,-\lambda )f_3(x,-\lambda )}{\left\| f_j(\cdot ,-\lambda ) \right\| _{L^2(m(dx))}^2}\right. \nonumber \\&\quad \left. +\frac{{\bar{f}}_3(x_0,-\lambda ){\bar{f}}_3(x,-\lambda )}{\left\| {\bar{f}}_j(\cdot ,-\lambda ) \right\| _{L^2(m(dx))}^2}+\frac{f_3(x_0,-\lambda ){\bar{f}}_3(x,-\lambda )+{\bar{f}}_3(x_0,-\lambda )f_3(x,-\lambda )}{\left\| {\bar{f}}_3(\cdot ,-\lambda ) \right\| _{L^2(m(dx))} \left\| f_3(\cdot ,-\lambda ) \right\| _{L^2(m(dx))}}\right) d\lambda . \qquad \end{aligned}$$
(39)

Proof

As in Theorem 7, we only need to deal with \(p_{c,\varPhi }\). Let us denote

$$\begin{aligned} u_3(x,-\lambda )=\mathfrak {R}\left( \frac{f_3(x,-\lambda )}{\left\| f_3(x,-\lambda ) \right\| _{L^2(m(dx))}}\right) . \end{aligned}$$

Then it is easy to see, by simple complex analysis arguments, that

$$\begin{aligned} p_c(t,x;x_0)=4m(x)\int _{\varLambda _3}^{+\infty }e^{-\lambda t}u_3(x,-\lambda )u_3(x_0,-\lambda )d\lambda . \end{aligned}$$

Now set \(x=x_0\) and let us first show the spectral decomposition in this case. Observe that

$$\begin{aligned} \int _{0}^{+\infty }\int _{\varLambda _3}^{+\infty }e^{-\lambda s}u^2_3(x,-\lambda )d\lambda f_\varPhi (s;t)ds=\int _{\varLambda _3}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda )u_3^2(x,-\lambda )d\lambda \end{aligned}$$

by Fubini–Tonelli theorem, since the integrands are all non-negative. Thus we obtain, by Eq. (14),

$$\begin{aligned} p_\varPhi (t,x;x)=p_{d,\varPhi }(t,x;x)+p_{c,\varPhi }(t,x;x). \end{aligned}$$
(40)

Now let us fix \(x \in {\mathbb {R}}\) and consider any scale function \({\mathfrak {s}}:{\mathbb {R}}\rightarrow {\mathbb {R}}\), i.e. a continuous strictly increasing function on \({\mathbb {R}}\) such that for any \(a,b \in {\mathbb {R}}\) with \(a<b\) and \(y \in (a,b)\) it holds

$$\begin{aligned} {\mathbb {P}}_y(T_b<T_a)=\frac{{\mathfrak {s}}(y)-{\mathfrak {s}}(a)}{{\mathfrak {s}}(b)-{\mathfrak {s}}(a)} \end{aligned}$$

where \(T_b=\inf \{t>0: \ X(t)>b\}\) and \(T_a=\inf \{t>0: \ X(t)<a\}\) (see [60, Proposition 3.2, Chapter VII]). Let us denote by \({\mathfrak {s}}^{-1}\) its inverse function. Thus, define

$$\begin{aligned} V_x(t)=t\int _{{\mathfrak {s}}^{-1}({\mathfrak {s}}(x)-t)}^{{\mathfrak {s}}^{-1}({\mathfrak {s}}(x)+t)}sp(y)dy, \end{aligned}$$

where sp(y) is the speed density of X(t). \(V_x(t)\) is strictly increasing and continuous, thus so it is its inverse function \(V^{-1}_x(t)\). Moreover, as \(V_x(0)=0\) (by dominated convergence theorem, since the speed measure is finite) we also have \(V_x^{-1}(0)=0\) and then \(V_x^{-1}(t)\) is bounded in a neighbourhood of 0. Then, by [73, Theorem 2.1], we know that there exist two constants \(t_0,C>0\) (depending on x) such that for any \(t<t_0\) it holds

$$\begin{aligned} \int _0^t p(t,x;x)dt\le C V^{-1}_x(t). \end{aligned}$$

In particular this means that there exists \(\varepsilon >0\) (for instance \(\varepsilon =t_0/2\)) such that \(\int _0^{\varepsilon }p(t,x;x)dt<+\infty \).

Now let us recall that \(f_\varPhi (s;t)\) is continuous for fixed t and \(f_\varPhi (0^+;t)={\bar{\nu }}_\varPhi (t)\). Thus, for \(s \in [0,\varepsilon ]\), there exists a constant \(M(t)>0\) such that \(f_\varPhi (s;t)\le M(t)\). Moreover, from the spectral decomposition, we have that p(txx) is decreasing in t, thus \(p(t,x;x)\le p(\varepsilon ,x;x)\) for any \(t \ge \varepsilon \). Thus we have

$$\begin{aligned} p_\varPhi (t,x;x)&=\int _0^{+\infty }p(s,x;x)f_\varPhi (s;t)ds\\&=\int _0^{\varepsilon }p(s,x;x)f_\varPhi (s;t)ds+\int _\varepsilon ^{+\infty }p(s,x;x)f_\varPhi (s;t)ds\\&\le M(t)\int _0^{\varepsilon }p(s,x;x)ds+p(\varepsilon ,x;x)\int _{\varepsilon }^{+\infty }f_\varPhi (s;t)ds<+\infty . \end{aligned}$$

Thus, by Eq. (40) and the fact that both the summands are non-negative, we have that

$$\begin{aligned} \int _{\varLambda _3}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda )u_3^2(x;-\lambda )d\lambda <+\infty . \end{aligned}$$

Now we can consider the case \(x_0 \not = x\). Indeed, in such case, we have

$$\begin{aligned}&\left( \int _{\varLambda _3}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda )|u_3(x_0;-\lambda )u_3(x;-\lambda )|d\lambda \right) ^2 \\&\quad \le \left( \int _{\varLambda _3}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda )u_3^2(x;-\lambda )d\lambda \right) \left( \int _{\varLambda _3}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda )u_3^2(x_0;-\lambda )d\lambda \right) <+\infty , \end{aligned}$$

thus we can use Fubini’s theorem also in the case \(x \not = x_0\) to conclude the proof. \(\square \)

6.3 The Time-non-local Backward and Forward Kolmogorov Equations

Now we can focus on the strong solutions of the time-non-local Backward and Forward Kolmogorov Eqs. (19) and (35). Both the results are direct consequence of [23, Theorem 2.1].

Theorem 9

Let X(t) be a Pearson diffusion of spectral category II or III with diffusion space E and generator \({\mathcal {G}}\). Let \(\varPhi \in \mathcal {BF}\) be a Bernstein function and \(g \in C^2_0(E)\). Then the strong solution of (19) with initial datum g is given by

$$\begin{aligned} u(t,y)=\int _{E}p_\varPhi (t,x;y)g(x)dx. \end{aligned}$$

Theorem 10

Let X(t) be a Pearson diffusion of spectral category II with diffusion space E and Fokker–Planck operator \({\mathcal {F}}\). Let \(\varPhi \in \mathcal {BF}\) be a Bernstein function. Let \(f \in C^2_c(E)\). Then the strong solution of (35) with initial datum f is given by

$$\begin{aligned} v(t,x)=\int _{E}p_\varPhi (t,x;y)f(y)dy. \end{aligned}$$

Remark 4

For both previous Theorems, if \(\varPhi \) is a complete Bernstein function, then u(tx) and v(tx) can be extended by analyticity to a sectorial region \({\mathbb {C}}(\alpha )\) with \(\alpha <\frac{\pi }{2}\).

7 Stochastic Representation of the Solutions of the Backward and Forward Kolmogorov Equations

Let us recall that in the previous two Sections we have characterized \(p_\varPhi (t,x;y)\) as the fundamental solution of both problems (19) and (35). Recalling that \(p_\varPhi (t,x;y)\) is the transition probability density of \(X_\varPhi (t)\), we obtain some stochastic representation results. Concerning the backward Kolmogorov equation (19) we have the following result.

Proposition 3

Let X(t) be a Pearson diffusion with diffusion space E, associated family of orthonormal polynomials \(Q_n\) with respective eigenvalues \(\lambda _n\), generator \({\mathcal {G}}\) and stationary density m(x). Let \(\varPhi \in \mathcal {BF}\) be a Bernstein function and \(g \in C^2_0(E)\). Then the unique strong solution of (19) is given by

$$\begin{aligned} u(t,y)={\mathbb {E}}_y[g(X_\varPhi (t))]. \end{aligned}$$
(41)

Proof

By definition of conditional expectation and the fact that \(p_\varPhi (t,x;y)\) is the transition density of \(X_\varPhi (t)\), we have

$$\begin{aligned} u(t,y)=\int _{E}p_\varPhi (t,x;y)g(x)dx \end{aligned}$$

concluding the proof by observing that \(p_\varPhi (t,x;y)\) is the fundamental solution of (19). \(\square \)

Remark 5

If \(X_\varPhi (t)\) belongs to spectral category I, then one can weaken the hypotheses on the initial datum, by asking the same hypotheses as in Theorem 5 in place of \(g \in C^2_0(E)\).

However, since we are using adjoint operators, we have to ask for more strict hypotheses for the stochastic representation of strong solutions of (35).

Proposition 4

Let X(t) be a Pearson diffusion with diffusion space E, associated family of orthonormal polynomials \(Q_n\) with respective eigenvalues \(\lambda _n\), Fokker–Planck operator \({\mathcal {F}}\) and stationary density m(x). Let \(\varPhi \in \mathcal {BF}\) be a Bernstein function and \(f \in L^1(dx) \cap C^2_c(E)\) with \(f \ge 0\) and \(\left\| f \right\| _{L^1(dx)}=1\). Then the unique strong solution of the problem (35) is given by

$$\begin{aligned} v(t,x)=\frac{d }{d x}{\mathbb {P}}_f(X_\varPhi (t)\le x) \end{aligned}$$
(42)

where \({\mathbb {P}}_f(\cdot )\) is the probability measure obtained by \({\mathbb {P}}\) conditioning with the request that \(X_\varPhi (0)\) admits f(x)dx as probability law.

Proof

Let us observe that (setting \(p_\varPhi (t,x;y)\) as \(p_\varPhi (t,x;y)\chi _{E}(x)\))

$$\begin{aligned} {\mathbb {P}}_f(X_\varPhi (t)\le x)&=\int _0^{+\infty }{\mathbb {P}}_y(X_\varPhi (t)\le x)f(y)dy\\&=\int _0^{+\infty }\int _{-\infty }^{x}p_\varPhi (t,z;y)dzf(y)dy\\&=\int _{-\infty }^{x}\int _0^{+\infty }p_\varPhi (t,z;y)f(y)dydz. \end{aligned}$$

Thus it holds

$$\begin{aligned} v(t,x)=\frac{d }{d x}{\mathbb {P}}_f(X_\varPhi (t)\le x)=\int _0^{+\infty }p_\varPhi (t,x;y)f(y)dy \end{aligned}$$

concluding the proof by the fact that \(p_\varPhi (t,x;y)\) is the fundamental solution of (35). \(\square \)

Remark 6

As for Proposition 3, if \(X_\varPhi (t)\) belongs to spectral category I, then one can weaken the hypotheses on the initial datum, by asking the same hypotheses as in Theorem 6 in place of \(f \in C^2_c(E)\).

8 Limit Distributions, First Order Stationarity and Correlation Structure of Time-non-local Pearson Diffusions

Now let us use the previous results to establish some properties of time-non-local Pearson diffusions \(X_\varPhi (t)\). Let us first show that for regular enough initial distributions the limit distribution of any time-non-local Pearson diffusion admits is given by the stationary distribution m(x)dx of the respective Pearson diffusion.

Theorem 11

Let X(t) be a Pearson diffusion with diffusion space E, associated family of orthonormal polynomials \(Q_n\) with respective eigenvalues \(\lambda _n\) and stationary distribution m(x)dx. Let \(\varPhi \in \mathcal {BF}\) be a Bernstein function. Then

$$\begin{aligned} \lim _{t \rightarrow +\infty }p_\varPhi (t,x;y)\rightarrow m(x) \end{aligned}$$
(43)

for fixed \(x,y \in E\).

Moreover, for any \(f \in C^2_c(E) \cap L^1(E)\) such that \(f \ge 0\) and \(\left\| f \right\| _{L^1(E)}=1\), it holds

$$\begin{aligned} \lim _{t \rightarrow +\infty }\left( \frac{d }{d x}{\mathbb {P}}_f(X_\varPhi (t)\le x)\right) =m(x) \end{aligned}$$

for any fixed \(x \in E\).

Proof

Let us first argue for \(X_\varPhi (t)\) of spectral category I. First of all, we have (since for any Pearson diffusion \(\lambda _0=0\) and \(Q_0(x)\equiv 1\)):

$$\begin{aligned} p_\varPhi (t,x;y)=m(x)+\sum _{n=1}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(x)Q_n(y). \end{aligned}$$

Since we are sending \(t \rightarrow +\infty \), we can suppose \(t\ge t_0>0\) and then \({\mathfrak {e}}_\varPhi (t;-\lambda _n)\le {\mathfrak {e}}_\varPhi (t_0;-\lambda _n)\). By Lemma 3, we can use dominated convergence theorem to obtain Eq. (43).

Concerning \(X_\varPhi (t)\) of spectral category II, arguing as before, we have (for \(j=1,2\))

$$\begin{aligned} p_\varPhi (t,x;y)= & {} m(x)+\sum _{n=1}^{N_j}{\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(x)Q_n(y)\\&+m(x)\int _{\varLambda _j}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda )a_j(\lambda )f_j(x;-\lambda )f_j(y;-\lambda )d\lambda , \end{aligned}$$

where \(\sum _{n=1}^{0}\equiv 0\). Arguing as before and observing as in Theorem 7 that

$$\begin{aligned} |{\mathfrak {e}}_\varPhi (t_0;-\lambda )a_j(\lambda )f_j(x;-\lambda )f_j(y;-\lambda )|=O(\lambda ^{-\frac{3}{2}}), \end{aligned}$$

we can use again dominated convergence theorem to achieve Eq. (43).

Finally, if \(X_\varPhi (t)\) is a Student diffusion we have

$$\begin{aligned} p_\varPhi (t,x;y)= & {} m(x)+\sum _{n=1}^{N_3}{\mathfrak {e}}_\varPhi (t;-\lambda _n)Q_n(x)Q_n(y)\\&+4m(x)\int _{\varLambda _3}^{+\infty }{\mathfrak {e}}_\varPhi (t;-\lambda )u_3(x;-\lambda )u_3(y;-\lambda )d\lambda , \end{aligned}$$

where \(u_3\) is defined in the proof of Theorem 8. Since we have shown by Cauchy-Schwartz inequality that \(|{\mathfrak {e}}_\varPhi (t_0;-\lambda )u_j(x;-\lambda )u_j(y;-\lambda )|\) is integrable in \(\lambda \), we can still use dominated convergence theorem to achieve Eq. (43).

Finally, arguing as in Proposition 4, we have

$$\begin{aligned} \frac{d }{d x}{\mathbb {P}}_f(X_\varPhi (t)\le x)=\int _0^{+\infty }p_\varPhi (t,x;y)f(y)dy \end{aligned}$$

and then we conclude the proof by dominated convergence theorem. \(\square \)

Remark 7

If \(X_\varPhi (t)\) belongs to spectral category I, then we can substitute the hypothesis \(f \in C^2_c(E)\) with the ones of Theorem 6.

In particular, we can show that m(x) is actually the first-order stationary distribution of \(X_\varPhi (t)\).

Proposition 5

Let X(t) be a Pearson diffusion with diffusion space E, associated family of orthonormal polynomials \(Q_n\) with respective eigenvalues \(\lambda _n\) and stationary distribution m(x)dx. Let \(\varPhi \in \mathcal {BF}\) be a Bernstein function. Then it holds

$$\begin{aligned} \frac{d }{d x}{\mathbb {P}}_m(X_\varPhi (t)\le x)=m(x) \end{aligned}$$

for any \(t \ge 0\) and \(x \in E\).

Proof

Let us observe that

$$\begin{aligned} \frac{d }{d x}{\mathbb {P}}_m(X_\varPhi (t)\le x)&=\int _{E}p_\varPhi (t,x;y)m(y)dy\\&=\int _{E}\int _{0}^{+\infty }p(s,x;y)f_\varPhi (s;t)dsm(y)dy\\&=\int _{0}^{+\infty }\int _{E}p(s,x;y)m(y)dyf_\varPhi (s;t)ds\\&=m(x)\int _{0}^{+\infty }f_\varPhi (s;t)ds=m(x) \end{aligned}$$

where we used Tonelli–Fubini’s theorem (since the integrand is non-negative) and the fact that m(x) is the stationary measure of X(t). \(\square \)

Remark 8

An interesting problem we leave open consists in giving estimates on the convergence rates of non-stationary time-non-local Pearson diffusions, as done in the classical case for the Fisher–Snedecor diffusion in [40].

However, the process \(X_\varPhi (t)\) is not second-order stationary, neither in wide sense, as we can see from the following result.

Proposition 6

Let X(t) be a Pearson diffusion with diffusion space E, associated family of orthonormal polynomials \(Q_n\) with respective eigenvalues \(\lambda _n\) and stationary distribution m(x). Let \(\varPhi \in \mathcal {BF}\) be a Bernstein function. Then it holds, for any \(t \ge s \ge 0\),

$$\begin{aligned} Corr_m(X_\varPhi (t),X_\varPhi (s))= & {} \lambda _1\int _0^{s}{\mathfrak {e}}_\varPhi (t-\tau ;-\lambda _1)dU_\varPhi (\tau )\\&-2+2{\mathfrak {e}}_\varPhi (s;-\lambda _1)+{\mathfrak {e}}_\varPhi (t;-\lambda _1), \end{aligned}$$

where \(U_\varPhi (t)={\mathbb {E}}[L_\varPhi (t)]\).

Proof

This is a consequence of the fact that \(Corr_m(X(t),X(s))=e^{-\lambda _1|t-s|}\) and [9, Theorem 2]. \(\square \)

Since, if \(X_\varPhi (0)\) admits m(x) as probability density function, the process \(X_\varPhi (t)\) is first-order stationary, but not second-order stationary, we cannot exploit long-range or short-range dependence properties with the usual definitions. Thus, taking inspiration from [14, Lemmas 2.1 and 2.2], we give the following definitions.

Definition 5

Given the function \(\gamma (n)=Corr_m(X_\varPhi (n),X_\varPhi (0))\) for \(n \in {\mathbb {N}}\):

  • \(X_\varPhi (t)\) is said to exhibit long-range dependence if \(\gamma (n)\sim \ell (n)n^{-\alpha }\) where \(\ell (n)\) is a slowly varying function and \(\alpha \in \left( 0,1\right) \);

  • \(X_\varPhi (t)\) is said to exhibit short-range dependence if \(\sum _{n=1}^{+\infty }|\gamma (n)|<+\infty \).

Now let us observe that, by Proposition 6, we know that

$$\begin{aligned} Corr_m(X_\varPhi (t),X_\varPhi (0))={\mathfrak {e}}_\varPhi (t;-\lambda _1) \end{aligned}$$

thus we only have to exploit some asymptotic properties of \({\mathfrak {e}}_\varPhi (t;-\lambda _1)\).

Proposition 7

Let \(\varPhi \in \mathcal {BF}\) be a Bernstein function. Then the following properties hold:

  1. 1.

    If \(\varPhi \) is regularly varying at \(0^+\) with order \(\alpha \in (0,1)\), then, for any fixed \(\lambda >0\), \({\mathfrak {e}}_\varPhi (t;-\lambda )\) is regularly varying at \(\infty \) with order \(-\alpha \).

  2. 2.

    Suppose \(\lim _{z \rightarrow 0^+}\frac{\varPhi (z)}{z}=l \in (0,+\infty )\). Then, for fixed \(\lambda >0\), \({\mathfrak {e}}_\varPhi (t;-\lambda )\) is integrable in \((0,+\infty )\).

Proof

Property (1) has been already proved in [8] thus we focus on property (2). Let us define the integral function \(J(t)=\int _0^t {\mathfrak {e}}_\varPhi (s;-\lambda )ds\) and set \({\overline{J}}\) the Laplace-Stieltjes transform of J. We have

$$\begin{aligned} {\overline{J}}(z)=\frac{\varPhi (z)}{z(\varPhi (z)+\lambda )}. \end{aligned}$$

Taking the limit as \(z \rightarrow 0^+\), since \(\varPhi (0)=0\), we have \(\lim _{z \rightarrow 0^+}{\overline{J}}(z)=l/\lambda \). Thus, by Karamata’s Tauberian theorem, we conclude the proof. \(\square \)

As a direct Corollary we have

Corollary 1

Let \(\varPhi \in \mathcal {BF}\) be a Bernstein function and \(X_\varPhi (t)\) be a time-non-local Pearson diffusion such that \(X_\varPhi (0)\) admits probability density function m(x). Then the following properties hold:

  1. (i)

    If \(\varPhi \) is regularly varying at \(0^+\) with order \(\alpha \in (0,1)\), then \(X_\varPhi (t)\) is long-range dependent with respect to the initial datum;

  2. (ii)

    If \(\lim _{z \rightarrow 0^+}\frac{\varPhi (z)}{z}=l \in (0,+\infty )\), then \(X_\varPhi (t)\) is short-range dependent with respect to the initial datum.

Proof

Property (i) directly follows from property (1) of the previous proposition. Property (ii) instead follows from property (2) of the previous proposition and the integral criterion for convergence of the series. \(\square \)

Now we can consider the examples we stated in Sect. 3:

  • If \(\varPhi (\lambda )=\lambda ^\alpha \) with \(\alpha \in (0,1)\) then we have that \(X_\varPhi (t)\) is actually a fractional Pearson diffusion as exploited in [43] for the first spectral category or [44] for the second spectral category. In any case, \(X_\alpha \) is long-range dependent with respect to the initial datum as \(\varPhi \) is regularly varying at \(0^+\) with index \(\alpha \in (0,1)\). In particular, by using the formula for the autocorrelation function given in [42], that is

    $$\begin{aligned} Corr(X_\varPhi (t),X_\varPhi (s))=E_\alpha (-\lambda _1 t^\alpha )+\frac{\lambda _1 \alpha t^\alpha }{\varGamma (1+\alpha )}\int _0^{\frac{s}{t}}\frac{E_\alpha (-\lambda _1 t^\alpha (1-z)^\alpha )}{z^{1-\alpha }}dz, \end{aligned}$$

    we have that the process is actually long-range dependent with respect to any datum \(X_\varPhi (s)\).

  • If \(\varPhi (\lambda )=(\lambda +\theta )^\alpha -\theta ^\alpha \) for \(\alpha \in (0,1)\), we have \(\lim _{\lambda \rightarrow 0^+}\frac{\varPhi (\lambda )}{\lambda }=\alpha \). Thus the tempered fractional Pearson diffusions \(X_\varPhi (t)\) are short-range dependent with respect to the initial datum;

  • If \(\varPhi (\lambda )=\log (1+\lambda ^\alpha )\) for \(\alpha \in (0,1)\), we have \(\lim _{\lambda \rightarrow 0^+}\frac{\varPhi (\lambda )}{\lambda ^\alpha }=1\), thus \(\varPhi (\lambda )\) is regularly varying at \(0^+\) with index \(\alpha \in (0,1)\). This implies that the geometric fractional Pearson diffusions \(X_{\varPhi }(t)\) are long-range dependent with respect to the initial datum;

  • If \(\varPhi (\lambda )=\log (1+\lambda )\), we have \(\lim _{\lambda \rightarrow 0^+}\frac{\varPhi (\lambda )}{\lambda }=1\). This implies that the Gamma time-changed Pearson diffusions \(X_{\varPhi }(t)\) are short-range dependent with respect to the initial datum.

Let us also observe that the lack of second order stationarity leads to a non-stationary behaviour of the autocorrelation function \(Corr_m(X_\varPhi (t),X_\varPhi (t_0))\), i.e. it is dependent not only on \(t-t_0\) but directly on the values t and \(t_0\). In particular, for \(t \rightarrow +\infty \) (hence \(t \gg t_0\)) the asymptotic behaviour is dictated by \({\mathfrak {e}}_\varPhi (t;-\lambda _1)\). On the other hand, if \(t \ll t_0\), the autocorrelation function tends to a constant (depending on \(t_0\)). This change of asymptotic behaviour, despite the first-order stationarity, is similar to the aging phenomena [18, 61]. Let us stress out that the aging phenomena is typical of more general Continuous Time Random Walks (CTRWs) than the simple Poisson process (see, for instance, [19]). On the other hand, time-changed processes like \(X_\varPhi (t)\) arise naturally when considering limits of CTRWs with inter-jump times that are non necessarily exponentially distributed and/or independent, thus agreeing with the previous observation.