1 Introduction

We study the Skorokhod embedding problem for Markov processes \(X=(X_t)_{t\ge 0}\) evolving in a locally compact space E. That is, given measures \(\mu \) and \(\nu \) on E, the task is to find a stopping time T such that

figure a

Throughout this article we are interested in non-randomised stopping times, that is T is a stopping time in the filtration generated by X. When X is a one-dimensional Brownian motion, this problem has received much attention, partly due to its importance in mathematical finance [42]. In this case, there exists a wealth of different stopping times that solve SEP\((X, \mu , \nu )\), see [51] for an overview. One of the most intuitive solutions is due to Root [57]: for a one-dimensional Brownian motion and \(\mu \),\(\nu \) in convex order, there exists a space-time subset – the so-called Root barrier – such that its hitting time by \((t,X_t)\) solves SEP\((X, \mu , \nu )\). More recently, connections with obstacle PDEs [23, 28, 33, 34, 38], optimal transport [3, 5,6,7, 20, 37, 39, 40], and optimal stopping [23, 25] and extensions to the multi-marginal case [4, 22, 56] have been developed.

However, already for multi-dimensional Brownian motion much less is known about solutions to SEP\((X, \mu , \nu )\), see for example work of Falkner [30] that highlights some of the difficulties that arise in the multi-dimensional Brownian case. For general Markov processes the literature gets even sparser: Rost [58, 59] developed a potential theoretic approach to previous work of Root, but in general this shows only the existence of a randomised stopping time for SEP\((X, \mu , \nu )\)  when \(\mu \) and \(\nu \) are in balayage order. Subsequent works of Chacon, Falkner, and Fitzsimmons, [18, 30, 32], expand on these results and provide sufficient conditions for the existence of a non-randomised stopping time; however, in none of these works the question of how to compute these stopping times \(T(\omega )\) for a given sample trajectory \(X(\omega )\) is addressed. Another approach is the application of optimal transport to SEP\((X, \mu , \nu )\)  as initiated by Beiglböck, Cox, Huesmann [3]. This covers Feller processes but verifying the assumptions can be non-trivial. More importantly, the optimal transport approach currently only addresses the existence and optimality of a stopping time but not its computation. Besides these two approaches – (Rost’s) potential theoretic approach and the optimal transport approach – we are not aware of a general methodology that produces solutions to SEP\((X, \mu , \nu )\) for Markov processes.

Contribution. We focus on the large class of right-continuous transient standard Markov processes satisfying a duality assumption and absolute continuity of the semigroup. Our main result is Theorem 3.6 which extends Rost’s results and shows existence of a non-randomised Root stopping time and, more importantly, represents the Root barrier as a free boundary via the semigroup of the dual space-time process. This allows to apply classical dynamic programming to calculate the Root barrier for a large class of Markov processes. Theorem 3.6 also implies that if a PDE theory is available that ensures the well-posedness of the free boundary problem formulated as PDE problem, then numerical methods for PDEs can be used to compute the barrier. However, in general this requires much stronger assumptions on the Markov process, e.g. when the generator involves non-local terms as is already the case for one-dimensional Lévy processes, the well-posedness of such PDEs is an active research area.

We present a series of examples of processes to which our result applies. The most important one is arguably multi-dimensional Brownian motion (or more generally, hypoelliptic diffusions), but we also discuss stable Lévy processes and Markov chains on a discrete state space. In all these cases our result allows to compute the Root barrier, and we present several numerical experiments to illustrate this point.

Finally, we show that our approach is flexible enough to construct new classes of solutions to the Skorokhod embedding problem: instead of hitting times of the space-time process \((t,X_t)\), we discuss hitting times of \((A_t,X_t )\) where A is an additive functional of X of the form \(\int _0^\cdot a(X_s) \,\mathrm {d}s\). We expect that such an approach holds in much greater generality for other functionals and leave this for further research.

Outline. The structure of the article is as follows: Sect. 2 introduces notation and basic results from potential theory, Sect.3 contains the statement of our main result and Sect. 4 contains its proof. Section 5 then applies this to concrete examples of Markov processes and computations of Root barriers. Section 6 discusses how these results can be used to construct new solutions of SEP\((X, \mu , \nu )\). In Appendix A and Appendix B we will present fundamental results from classical potential theory used throught this article and in Appendix C we discuss details around applying our result to Brownian motion in a Lie group.

2 Notations and assumptions

We briefly recall notations from potential theory, mostly following the presentation in Blumenthal and Getoor [12]. A detailed description can be found in “Appendix A”. Throughout, E is a locally compact metric space with countable base and \({\mathcal {E}}\) is the Borel-\(\sigma \)-algebra on E. In addition, we write \(\mathcal {E}^*\) for the \(\sigma \)-algebra of universally measurable sets and \(\mathcal {E}^n\) for the \(\sigma \)-algebra of nearly Borel sets, see Definition A.1.

Let \(\left( \Omega , {\mathcal {F}}, ({\mathcal {F}}_t)_{t \ge 0}, (X_t)_{t\ge 0}, ({\mathbb {P}}^x)_{x \in E}\right) \) denote a filtered probability space that carries a stochastic process X. To allow for killing we add an absorbing cemetery state \(\Delta \) to the state space, that is we define \(E_\Delta := E \cup \{\Delta \}\) and for all \(t\ge 0\) if \(X_t(\omega )=\Delta \), then \(X_s(\omega )=\Delta \) for all \(s>t\). Denote with \(\zeta := \inf \{t \ge 0:~ X_t =\Delta \}\) the lifetime of the process. Each \({\mathbb {P}}^x\) is then a probability measure on paths with \(X_0 = x\), \({\mathbb {P}}^x\)-a.s for all \(x\in E_\Delta \). Furthermore, for \(t\ge 0\) let \(\theta _t\) be the natural shift operator of the the process, i.e. \(\theta _t(X_s(\omega ))=X_{t+s}(\omega )\) for all \(s \ge 0\). Throughout, we assume that X is a standard process, see Definition A.2, in particular we assume that X has càdlàg paths and satisfies the strong Markov property. We write \(P= (P_t)_{t\ge 0}\) for the Markovian transition semigroup of X and \(U = \int _0^\infty P_t\,\mathrm {d}t\) for its potential and write as usual \(P_tf\), \(\mu P_t\), Uf and \(\mu U\) for the actions on Borel functions \(f:E\rightarrow {\mathbb {R}}\) and Borel measures \(\mu \) on E. For an \(({\mathcal {F}}_t)_{t\ge 0}\)-stopping time T, we write \(P_T(x,\,\mathrm {d}y)={\mathbb {P}}^x\left( X_T\in \,\mathrm {d}y; T <\zeta \right) \) and for first hitting times \(T_A =\inf \{t>0:\,X_t\in A\}\) of A\(\in \mathcal {E}\), we write \(P_A= P_{T_A}\). We write \(A^r\) for the regular points of a nearly Borel set A, see Table 3 in “Appendix A”.

A central role will be played by lifting X to a space-time process \({\overline{X}}\), that is \({\overline{X}}_t : = (\tau _t,X_t)\) with \(\tau _t=\tau _0+t\), with the space-time semigroup \(Q=(Q_t)_{t\ge 0}\) acting on Borel functions \(g:{\mathbb {R}}\times E\rightarrow {\mathbb {R}}\) and Borel measures \(\mu \) on E as follows:

$$\begin{aligned} Q_s g(t,x)= P_sg(t+s, \cdot )(x) \qquad (\delta _s\times \mu ) Q_t (I\times A)= {\mathbb {P}}^\mu (X_{s+t} \in A)\mathbb {1}_{\{s+t\in I\}}, \end{aligned}$$

where \(t\in {\mathbb {R}}\), \(x\in E\), \(A\in \mathcal {E}\) and \(I\subseteq {\mathbb {R}}\).

Duality. Throughout this paper we make the following assumption,

Assumption 2.1

There exists a standard process \(\widehat{X}\) with semigroup \(\widehat{P}\) on the same probability space, and some \(\sigma \)-finite measure \(\xi \) on E such that for all \(t \ge 0\) and \(f,g \ge 0\) \(\mathcal {E}^*\) -measurable,

$$\begin{aligned} \int _{E} (P_t f) g \,\mathrm {d}\xi = \int _{E} f (g \widehat{P}_t) \,\mathrm {d}\xi . \end{aligned}$$
(2.1)

Furthermore, the semigroups of X and \(\widehat{X}\) are absolutely continuous with respect to \(\xi \),

$$\begin{aligned} P_t(x,\cdot ) \ll \xi , \quad \widehat{P}_t(\cdot , y) \ll \xi ,\qquad \forall x,y \in E. \end{aligned}$$
(2.2)

Remark 2.2

Relation (2.1) is referred to in the literature as weak duality. The processes X and \(\widehat{X}\) are said to be in strong duality with respect to \(\xi \) (as defined in [12, Ch. VI] or [19, Ch.13]), if, in addition to (2.1), the resolvent kernels are absolutely continuous with respect to \(\xi \). This is weaker than the absolute continuity of the semigroup, so that in particular, strong duality of X and \(\widehat{X}\) holds under Assumption 2.1.

We write \(\widehat{P}\) and \(\widehat{U}\) for the semigroup and potential kernel of \(\widehat{X}\), and we denote the actions of these operators on Borel functions f and measures \(\mu \) on the other side as for X, i.e. \(f\widehat{P}_t\), \(f\widehat{U}\) and \(\widehat{U} \mu \). Furthermore, we use the prefix “co” for the corresponding properties relating to \(\widehat{X}\), e.g. coexcessive, copolar, cothin, etc., and we write \(\widehat{T}_A = \inf \{t>0:\,\widehat{X}_t\in A\}\) and \({}^rA\) for the coregular points of a measurable set A.

By [36, 65], absolute continuity of the semigroups implies that the corresponding space-time processes \((\tau _t,X_t)\), and \((\widehat{\tau }_t, \widehat{X}_t)\), where \(\widehat{\tau }_t= \widehat{\tau }_0-t\), are in strong duality with respect to the measure \(\lambda \otimes \xi \), where \(\lambda \) is the Lebesgue measure on the real line. We denote by \(\widehat{Q}\) the semigroup corresponding to the space-time process \((\widehat{\tau }_t, \widehat{X}_t)\). For every \(s\ge 0\) and \((\mathcal {B}({\mathbb {R}})\times \mathcal {E})\)-\(\mathcal {B}({\mathbb {R}})\)-measurable function g,

$$\begin{aligned} (Q_sg)(t,x) = P_sg(t+s,\cdot )(x), \qquad (g\widehat{Q}_s )(t,x)= g(t-s, \cdot )\widehat{P}_s (x). \end{aligned}$$

In addition, there exists a Borel function \((t,x,y) \mapsto p_t(x,y)\) such that for all \(t>0\) and xy in E, \(P_t(x,\,\mathrm {d}y) = p_t(x,y) \xi (\mathrm {d}y)\) and \(\widehat{P}_t(\mathrm {d}x,y) = p_t(x,y)\xi (\mathrm {d}x)\), and p satisfies the Kolmogorov–Chapman relation

$$\begin{aligned} \forall t,s >0, \;\; \forall x,y \in E:\; \;\; p_{t+s}(x,y) = \int \xi (\mathrm {d}z) p_t(x,z) p_s(z,y). \end{aligned}$$
(2.3)

The function \(u(x,y) := \int _0^\infty p_t(x,y)\,\mathrm {d}t\) is excessive in x (for each fixed y), coexcessive in y, and is a density for U and \(\widehat{U}\).

Note that the duality assumption implies by [12, Ch. IV, Prop. (1.11)]) that a measure \(\mu \) is excessive if and only if it has a density which is coexcessive and finite \(\xi \)-almost everywhere. Hence, the density of the potential \(\mu U\) with respect to \(\xi \) is given by the (coexcessive) potential function \(\mu \widehat{U}\).

Table 1 (Densities of) semigroup and potentials for X and its dual \(\widehat{X}\)

Remark 2.3

The functions which are (co-)excessive with respect to P, \(\widehat{P}\), Q and \(\widehat{Q}\) are actually Borel-measurable. Indeed, strong duality of the corresponding processes guarantees the existence of a so-called reference measureFootnote 1 (for more details see [12, Ch. VI]). In this case Proposition (1.3) in [12, Ch. V] implies that excessive functions are Borel-measurable.

We repeatedly use the following classical result,

Proposition 2.4

(Hunt’s switching formula, [12, VI.1.16]) Let X, \(\widehat{X}\) be standard processes in strong duality. Then for all Borel-measurable B, one has \(P_B u = u \widehat{P}_B\), i.e. for all \(x, y \in E\),

$$\begin{aligned} {\mathbb {E}}^x \big [ u(X_{T_B},y) \big ] = \widehat{{\mathbb {E}}}^y \big [ u(x,\widehat{X}_{\widehat{T}_B}) \big ]. \end{aligned}$$

Remark 2.5

The dual process \(\widehat{X}\) can be thought of as X running backwards in time. In fact, strong duality implies that for non-negative bounded Borel functions f and g it holds

$$\begin{aligned} {\mathbb {E}}^\xi [ f(X_0)g(X_t)] = \widehat{{\mathbb {E}}}^\xi [ f(\widehat{X}_t) g(\widehat{X}_0)]. \end{aligned}$$

More generally and ignoring technicalities (see [19, Ch.13] for details), if we take \(\Omega \) to be the canonical probability space and let \(r_t: \Omega \rightarrow \Omega \) denote the right-continuous time reversal at time t, that is \(\omega ':=r_t(\omega )\) is given as \(\omega '(s):=\omega (t-s-)\), then for any F that is \({\mathcal {F}}_t\)-measurable

$$\begin{aligned} {\mathbb {E}}^\xi [ F] = \widehat{{\mathbb {E}}}^\xi [ F\circ r_t]. \end{aligned}$$

Informally, strong duality of X with another standard process requires that two conditions are met: (i) X admits an excessive reference measure, (ii) the right-continuous version of its time reversal is a standard process and in particular satisfies the strong Markov property. We refer to [19, Ch. 15] and [62] for a detailed discussion.

Remark 2.6

A practical approach to obtain Markov processes in duality is via Dirichlet forms. Given a Markov process with generator \(\mathcal {L}\), this consists in considering the bilinear form

$$\begin{aligned} \mathcal {D}(f,g) := - \int (\mathcal {L} f) g \,\mathrm {d}\xi , \end{aligned}$$

extended to a suitable class of functions fg. The theory of Dirichlet forms, see e.g. [49], then provides sufficient analytic criteria on \(\mathcal {D}\) so that it is associated to a pair of (standard) Markov processes in weak duality with respect to \(\xi \). It is also possible to obtain existence (and further properties) of transition densities for a Markov semigroup by considering functional inequalities (such as Nash inequality) satisfied by the associated Dirichlet form, see e.g. [17].

3 A free boundary characterisation

Definition 3.1

(Root barrier) A subset R of \({\mathbb {R}}_+ \times E\) is called a Root barrier for X if R is nearly Borel-measurable with respect to the space-time process \({\overline{X}}\) and

$$\begin{aligned} (t,x) \in R, \;s > t \;\; \Longrightarrow \;\;(s,x) \in R. \end{aligned}$$

We call the first hitting time \(T_R = \inf \{t > 0 : (t, X_t)\in R\}\) the Root stopping time associated with R.

Dealing with the regularity of R is a central theme of this article and it is useful to introduce “right-” and “left-”continuous modifications \(R^-\) and \(R^+\) of R.

Definition 3.2

For a Root barrier R denote with

$$\begin{aligned} R_t=\{ x \in E: (t,x) \in R\} \end{aligned}$$

the section at time t. We define \(R^- \subset R \subset R^+\) as

$$\begin{aligned} R^-&=\bigcup _{t\ge 0 }~[t,\infty ) \times R^-_t \quad \text { with } R^-_t = \bigcup _{s<t}~ R_s, \\ R^+&= \bigcup _{t\ge 0 }~[t,\infty ) \times R^+_t \quad \text { with } R^+_t = \bigcap _{s>t} ~R_s. \end{aligned}$$

Remark 3.3

An equivalent definition of the barrier is that the mapping \(t \mapsto R_t\) is non-decreasing. This also implies that \(R^-\) and \(R^+\) are barriers as well. As R is nearly Borel-measurable with respect to \(\overline{X}\) then so are the shifted barriers \(R^{s} := \{ (t-s,x) ,\;\; (t,x) \in R\}\) for any \(s \in {\mathbb {R}}\), as then

$$\begin{aligned} R^- = \bigcup _{s < 0,~ s \in {\mathbb {Q}}} R^{s}, \;\;\;\;\;\;\;\;\; R^+ = \bigcap _{s>0,~ s \in {\mathbb {Q}}} R^s. \end{aligned}$$

Definition 3.4

(Balayage order) Two probability measures \(\mu \) and \(\nu \) are in balayage order, if their potentials \(\mu U\) and \(\nu U\) satisfy

$$\begin{aligned} \mu U(A) \ge \nu U (A) \qquad \text {for all measurable sets } A. \end{aligned}$$
(3.1)

In this case we will write \(\mu \prec \nu \) and say that \(\mu \) is before \(\nu \).

Remark 3.5

Under Assumption 2.1, (3.1) is equivalent to

$$\begin{aligned} \mu \widehat{U} (x) \ge \nu \widehat{U} (x) \qquad \text {for all } x\in E. \end{aligned}$$
(3.2)

The inequality (3.2) holds everywhere if and only if it holds \(\xi \)-almost everywhere, since both sides are coexcessive functions.

We now state our main result,

Theorem 3.6

Let X be a Markov process for which Assumption 2.1 holds. Let \(\mu , \nu \) be two measures such that \(\mu U\) and \(\nu U\) are \(\sigma \)-finite measures and such that \(\nu \) charges no semipolar set. If \(\mu \prec \nu \) then there exists a Root barrier R for X such that

$$\begin{aligned} \mu P_{T_R} = \nu . \end{aligned}$$

Moreover, if we set

$$\begin{aligned} f^{\mu ,\nu }(t,x) := \inf \big \{ g \widehat{Q}-\text{ excessive: } \;\;\; g \ge \mu \widehat{U}(x) \mathbb {1}_{\{t \le 0\}} + \nu \widehat{U}(x) \mathbb {1}_{\{t >0\}} \big \}, \end{aligned}$$
(3.3)

then

  1. (1)

    \(f^{\mu ,\nu }(t,x) = \mu P_{t \wedge T_R} \widehat{U}(x)\),

  2. (2)

    \( T_R = \mathop {\text {arg min}}\limits _{S:~ \mu P_S = \nu } \mu P_{t\wedge S} U(B)\) for any Borel set B and \(t\ge 0\),

  3. (3)

    in the above we may take

    $$\begin{aligned} R = \left\{ (t,x)\in {\mathbb {R}}_+ \times E\;\; |\;\; f^{\mu ,\nu }(t,x) = \nu \widehat{U} (x) \right\} . \end{aligned}$$

Besides existence and optimality of a Root stopping time, the main interest of Theorem 3.6 is that item (3) provides a way to compute the Root barrier for a large class of Markov processes ranging from Lévy processes to hypo-elliptic diffusions, see the examples in Sect. 5. Concretely, it allows to use classical optimal stopping and the dynamic programming algorithm to compute \(f^{\mu ,\nu }\) and hence R. We state this as a corollary:

Corollary 3.7

Using the same notation and assumptions as in Theorem 3.6 it holds that

  1. (1)

    \(f^{\mu ,\nu }\) is the value function of the optimal stopping problem

    $$\begin{aligned} f^{\mu ,\nu }(t,x) = \sup _{\tau } {\mathbb {E}}^x \left[ \mu \widehat{U}\left( \widehat{X}_\tau \right) \mathbb {1}_{\{\tau =t\}} + \nu \widehat{U}\left( \widehat{X}_\tau \right) \mathbb {1}_{\{\tau <t\}} \right] \quad \forall t \ge 0, x \in E, \end{aligned}$$
    (3.4)

    where the supremum is taken over stopping times \(\tau \) taking values in [0, t].

  2. (2)

    If we define for \(n \ge 0\) the function \(f_n^{\mu ,\nu }\) on \(\{ k2^{-n},~ k \ge 0\} \times E\) by

    $$\begin{aligned} f_n^{\mu ,\nu }(0,\cdot ) = \mu \widehat{U}, \;\;\;\;f_n^{\mu ,\nu }(2^{-n}(k+1),\cdot ) = \max \left\{ \left( f^{\mu ,\nu }_n(2^{-n}k,\cdot ) \right) \widehat{P}_{2^{-n}} ,\nu \widehat{U}\right\} , \end{aligned}$$

    then for each \(t\ge 0\), \(x \in E\),

    $$\begin{aligned}f^{\mu ,\nu }(t,x) = \lim _{n \rightarrow \infty } f^{\mu ,\nu }_n\left( 2^{-n} \lfloor 2^n t \rfloor ,x\right) .\end{aligned}$$

Informally, \(f^{\mu ,\nu }\) is the solution of the obstacle problem

$$\begin{aligned} u(0,\cdot ) = \mu \widehat{U}, \quad \min \left[ (\partial _t - \widehat{\mathcal {L}})u, u - \nu \widehat{U} \right] = 0 \qquad \text{ on } (0,+\infty ) \times E, \end{aligned}$$
(3.5)

where \(\widehat{\mathcal {L}}\) is the generator of the dual process \(\widehat{X}\). However, to make this rigorous is in general a subtle topic since the obstacle introduces singularities. Several notions of generalised PDE solutions ranging from variational inequalities to viscosity solutions address this, often together with numerical schemes [1, 44, 45, 52]. This PDE approach to Root’s barrier has been carried out in [23, 34] for one-dimensional diffusions. However, already in the one-dimensional case when the operator involves non-local terms as is the case for many Markov processes, the well-posedness of such obstacle PDEs is an active research area; see e.g. [2, 16]. In general, this PDE approach requires stronger assumptions than Assumption 2.1 for the well-posedness of (3.5); in stark contrast, Corollary 3.7 holds in full generality of Theorem 3.6.

Remark 3.8

(Minimal residual expectation) Item (2) of Theorem 3.6 was named minimal residual expectation by Rost [60] with respect to \(\nu = \mu P_{T_R}\). It implies that

$$\begin{aligned} T_R = \mathop {\text {arg min}}\limits _{S:~ \mu P_S = \nu } {\mathbb {E}}^\mu [F(S)], \; \text{ for } \text{ any } \text{ non-decreasing } \text{ convex } \text{ function } F. \end{aligned}$$

This is actually an equivalent formulation of the minimal residual expectation property as soon as \((\mu U-\nu U)(E)\) is finite as then this quantity is equal to \({\mathbb {E}}^\mu [S]\) for all solutions S of SEP\((X, \mu , \nu )\). Furthermore, Rost proved in [60] that any stopping time S which is of minimal residual expectation with respect to \(\mu P_{T_R}\) necessarily satisfies \(S=T_R\) \({\mathbb {P}}^\mu \)-a.s.

Remark 3.9

(Recurrent Markov processes) That \(\mu U\) and \(\nu U\) are \(\sigma \)-finite is a kind of transience assumption, and is usual in this context [32, 60]. In the case of one-dimensional Brownian motion or diffusions it is not necessary, see [23, 34]. We expect that our result could be extended to the recurrent case (at least in some special cases), but this would require a certain amount of work, see e.g. [30] for results for two-dimensional Brownian motion.

Remark 3.10

(Assumptions of Theorem 3.6) From the counterexamples discussed in [30, 32], to obtain solutions to SEP\((X, \mu , \nu )\) as non-randomised stopping times, one needs to make:

  1. (1)

    an assumption on the process in order to avoid “deterministic portions” in the trajectory. In our case, this is reflected in the assumption of absolute continuity (2.2). This assumption is rather strong but can often be checked in practice. In the case of diffusions, the celebrated Hörmander’s criterion [43] gives a simple condition to ensure existence of transition densities with respect to Lebesgue measure. For jump-diffusions, there are also many results providing sufficient criteria for absolute continuity, see for instance [9, 53].

  2. (2)

    an assumption on the “small” sets charged by initial and target measures (to avoid issues as in the case of multidimensional Brownian motion and Dirac masses). This is why we assume that \(\nu \) charges no semipolar sets. Without this assumption, it is not true that there exists a solution to SEP\((X, \mu , \nu )\)  as hitting time of a barrier, or even as an non-randomised stopping time. In the case where all semipolar sets are polar, following [31], we can replace the assumption that \(\nu \) charges no (semi)polar set by the assumption that

    $$\begin{aligned}&\text{ there } \text{ exists } \text{ a } \text{(universally } \text{ measurable) } \text{ set } C \text{ s.t. } \nonumber \\&\quad \nu (Z) = \mu (Z \cap C), \,\, \text{ for } \text{ all } \text{ polar } Z. \end{aligned}$$
    (3.6)

    Indeed, there exists then a polar set \(M \subset C\), and a measure \(\gamma \) supported on M, \(\mu ', \nu '\) supported on \(M^c\) with \(\mu = \mu ' + \gamma \), \(\nu =\nu ' + \gamma \), and \(\nu '\) charges no polar sets (cf. [31, p.50]). Letting \(R'\) be a barrier embedding \(\nu '\) into \(\mu '\) as given by Theorem 3.6, let \(R:=R' \cup ({\mathbb {R}}_+ \times M)\), then \(T:= \inf \{t \ge 0, \; (t,X_t) \in R\}\) embeds \(\nu \) into \(\mu \). In [31] is proven that (if semipolar sets are polar), (3.6) is a necessary condition for a non-randomised solution to SEP\((X, \mu , \nu )\)  to exist (in the case where \(\mu U \ge \nu U\) but (3.6) does not hold, randomisation of the stopping time at time 0 is necessary).

4 Proof of Theorem 3.6

The proof of our main result, Theorem 3.6, is split into two parts:

  • Existence. We first show that a Root barrier R exists such that \(\mu P_{T_R}=\nu \) and that items (1) and (2) of Theorem 3.6 hold. Here we rely on classic work of Rost, [60], that shows that SEP\((X, \mu , \nu )\)  has as solution stopping time T that lies between the hitting times of two barriers which differ only by a space-time graph. We show that these hitting times are necessarily equal; a similar approach was already followed in [3, 18, 34] under different assumptions.

  • Free boundary characterisation. We show item (3) of Theorem 3.6, that is that one can take the contact set of the obstacle problem (3.5) as the Root barrier. From a conceptual point of view, this is similar to the case of one-dimensional diffusions as studied with PDE methods in [23, 34]. However, there the analysis is greatly simplified due to the existence of local times. Since local times are not available in our setting, the situation becomes more delicate and requires the analysis of negligible sets via potential theory.

4.1 Existence

We prepare the proof of existence and optimality with two lemmas. The first lemma shows right-continuity of the semigroup when applied to bounded Borel-measurable functions.

Lemma 4.1

Under Assumption 2.1, it holds for all Borel-measurable and bounded functions f, for all \(x\in E\) and \(t>0\)

$$\begin{aligned} \lim _{h\downarrow 0} P_{t+h}f(x) = P_t f(x). \end{aligned}$$
(4.1)

Proof

First, note that if f is continuous then by a.s. right continuity of \(t\mapsto X_t\) it is clear that \(P_t f\) is right continuous as a function of t.

Let \(p_t^x = p_t(x,\cdot )\). Since \(\int _E p_t^x(y)\xi (\mathrm {d}y) = 1\), by de La Vallée Poussin’s theorem (see e.g. [26, Thm. II.22]Footnote 2 there exists a function G which is strictly convex and superlinear (i.e. \(\lim _{x \rightarrow +\infty } G(x)/x = +\infty \)) such that

$$\begin{aligned} \int G(p_t^x(y))\xi (\mathrm {d}y) <\infty . \end{aligned}$$
(4.2)

Then for all \(h \ge 0\) one has

$$\begin{aligned} \int G(p_{t+h}^x(y)) \xi (\mathrm {d}y)&= \int G\left( \int p_t^x(z) p_h^z(y) \xi (\mathrm {d}z)\right) \xi (\mathrm {d}y) \\&\le \int \int p_h^z(y) \xi (\mathrm {d}z) G\left( p_t^x(z)\right) \xi (\mathrm {d}y) = \int G(p_{t}^x(z)) \xi (\mathrm {d}z), \end{aligned}$$

where we first used Kolmogorov-Chapman’s equality (2.3), then Jensen’s inequality and that it holds \(\int p_h^z(y) \xi (\mathrm {d}z) = 1\) by duality. Since \(\xi \) is \(\sigma \)-finite, there exists a countable increasing family of open sets \((E_n)_{n\in {\mathbb {N}}}\) such that and \(\xi (E_n)<\infty \) for all \(n\in {\mathbb {N}}\).

Now fix \(n\in {\mathbb {N}}\). On \(E_n\) the integrability condition as in (4.2) is satisfied for all functions in the family \((p_s^x)_{s \ge t}\). By the de La Vallée Poussin’s theorem this is equivalent to \((p_s^x)_{s \ge t}\) being uniformly integrable in \(L^1(E_n, \xi )\).

Then, by the Dunford-Pettis theorem (see e.g. [26, Thm. II.23]), uniform integrability of \((p_s^x)_{s\ge t}\) implies that it is weakly (relatively) compact in the finite measure space \(L^1(E_n, \xi )\). By a diagonal argument there exists a subsequence \(s_k \downarrow t\) and a measurable function q such that for all n, for all bounded and measurable f one has \(\int _{E_n} p^x_{s_k} f \,\mathrm {d}\xi \rightarrow \int _{E_n} q f \,\mathrm {d}\xi \) for \(k\rightarrow \infty \). If we take f as a continuous function supported in \(E_n\), by right-continuity of the sample paths, we obtain that \(q = p_t^x\). In addition, since \(E_n^c\) is closed, by a.s. right-continuity of X, one has that

$$\begin{aligned} \limsup _{k\rightarrow \infty } P_{s_k}(x,E_n^c) \le P_t(x,E_n^c). \end{aligned}$$

Hence if f is measurable and bounded by 1,

$$\begin{aligned} \limsup _{k\rightarrow \infty } \left| \int p^x_{s_k}f \,\mathrm {d}\xi -\int p^x_t f \,\mathrm {d}\xi \right| \le \limsup _{k\rightarrow \infty } \left| \int _{E_n^c} p^x_{s_k}f \,\mathrm {d}\xi - \int _{E_n^c} p^x_t f \,\mathrm {d}\xi \right| \le 2 P_t(x,E_n^c). \end{aligned}$$

Letting \(n \rightarrow \infty \), the right-hand side goes to 0 by dominated convergence. Hence \(p^x_{s_k}\) converges weakly in \(L^1(E,\xi )\) to \(p^x_t\). We can use the same line of argument for every subsequence of any sequence \(s_k\downarrow t\) to argue the convergence of a subsubsequence. Therefore for all \(x\in E\) we have that \(p^x_{s}\) converges weakly in \(L^1(E,\xi )\) to \(p^ x_t\) for \(s\downarrow t\) which leads to the required statement. \(\square \)

The second Lemma revisits Chacon’s idea of “shaking the barrier”, see also [3, 18] for similar statements under slightly stronger assumptions.

Lemma 4.2

If the semigroup of a Markov process satisfies (4.1), then for all Root barriers R one has almost surely

$$\begin{aligned} T_R = T_{R^+}=T_{R^-}. \end{aligned}$$
(4.3)

Proof

Firstly, by replacing R with \(R^+\) if necessary, it is enough to show that \(T_R = T_{R^{-}}\) almost surely. Secondly, if we define

$$\begin{aligned} R(\delta ) := R \cap ([\delta ,+\infty ) \times E), \end{aligned}$$

we have \(T_R = \inf _{\delta >0} T_{R(\delta )}\). Put together, this implies that it is sufficient to show that for all \(\delta >0\), \(T_{R(\delta )}=T_{R^-(\delta )}\) \({\mathbb {P}}^\mu \)-a.s. and below we assume that \(R = R(\delta )\) for a given \(\delta >0\).

For \(\varepsilon \in {\mathbb {R}}\) define

$$\begin{aligned} R^\varepsilon := \bigcup _{t\ge \max (-\varepsilon , 0) } [t,\infty ) \times R_{t+\varepsilon }. \end{aligned}$$
(4.4)

That is, \(R^\varepsilon \) is the barrier that arises by shifting R in time to the left if \(\varepsilon >0\) [resp. to the right if \(\varepsilon <0\)]. Now since \(R=R(\delta )\),

$$\begin{aligned} T_R = T_{R^{\delta }} \circ \theta _{\delta } + \delta \end{aligned}$$

and for any \(0< \varepsilon < \delta \) we also have

$$\begin{aligned}T_{R^{-\varepsilon }} = T_{R^\delta } \circ \theta _{\delta +\varepsilon } + (\delta +\varepsilon ).\end{aligned}$$

Now set \(f(x) := {\mathbb {E}}^x \left[ \exp \left( -T_{R^{\delta }}\right) \right] \) and use the above identities to deduce that for every \(0< \varepsilon < \delta \) and every x,

$$\begin{aligned}{\mathbb {E}}^{x} \left[ \exp \left( -T_{R}\right) \right] = e^{-\delta } P_\delta f(x), \;\;\;{\mathbb {E}}^{x} \left[ \exp \left( -T_{R^{-\varepsilon }}\right) \right] = e^{-(\delta +\varepsilon )} P_{\delta +\varepsilon } f(x).\end{aligned}$$

From the right-continuity of the semigroup, Lemma 4.1, it follows that

$$\begin{aligned}\lim _{\varepsilon \downarrow 0} {\mathbb {E}}^{x} \left[ \exp \left( -T_{R^{-\varepsilon }}\right) \right] = {\mathbb {E}}^{x} \left[ \exp \left( -T_{R}\right) \right] .\end{aligned}$$

But since \(T_{R^{-\varepsilon }}\ge T_R\) \({\mathbb {P}}^x\)-a.s. for all x and for all \(\varepsilon >0\), this already implies that

$$\begin{aligned}\lim _{\varepsilon \downarrow 0} T_{R^{-\varepsilon }} = T_R\quad {\mathbb {P}}^x\text {-a.s.}\end{aligned}$$

and we conclude that \(T_R = T_{R^-}\) since \(R^- = \bigcup _{\varepsilon >0} R^{-\varepsilon }.\) \(\square \)

For the proof of existence and optimality, we rely on the following result obtained by Rost:

Theorem 4.3

(Rost, Theorems 1 and 3 in [60]) If \(\mu \prec \nu \), then there exists a (possibly randomised) stopping time T which is of minimal residual expectation with respect to \(\nu \), i.e. \(\mu P_T=\nu \) and

$$\begin{aligned} T = \mathop {\text {arg min}}\limits _{S:~ \mu P_S = \nu } \mu P_{t\wedge S} U(B) \qquad \text {for any Borel set } B \text { and }t\ge 0. \end{aligned}$$
(4.5)

In addition, the measure

$$\begin{aligned} \,\mathrm {d}t \otimes ( \mu P_{t \wedge T} U (\mathrm {d}x)) \end{aligned}$$

is given by the Q-réduite of the measure

$$\begin{aligned} \,\mathrm {d}t \otimes \left( \mu U(\mathrm {d}x)\mathbb {1}_{\{t \le 0\}} + \nu U(\mathrm {d}x)\mathbb {1}_{\{t > 0\}} \right) . \end{aligned}$$

Furthermore, there exists a finely closed Root barrier R such that \({\mathbb {P}}^\mu \)-a.s.:

  1. (1)

    \(T \le T_R:= \inf \left\{ t > 0, \; X_t \in R_{t}\right\} \),

  2. (2)

    \(X_{T} \in R_{T+}\).

The key ingredients in the proof of Rost’s theorem are the filling scheme from [59], which allows to obtain the existence of T satisfying the optimality property (4.5), and then a paths-swapping argument (see [42] for a heuristic description), which shows that T is almost the hitting time of a Root barrier (i.e. (1) and (2) above). However, this does not imply that T is the hitting time of a Root barrier.

In order to conclude item (1) from Theorem 3.6, we first see that Lemma B.3 yields

$$\begin{aligned} f^{\mu ,\nu } (t,x)= \mu P_{t \wedge T} \widehat{U}(x), \;\;\;\;\,\mathrm {d}t \otimes \xi (\mathrm {d}x) \text{-a.e. } \end{aligned}$$
(4.6)

We will prove in Lemma 4.4 that \(f^{\mu ,\nu }\) is \(\widehat{Q}\)-excessive. Therefore, if we show that \(g(t,x):= \mu P_{t\wedge T}\widehat{U}(x)\) is \(\widehat{Q}\)-excessive then (4.6) holds everywhere. For this we need to show that g satisfies \(g \widehat{Q}_t \rightarrow g\) as \(t \rightarrow 0\). But this follows from the definition since

$$\begin{aligned} \liminf _{t \rightarrow 0} \, (g \widehat{Q}_t)(s,\cdot )= & {} \liminf _{t \rightarrow 0} g(t-s, \cdot )\widehat{P}_t \\= & {} \liminf _{t \rightarrow 0}\mu P_{(s-t) \wedge T} \widehat{U} \widehat{P}_t \ge \liminf _{t \rightarrow 0} \mu P_{s \wedge T} \widehat{U} \widehat{P}_t = \mu P_{s \wedge T} \widehat{U}. \end{aligned}$$

Secondly, from \(f^{\mu ,\nu } (t,x)= \mu P_{t \wedge T} \widehat{U}(x)\), it also follows from Theorem 1 in [60] that T is the unique stopping time minimising \(\mu P_{t\wedge T }\widehat{U}\) for all \(t\ge 0\) among all stopping times embedding \(\nu \) in \(\mu \).

Furthermore, \(X_{T} \in R_{T+}\) in (2) from Theorem 4.3 by Rost implies \(T \ge T_{R^+} :=\inf \left\{ t > 0, \; X_t \in R_{t+}\right\} \) on \(\{T>0\}\). If \(T=0\) we have \(X_0 \in R_{0+}\) and if \(X_0 \in R_{0+}^r\) then \(T_{R^+}=0\), so that combined we get

$$\begin{aligned} {\mathbb {P}}^\mu (T=0<T_{R^+}) \; \le {\mathbb {P}}^\mu (X_T \in R_{0+} \setminus R_{0+}^r) \; =\; \nu (R_{0+} \setminus R_{0+}^r) \;= \; 0, \end{aligned}$$

where we used that \(R_{0+} \setminus R_{0+}^r\) is semipolar and that by assumption \(\nu \) charges no semipolar sets. Hence combined with item (2) from Theorem 4.3, one has \(T_{R^+} \le T \le T_R\), and we can conclude the existence of a solution satisfying items (1) and (2) in Theorem 3.6 with Lemma 4.2.

4.2 Free boundary characterisation

Let \(T=T_R\) be the unique Root stopping time solving SEP\((X, \mu , \nu )\)  from the previous section with the respective Root barrier R. We want to prove \(T={\widetilde{T}}\) with \({\widetilde{T}} := T_{{\widetilde{R}}}\), where \({\widetilde{R}}\) is defined as in Theorem 3.6

$$\begin{aligned}{\widetilde{R}} := \left\{ (t,x) \in {\mathbb {R}}\times E~|~f^{\mu ,\nu } (t,x) = \nu \widehat{U} (x)\right\} .\end{aligned}$$

The proof is split into two inequalities given in Proposition 4.5 and Proposition 4.7. First, we show some useful properties of the Root barrier \({{\widetilde{R}}}\):

Lemma 4.4

The function \(f^{\mu ,\nu }\) and the resulting Root barrier \({\widetilde{R}}\) satisfy the following properties:

  1. (1)

    \(f^{\mu ,\nu }\) is \(\widehat{Q}\)-excessive and non-increasing in t,

  2. (2)

    \({\widetilde{R}}\) is a Borel-measurable and \(\widehat{Q}\)-finely closed Root barrier.

Proof

For (1), note that any function \(f:{\mathbb {R}}\times E\rightarrow {\mathbb {R}}\) is \(\widehat{Q}\)-finely continuous if and only if the process \(t\mapsto f(s-t, \widehat{X}_t)\) is \({\mathbb {P}}^{\delta _{(s,x)}}\)-a.s. right continuous for all \((s, x)\in {\mathbb {R}}\times E\) (see e.g. [12, Theorem (4.8)]) . As in the obstacle \(h(t,x) =\mu \widehat{U}(x) \mathbb {1}_{\{t \le 0\}} + \nu \widehat{U}(x) \mathbb {1}_{\{t >0\}}\), we have the \(\widehat{P}\)-finely continuous functions \(\mu \widehat{U}\) and \(\nu \widehat{U}\) making \(t\mapsto \mu \widehat{U}(\widehat{X}_t)\) (when \(t< s\)) and \(t\mapsto \nu \widehat{U}(\widehat{X}_t)\) (when \(t> s\)) \({\mathbb {P}}^{\delta _x}\)-a.s. right-continuous for all \(x\in E\). Furthermore, \(t\mapsto h(s-t,x)\) is right continuous at \(t=s\) which together makes h \(\widehat{Q}\)-finely continuous. By Proposition B.2 it then follows that \(f^{\mu ,\nu }\) is \(\widehat{Q}\)-excessive. Further, \(f^{\mu ,\nu }\) is non-increasing in t since h is non-increasing.

For (2), note that the \(\widehat{Q}\)-excessive function \(f^{\mu ,\nu }\) is Borel-measurable, see Remark 2.3, and the barrier \({\widetilde{R}}\) is a level set of the Borel-measurable function \((t,x) \mapsto f^{\mu ,\nu }(t,x) -\nu \widehat{U}(x)\), hence it is Borel-measurable. Therefore \({\widetilde{R}}\) is \(\widehat{Q}\)-finely closed since it is the set where the two finely-continuous functions \(f^{\mu ,\nu }\) and h coincide, and it is a barrier by time monotonicity of \(f^{\mu ,\nu }\).

\(\square \)

Proposition 4.5

\({\widetilde{T}} \le T\).

Proof

Since \({\widetilde{T}} = T_{{\widetilde{R}}} = T_{{\widetilde{R}}^+}\) by Lemma 4.2, we only need to prove \(T_{{\widetilde{R}}^+}\le T\). Since \(\mu U\) is \(\sigma \)-finite, \(\mathcal {N} = \{ \mu \widehat{U} = \infty \}\) is polar (cf. [12, (3.5)]). Let (sy) be such that \(y \in {}^r R_{s}\) and \(y \notin \mathcal {N}\). One has

$$\begin{aligned} 0\le \mu P_{s\wedge T}\widehat{U}(y) - \mu P_T\widehat{U}(y)&={\mathbb {E}}^\mu \left[ \mathbb {1}_{\{s\le T\}}\big (u(X_s, y) -u(X_T, y)\big )\right] \end{aligned}$$
(4.7)

and since \(T\le s + T_{R_s}\circ \theta _s\) on \(\{s\le T \}\) as \(s\mapsto R_s\) is non-decreasing, we can apply the Markov property to obtain

$$\begin{aligned} {}(4.7)&\le {\mathbb {E}}^\mu \left[ \mathbb {1}_{\{s\le T\}}\big (u(X_s, y) -P_{R_s}u(X_s, y)\big )\right] . \end{aligned}$$

By the switching identity (Proposition 2.4) and since \(y\in {}^rR_s\) we have

$$\begin{aligned} P_{R_s}u(x, y) = {\mathbb {E}}^x\big [u(X_{T_{R_s}}, y)\big ]= \widehat{{\mathbb {E}}}^y\big [u(x,{\widehat{X}}_{{\widehat{T}}_{R_s}})\big ] =u(x,y) \end{aligned}$$
(4.8)

for all \(x \in E\) and hence

$$\begin{aligned} \mu P_{s\wedge T}\widehat{U}(y) = \mu P_T\widehat{U}(y) =\nu \widehat{U}(y). \end{aligned}$$
(4.9)

Thus, we can conclude that \((s,y) \in {\widetilde{R}}\).

Now for any \(\varepsilon >0\), if \(t<q < t+\varepsilon \), \(R_t \setminus {}^r R_{t+\varepsilon } \subset R_q \setminus {}^r R_q\). Since \(\bigcup _{q \in {\mathbb {Q}}} R_q \setminus {}^r R_q\) is semipolar, and since \(\nu \) charges no semipolar sets, it follows that a.s. \(X_T \in {}^r R_{T+\varepsilon } \setminus \mathcal {N}\). By the previous paragraph, this means that \(X_T \in \bigcap _{\varepsilon >0} {\widetilde{R}}_{T+\varepsilon } = {\widetilde{R}}_{T+}\). Hence \(T_{{\widetilde{R}}^+} \le T\). \(\square \)

Before we prove the inverse inequality, we first need a preliminary lemma:

Lemma 4.6

Assume that for some measure \(\eta \) which charges no semipolar sets, some stopping time \(\tau \) and some nearly Borel-measurable set A one has \(\eta \widehat{U} = \eta P_\tau \widehat{U}\) on A. Then \(\tau \le T_A\), \({\mathbb {P}}^\eta \) almost surely.

Proof

We first write

$$\begin{aligned} \eta P_\tau \widehat{U} \ge \eta P_\tau \widehat{U} \widehat{P}_A = \eta \widehat{U} \widehat{P}_A = \eta P_A \widehat{U}, \end{aligned}$$

where we have used in the first inequality that \(\eta P_\tau \widehat{U} \) is coexcessive and in the following equality, that the coexcessive functions \(\eta \widehat{U}\) and \(\eta P_\tau \widehat{U}\) coincide on A and therefore also on its cofine closure on which \(\widehat{P}_A\) is supported. The last equality follows by the switching identity.

Therefore it holds that \( \eta P_A U \le \eta P_\tau U \), i.e. the measures \(\eta P_A\) and \(\eta P_\tau \) are in balayage order. We then follow the proof of [60, Lemma p.8]. By [59], since \(\eta P_A \succ \eta P_\tau \), there exists a stopping time \(\tau '\) (possibly on an enlarged probability space) which is later than \(\tau \) such that the process arrives in the measure \(\eta P_A\) at time \(\tau '\), i.e. \(\tau ' \ge \tau \) \({\mathbb {P}}^\eta \)-a.s. and \(\eta P_{\tau '} = \eta P_A\). We can assume without loss of generality that A is finely closed and then this implies that \({\mathbb {P}}^\eta (X_{\tau '}\in A)=1\). In particular, if \(D_A := \inf \{t\ge 0, X_t \in A\}\), then we have \(\tau ' \ge D_A\) \({\mathbb {P}}^\eta \)-a.s. However, since \(\eta (A \setminus A^r) = 0\) it holds that \(T_A=D_A\) \({\mathbb {P}}^\eta \)-a.s., so that \(\tau ' \ge T_A\). Since \(\tau ' > T_A\) would be a contradiction to \(\eta P_{\tau '}U = \eta P_A U\), we conclude that \(T_A=\tau '\), and therefore \(T_A \ge \tau \) \({\mathbb {P}}^\eta \)-almost surely.\(\square \)

Proposition 4.7

\({\widetilde{T}} \ge T\).

Proof

We first show that for all \(t \in {\mathbb {Q}}_+:={\mathbb {Q}}\cap (0,+\infty )\), one has \(T \le t + T_{{\widetilde{R}}_t} \circ \theta _t\). For this we first prove

$$\begin{aligned} \mu P_{T} = \mu P_{t \wedge T} P_{T^{t}} \end{aligned}$$
(4.10)

where for fixed \(t \in {\mathbb {Q}}_+\) the stopping time \(T^{t}:=\inf \{ s >0: X_s \in {R}_{t+s}\}\) is the hitting time of R shifted in time by t. This holds since for all Borel-measurable functions f it holds

$$\begin{aligned} \mu P_{t \wedge T} P_{T^{t}} f&= {\mathbb {E}}^{\mu } \left[ f(X_{t+T^{t}\circ \theta _t}) \mathbb {1}_{\{t < T\}} + P_{T^{t}} f(X_T) \mathbb {1}_{\{t \ge T\}} \right] . \end{aligned}$$

Since \(T =t+T^{t}\circ \theta _t\) on \(\{ t < T\}\), it holds that \({\mathbb {E}}^{\mu } [ f(X_{T^{t}\circ \theta _t}) \mathbb {1}_{\{t< T\}}] = {\mathbb {E}}^{\mu } [ f(X_{T}) \mathbb {1}_{\{t < T\}}]\). Furthermore, we know that by definition of \(T^t\) we have \(P_{T^{t}} f = f\) on \({R}_t^r\), since \(t\mapsto R_t\) is non-decreasing. As \({\mathbb {P}}^{\mu }(X_T \in {R}_t \setminus {R}_t^r) = \nu ({R}_t \setminus {R}_t^r) =0\) since \(\nu \) does not charge semipolar sets, it holds that \({\mathbb {E}}^ x[P_{T^{t}} f(X_T) \mathbb {1}_{\{t \ge T\}}] = {\mathbb {E}}^x[f(X_T) \mathbb {1}_{\{t \ge T\}}]\). Together this implies \(\mu P_{t \wedge T} P_{T^{t}} f = \mu P_Tf\).

Secondly, note that \(\mu P_{t \wedge T} \ll \xi + \nu \) does not charge semipolar sets. Since \(\mu P_T \widehat{U} = \mu P_{t \wedge T} \widehat{U}\) on \({\widetilde{R}}_t\), we can choose \(\eta =\mu P_{t\wedge T}\), \(\tau = T^{t}\) and \(A = {\widetilde{R}}_t\) in Lemma 4.6 to obtain that \(T^t \le T_{{\widetilde{R}}_t}\), \({\mathbb {P}}^{ \mu P_{t \wedge T}}\)-a.s. We write

$$\begin{aligned} {\mathbb {P}}^\mu ( T> t + T_{{\widetilde{R}}_t} \circ \theta _t) = {\mathbb {E}}^{\mu }\left[ \mathbb {1}_{\{t<T\}} {\mathbb {P}}^{X_t} \left( T^t > T_{{\widetilde{R}}_t}\right) \right] = 0. \end{aligned}$$

and this implies

$$\begin{aligned} {\mathbb {P}}^\mu \big ( \exists t \in {\mathbb {Q}}_+:~ X_s \in {\widetilde{R}}_t \text { for some } s\in [t, T)\big ) = 0. \end{aligned}$$

Since

$$\begin{aligned} {\widetilde{R}}_s^- \subset \bigcup _{t \le s, t \in {\mathbb {Q}}_+} {\widetilde{R}}_t \end{aligned}$$

this implies that \(T_{{\widetilde{R}}^-} \ge T\) \({\mathbb {P}}^\mu \)-almost surely, which concludes the proof by Lemma 4.2. \(\square \)

5 Examples

In this section we apply Theorem 3.6 to concrete Markov processes. The examples are

  • Continuous-time Markov chains. This is a toy example but we find it instructive since many abstract quantities from potential theory become very concrete and simple; e.g. the obstacle PDE reduces to a system of ordinary differential equations.

  • Hypo-elliptic diffusions. This is a large and important class of processes. In the one-dimensional case we recover the setting of [23, 34] but for the multi-dimensional case the results are new to our knowledge. As concrete example we give a Skorokhod embedding for two-dimensional Brownian motion and Brownian motion in a Lie group.

  • \(\alpha \)-stable Lévy processes. There is very little literature on the Skorokhod embedding problem for Lévy processes, see [27] for references. We apply our results to \(\alpha \)-stable Lévy processes which are of growing interest in financial modelling, see e.g. [63], as they are characterised uniquely as the class of Lévy processes possessing the self-similarity property. Due to the infinite jump-activity such processes are hard to analyse but potential theoretic tools are classic in this context and much is known about their potentials, see [8, 11, 14, 47].

Two remarks are in order: firstly, the question to characterise or even construct measures \(\mu ,\nu \) that are in balayage order \(\mu \prec \nu \) for a given Markov process seems to be a difficult topic. In the case of one-dimensional Brownian motion this reduces to the convex order which is usually easy to verify but already for multi-dimensional Brownian motion it can be (numerically) difficult to check if two given measures are in balayage order. Secondly, we reiterate the discussion after Corollary 3.7 that the PDE formulation usually requires stronger assumptions whereas the discrete dynamic programming algorithm, Corollary 3.7, applies to Theorem 3.6 in full generality. All our examples were computed using the dynamic programming equation stated as item (2) in Corollary 3.7.

5.1 Continuous-time Markov chains

Let \(Y=(Y_n)_{n\in {\mathbb {N}}}\) be a discrete-time Markov chain on a discrete state space \(E\subset {\mathbb {Z}}\) and transition matrix \(\Pi \) such that \(\Pi (x,y) = q(y-x)\) for all \(x,y\in E\) and a probability measure q. Imposing \(\exp (\lambda )\)-distributed waiting times at each state, we arrive at the continuous-time Markov chain \(X=(X_t)_{t\ge 0}\) with transition function

$$\begin{aligned} p_t(x,y) = \mathrm {e}^{-\lambda t}\sum _{k=0}^ \infty \frac{(\lambda t)^k}{k!}\Pi ^k(x,y). \end{aligned}$$
(5.1)

The process X is dual to the continuous-time Markov chain \(\widehat{X}\) with transition matrix \(\widehat{\Pi }= \Pi ^T\) and the same transition rate \(\lambda \) at each state with respect to the counting measure. The potentials are given with respect to the function

$$\begin{aligned} u(x,y) = \sum _{k=0}^ \infty \Pi ^k(x,y) \end{aligned}$$
(5.2)

and the potential function of a measure \(\mu \) is given by

$$\begin{aligned} \mu \widehat{U} (y) = \sum _{k=0}^ \infty \sum _{x\in E} \Pi ^k(x,y)\mu ( x). \end{aligned}$$
(5.3)

Example 5.1

(Asymmetric random walk on \({\mathbb {Z}}\)) Let Y be the asymmetric random walk on \({\mathbb {Z}}\), that is \(\Pi (x, x+1)= p\in (\frac{1}{2}, 1]\) and \(\Pi (x, x-1)= 1-p=:q\). By translation invariance and a standard result (see e.g. [54]) it then holds for the potential kernel of X

$$\begin{aligned} u(x,y) = u(0, y-x) = {\left\{ \begin{array}{ll} \frac{1}{p-q}, &{} y\ge x,\\ \frac{1}{p-q}\cdot \left( \frac{p}{q}\right) ^{y-x}, &{} y\le x. \end{array}\right. } \end{aligned}$$
(5.4)

Now let \(\mu = \delta _0\) and \(\nu =\sum _{l=1}^N a_l\delta _{x_l}\) for some \(N\in {\mathbb {N}}\), \(a_l>0\), \(\sum _{l=1}^N a_l=1\) and \(0<x_1<\dots <x_N\). Then

$$\begin{aligned} \nu \widehat{U}(y)&= \sum _{l=1}^N a_lu(x_l, y) \nonumber \\&= {\left\{ \begin{array}{ll} \frac{1}{p-q}\cdot \sum _{l=1}^N a_l\cdot \left( \frac{p}{q}\right) ^{y-x_l}, &{} y\le x_1\\ \frac{1}{p-q}\cdot \left[ \sum _{l=1}^Ka_l +\sum _{l=K+1}^Na_l\cdot \left( \frac{p}{q}\right) ^{y-x_l}\right] , &{} x_K <y\le x_{K+1},~ 1\le K\le N-1,\\ \frac{1}{p-q}, &{} y\ge x_N. \end{array}\right. } \end{aligned}$$
(5.5)

Since \(p>q\), we have \(\nu \widehat{U}\le \mu \widehat{U}\) for all such \(\nu \). The generator of \(\widehat{X}\) is given by

$$\begin{aligned} \widehat{\mathcal {L}}f(y) = \lambda \cdot [pf(y-1) + qf(y+1) -f(y)] \end{aligned}$$
(5.6)

and the obstacle problem (3.5) reduces to the following set of ODEs:

$$\begin{aligned} u(0, x)&= \mu \widehat{U}(x),\\ \partial _t u(t, x)&= {\left\{ \begin{array}{ll} \lambda \cdot [pu(t,x-1) + qu(t,x+1) -u(t,x)] &{} \text{ if } u(t,x) > \nu \widehat{U}(x), \\ 0 &{} \text{ if } u(t,x) = \nu \widehat{U}(x). \end{array}\right. } \end{aligned}$$

Then either classical methods for solving this set of coupled ODEs can be applied or we can directly apply the dynamical programming approach as in Corollary 3.7 as follows: For \(\varepsilon >0\) small enough, we choose \(x_0\) such that \(\mu \widehat{U}(x_0)-\nu \widehat{U}(x_0)< \varepsilon \). We approximate the function \(f^{\mu , \nu }\) on the set \(\{x_0, x_0+1, \dots , x_N\}\) at discrete time points \(t_k=\frac{k}{2^n}\) for fixed n:

$$\begin{aligned} f_n^{\mu ,\nu }(0,y)&= \mu \widehat{U}(y),\\ f_n^{\mu ,\nu }(t_k,y)&= \mu \widehat{U}(y) \qquad \text{ for } y<x_0 \text { or } y\ge x_N,\\ f_n^{\mu ,\nu }(t_{k+1}, y)&= \max \big \{ (1- \genfrac{}{}{}1{\lambda }{2^{n}} )f_n^{\mu ,\nu }(t_k, y) +\genfrac{}{}{}1{\lambda }{2^{n}} \big (pf_n^{\mu ,\nu }(t_k, y-1)\\&\quad +qf_n^{\mu ,\nu }(t_k, y+1)\big ), \nu \widehat{U}(y)\big \}. \end{aligned}$$

For example, we take \(\lambda =1\), \(p=\frac{2}{3}\) and \(\nu = \frac{1}{4}\delta _2 + \frac{3}{4}\delta _4\). Figure 1 show the potentials \(\mu \widehat{U}\), \(\nu \widehat{U}\) and the resulting Root barrier.

Fig. 1
figure 1

Root embedding for the continuous-time asymmetric random walk on \({\mathbb {Z}}\) with \(\lambda =1\), \(p=\frac{2}{3}\), \(\mu = \delta _0\) and \(\nu = \frac{1}{4}\delta _2 + \frac{3}{4}\delta _4\)

5.2 Hypo-elliptic diffusions

Let X be the diffusion in \({\mathbb {R}}^d\) obtained by solving an SDE formulated in the Stratonovich sense

$$\begin{aligned} \,\mathrm {d}X_t = \sum _{i=1}^N V_i(X_t) \circ \,\mathrm {d}B_t + V_0(X_t) \,\mathrm {d}t \end{aligned}$$
(5.7)

where the \(V_i\), \(i=1,\ldots , N\), are vector fields on \({\mathbb {R}}^d\) which we assume to be smooth with all derivatives bounded, and B is a standard Brownian motion in \({\mathbb {R}}^N\). We further assume that X is killed at rate \(c(X) \,\mathrm {d}t\), where \(c\ge 0\) is a non-negative smooth function. X is then a standard Markov process on \({\mathbb {R}}^d\), with generator \(\mathcal {L}\) which acts on smooth functions via

$$\begin{aligned} \mathcal {L} f = -c f + \left( V_0 + \sum _i V_i^2 \right) f = -c f + \sum _{i} b_i \partial _{i}f + \sum _{ij} \partial _i \left( a_{ij} \partial _{j} f \right) , \end{aligned}$$

where the \(b_i\) and \(a_{ij}\)’s are smooth functions which can be written explicitely in terms of the \(V_i\). The formal adjoint of \(\mathcal {L}\) with respect to Lebesgue measure is then given by

$$\begin{aligned} \widehat{\mathcal {L}} f = (-{\mathrm{div}} (b) - c) f - \sum _{i} b_i \partial _{i}f + \sum _{ij} \partial _i \left( a_{ij} \partial _{j} f \right) \end{aligned}$$

and we can choose smooth vector fields \(\widehat{V}_i\)’s such that \(\widehat{\mathcal {L}} = (-\mathrm {div} (b) - c) f + \left( \widehat{V}_0 + \sum _i \widehat{V}_i^2\right) \). Assuming that

$$\begin{aligned} {\mathrm{div}} (b) + c \ge 0 \text{ on } {\mathbb {R}}^d, \end{aligned}$$
(5.8)

we can then identify \(\widehat{\mathcal {L}}\) with the generator of the Markov process consisting of the Stratonovich SDE

$$\begin{aligned} \,\mathrm {d}\widehat{X}_t = \sum _{i} \widehat{V}_i(X) \circ \,\mathrm {d}B_t + \widehat{V}_0(X) \,\mathrm {d}t \end{aligned}$$
(5.9)

killed at rate \((\mathrm {div} (b) + c)(\widehat{X}_t) \,\mathrm {d}t\).

In addition, assume that the vector fields satisfy the weak Hörmander conditions

$$\begin{aligned} \forall x \in {\mathbb {R}}^d,~~ \mathrm {Lie}\Big [ V_i, \; \left[ V_0, V_i\right] , \; i \ge 1 \Big ](x) = {\mathbb {R}}^d, \end{aligned}$$
(5.10)
$$\begin{aligned} \forall x \in {\mathbb {R}}^d,~~ \mathrm {Lie}\Big [\widehat{V}_i, \; \left[ \widehat{V}_0, \widehat{V}_i\right] , \; i \ge 1\Big ](x) = {\mathbb {R}}^d, \end{aligned}$$
(5.11)

then the classical Hörmander result [43] yields that the semigroups \(P_t\), \({\widetilde{P}}_t\) associated to X, \({\widetilde{X}}\) admit (smooth) densities with respect to Lebesgue measure. Therefore, \((P_t)\) and \((\widehat{P}_t)\) are in duality with respect to Lebesgue measure, as seen by

$$\begin{aligned} \frac{\,\mathrm {d}}{\,\mathrm {d}s} \left\langle P_{t-s} f, \widehat{P}_s g \right\rangle = \left\langle - \mathcal {L} P_{t-s} f, \widehat{P}_s g \right\rangle + \left\langle P_{t-s} f, \widehat{\mathcal {L}} \widehat{P}_s g \right\rangle = 0, \end{aligned}$$

which yields that \(\left\langle P_{t} f, g \right\rangle = \left\langle f, \widehat{P}_t g \right\rangle \), first for fg smooth with compact support and then for all \(f,g \ge 0\) Borel measurable by an approximation argument. In conclusion, we have obtained the following.

Proposition 5.2

Assume that (5.8) and (5.10)–(5.11) hold. Then the process X given by solving the SDE (5.7) satisfies Assumption 2.1, with \(\widehat{X}\) given by the solution to (5.9) and \(\xi \) given by the Lebesgue measure on \({\mathbb {R}}^d\).

Example 5.3

(Brownian motion in \({\mathbb {R}}^d\)) For \(d\le 2\), as Brownian motion is recurrent, for any positive Borel function f we have either \(Uf \equiv \infty \) or \(Uf\equiv 0\). Therefore we consider the Brownian motion killed when exiting the unit ball \(B_1(0)\), i.e. \(\zeta = \inf \{t>0:~||X_t|| >1\}\). For any probability measure \(\mu \) with density f supported on \(B_1(0)\), the potential \(\mu \widehat{U} = f\widehat{U}\) is the unique continuous solution of \(\frac{1}{2}\Delta v = f\) on \(B_1(0)\) vanishing on \(\partial B_1(0)\), and is given explicitely as \( f\widehat{U}(x) = {\mathbb {E}}^x \left[ \int _0^\zeta f(\widehat{X}_t) \,\mathrm {d}t\right] = \int u(x, y) f( y)\,\mathrm {d}y \), where

$$\begin{aligned} u(x,y) = {\left\{ \begin{array}{ll} - |x-y|, &{} d=1,\\ \frac{1}{\pi }\log \frac{1}{||x-y||}, &{}d=2.\\ \end{array}\right. } \end{aligned}$$
(5.12)

In dimensions \(d\ge 3\), Brownian motion is transient, and the potential is the Newtonian potential on \({\mathbb {R}}^d\):

$$\begin{aligned} u(x,y) =c_d\cdot \frac{1}{||x-y||^{d-2}}, \end{aligned}$$
(5.13)

where \(c_d = \genfrac{}{}{}1{1}{2}\pi ^{-\nicefrac {d}{2}}\Gamma \big (\genfrac{}{}{}1{1}{2}(d-2)\big )\). For \(d=1\) the balayage order reduces to the convex order which is easy to verify. In higher dimensions it is in general non-trivial to find measures in balayage order.

Now we consider the two-dimensional Brownian motion starting in 0. As an example for a measure \(\nu \) which is not rotational symmetric and can be embedded in the two-dimensional Brownian motion, we take \(\nu \) as the measure with the following density as an approximation of the marginal of the diffusion Y which is generated by the operator \(\mathrm {e}^{-x_1-x_2}\widehat{\mathcal {L}}\), here \(\nu (A)={\mathbb {P}}^0(Y_{0.1}\in A)\),

$$\begin{aligned} \frac{\nu (\mathrm {d}x) }{\,\mathrm {d}x}= & {} C\exp \left[ -2.5\cdot \big (a (x_1-{{\widetilde{x}}}) -b (x_2-{{\widetilde{x}}})-{{\widetilde{x}}})\big )^2 -6\cdot \big (b(x_1-{{\widetilde{x}}}) \right. \\&\left. + a(x_2-{{\widetilde{x}}}) -{{\widetilde{x}}} \big )^2 \right] , \end{aligned}$$

where C denotes a normalising constant, \(a=\cos (\genfrac{}{}{}1{\pi }{4})\), \(b= \sin (\genfrac{}{}{}1{\pi }{4})\) and \({{\widetilde{x}}} = 0.15\). The (empirical) density is respresented in Fig. 2a on page 18. We take \(\nu \) of this form since Y can be obtained as a time change via an additive functional of X which implies \(\delta _0 \prec \nu \) (we will show this explicitly in Sect. 6) and we see \(\nu \widehat{U}\) in Fig. 2b.

Fig. 2
figure 2

Root embedding for the 2d Brownian motion

Example 5.4

(Lie-group valued Brownian motion) Let \((B^1,B^2)\) be a two-dimensional Brownian motion. Then \((B^1,B^2,\int B^1 \,\mathrm {d}B^2 - \int B^2 \,\mathrm {d}B^1)\) can be identified (after taking the Lie algebra exponential), as a Brownian motion in the free nilpotent Lie group of order 2; see “Appendix C” for details and extension to general free nilpotent groups. The generator of the process is the sub-Laplacian \(\Delta _G = \frac{1}{2} \left( X^2 + Y^2\right) \) on the Heisenberg group G; where in coordinates

$$\begin{aligned} X= \partial _{x} + \frac{1}{2} y \partial _a, \;\;\; Y = \partial _{y} - \frac{1}{2} x \partial _a. \end{aligned}$$

As shown by [35, 48], the transition density equals

$$\begin{aligned} p_1(b^1,b^2,a) = \frac{1}{2\pi ^2} \int _0^\infty \frac{x}{\sinh (x)}\exp \left( -\frac{(b^1+b^2)^2}{2 \tanh (x)}\right) \cos (ax)\,\mathrm {d}x \end{aligned}$$
(5.14)

and by Brownian scaling \(p_t(b^1,b^2,a):= p_1(\frac{b^1}{\sqrt{t}},\frac{b^2}{\sqrt{t}},\frac{a}{t})\). In this case, it is already non-trivial to find measures \(\mu ,\nu \) in balayage order, \(\mu \prec \nu \), even if \(\mu \) is a Dirac at the origin. However, Proposition C.1 in the appendix shows that any measure \({\widetilde{\nu }}\) on \((0,\infty )\) can be lifted to a measure \(\nu \) on G such that \( \delta _0 \prec \nu \). This provides a rich class of probability measures in balayage order, and Theorem 3.6 allows to apply dynamic programming to compute the Root barrier solving SEP\((X,\delta _0,\nu )\). However, this is computationally expensive since (5.14) is not available in closed form. In this case, the well-posedness of the obstacle PDE

$$\begin{aligned} \min \left[ (\partial _t -\Delta _G) u,~ u- \nu \widehat{U}\right]&=0,\\ u(0,\cdot )&= \delta _0\widehat{U} \end{aligned}$$

can be shown by standard methods (such as viscosity solutions). Again this leads to non-trivial numerics,Footnote 3 even after using the radial symmetry of (5.14) to reduce the space dimension to 2, namely radius and area. Nevertheless, both approaches (dynamic programming and PDE) are applicable to compute barriers for group-valued Brownian motion, although much work remains to be done to turn this into a stable numerical tool and we leave this for future research.

5.3 Symmetric stable Lévy processes

A right-continuous stochastic process \((X_t)_{t\ge 0}\) is called an \(\alpha \)-stable Lévy process, if it has independent, stationary increments which are distributed according to an \(\alpha \)-stable distribution. We consider the symmetric case without drift. In this case, the characteristic component is given by \(\psi (\theta ) = |\theta |^\alpha \), i.e. \({\mathbb {E}}[\mathrm {e}^{i\theta X_t}]= \mathrm {e}^{-t|\theta |^\alpha } =:g_t(\theta )\) and hence \(X_t\) satisfies the scaling property \(X_t \overset{d}{=} t^{\nicefrac {1}{\alpha }} X_1\). Classic results, e.g. [41], show that X has a transition density

$$\begin{aligned} p_t(x,y) = p_t(y-x) = ({\mathcal {F}}^{-1}g_t)(y-x), \end{aligned}$$

which is absolutely continuous with respect to the Lebesgues measure. For further properties of symmetric stable processes, we refer to [11]. We are going to take \(\alpha \in (0,1)\), as in this case X is transient, as shown in [14]. Furthermore, one-dimensional Lévy processes X are dual to \(\widehat{X}:= -X\) with respect to the Lebesgue measure (see [8]). Since the jumps are distributed according to the symmetric stable distribution, the symmetric stable process X is self-dual. By [12] the potential \(Uf(x) = \int u(x,y)f(y)\,\mathrm {d}y\) equals

$$\begin{aligned} u(x,y) = C_{1,\alpha }\cdot |x-y|^{\alpha -1}, \end{aligned}$$
(5.15)

where in the one-dimensional case \(C_{1,\alpha }= \Gamma (\frac{1-\alpha }{2})\cdot \big [ 2^{\alpha }\sqrt{\pi }\Gamma (\frac{\alpha }{2})^{2}\big ]^{-1}\). In order to construct the Root stopping time, we construct the function \(f^{\mu ,\nu }\) as described in Theorem 3.6 as solution to the obstacle problem

$$\begin{aligned} v(0,\cdot ) = \mu \widehat{U}, \;\;\; \min \left[ (\partial _t + (-\Delta )^{\nicefrac {\alpha }{2}})v, v - \nu \widehat{U} \right] = 0, \end{aligned}$$

where the generator of the process \(-(-\Delta )^{\nicefrac {\alpha }{2}}\) is given by the fractional Laplacian

$$\begin{aligned} (-\Delta )^{\nicefrac {\alpha }{2}} f(y) = C_{2,\alpha } \cdot {\text {P.V.}}\int _{-\infty }^\infty \frac{f(y)-f(z+y)}{|z|^{1+\alpha }}\,\mathrm {d}z, \end{aligned}$$
(5.16)

with \({\text {P.V.}}\) denoting a principal value integral.

Example 5.5

(Embedding for \(\alpha =0.5\), \(\mu \sim \mathrm {Uniform}([-1,1])\) and \(\nu \sim 0.75\cdot \mathrm {Beta}(2,2)\)) Let \(\mu \) be the Uniform distribution on \([-1,1]\), then

$$\begin{aligned} \mu \widehat{U} (y) = \frac{C_{1,\alpha }}{2\alpha }\cdot {\left\{ \begin{array}{ll} (1-y)^\alpha + (1+y)^\alpha , &{} \text { for } |y|<1,\\ (|y|+1)^\alpha - (|y|-1)^\alpha , &{} \text { for } |y|\ge 1. \end{array}\right. } \end{aligned}$$
(5.17)

We want to construct a solution T for SEP\((X, \mu , \nu )\)  where the density of \(\nu \) is given by \(\frac{\nu (\mathrm {d}x)}{\,\mathrm {d}x} = 0.75\cdot g_{a,b}\), where

$$\begin{aligned} g_{a,b}(x) = {\left\{ \begin{array}{ll} \frac{\Gamma (a+b)}{\Gamma (a)\cdot \Gamma (b)}\cdot 2^{-a-b+1}\cdot (x+1)^{a-1}\cdot (1-x)^{b-1}, &{}|x|\le 1,\\ 0, &{}|x|>1. \end{array}\right. } \end{aligned}$$
(5.18)

is the density of a Beta(ab) distribution on the interval \([-1,1]\). Realisations of the resulting embedding will then give us that on the event \(\{T<\infty \}\) where \({\mathbb {P}}(T<\infty ) =0.75\), the stopped values \(X_T\) are distributed according to the Beta(ab) distribution. Studying general numerical methods for the fractional Laplacian is beyond the scope of this article, so we just discuss a quick method which is adapted to our case. We can rewrite (5.16) as

$$\begin{aligned} (-\Delta )^{\nicefrac {\alpha }{2}} f(y) = C_{2,\alpha } \cdot \lim _{h\rightarrow \infty }\int _0^h\frac{2f(y)-f(z+y)-f(y-z)}{|z|^{1+\alpha }}\,\mathrm {d}z. \end{aligned}$$
(5.19)

Define the set \(\mathcal {O}_{{\overline{T}}}:= [0, {\overline{T}}]\times [-K, K]\) for large \({\overline{T}}, K\in {\mathbb {R}}\) and \(h:= (\Delta t, \Delta x) = \big (\frac{{\overline{T}}}{N_{{\overline{T}}}}, \frac{2K}{N_x}\big )\), where \(N_x\), \(N_{{\overline{T}}}\in {\mathbb {N}}\) are chosen large enough. The space-time mesh grid is defined as

$$\begin{aligned} \mathcal {G}_h := \{t_n: t_n= n\cdot \Delta t,~ n = 0, 1, \dots , N_{{\overline{T}}}\}\times \{x_j: x_j = -K+j\cdot \Delta x,~ j = 1, \dots , N_x\}. \end{aligned}$$

For the resulting minimal excessive majorant of \(\mu \widehat{U} \mathbb {1}_{\{t\le 0\}}+\nu \widehat{U} \mathbb {1}_{\{t> 0\}}\) we expect that \(f^{\mu ,\nu }\) never touches \(\nu \widehat{U}\) outside \([-1,1]\) as this is the support of \(\nu \). Indeed, a straightforward calculation shows that starting in \(\mu \widehat{U}\), for \(|y|\gg 1\) we have \((-\Delta )^{\nicefrac {\alpha }{2}}\mu \widehat{U}(y) = o(|y-1|)\), i.e. the repeated action of the fractional Laplacian on \(\mu \widehat{U}\) outside an interval \([-K, K]\) with large enough \(K\gg 1\) is negligible. For any \((t,x)\in \mathcal {G}_h\) we define the operator

$$\begin{aligned} S^h[v^h](t,x) = {\left\{ \begin{array}{ll} v^h(t,x) +\Delta t\cdot \big ((-\Delta )^{\nicefrac {\alpha }{2}}_h v^h(t, \cdot )\big )(x), &{} x\in [-K, K],\\ \mu \widehat{U} (x), &{} \text {else}. \end{array}\right. } \end{aligned}$$

where \((-\Delta )^{\nicefrac {\alpha }{2}}_h\) is the evaluation of the fractional Laplacian using a Gauß-Kronrod quadrature as described in [24] on \(\mathcal {G}_h\). Then the minimal excessive majorant \(f^{\mu ,\nu }\) for Theorem 3.6 can be computed on \(\mathcal {G}_h\) as follows:

$$\begin{aligned} v^h(0, \cdot ) = \mu \widehat{U}(\cdot ), \qquad v^h((n+1)\Delta t, \cdot ) = \max \big ( \nu \widehat{U}(\cdot ), ~S^h[v^h](n \Delta t, \cdot ) \big ). \end{aligned}$$
(5.20)

In Fig. 3 on page 21, we can see a realisation of the embedding for SEP\((X, \mu , \nu )\)  with \(\mu \) and \(\nu \) as given above. As for small values of \(\alpha \), the trajectories of X may have large jumps, for the simulations we need to take into consideration that X may jump back in the barrier although it already left the support of \(\nu \). Following the results from [13, 46], the probability of X not returning to \((-1,1)\) after reaching level x is

$$\begin{aligned} {\mathbb {P}}^x(X_t \not \in (-1,1) ~~\forall t>0)=\frac{\Gamma (1-\genfrac{}{}{}1{\alpha }{2})}{\Gamma (\genfrac{}{}{}1{\alpha }{2})\Gamma (1-\alpha )}\int _0^{\frac{x-1}{x+1}} u^{\nicefrac {\alpha }{2}-1}(1-u)^{-\alpha }\,\mathrm {d}u. \end{aligned}$$
(5.21)
Fig. 3
figure 3

Root embedding for the symmetric \(\genfrac{}{}{}1{1}{2}\)-stable process, \(\mu = \mathrm {Uniform}[-1,1]\), \(\nu = 0.75\cdot \mathrm {Beta}(2,2)\)

6 Towards generalised Root embeddings

The results of the previous sections, rely on Root’s and Rost’s approach to lift X to a space-time process

$$\begin{aligned} {\overline{X}}=(t,X_t)_{t \ge 0} \end{aligned}$$

and find a solutions of SEP\((X, \mu , \nu )\)  that are given as a hitting time of \({\overline{X}}\). A natural generalisation is to replace the time-component by another real-valued, increasing process A with \(A_0=0\), such that (AX) is again Markov and carry out a similar construction. That is, to construct a set such that its first hitting time by the lifted process

$$\begin{aligned} (A_t,X_t) _{t \ge 0} \end{aligned}$$

solves SEP\((X, \mu , \nu )\). Again, one expects such a stopping time to be optimal in a minimal residual expectation sense, however, now formulated in terms of A.

Carrying out this program in full generality is beyond the scope of this article. Instead, we focus on the case when A is of the form \(A_t=\int _0^t a(X_s)\,\mathrm {d}s \) where a is strictly positive. Denote with \(\tau _s:=\inf \{t>0: A_t = s\}\) the first hitting time of \(s \ge 0\) by A and with \(Y_s:=X_{\tau _s}\) the time-changed process. Since for every (sufficiently nice) set \(R \subset [0,\infty ) \times E\)

$$\begin{aligned} \inf \{ s> 0: (s,Y_s) \in R\} = \inf \{ s> 0: (A_{\tau _s},X_{\tau _s}) \in R\}, \end{aligned}$$

this allows us to use the framework of the previous sections. Concretely, one needs to verify that the assumptions of Theorem 3.6 are met by Y. This already provides a new class of solutions for SEP\((X, \mu , \nu )\). It can be seen as an interpolation between the Root embedding (when \(a \equiv 1\)) and the classical Vallois embedding [64], since when applied to a Brownian motion, the classical Vallois embedding can be identified as the limiting case when a approaches a Dirac at 0.

6.1 Generalised Root embeddings

Below we restrict ourselves to additive functionals of the form

$$\begin{aligned} \,\mathrm {d}A_t = a(X_t) \,\mathrm {d}t \end{aligned}$$

with a Borel measurable a which is locally bounded and locally bounded away from 0, so that \(t\mapsto A_t\) is one-to-one and the measure \(m_A(\mathrm {d}x) = a(x) \xi (\mathrm {d}x)\) is \(\sigma \)-finite. This implies that A is an additive functional of X, i.e. A satisfies

  1. (1)

    \(A_0=0\), \(t\mapsto A_t(\omega )\) is right continuous and non-decreasing, almost surely,

  2. (2)

    \(A_t\) is \({\mathcal {F}}_t\)-measurable,

  3. (3)

    \(A_{t+s} = A_t+ A_s\circ \theta _t\) almost surely for each \(t, s\ge 0\).

We can then define the time-changed process Y as follows

$$\begin{aligned} Y_t = X_{\tau _t}, \qquad \tau _t:= \inf \{ u > 0: \;\; A_u = t\}. \end{aligned}$$

By [29, Theorem 10.11], Y is a standard process. Its potential is given by

$$\begin{aligned} U^Af(x)= & {} {\mathbb {E}}^x\left[ \int _0^ \infty f(X_t)\,\mathrm {d}A_t\right] = {\mathbb {E}}^x\left[ \int _0^ \infty f(X_t) a(X_t)\,\mathrm {d}t\right] \\= & {} \int u(x,y) f(y) a(y)\xi (\mathrm {d}y), \end{aligned}$$

and we can define the potential operator \(\widehat{U}^Af(y) = \int f(x)u(x,y) a(x)\xi (\mathrm {d}x)\) for any non-negative Borel-measurable function f which corresponds to the time-changed process \(\widehat{Y}_t = \widehat{X}_{\widehat{\tau }_t}\), where we analogously define \(\widehat{\tau }_t := \inf \{ u > 0, \;\;\widehat{A}_u = t\}\) with \(\widehat{A}_t = \int _0^t a(\widehat{X}_s) \,\mathrm {d}s\). In addition, strong duality holds,

Theorem 6.1

(Revuz, Thm. V.5 and Thm. 2 in VII.3 in [55]) The processes Y and \(\widehat{Y}\) are in strong duality with respect to the so-called Revuz measure \(m_A(\mathrm {d}x) = a(x) \xi (\mathrm {d}x)\).

Remark 6.2

From the duality with respect to the Revuz measure \(m_A\), it follows for any Borel measure \(\mu \) and \(y\in E\) that

$$\begin{aligned} \mu \widehat{U}^A (y) = \int u(x,y)\mu (\mathrm {d}x). \end{aligned}$$

Hence, \(\mu \widehat{U}^A = \mu \widehat{U}\), i.e. the potentials of the measures of the original and the time-changed process are equal. However, note that \(\mu U^A \not = \mu U\).

To apply our main result to the time-changed process we make the following assumption, which we will discuss later in this section.

Assumption 6.3

For all \(t>0\) and \(x\in E\), the transition functions of Y and \(\widehat{Y}\) are absolutely continuous with respect to \(m_A\), i.e. \(P^A_t(x,\cdot ) \ll m_A\) and \(\widehat{P}^A_t(\cdot ,y) \ll m_A\).

Combining the above duality results with our main Theorem 3.6 then gives us the following new solution of SEP\((X, \mu , \nu )\).

Theorem 6.4

Let X be a Markov process and A an additive functional for which Assumptions 2.1 and 6.3 hold. Let \(\mu \prec \nu \) be two measures with \(\sigma \)-finite potentials in balayage order, i.e. \(\mu \widehat{U}\ge \nu \widehat{U}\), and such that \(\nu \) charges no semipolar set. Then there exists a Root barrier \(R^A\) which embeds (AX) such that its first hitting time \(T^A:= \inf \{ t >0: \;\; (A_t, X_t) \in R^A \}\) embeds \(\mu \) into \(\nu \),

$$\begin{aligned} \mu P_{T} = \nu . \end{aligned}$$

Moreover, if we denote

$$\begin{aligned} f^{A,\mu ,\nu } = \inf \big \{ g \widehat{Q}^A-\text{ excessive: } \;\;\; g \ge \mu \widehat{U}(x) \mathbb {1}_{\{t \le 0\}} + \nu \widehat{U}(x) \mathbb {1}_{\{t >0\}} \big \}, \end{aligned}$$
(6.1)

where \(\widehat{Q}^A\) denotes the space-time semigroup associated with \(\widehat{Y}\), then

  1. (1)

    \(f^{A,\mu ,\nu }(t,x) = \mu P_{\tau _{t}\wedge A_{T^A}}\widehat{U}(y)\),

  2. (2)

    \(T^A = \mathop {\text {arg min}}\limits _{S:~ \mu P_S = \nu }~~ \mu P_{\tau _t\wedge A_S} U^A(B)\) for all Borel sets B and \(t\ge 0\),

  3. (3)

    We may take \(R^A = \left\{ (s,x)\in {\mathbb {R}}_+ \times E\;\; |\;\; f^{A,\mu ,\nu }(s,x) = \nu \widehat{U} (x) \right\} \).

Proof

By Remark 6.2, \(\mu \widehat{U}\ge \nu \widehat{U}\) implies \(\mu \widehat{U}^A\ge \nu \widehat{U}^A\) for the time-changed process Y. We henceforth write \(N_B(\omega ) = \{t>0:~ X_t(\omega ) \in B\}\) for the visits of a nearly Borel set B during the lifetime of X. Then the set B is semipolar if and only if the set \(N_B\) is almost surely countable. Further we have

$$\begin{aligned}\{s>0:~ X_{\tau _s(\omega )}(\omega ) \in B\} \subseteq N_B(\omega ) \end{aligned}$$

since the mapping \(s\mapsto \tau _s\) is continuous and strictly increasing because \(t\mapsto A_t\) is. Therefore, any set B which is semipolar for X is also semipolar for Y and \(\nu \) does not charge sets which are semipolar for Y.

Due to Assumption 6.3, the processes Y and \(\widehat{Y}\) and the measures \(\mu \) and \(\nu \) satisfy the assumptions of Theorem 3.6. Then \(f^{A,\mu ,\nu }\) and \(R^A\) defined as above are exactly the equivalent results from Theorem 3.6 for Y and the stopping time solving SEP\((Y,\mu , \nu )\) is given by

$$\begin{aligned} T = \inf \{t >0:~ (t, Y_t) \in R^A\}. \end{aligned}$$

Then for \(f^{A, \mu , \nu }\) as in (6.1), we have \(f^{A, \mu , \nu } (t, x) = \mu P^A_{t\wedge T}\widehat{U}^A (x)= \mu P_{\tau _{t\wedge T}}\widehat{U}(x)\) is the density of the measure \(\mu P_{\tau _t\wedge A_S} U^A\) w.r.t. \(m_A\). If we define \(T^A = \tau _T\), then for any nearly Borel set \(B\in \mathcal {E}^n\) we obtain \({\mathbb {P}}^\mu (X_{T^A}\in B) = {\mathbb {P}}^\mu (Y_T\in B)= \nu (B)\) and it follows that for any solution T to SEP\((Y,\mu , \nu )\), we have that \(T^A= \tau _T\) is a solution for SEP\((X, \mu , \nu )\). The optimality of Property 2 then naturally follows.

Finally, since \(\inf \{s>0:~ (s, Y_s) \in R^A\} = \inf \{s >0:~ (A_{\tau _s}, X_{\tau _s}) \in R^A\}\), for \(T^A= A_T\), we know that

$$\begin{aligned} T^A = \inf \{ t>0: ~ (A_t, X_t)\in R^A\}, \end{aligned}$$

which completes the proof. \(\square \)

Remark 6.5

Assumption 6.3 does not always hold, even under Assumption 2.1. For instance, let \(X=(X^1,X^2)\) be the Markov process given by

$$\begin{aligned} \,\mathrm {d}X_t = (\mathrm {d}B_t, a(X^1_t) \,\mathrm {d}t) \end{aligned}$$

where a is non-negative, bounded, smooth with (for instance) \(a'\) strictly positive, and B is a linear Brownian motion. Then by the weak Hörmander criterion, X admits transition probabilities with respect to Lebesgue measure and satisfies Assumption 2.1. However taking the time-change \(\tau _s\) corresponding to \(\,\mathrm {d}A_t = a(X_t^1) \,\mathrm {d}t\), the resulting process satisfies

$$\begin{aligned} \,\mathrm {d}Y_s = (\mathrm {d}B_{\tau _s}, \,\mathrm {d}s) \end{aligned}$$

which does not admit transition probabilities.

Remark 6.6

Let X be the diffusion with generator given in Hörmander form by

$$\begin{aligned} \mathcal {L}_X = V_0 + \sum _{i=1}^n V_i^2 \end{aligned}$$

(with for instance the \(V_i\)’s with bounded derivatives of all order), then (assuming a also smooth,say) the generator of Y is given by

$$\begin{aligned} \mathcal {L}_Y = \frac{1}{a} \mathcal {L}_X =V^A_0 + \sum _{i=1}^n \left( \frac{1}{a^{1/2}} V_i\right) ^2 \end{aligned}$$

for some vector field \(V^A_0\). In particular, if the following strong Hörmander condition holds for X :

$$\begin{aligned} \forall x \in {\mathbb {R}}^d,~~ \mathrm {Lie}[V_1,\ldots ,V_n](x) = {\mathbb {R}}^d \end{aligned}$$

it also holds for the generator of Y, in which case Y admits transition probabilities with respect to Lebesgue measure. This condition is for instance satisfied when X is multi-dimensional Brownian motion (or more generally, Brownian motion on a Carnot group).

Remark 6.7

(Obstacle PDE) The generator of the time-changed process \(\widehat{Y}\) is given by \(\widehat{\mathcal {L}}^A f(x) = \frac{1}{a(x)} \widehat{\mathcal {L}}f(x)\), see [29]. Hence, we can again identify \(f^{A,\mu ,\nu }(t,x)\) as the solution to the obstacle problem

$$\begin{aligned}\min \left[ (\partial _t - a^{-1} \widehat{\mathcal {L}}) u, u - \nu \widehat{U} \right] = 0,\quad u(0,\cdot ) = \mu \widehat{U} \qquad \text { on } (0,+\infty )\times E\end{aligned}$$

provided additional regularity assumptions are made that guarantee well-posedness of the above PDE. However, analogous to Corollary 3.7, dynamic programming applies without any additionally assumptions on \(\widehat{\mathcal {L}}\) and a.

Remark 6.8

(Vallois’ embedding as limit of Root type embedding) Item (2) in Theorem 6.4 implies that

$$\begin{aligned} T^A = \mathop {\text {arg min}}\limits _{S:~ \mu P_S = \nu } {\mathbb {E}}^\mu [F(A_S)], \; \text{ for } \text{ any } \text{ non-decreasing } \text{ convex } \text{ function } F. \end{aligned}$$

Taking X as the one-dimensional Brownian motion and \(a(x) =\delta _0(x)\) the Dirac at 0, the additive functional A becomes the local time of X at 0. Thus – at least informally since a is not bounded from below – Theorem 6.4 recovers the classical Vallois embedding, see e.g. [21, 64].

6.2 Examples

We now apply Theorem 6.4 to concrete Markov processes.

Example 6.9

(Brownian motion B and \(A_t= \int _0^t \exp (2B_s)\,\mathrm {d}s\)) Taking \(X_t = B_t\) as the one-dimensional Brownian motion, the additive functional \(\int _0^t \exp (2B_s)\,\mathrm {d}s\) has received much attention (see e.g. [50]) due to application in mathematical finance in the context of Asian options. Then \(B_{\tau _t} \overset{\mathrm {d}}{=}\log (Z_t)\), where Z is the Bessel process of index 0 for which the transition density is well known (see [50]). Figure 4 on page 25 shows the Root barriers for \(\mu = \delta _0\) and \(\nu =\mathrm {Uniform}[-1,1]\).

Fig. 4
figure 4

Root embedding for Brownian motion and time-changed Brownian motion, with the same time discretisation in both pictures, \(\mu = \delta _0\), \(\nu =\mathrm {Uniform}[-1,1]\)

Example 6.10

(Symmetric stable Lévy process X and \(A_t=2t+ \int _0^t\mathrm {arctan}(4X_s)\,\mathrm {d}s\)) For smooth a with \(c_1\le a\le c_2\) for some \(c_1, c_2>0\), from [10, Theorem (2.5)], the time-changed process \(Y_t=X_{\tau _t}\) has absolutely continuous transition density with respect to the Lebesgue measure. Comparing the Root barriers for \(\mu = \mathrm {Uniform}[-1,1]\) and \(\nu = 0.75\cdot \mathrm {Beta}(2,2)\) for X and Y, we can see that the barrier in Fig. 5b on p. 26 is not symmetric, unlike the barrier for X in Fig. 3b. Due to the time change, the process Y runs faster past negative increments and more slowly through the positive parts which leads to hitting the barrier early on the negative parts and much later on the positive parts compared to X.

Fig. 5
figure 5

Root embedding for the time-changed symmetric \(\genfrac{}{}{}1{1}{2}\)-stable process, with \(a(x) = 2+\mathrm {arctan}(4x)\), \(\mu = \mathrm {Uniform}[-1,1]\), \(\nu = 0.75\cdot \mathrm {Beta}(2,2)\)