1 Introduction

We study a singular stochastic control problem on a linearly controlled, one-dimensional Brownian motion \(X\) with (random) drift \(\mu \). The problem is motivated by the dividend problem, where \(X\) denotes the revenues of a firm and the firm’s manager needs to distribute dividends to the shareholders in an optimal way but being mindful of the risk of default. Similarly to the existing literature, we account for the risk of default by letting the process \(X\) be absorbed upon reaching zero.

As one may expect, the optimal distribution of dividends is very sensitive to the profitability of the firm, which is encoded in the drift \(\mu \) of the process \(X\). A positive drift reflects a company in good health, and as a rule of thumb, dividends are paid when revenues are sufficiently high (which is expected to occur rather often) so to keep a low risk of default. On the other hand, a negative drift indicates a firm that operates at a loss and therefore should be wound up as soon as possible by paying out all dividends.

Estimating profitability is a challenging task in many real-world situations and has already received attention in the mathematical economic literature; see e.g. Décamps et al. [20] for investment timing, De Marzo and Sannikov [22] for contract theory, Daley and Green [14] for asset trading. In order to capture this feature in a nontrivial but tractable way, we assume partial information on the drift of the process \(X\). This is a novelty compared to existing models on dividend distribution.

We remark that statistical estimation of the drift of a drifting Brownian motion from observation of the process is a much less efficient procedure than estimation of its volatility. Indeed, over a given period of time \([0,T]\), the variance on the classical estimator for the volatility can be reduced by increasing the number of observations, whereas this is not the case for the variance on the estimator for the drift \(\mu \). The latter depends on \(1/T\) (see Ekström and Lu [24, Example 2.1] for a simple example); hence an accurate estimate of \(\mu \) requires a long period of observation under the exact same market conditions, which in reality is not feasible.

Our study shows how the flow of information affects the firm’s manager optimal dividend strategy. A as in the informal discussion above, dividends are paid only when revenues exceed a critical value \(d^{*}\); however, in contrast to the existing literature, this critical value changes dynamically according to the manager’s current belief on the profitability of the firm. As we explain in more detail below, that belief is described by a state variable \(\pi \in (0,1)\), where a value of \(\pi \) close to 1 indicates a strong belief in a positive drift and a value of \(\pi \) close to 0 indicates a strong belief in a negative drift. We observe that the critical value \(d^{*}\) of the revenues increases (but stays bounded) as \(\pi \) increases, which is in line with the intuition that a firm with high profitability expects good performances and chooses to pay dividends when large revenues are realised, so that the risk of default is kept low and the business can be sustained over longer times. On the contrary, if there is a weak belief in the profitability of the firm, then dividends will be paid also for lower levels of the revenues, as there is no expectation that these will increase in the future. The partially informed manager of our firm learns about the true value of profitability by observing the stream of revenues \(X\) and adjusts her strategy accordingly, so that dividends are paid dynamically at different levels of revenue depending on the learning process.

The observation of \(X\) will in the end reveal the true drift \(\mu \), so that the belief of the firm’s manager will eventually converge to either \(\pi =0\) or \(\pi =1\). Her dividend strategy will then converge to the corresponding strategy for the problem with full information (see Proposition 5.14). This shows that our model complements and extends the existing literature, which is reviewed in the next section, by displaying a richer structure of the optimal solution and by effectively adding a new dimension to the classical problem (i.e., the belief). For a broader discussion on the economic foundations and implications of a dividend problem with partial information, we also refer the reader to the introduction of the preprint Décamps and Villeneuve [21], where a special case of our problem is studied with different methods (a detailed comparison is given in the final three paragraphs of the next section).

1.1 Mathematical background and overview of main results

Our specific mathematical interest is in the explicit characterisation of the optimal control in terms of an optimal boundary arising from an associated free boundary problem. To the best of our knowledge, the study of free boundaries for singular stochastic control problems associated to diffusions with absorption and partial information has never been addressed in the literature. Recently Øksendal and Sulem [39] studied general maximum principles for singular control problems with partial information. Their approach relies mostly on backward stochastic differential equations (BSDEs), and they provide general abstract results linking the value of the singular control problem to the solution of suitable BSDEs. Here we focus instead on a specific problem with the aim of a more detailed study of the optimal control. It is worth noticing that [39] does not consider the case of absorbed diffusions, which is a source of interesting mathematical facts in our paper, as we discuss below.

For the sake of tractability, we choose a model in which \(\mu \) is a random variable that can only take two real values, i.e., \(\mu \in \{ \mu _{0},\mu _{1}\}\), with \(\mu _{0}<\mu _{1}\). The company’s revenue at time \(t\) net of dividend payments reads

$$\begin{aligned} X^{D}_{t}=X_{t}-D_{t}:=x+\mu t+\sigma B_{t}-D_{t}, \end{aligned}$$
(1.1)

where \(B\) is a Brownian motion, \(\sigma >0\) and \(D_{t}\) denotes the total amount of dividends paid up to time \(t\) (notice that \(D\) is an increasing process and we choose it to be right-continuous). As in the most canonical formulation of the dividend problem, the firm’s manager wants to maximise the discounted flow of dividends until the firm goes bankrupt. Moreover, the manager can infer the true value of \(\mu \) by observing the evolution of \(X\).

By using standard filtering techniques, the problem can be rewritten in a Markovian framework by considering simultaneously the dynamics of \(X^{D}\) and of the process \(\pi _{t}:=P[\mu =\mu _{1}| \mathcal{F}^{X}_{t}]\), where \(\mathcal{F}^{X}_{t}=\sigma (X_{s}, s \le t)\). This approach has a long and venerable history in optimal stopping theory, with early contributions dating back to work of Shiryaev in the 1960s in the context of quickest detection. See Shiryaev [45] for a survey, and Johnson and Peskir [34] for some recent developments and further references. However, it seems that this model has never been adopted in the context of singular control.

One difficulty that arises by the reduction to a Markovian framework is that the dynamics of the state process is two-dimensional and diffusive. This leads to a variational formulation of the stochastic control problem in terms of PDEs, and therefore explicit solutions cannot be provided in general.

The literature on the optimal dividend problem is very rich with seminal mathematical contributions by Jeanblanc and Shiryaev [32] and Radner and Shepp [42]. More recent contributions include, among many others, the survey by Avanzi [2], Akyildirim et al. [1] and Eisenberg [23] who consider random interest rates, Avram et al. [3] who allow jumps in the dynamics of \(X\), Jiang and Pistorius [33] who consider a regime-switching dynamics for the coefficients in (1.1), or Bayraktar et al. [6] who consider jumps in the dynamics of \(X\) and fixed transaction costs for dividend lump payments. However, research so far has largely focused on explicitly solvable examples. This means that in the majority of papers, the underlying stochastic dynamics are either one- or two-dimensional but with one of the state processes driven by a Markov chain. Moreover, the time horizon \(\gamma ^{D}\) of the optimisation is usually assumed to be the first time of \(X^{D}\) falling below some level \(a\ge 0\). Alternatively, capital injection is allowed and the optimisation continues indefinitely, i.e., \(\gamma ^{D}=\infty \). These choices of \(\gamma ^{D}\) make the problem time-homogenous and easier to deal with. In the absence of capital injection, even just assuming a finite time-horizon for the dividend problem, i.e., taking \(\gamma ^{D}\wedge T\) for some deterministic \(T>0\), introduces major technical difficulties. The latter were addressed first by Grandits in [29] and [30] with PDE methods, and then by De Angelis and Ekström [16] with probabilistic methods. Interestingly, the finite time-horizon is more easily tractable in the presence of capital injection, as shown in Ferrari and Schuhmann [26] using ideas originally contained in El Karoui and Karatzas [25].

Here we take the approach suggested in [16], but as we explain below, we substantially expand the results therein. First we link our dividend problem to a suitable optimal stopping one. Then we solve the optimal stopping problem (OSP) by characterising its optimal stopping rule in terms of a free boundary \(\pi \mapsto d(\pi )\). Finally, we deduce from properties of the value function \(U\) of the OSP that the value function \(V\) of the dividend problem is a strong solution of an associated variational inequality on \([0,\infty )\times [0,1]\) with gradient constraint. Moreover, using the boundary \(d(\cdot )\), we express the optimal dividend strategy as an explicit process depending on \(t\mapsto d(\pi _{t})\). It is worth noticing that we can prove that \(V\in C^{1}((0,\infty ) \times (0,1))\), with the second derivatives \(V_{xx}\) and \(V_{x\pi }\) belonging to \(C((0,\infty ) \times (0,1))\) and \(V_{\pi \pi }\in L^{\infty }((0,\infty ) \times (0,1))\). This type of global regularity cannot be easily obtained with PDE methods due to the degeneracy of the underlying diffusion. Here we obtain these results with a careful probabilistic study of the value function \(U\). In particular, the argument used to prove \(V_{\pi \pi }\in L^{\infty }((0, \infty ) \times (0,1))\) in Proposition 6.2 seems completely new in the related literature.

As in [16], the presence of an absorbing point for the process \(X^{D}\) ‘destroys’ the standard link between optimal stopping and singular control. Such a link has been studied by many authors. Bather and Chernoff [5] and Beneš et al. [7] were the first to observe it, and Taksar [47] provided an early connection to Dynkin games. Extensions and refinements of the initial results were obtained in a long series of subsequent papers using different methodologies. Just to mention a few, we recall Boetius and Kohlmann [10], El Karoui and Karatzas [25] and Karatzas and Shreve [35] who address the problem with probabilistic methods, Benth and Reikvam [8] who use viscosity theory, and Guo and Tomecek [31] who link singular control problems to switching problems.

Departing from the literature mentioned above, we prove here that \(V_{x}=U\), where now \(U\) is the value function of an OSP whose underlying process is a two-dimensional, uncontrolled, degenerate diffusion \((\widehat{X},\widehat{\pi })\), which lives in \([0,\infty ) \times [0,1]\) and is reflected at \(\{0\}\times (0,1)\) towards the interior of the domain along the direction of a state-dependent vector \(\mathbf{v}(\widehat{\pi })\) (see Sect. 4.1). Moreover, upon each reflection, the gain process that is underlying the OSP increases exponentially at a rate that depends on the ‘intensity’ of the reflection and on the value \(\widehat{\pi }_{t}\) of the process. We call this behaviour of the gain process ‘state-dependent creation’ of the process \((\widehat{X},\widehat{\pi })\) at \(\{0\}\times (0,1)\) (cf. Peskir [40]). Indeed, it is interesting that the ‘creation’ feature of our reflected process links our paper to work by Stroock and Williams [46] and Peskir [40] concerning a type of non-Feller boundary behaviour of one-dimensional Brownian motion with drift. Notice, however, that in those papers, the creation rate is constant and the problem is set on the real line, so that the direction of reflection is fixed. Here we deal instead with an example of a nontrivial, two-dimensional extension of the problem studied in [46, 40].

A striking difference with the problem studied in [16] is the much more involved dynamics underlying the OSP and the behaviour of the gain process. In [16], the state dynamics in the control problem is of the form \((t,\check{X}^{D}_{t})\), with \(\check{X}^{D}\) as in (1.1) but with deterministic constant drift. This leads to an optimal stopping problem involving a one-dimensional Brownian motion with drift which is reflected at zero, and which is created (in the same sense as above) at a constant rate. The state variable ‘time’ is unaffected by the link between the dividend problem and the stopping one. Here instead, the correlation in the dynamics of \(X^{D}\) and \(\pi \) in the control problem induces two main effects: (i) it causes the reflection of the process \((\widehat{X},\widehat{\pi })\) to be along the stochastic vector process \(t\mapsto \mathbf{v}(\widehat{\pi }_{t})\) (see (4.5) and (4.6)), and (ii) it generates a nonconstant creation rate that depends on the process \(\widehat{\pi }\) (see (4.8)).

The reflection of \((\widehat{X},\widehat{\pi })\) at \(\{0\}\times (0,1)\) is realised by an increasing process \((A_{t})_{t\ge 0}\) which we can write down explicitly (see (4.14)) and which we informally refer to as ‘local time’ of \((\widehat{X},\widehat{\pi })\) at \(\{0\}\times (0,1)\). Despite its use in solving the dividend problem, the OSP that we derive is interesting in its own right and belongs to a class of problems that, to the best of our knowledge, has never been studied before. In particular, this is an optimal stopping problem on a multidimensional diffusion, reflected in a domain \(\mathcal{O}\), with a gain process that increases exponentially at a rate proportional to the local time spent by the process in some portions of \(\partial \mathcal{O}\) (moreover, this rate is nonconstant).

In conclusion, we believe that the main mathematical contributions of our work are the following: (i) for the first time, we characterise the free boundary associated to a singular stochastic control problem with partial information on the drift of the process and absorption; (ii) we obtain rather strong regularity results for the value \(V\) of the control problem, despite degeneracy of the associated HJB operator; (iii) we find a nontrivial connection between singular control for multidimensional diffusions with absorption, and optimal stopping of reflected diffusions with ‘state-dependent creation’; (iv) we solve an example of a new class of optimal stopping problems whose popularity, we hope, will increase with the increasing understanding of their role in the dividend problem.

After completing this work, we learned about the preprint by Décamps and Villeneuve [21] where the same problem is addressed in the special case of \(\mu _{1}=-\mu _{0}\). In that setting, the problem’s dimension can be reduced by a transformation that makes one of the two state processes purely controlled (a closer inspection reveals that this is in line with the case of a null drift in our equation (4.46)). The problem in [21] can be solved by ‘guess-and-verify’ via a parameter-dependent family of ODEs with suitable boundary conditions. The methods of [21] cannot be used for generic \(\mu _{0}\) and \(\mu _{1}\) because the dimension reduction is impossible and the ODE becomes a two-dimensional free boundary problem involving partial derivatives.

Besides the methodological differences between the two papers, the optimal strategy obtained in [21] shares similarities with ours but also features a remarkable difference. Due to the fact that one of the state variables is purely controlled, in [21], the level of future revenues at which dividends will be paid can only increase after each dividend payment. As stated in [21], this can be understood as the firm’s manager ‘becoming more confident about the relevance of her project’. When revenues reach a new maximum, this suggests to the manager that the drift is positive; however, the symmetric structure \(\mu _{1}=-\mu _{0}\) is such that she does not subsequently change her view, even if revenues start fluctuating downwards. This stands in sharp contrast with our solution, which instead allows the manager to increase/decrease her revenues’ target level depending on the new information acquired.

The rest of the paper is organised as follows. In Sect. 2, we cast the problem and provide its Markovian formulation. Section 3 introduces the verification theorem which we aim at proving probabilistically in the subsequent sections. The main technical contribution of the paper is contained in Sects. 46. In the first part of Sect. 4, we introduce the stopping problem for a two-dimensional degenerate diffusion with state-dependent reflection. Then, in the rest of Sect. 4 and in Sect. 5, we study properties of the associated value function and obtain geometric properties of the optimal stopping set. In Sect. 6, we prove that the value function and the optimal control of the dividend problem can be constructed from the value function of the optimal stopping problem and its optimal stopping region. A short appendix contains a rather standard proof of the verification theorem stated in Sect. 3.

2 Setting

We consider a complete probability space \((\Omega ,\mathcal{F},P)\) equipped with a one-dimensional Brownian motion \((B_{t})_{t \ge 0}\) and its natural filtration \((\mathcal{F}^{B}_{t})_{t\ge 0}\) completed with \(P\)-nullsets. On the same probability space, we also have a random variable \(\mu \) which is independent of \(B\) and takes two possible real values \(\mu _{0}<\mu _{1}\), with probability \(P[ \mu =\mu _{1}]=\pi \in [0,1]\). Further, given \(x>0\) and \(\sigma >0\), we model the firm’s revenue in the absence of dividend payments by the process \((X_{t})_{t\ge 0}\) defined as

$$\begin{aligned} X_{t}=x+\mu t+\sigma B_{t}, \qquad t\ge 0. \end{aligned}$$
(2.1)

We denote by \((\mathcal{F}^{X}_{t})_{t\ge 0}\) the filtration generated by \(X\), augmented with \(P\)-nullsets, and we say that a dividend strategy is an \((\mathcal{F}^{X}_{t})_{t\ge 0}\)-adapted, increasing, right-continuous process \((D_{t})_{t\ge 0}\) with \(D_{0-}=0\). In particular, \(D_{t}\) represents the cumulative amount of dividends paid by the firm up to time \(t\), and we say that the firm’s profit under the dividend strategy \(D\) is

$$\begin{aligned} X^{D}_{t}=x+\mu t+\sigma B_{t}-D_{t}, \qquad t\ge 0. \end{aligned}$$
(2.2)

Notice that for \(D\equiv 0\), we have \(X^{0}=X\). As it is customary in the dividend problem, we define a default time at which the firm stops paying dividends and denote it by

$$ \gamma ^{D}:=\inf \{t\ge 0 : X^{D}_{t}\le 0\}. $$

Equipped with this simple model for the firm’s profitability, the manager of the firm wants to maximise the expected flow of discounted dividends until the default time, where discounting occurs at a constant rate \(\rho >0\), i.e.,

$$\begin{aligned} \text{maximise the value of } E\bigg[\int ^{\gamma ^{D}}_{0-}e^{-\rho t}dD _{t}\bigg] \text{ over $D\in \mathcal{A}$}, \end{aligned}$$
(2.3)

where \(\mathcal{A}\) denotes the set of admissible dividend strategies. In particular,

$$\begin{aligned} &\text{$D\in \mathcal{A}$ iff $D$ is $(\mathcal{F}^{X}_{t})_{t\ge 0}$-adapted, increasing, right-continuous,} \\ &\text{with $D_{0-}=0$ and such that $D_{t}-D_{t-}\le X^{D}_{t-}$ for all $t\ge 0$, $P$-a.s.} \end{aligned}$$

It is important to notice that \(X=X^{D}+D\). Moreover, the control process \(D\) is chosen by the firm’s manager based on her observation of the process \(X\) and it is therefore natural that \(D_{t}\) should be \(\mathcal{F}^{X}_{t}\)-measurable.

It is well known that the dynamics (2.2) may be rewritten in a more tractable Markovian form, thanks to standard filtering methods (see for instance Shiryaev [44, Sect. 4.2]). In particular, denoting \(\pi _{t}:=P[\mu =\mu _{1}\big |\mathcal{F}^{X}_{t}]\), one can construct an \(((\mathcal{F}^{X}_{t})_{t\ge 0},P)\)-Brownian motion \((W_{t})_{t \ge 0}\) and write the dynamics of the couple \((X^{D}_{t},\pi _{t})_{t \ge 0}\) in the form

dXtD=(μ0+μˆπt)dt+σdWtdDt,X0D=x,
(2.4)
dπt=θπt(1πt)dWt,π0=π,
(2.5)

under the measure \(P\), with \(\hat{\mu }:=\mu _{1}-\mu _{0}\) and \(\theta :=\hat{\mu }/\sigma \). We notice that (2.4) can be obtained from (2.2) by formally replacing \(\mu \) with \(E[ \mu | \mathcal{F}^{X}_{t}]\). Moreover, \((\pi _{t})_{t\ge 0}\) in (2.5) is a bounded martingale; hence it is a martingale on \([0,\infty ]\) and in particular, \(\pi _{\infty }\in \{0,1\}\) by noticing that \(\mu =\lim _{s\to \infty }X_{s}/s\)\(P\)-a.s. and therefore \(\mu \) is \(\mathcal{F}^{X}_{\infty }\)-measurable.

Intuitively, we can say that at any given time \(t\ge 0\), the amount of new information which becomes available to the firm’s manager is measured by the absolute value of the increment \(\Delta \pi _{t}\). Then, the learning rate depends on the so-called signal-to-noise ratio \(\theta \) and on the current belief \(\pi _{t}\), which appear in the diffusion coefficient in (2.5). Given an increment \(\Delta W _{t}\) of the Brownian motion, the value of \(|\Delta \pi _{t}|\) is increasing in the signal-to-noise ratio, as expected. Further, the maximum of the diffusion coefficient (hence the maximum learning rate) occurs when \(\pi _{t}=1/2\), which corresponds to the most uncertain situation.

Since \((X^{D}_{t},D_{t},\pi _{t},W_{t})_{t\ge 0}\) is \((\mathcal{F}^{X} _{t})_{t\ge 0}\)-adapted and we do not need to consider any other filtration, we write from now on\(\mathcal{F}_{t}= \mathcal{F}^{X}_{t}\) to simplify the notation. In the new Markovian framework, our problem (2.3) reads

$$\begin{aligned} V(x,\pi )=\sup _{D\in \mathcal{A}}E_{x,\pi }\bigg[\int ^{\gamma ^{D}} _{0-}e^{-\rho t}dD_{t}\bigg] \qquad \text{for }(x,\pi ) \in [0,\infty ) \times (0,1), \end{aligned}$$
(2.6)

where \(E_{x,\pi }[\,\cdot \,]:=E[\,\cdot \,|X_{0}=x,\pi _{0}=\pi ]\).

The formulation in (2.6) of the optimal dividend problem with partial information corresponds to a singular stochastic control problem whose underlying process is a two-dimensional degenerate diffusion which is killed upon leaving the set \((0,\infty ) \times [0,1]\) (recall that if \(\pi _{0}\in (0,1)\), then \(\pi _{t}\in (0,1)\) for all \(t\in (0,\infty )\), whereas if \(\pi _{0}\in \{0,1\}\), then \(\pi _{t}=\pi _{0}\) for all \(t>0\)). In the economic literature, the value function \(V\) of (2.6) is traditionally considered as the value of the firm itself.

Remark 2.1

The case of full information corresponds to \(\pi \in \{0,1\}\). In this case, it is known that if \(\mu \le 0\), it is optimal to pay all dividends immediately and liquidate the firm. On the other hand, if \(\mu >0\), then dividends should be paid gradually according to a strategy characterised by a Skorokhod reflection of the process \(X^{D}\) against a positive (moving) boundary; see Jeanblanc and Shiryaev [32] for the stationary and De Angelis and Ekström [16] for the nonstationary case.

In our setting with partial information, it is clear that \(\mu _{0}<\mu _{1}\le 0\) would lead to an immediate liquidation of the firm. The cases \(\mu _{1}>\mu _{0}\ge 0\) and \(\mu _{0}<0<\mu _{1}\) instead need to be studied separately as they present subtle technical differences which would make a unified exposition rather lengthy. In this paper, we start with the case \(\mu _{0}<0<\mu _{1}\), which seems economically the most interesting as it represents the uncertainty of a firm which cannot predict exactly whether its line of business has an increasing or decreasing future trend.

Motivated by the above remark, we make the following standing assumption throughout the paper:

Assumption 2.2

We have \(\mu _{1}>0>\mu _{0}\).

We close this section by introducing the infinitesimal generator \(\mathcal{L}_{X,\pi }\) associated to the uncontrolled process \((X_{t},\pi _{t})_{t\ge 0}\). Given any function \(f\in C^{2}([0, \infty ) \times [0,1])\), we set

$$\begin{aligned} (\mathcal{L}_{X,\pi } f)(x,\pi ) &:=\frac{1}{2} \big(\sigma ^{2} f_{xx} + 2\sigma \theta \pi (1-\pi )f_{x\pi }+ \theta ^{2}\pi ^{2}(1-\pi )^{2}f _{\pi \pi }\big)(x,\pi ) \\ & \phantom{=::}+ (\mu _{0}+\hat{\mu }\pi )f_{x}(x,\pi ) \end{aligned}$$

for \((x,\pi )\in [0,\infty ) \times [0,1]\) and where \(f_{xx}\), \(f_{x \pi }\), \(f_{\pi \pi }\) are second derivatives and \(f_{x}\) a first derivative. For simplicity, in the rest of the paper, we also define

$$\begin{aligned} \mathcal{O}:=(0,\infty )\times (0,1). \end{aligned}$$

Moreover, given a set \(A\), we denote by \(\overline{A}\) its closure.

Following the approach introduced in [16], we start our analysis in the next section by providing a verification theorem for \(V\). Then we use the latter to conjecture an optimal stopping problem that should be associated with \(V_{x}\). It will soon become clear that the construction of [16] is substantially easier than the one needed here. Our new construction also leads to a much more involved optimal stopping problem.

3 A verification theorem

A familiar heuristic use of the dynamic programming principle suggests that for any admissible control \(D\), the process

$$\begin{aligned} e^{-\rho (t\wedge \gamma ^{D})}V\big(X^{D}_{t\wedge \gamma ^{D}}, \pi _{t\wedge \gamma ^{D}}\big)+\int _{0-}^{t\wedge \gamma ^{D}}e^{- \rho s}dD_{s}, \qquad t \geq 0, \end{aligned}$$
(3.1)

should be a supermartingale, and if \(D=D^{*}\) is an optimal control, then (3.1) should be a martingale. Moreover, given a starting point \((x,\pi )\), one strategy could be to pay immediately a small amount \(\delta \) of dividends, hence shifting the dynamics to the point \((x-\delta ,\pi )\), and then continue optimally. Since this would in general be suboptimal, one has

$$\text{$V(x,\pi )\ge V(x-\delta ,\pi )+\delta $, which implies $V_{x}(x,\pi )\ge 1$.} $$

If the inequality is strict, then the suggested strategy is strictly suboptimal. Hence, the firm should pay dividends when \(V_{x}=1\) and do nothing when \(V_{x}> 1\). It is also clear from (2.6) that \(V(0,\pi )=0\) for all \(\pi \in [0,1]\).

Based on this heuristic, we can formulate the following verification theorem. Its proof is rather standard (see e.g. Fleming and Soner [27, Theorem VIII.4.1]) and we give it in the Appendix for completeness.

Theorem 3.1

Let\(v\in C^{1}(\mathcal{O})\cap C(\overline{\mathcal{O}})\)with\(v_{xx}, v_{x\pi }\in C(\mathcal{O})\)and\(v_{\pi \pi }\in L^{\infty }_{\mathrm{loc}}(\mathcal{O})\). Assume that\(0\le v(x,\pi )\le c x\)for all\((x,\pi )\in \mathcal{O}\)and some\(c>0\), and that\(v\)solves

$$\begin{aligned} \max \{(\mathcal{L}_{X,\pi }-\rho )v,1-v_{x}\}(x,\pi ) &=0 \qquad \textit{for a.e.}\ (x,\pi )\in \mathcal{O} \end{aligned}$$
(3.2)
$$\begin{aligned} v(0,\pi ) &=0 \qquad \textit{for}\ \pi \in [0,1]. \end{aligned}$$
(3.3)

Then\(v\ge V\)on\(\mathcal{O}\).

Let us denote

$$\begin{aligned} \mathcal{I}_{v}:=\{(x,\pi )\in \mathcal{O}: v_{x}(x,\pi )>1\}. \end{aligned}$$
(3.4)

In addition to the above, assume that\(v\in C^{2}(\overline{ \mathcal{I}_{v}}\cap \mathcal{O})\)and there exists\(D^{v}\in \mathcal{A}\)such that for\(P\)-a.e\(\omega \in \Omega \)and for all\(0\le t\le \gamma ^{D^{v}}(\omega )\), we have

$$\begin{aligned} (X^{D^{v}}_{t},\pi _{t}) &\in \overline{\mathcal{I}_{v}}, \end{aligned}$$
(3.5)
(3.6)
(3.7)

Then\(V=v\)on\(\mathcal{O}\)and\(D^{*}:=D^{v}\)is an optimal dividend strategy.

If \(V\in C^{1}(\mathcal{O})\), we can define

$$\begin{aligned} \mathcal{I}:=\{(x,\pi )\in \mathcal{O}:V_{x}(x,\pi )>1\}, \end{aligned}$$
(3.8)

and we refer to ℐ as the inaction set for problem (2.6). For future reference, we also recall that if \(V\in C^{2}( \mathcal{O})\) solves (3.2) and (3.3), then we have in particular

$$\begin{aligned} (\mathcal{L}_{X,\pi } V-\rho V)(x,\pi )=0, \qquad (x,\pi )\in \mathcal{I}. \end{aligned}$$
(3.9)

4 Stopping a two-dimensional diffusion with reflection and creation

In this section, we construct an optimal stopping problem (OSP) which involves a two-dimensional degenerate diffusion. This diffusion is kept inside \(\mathcal{O}\) by reflection at \(\{0\}\times (0,1)\) and it also undergoes creation upon each new reflection, in a sense which will be mathematically clarified later. Here we also start a detailed study of the optimal stopping region and of the value function of this OSP, which will be instrumental to solve problem (2.6).

4.1 Construction of the stopping problem

Assume for a moment that \(V\in C^{2}(\overline{\mathcal{O}})\) so that the boundary condition \(V(0,\pi )=0\) would also imply \(V_{\pi }(0,\pi )=V_{\pi \pi }(0,\pi )=0\). Then for all \(\pi \in (0,1)\) for which \((0,\pi )\in \mathcal{I}\) (see (3.8)), we get from (3.9) that

$$\begin{aligned} \frac{1}{2}\sigma ^{2}V_{xx}(0,\pi )+\sigma \theta \pi (1-\pi )V_{x \pi }(0,\pi )+(\mu _{0}+\hat{\mu }\pi )V_{x}(0,\pi )=0. \end{aligned}$$
(4.1)

Setting \(u:=V_{x}\) we notice that \(\mathcal{I}=\{(x,\pi )\in \mathcal{O}: u(x,\pi )>1\}\) and that \(u\ge 1\) in \(\mathcal{O}\). Moreover, formally differentiating (3.9) and using (4.1), we obtain that \(u\) solves

$$\begin{aligned} (\mathcal{L}_{X,\pi } u-\rho u)(x,\pi ) &=0, \qquad (x,\pi )\in \mathcal{I}, \end{aligned}$$
(4.2)
$$\begin{aligned} u(x,\pi ) &=1, \qquad (x,\pi )\in \partial \mathcal{I}, \end{aligned}$$
(4.3)
$$\begin{aligned} \frac{1}{2}\sigma ^{2}u_{x}(0,\pi )+\sigma \theta \pi (1-\pi )u_{ \pi }(0,\pi ) & \\ \phantom{=:}+(\mu _{0}+\hat{\mu }\pi )u(0,\pi ) &=0, \qquad (0,\pi )\in \mathcal{I}. \end{aligned}$$
(4.4)

We claim that the variational problem (4.2)–(4.4) should be connected to the optimal stopping problem (4.8) given below. First we state the problem, then we give a heuristic justification of our claim and finally we prove, in several steps, that our conjecture is indeed correct.

Let \((\widehat{X},\widehat{\pi })\) be solution of the system, for \(t>0\),

$$\begin{aligned} d\widehat{X}_{t} &=(\mu _{0}+\hat{\mu }\widehat{\pi }_{t})dt+\sigma dW _{t}+dA_{t}, \qquad \widehat{X}_{0}=x, \end{aligned}$$
(4.5)
$$\begin{aligned} d\widehat{\pi }_{t} &=\theta \widehat{\pi }_{t}(1-\widehat{\pi }_{t}) \bigg(dW_{t}+\frac{2}{\sigma }dA_{t}\bigg), \qquad \widehat{\pi }_{0}=\pi , \end{aligned}$$
(4.6)

where \((A_{t})_{t\ge 0}\) is an increasing continuous process with \(A_{0}=0\) and such that \(P\)-a.s.,

(4.7)

Notably the process \((\widehat{X},\widehat{\pi })\) is a two-dimensional degenerate diffusion which is reflected at \(\{0\}\times (0,1)\) towards the interior of \(\mathcal{O}\), along the state-dependent unitary vector

$$\mathbf{v}(\pi ) := \bigg( \frac{1}{c(\pi )},\frac{\frac{2\theta }{ \sigma } \pi (1-\pi )}{c(\pi )}\bigg)\qquad \text{with $c(\pi ):=\sqrt{1+ \bigg(\frac{2\theta }{\sigma }\bigg)^{2}\pi ^{2}(1-\pi )^{2}}$}. $$

Although existence of this reflected process may be deduced by standard theory (see e.g. Bass [4, Sects. I.11 and I.12] for a general exposition and references), we do not dwell here on this issue. In fact, in the next section, the reflected SDE (4.5), (4.6) is reduced to an equivalent but simpler one (see (4.12), (4.13) below) for which a solution can be computed explicitly – hence implying that (4.5), (4.6) admits a solution as well.

For \((x,\pi )\in \mathcal{O}\), let us now consider the problem

$$\begin{aligned} U(x,\pi )=\sup _{\tau }E_{x,\pi }\bigg[\exp \bigg(\int _{0}^{\tau }\frac{2}{ \sigma ^{2}}(\mu _{0}+\hat{\mu }\widehat{\pi }_{t})dA_{t}-\rho \tau \bigg)\bigg], \end{aligned}$$
(4.8)

where the supremum is taken over all \(P_{x,\pi }\)-a.s. finite stopping times.

Associated with the above problem are the so-called continuation and stopping sets, denoted by \(\mathcal{C}\) and \(\mathcal{S}\), respectively. They are defined as

$$\begin{aligned} \mathcal{C} &:=\{(x,\pi )\in [0,\infty )\times (0,1) : U(x,\pi )>1\}, \end{aligned}$$
(4.9)
$$\begin{aligned} \mathcal{S} &:=\{(x,\pi )\in [0,\infty )\times (0,1) : U(x,\pi )=1\}, \end{aligned}$$
(4.10)

and it is immediate to observe that if \(U=V_{x}\), then \(\mathcal{C}= \mathcal{I}\) (recall (3.8)).

The heuristic that associates (4.8) to (4.2)–(4.4) goes as follows. Suppose that \(u\in C^{2}(\overline{\mathcal{O}})\) is a solution of (4.2)–(4.4) and that

$$t\mapsto e^{\int _{0}^{t}\frac{2}{\sigma ^{2}}(\mu _{0}+\hat{\mu } \widehat{\pi }_{s})dA_{s}-\rho t}u(\widehat{X}_{t},\widehat{\pi }_{t}) \quad \text{is a $P$-supermartingale}. $$

Then \((\mathcal{L}_{X,\pi }-\rho )u\le 0\) on \(\mathcal{O}\), and an application of Dynkin’s formula, combined with the use of (4.4) and \(u\ge 1\), gives

$$\begin{aligned} u(x,\pi ) & \ge E_{x,\pi }\bigg[\exp \bigg(\int _{0}^{\tau }\frac{2}{ \sigma ^{2}}(\mu _{0}+\hat{\mu }\widehat{\pi }_{t})dA_{t}-\rho \tau \bigg)u(\widehat{X}_{\tau },\widehat{\pi }_{\tau })\bigg] \\ & \ge E_{x,\pi }\bigg[\exp \bigg(\int _{0}^{\tau } \frac{2}{\sigma ^{2}}(\mu _{0}+\hat{\mu }\widehat{\pi }_{t})dA_{t}- \rho \tau \bigg)\bigg] \end{aligned}$$

for any stopping time \(\tau \). Then \(u\ge U\). Moreover, by (4.2) and (4.3), the two inequalities above become equalities if we choose \(\tau \) as the first exit time from ℐ, and this concludes the heuristic.

The rest of this section is devoted to the analysis of problem (4.8) in order to show that indeed \(U=V_{x}\) and that \(U\) solves (4.2)–(4.4).

4.2 A Girsanov transformation

It turns out that the problem may be more conveniently addressed under a different probability measure. As it is customary in problems involving the process \((\pi _{t})\) (see e.g. Ekström and Lu [24], Klein [38] or Peskir and Johnson [34]), we introduce here the analogue for \(\widehat{\pi }_{t}\) of the so-called likelihood ratio process, namely

$$\begin{aligned} \widehat{\Phi }_{t}:=\frac{\widehat{\pi }_{t}}{1-\widehat{\pi }_{t}}, \qquad t\ge 0. \end{aligned}$$

By direct computation, it is not hard to derive the dynamics of \(\widehat{\Phi }\), for \(t>0\), in the form

$$\begin{aligned} \frac{d\widehat{\Phi }_{t}}{\widehat{\Phi }_{t}}=\theta \left (\frac{2}{ \sigma }dA_{t}+dW_{t}+\theta \widehat{\pi }_{t} dt\right ), \qquad \widehat{\Phi }_{0}=\varphi :=\frac{\pi }{1-\pi }. \end{aligned}$$

With the aim of turning \(W+\theta \int \widehat{\pi }_{s} ds\) into a Brownian motion, we follow the same steps as in [24] and introduce a new probability measure \(Q\) on \(\mathcal{F}_{T}\) by its Radon–Nikodým derivative

$$\begin{aligned} \eta _{T}:=\frac{dQ}{dP}\bigg|_{\mathcal{F}_{T}}=\exp \left (-\int _{0} ^{T}\theta \widehat{\pi }_{s}dW_{s}-\frac{1}{2}\int _{0}^{T}\theta ^{2} \widehat{\pi }^{2}_{s}ds\right ), \end{aligned}$$
(4.11)

for some finite \(T>0\). Under the new measure \(Q\), we have that

$$W^{Q}_{t}:=W_{t}+\theta \int _{0}^{t}\widehat{\pi }_{s} ds, \qquad t \in [0,T], $$

is a Brownian motion and the dynamics of \((\widehat{X}, \widehat{\Phi })\) for \(t\in [0,T]\) read

$$\begin{aligned} d\widehat{X}_{t} &=\mu _{0}dt+\sigma dW^{Q}_{t}+dA_{t}, \qquad \widehat{X}_{0}=x, \end{aligned}$$
(4.12)
$$\begin{aligned} d\widehat{\Phi }_{t} &=\theta \widehat{\Phi }_{t}\left (dW^{Q}_{t}+\frac{2}{ \sigma }dA_{t}\right ), \qquad \widehat{\Phi }_{0}=\varphi . \end{aligned}$$
(4.13)

One advantage of this formulation is that the process \(\widehat{X}\) is decoupled from the process \(\widehat{\Phi }\), and thanks to (4.7), we see that it is just a Brownian motion with drift \(\mu _{0}\) reflected at zero. In particular, this allows computing a simple expression for \(A\). Indeed, \(Q_{x,\varphi }\)-a.s. on \([0,T]\), we have (see Karatzas and Shreve [36, Lemma 3.6.14])

$$\begin{aligned} A_{t}=x\vee \sup _{0\le s\le t}(-\mu _{0}s-\sigma W^{Q}_{s})-x. \end{aligned}$$
(4.14)

Moreover, we can express the dynamics for \(\widehat{\Phi }\) as

$$\begin{aligned} \widehat{\Phi }_{t}=\varphi \exp \left (\theta W^{Q}_{t}-\frac{\theta ^{2}}{2}t +\frac{2\theta }{\sigma } A_{t}\right ) \qquad \text{$Q_{x,\varphi }$-a.s.,} \end{aligned}$$
(4.15)

where the dependence on \(x\) is given explicitly by (4.14). Sometimes we also use the notation\((\widehat{X}^{x},A^{x}, \widehat{\Phi }^{x,\varphi })\)to express the dependence of\((\widehat{X},A,\widehat{\Phi })\)on the initial point\((x,\varphi )\).

In order to rewrite problem (4.8) in the new variables, we introduce the process

$$\begin{aligned} Z_{t}:=\frac{1+\widehat{\Phi }_{t}}{1+\varphi } \qquad \text{$P_{x,\varphi }$-a.s.} \end{aligned}$$

and notice that \(P_{x,\varphi }[Z_{0}=1]=1\) and that under measure \(P_{x,\varphi }\), we have

$$\begin{aligned} \frac{dZ_{t}}{Z_{t}}=\theta \widehat{\pi }_{t}\left (\frac{2}{\sigma }dA_{t}+dW_{t}+\theta \widehat{\pi }_{t}dt\right ), \qquad t>0. \end{aligned}$$

Recalling (4.11) and rewriting the above SDE in terms of an exponential gives

$$\begin{aligned} Z_{t}=\frac{1}{\eta _{t}}\exp \left (\int _{0}^{t}\frac{2\theta }{ \sigma }\widehat{\pi }_{s} dA_{s}\right ), \qquad t\in [0,T], \end{aligned}$$

with the same \(T>0\) as in (4.11) and \(\eta _{t}=E[\eta _{T}| \mathcal{F}_{t}]\).

Since the stopping times in (4.8) are unbounded, changing measure requires a bit of care. In particular, we proceed in two steps: we first change measure for a fixed \(T\) and then pass to the limit as \(T\to \infty \). For any \(\tau \) and any \((x,\varphi )\), we get

$$\begin{aligned} &E_{x,\pi } \bigg[\exp \bigg(\int _{0}^{\tau \wedge T}\frac{2}{\sigma ^{2}}(\mu _{0}+\hat{\mu }\widehat{\pi }_{t})dA_{t}-\rho (\tau \wedge T) \bigg)\bigg] \\ &= E_{x,\pi }\bigg[\exp \bigg(\frac{2\mu _{0}}{\sigma ^{2}}A_{\tau \wedge T}-\rho (\tau \wedge T) \bigg)\exp \bigg(\int _{0}^{\tau \wedge T}\frac{2\theta }{\sigma }\widehat{\pi }_{t} dA_{t}\bigg)\frac{ \eta _{\tau \wedge T}}{\eta _{\tau \wedge T}}\bigg] \\ &= E_{x,\pi }\bigg[\exp \bigg(\frac{2\mu _{0}}{\sigma ^{2}}A_{\tau \wedge T}-\rho (\tau \wedge T) \bigg)Z_{\tau \wedge T}\eta _{\tau \wedge T}\bigg] \\ &=(1+\varphi )^{-1}E^{Q}_{x,\varphi }\bigg[\exp \left (\frac{2\mu _{0}}{\sigma ^{2}}A_{\tau \wedge T}-\rho (\tau \wedge T) \right )(1+ \widehat{\Phi }_{\tau \wedge T})\bigg]. \end{aligned}$$
(4.16)

Defining for all \((x,\varphi )\in [0,\infty ) \times (0,\infty )\) the problems

$$\begin{aligned} U(x,\pi ; T) &:=\sup _{\tau }E_{x,\pi }\bigg[\exp \bigg(\int _{0}^{ \tau \wedge T}\frac{2}{\sigma ^{2}}(\mu _{0}+\hat{\mu }\widehat{\pi } _{t})dA_{t}-\rho (\tau \wedge T)\bigg)\bigg], \\ U^{Q}(x,\varphi ; T) &:=\sup _{\tau }E^{Q}_{x,\varphi }\left [\exp \left (\frac{2\mu _{0}}{\sigma ^{2}}A_{\tau \wedge T}-\rho (\tau \wedge T) \right )(1+\widehat{\Phi }_{\tau \wedge T})\right ], \end{aligned}$$

we immediately see that (4.16) implies that

$$\begin{aligned} U^{Q}(x,\varphi ; T)=(1+\varphi )U\big(x,\varphi /(1+\varphi ); T \big). \end{aligned}$$
(4.17)

We should like to extend this equality to the case \(T=\infty \) and this requires a short digression as Girsanov’s theorem does not directly apply.

Since we are interested in properties of the value functions, we can define a new probability space \((\overline{\Omega},\overline{\mathcal{F}},\overline{P})\) equipped with a Brownian motion \(\overline{W}\) and a filtration \((\overline{\mathcal{F}}_{t})_{t\ge 0}\), and let \((\overline{\widehat{X}},\overline{\widehat{\Phi }})\) be the unique strong solution of the SDE (4.12), (4.13) driven by \(\overline{W}\) (instead of \(W^{Q}\)) with a corresponding process \(\overline{A}\) as in (4.14). Notice that indeed the couple \((\overline{\widehat{X}},\overline{\widehat{\Phi }})\) has an explicit expression for all \(t\ge 0\). In this setting, we can define the stopping problems

$$\begin{aligned} \overline{U}(x,\varphi ;T) &:=\sup _{\tau }\overline{E}_{x,\varphi }\left [\exp \left (\frac{2\mu _{0}}{\sigma ^{2}}\overline{A}_{\tau \wedge T}-\rho (\tau \wedge T) \right )(1+\overline{\widehat{\Phi }}_{\tau \wedge T})\right ], \\ \overline{U}(x,\varphi ) &:=\sup _{\tau }\overline{E}_{x,\varphi }\left [\exp \left (\frac{2\mu _{0}}{\sigma ^{2}}\overline{A}_{\tau }-\rho \tau \right )(1+ \overline{\widehat{\Phi }}_{\tau })\right ], \end{aligned}$$

where \(\overline{E}\) is the expectation under \(\overline{P}\) and stopping times are with respect to \((\overline{\mathcal{F}}_{t})_{t\ge 0}\). Now, \(U^{Q}(x,\varphi ;T)=\overline{U}(x,\varphi ;T)\) due to the equivalence in law of the process \((\widehat{X},\widehat{\Phi },A,W^{Q})\) under \(Q\), and \((\overline{\widehat{X}},\overline{\widehat{\Phi }},\overline{A},\overline{W})\) under \(\overline{P}\), on \([0,T]\). Further, if we show that

$$\begin{aligned} \lim _{T\to \infty }\overline{U}(x,\varphi ;T)=\overline{U}(x,\varphi ) \quad \text{and}\quad \lim _{T\to \infty }U(x,\pi ;T)=U(x,\pi ), \end{aligned}$$
(4.18)

then combining these facts with (4.17), we obtain

$$\begin{aligned} \overline{U}(x,\varphi ) &=\lim _{T\to \infty }\overline{U}(x,\varphi ;T)= \lim _{T\to \infty }U^{Q}(x,\varphi ;T) \\ &=(1 + \varphi )\lim _{T\to \infty }U\big(x,\varphi /(1 + \varphi ); T \big) = (1 + \varphi )U\big(x,\varphi /(1 + \varphi )\big). \end{aligned}$$
(4.19)

For the proof of (4.18), notice that \(U(x,\pi ;T)\le U(x, \pi )\), so that

$$\limsup _{T\to \infty }U(x,\pi ;T)\le U(x,\pi ). $$

For the converse inequality, given any \(\tau \), Fatou’s lemma and continuity of paths give

$$\begin{aligned} &E_{x,\pi }\bigg[\exp \bigg(\int _{0}^{\tau }\frac{2}{\sigma ^{2}}(\mu _{0}+\hat{\mu }\widehat{\pi }_{t})dA_{t}-\rho \tau \bigg)\bigg] \\ &\le \liminf _{T\to \infty }E_{x,\pi }\bigg[\exp \bigg(\int _{0}^{ \tau \wedge T}\frac{2}{\sigma ^{2}}(\mu _{0}+\hat{\mu }\widehat{\pi } _{t})dA_{t}-\rho (\tau \wedge T)\bigg)\bigg] \\ &\le \liminf _{T\to \infty } U(x,\pi ;T). \end{aligned}$$

Hence (4.18) holds for \(U\), and by the same arguments also for \(\overline{U}\).

Finally, with a slight abuse of notation we relabel

$$(\overline{\widehat{X}},\overline{\widehat{\Phi }},\overline{A},\overline{W})=(\widehat{X}, \widehat{\Phi },A,\overline{W}), \qquad \! \big(\overline{\Omega},\overline{\mathcal{F}},( \overline{\mathcal{F}}_{t})_{t\ge 0},\overline{P}\big)=\big(\Omega ,\mathcal{F},( \mathcal{F}_{t})_{t\ge 0},\overline{P}\big), $$

so that

$$\begin{aligned} &\overline{U}(x,\varphi )=\sup _{\tau }\overline{E}_{x,\varphi }\left [\exp \left (\frac{2\mu _{0}}{\sigma ^{2}}A_{\tau }-\rho \tau \right )(1+ \widehat{\Phi }_{\tau })\right ]. \end{aligned}$$
(4.20)

Problem (4.20) is somewhat easier to analyse than the original (4.8) because the dynamics (4.12), (4.13) for \((\widehat{X},\widehat{\Phi })\), driven by \(\overline{W}\) under \(\overline{P}_{x,\varphi }\), are more explicit than those of \((\widehat{X},\widehat{\pi })\), driven by \(W\) under \(P_{x,\pi }\); see (4.5), (4.6).

It is clear from (4.19) that \(\mathcal{C}\) and \(\mathcal{S}\) in (4.9), (4.10) now read

$$\begin{aligned} &\mathcal{C}=\{(x,\varphi )\in [0,\infty )\times (0,\infty ) : \overline{U}(x,\varphi )>1+\varphi \}, \\ &\mathcal{S}=\{(x,\varphi )\in [0,\infty )\times (0,\infty ) : \overline{U}(x,\varphi )=1+\varphi \}. \end{aligned}$$

Remark 4.1

The choice \(\varphi =0\) corresponds to full information on the drift of (2.1) (i.e., \(\mu =\mu _{0}\)), in which case there is no dynamics for \(\widehat{\Phi }\). Since problem (2.3) has a well-known explicit solution in that setting (see [32]), and given that for all \(t\ge 0\) and any \((x,\varphi )\in [0,\infty )\times (0,\infty )\), we have \(P_{x,\varphi }[\widehat{\Phi }_{t}>0]=1\), we do not include \([0,\infty )\times \{0\}\) in our state space.

4.3 Well-posedness and initial properties of the stopping problem

At this point, we start looking at elementary properties of problem (4.20) which guarantee its well-posedness. Recall the following known fact (see [36, Sect. 3.5.C]): for \(\beta >0\) and \(S^{\beta ,\sigma }_{t}:=\sup _{0\le s\le t}(-\beta s-\sigma \overline{W}_{s})\), we have

$$\begin{aligned} \overline{P}[S^{\beta ,\sigma }_{\infty }>x]=\exp \left (-\frac{2 \beta }{\sigma ^{2}}x\right ) \qquad \text{for $x>0$}. \end{aligned}$$
(4.21)

For \(\alpha >0\), setting \(\beta =\alpha +\frac{\sigma ^{2}\rho }{2 \alpha }\), the use of (4.21) and

$$\frac{2\alpha }{\sigma ^{2}}\sup _{0\le s\le t}(-\alpha s- \sigma \overline{W}_{s})-\rho t\le \frac{2\alpha }{\sigma ^{2}}\sup _{0\le s\le t}(- \beta s-\sigma \overline{W}_{s}) $$

gives the following bound: for any stopping time \(\tau \),

$$\begin{aligned} \overline{E}\Big[e^{\frac{2\alpha }{\sigma ^{2}}S^{\alpha ,\sigma }_{ \tau }-\rho \tau }\Big] &\le \overline{E}\Big[e^{\frac{2\alpha }{\sigma ^{2}}S^{\beta ,\sigma }_{\tau }}\Big]\le \overline{E}\Big[e^{\frac{2 \alpha }{\sigma ^{2}}S^{\beta ,\sigma }_{\infty }}\Big] \\ &= \frac{2\beta }{\sigma ^{2}}\int _{0}^{\infty }e^{\frac{2\alpha }{ \sigma ^{2}}x}e^{-\frac{2\beta }{\sigma ^{2}}x}dx=\frac{2\beta }{\sigma ^{2}}\int _{0}^{\infty }e^{-\frac{\rho }{\alpha }x}dx< \infty . \end{aligned}$$
(4.22)

A great deal of standard results in optimal stopping theory rely on the assumption that

$$\begin{aligned} \overline{E}_{x,\varphi }\Big[\sup _{t\ge 0}\big(e^{\frac{2\mu _{0}}{ \sigma ^{2}}A_{t}-\rho t}(1+\widehat{\Phi }_{t})\big)\Big]< \infty . \end{aligned}$$
(4.23)

In particular, (4.23) would normally be used to show that

$$\begin{aligned} \tau _{*}:=\inf \{t\ge 0 : (\widehat{X}_{t},\widehat{\Phi }_{t})\notin \mathcal{C}\} \end{aligned}$$
(4.24)

is the minimal optimal stopping time in (4.20), provided \(\overline{P}_{x,\varphi }[\tau _{*}<\infty ]=1\); otherwise it is the minimal optimal Markov time (see [44, Chap. 3.3, Theorem 3]), and notice also that for problem (4.8), we rewrite (4.24) in terms of \((\widehat{X},\widehat{\pi })\). Moreover, (4.23) would also guarantee the (super)martingale property of the discounted value process: the process \((N_{t})_{t\ge 0}\) defined as

$$N_{t}:=e^{\frac{2\mu _{0}}{\sigma ^{2}}A_{t}-\rho t} \overline{U}( \widehat{X}_{t},\widehat{\Phi }_{t}) $$

has the properties that

$$\begin{aligned} & \text{$(N_{t})_{t\ge 0}$ is a right-continuous $\overline{P}$-supermartingale}, \end{aligned}$$
(4.25)
$$\begin{aligned} &\text{$(N_{t\wedge \tau _{*}})_{t\ge 0}$ is a right-continuous $\overline{P}$-martingale}. \end{aligned}$$
(4.26)

Assumption (4.23) may be fulfilled in our setting by choosing \(\rho \) sufficiently large in comparison to the coefficients \((\mu _{0},\mu _{1},\sigma )\). In fact, we notice that the process

$$e^{\frac{2\mu _{0}}{\sigma ^{2}}A_{t}-\rho t}\widehat{\Phi }_{t}=\exp \left (\frac{2\mu _{1}}{\sigma ^{2}}A_{t}-\rho t+\theta \overline{W}_{t}-\frac{ \theta ^{2}}{2}t\right ) $$

is not uniformly integrable in general. As it turns out, by following a slightly different approach, we can still achieve (4.24)–(4.26) but with no other restriction on \(\rho \) than \(\rho >0\).

For \(n\ge 1\), let us denote \(\zeta _{n}:=\inf \{t\ge 0 : \widehat{\Phi }_{t}\ge n\}\) and consider the sequence of problems with value function

$$\begin{aligned} \overline{U}^{n}(x,\varphi ):=\sup _{\zeta \le \zeta _{n}}\overline{E}_{x, \varphi }\left [\exp \left (\frac{2\mu _{0}}{\sigma ^{2}}A_{\zeta }- \rho \zeta \right )(1+\widehat{\Phi }_{\zeta })\right ]. \end{aligned}$$
(4.27)

It is clear that these truncated problems fulfil condition (4.23) since the process \((\widehat{X},\widehat{\Phi })\) is stopped at \(\zeta _{n}\). Hence

$$\begin{aligned} \zeta ^{n}_{*}:=\inf \{t\ge 0 : \overline{U}^{n}(\widehat{X}_{t}, \widehat{\Phi }_{t})=1+\widehat{\Phi }_{t}\}\wedge \zeta _{n} \end{aligned}$$
(4.28)

is an optimal stopping time for (4.27). Moreover, the process \((N^{n}_{t})_{t\ge 0}\) defined as

$$\begin{aligned} N^{n}_{t}:=e^{\frac{2\mu _{0}}{\sigma ^{2}}A_{t}-\rho t} \overline{U}^{n}( \widehat{X}_{t},\widehat{\Phi }_{t}) \end{aligned}$$
(4.29)

satisfies the analogue of conditions (4.25), (4.26), and we obtain the next useful results.

Proposition 4.2

The sequence\((\overline{U}^{n})_{n\ge 1}\)is increasing in\(n\)with

$$\begin{aligned} \lim _{n\to \infty }\overline{U}^{n}(x,\varphi )=\overline{U}(x,\varphi ), \end{aligned}$$
(4.30)

for all\((x,\varphi )\in [0,\infty )\times (0,\infty )\). Moreover, there exists a universal constant\(c_{1}>0\)such that

$$\begin{aligned} 0\le \overline{U}^{n}(x,\varphi )\le \overline{U}(x,\varphi )\le 1 + c _{1}\varphi , \end{aligned}$$
(4.31)

for all\((x,\varphi )\in [0,\infty ) \times (0,\infty )\).

Proof

Clearly, \(\overline{U}^{n}\le \overline{U}\) for all \(n\), and the sequence is increasing because the set of admissible stopping times is increasing. For any \(\overline{P}_{x,\varphi }\)-a.s. finite stopping time \(\tau \), Fatou’s lemma gives

$$\begin{aligned} &\overline{E}_{x,\varphi } \left [\exp \left (\frac{2\mu _{0}}{\sigma ^{2}}A_{\tau }-\rho \tau \right )(1+\widehat{\Phi }_{\tau })\right ] \\ &\le \liminf _{n\to \infty }\overline{E}_{x,\varphi }\left [\exp \left (\frac{2 \mu _{0}}{\sigma ^{2}}A_{\tau \wedge \zeta _{n}}-\rho (\tau \wedge \zeta _{n}) \right )(1+\widehat{\Phi }_{\tau \wedge \zeta _{n}})\right ] \\ &\le \liminf _{n\to \infty }\overline{U}^{n}(x,\varphi ). \end{aligned}$$

The latter implies \(\overline{U}(x,\varphi )\le \liminf _{n\to \infty } \overline{U}^{n}(x,\varphi )\) and therefore (4.30).

Let us now analyse (4.31). For any stopping time \(\tau \), using (4.15) gives

$$\begin{aligned} &\overline{E}_{x,\varphi } \left [\exp \left (\frac{2\mu _{0}}{\sigma ^{2}}A_{\tau \wedge \zeta _{n}} - \rho (\tau \wedge \zeta _{n}) \right ) (1 + \widehat{\Phi }_{\tau \wedge \zeta _{n}})\right ] \\ &=\overline{E}_{x,\varphi } \Big[e^{\frac{2\mu _{0}}{\sigma ^{2}}A_{ \tau \wedge \zeta _{n}}-\rho (\tau \wedge \zeta _{n})}\Big] + \varphi \overline{E}_{x,\varphi } \Big[e^{\frac{2\mu _{1}}{\sigma ^{2}}A_{\tau \wedge \zeta _{n}}-\rho (\tau \wedge \zeta _{n})}e^{\theta \overline{W} _{\tau \wedge \zeta _{n}}-\frac{\theta ^{2}}{2}(\tau \wedge \zeta _{n})} \Big], \end{aligned}$$
(4.32)

and we can study the two terms separately. For the first one, given that \(\mu _{0}< 0\), the expectation is trivially bounded above by one.

For the second term in (4.32), we first change measure using

$$\begin{aligned} \frac{dP^{\theta }}{d\overline{P}}\bigg|_{\mathcal{F}_{t\wedge \zeta _{n}}}=e^{\theta \overline{W}_{t\wedge \zeta _{n}}-\frac{\theta ^{2}}{2}(t \wedge \zeta _{n})} \qquad \text{for $t\in [0,\infty )$,} \end{aligned}$$
(4.33)

and then notice that \(W^{\theta }_{t\wedge \zeta _{n}}=\overline{W}_{t \wedge \zeta _{n}}-\theta (t\wedge \zeta _{n})\) for \(t\in [0,\infty )\) is a (stopped) Brownian motion under \(P^{\theta }\) since the Radon–Nikodým derivative is a bounded martingale. This gives

$$\overline{E}_{x,\varphi }\Big[e^{\frac{2\mu _{1}}{\sigma ^{2}}A_{\tau \wedge \zeta _{n}}-\rho (\tau \wedge \zeta _{n})}e^{\theta \overline{W} _{\tau \wedge \zeta _{n}}-\frac{\theta ^{2}}{2}(\tau \wedge \zeta _{n})} \Big]=E^{\theta }_{x,\varphi }\Big[e^{\frac{2\mu _{1}}{\sigma ^{2}}A _{\tau \wedge \zeta _{n}}-\rho (\tau \wedge \zeta _{n})}\Big]\le c_{1}, $$

where the final inequality uses (4.22) with \(\alpha =\mu _{1}\) and

$$\sup _{0\le s\le t}(-\mu _{0} s-\sigma \overline{W}_{s})=\sup _{0\le s \le t}(-\mu _{1} s-\sigma W^{\theta }_{s}). $$

Notice that \(c_{1}>0\) depends only on \((\mu _{0},\mu _{1},\sigma , \rho )\). In summary, \(\overline{U}^{n}\) fulfils (4.31) for all \(n\ge 1\), and then (4.30) implies that the bound holds for \(\overline{U}\) as well. □

It is also useful to state a continuity result for \(\overline{U}^{n}\).

Proposition 4.3

For any\(n\ge 1\), we have\(\overline{U}^{n}\in C([0,\infty )\times (0, \infty ))\). Moreover, there exists a universal constant\(c>0\)such that for any couple of points\((x_{1},\varphi _{1})\)and\((x_{2},\varphi _{2})\)in\([0,\infty )\times (0,\infty )\)with\(\varphi _{2}>\varphi _{1}\), we have

$$\begin{aligned} \big|\overline{U}^{n}(x_{1},\varphi _{1})-\overline{U}^{n}(x_{2},\varphi _{2})\big|\le c\big((1+\varphi _{2})|x_{1}-x_{2}|+(\varphi _{2}-\varphi _{1})\big). \end{aligned}$$
(4.34)

Finally, \(\varphi \mapsto \overline{U}^{n}(x,\varphi )\)is increasing for all\(x\in [0,\infty )\).

Proof

Take \(x_{1}< x_{2}\) and \(\varphi \in (0,\infty )\). Let \(\zeta _{1}=\zeta ^{n}_{*}(x_{1},\varphi )\) be optimal for \(\overline{U}^{n}(x_{1},\varphi )\); then by direct comparison,

$$\begin{aligned} &\overline{U}^{n} (x_{1},\varphi )-\overline{U}^{n}(x_{2},\varphi ) \\ &\le \overline{E}\Big[e^{\frac{2\mu _{0}}{\sigma ^{2}}A^{x_{1}}_{\zeta _{1}}-\rho \zeta _{1}}-e^{\frac{2\mu _{0}}{\sigma ^{2}}A^{x_{2}}_{\zeta _{1}}-\rho \zeta _{1}}\Big] +\varphi E^{\theta }\Big[e^{\frac{2\mu _{1}}{ \sigma ^{2}}A^{x_{1}}_{\zeta _{1}}-\rho \zeta _{1}}-e^{\frac{2\mu _{1}}{ \sigma ^{2}}A^{x_{2}}_{\zeta _{1}}-\rho \zeta _{1}}\Big], \end{aligned}$$

where as in (4.33), we have used \(dP^{\theta }=e^{\theta \overline{W}_{t\wedge \zeta _{n}}-\frac{\theta ^{2}}{2}(t\wedge \zeta _{n})} d\overline{P}\) to change measure. Next we use that \(0\le A^{x_{1}}- A ^{x_{2}}\le x_{2}-x_{1}\) and (4.22) to conclude that

$$\begin{aligned} E^{\theta }\Big[e^{\frac{2\mu _{1}}{\sigma ^{2}}A^{x_{1}}_{\zeta _{1}}- \rho \zeta _{1}}-e^{\frac{2\mu _{1}}{\sigma ^{2}}A^{x_{2}}_{\zeta _{1}}- \rho \zeta _{1}}\Big] &\le (x_{2}-x_{1})E^{\theta }\Big[e^{\frac{2\mu _{1}}{\sigma ^{2}}A^{x_{1}}_{\zeta _{1}}-\rho \zeta _{1}}\Big]\le c_{1}(x _{2}-x_{1}), \\ \overline{E}\Big[e^{\frac{2\mu _{0}}{\sigma ^{2}}A^{x_{1}}_{\zeta _{1}}- \rho \zeta _{1}}-e^{\frac{2\mu _{0}}{\sigma ^{2}}A^{x_{2}}_{\zeta _{1}}- \rho \zeta _{1}}\Big] &\le \bigg|\frac{2\mu _{0}}{\sigma ^{2}}\bigg|(x _{2}-x_{1}). \end{aligned}$$

Therefore we have \(\overline{U}^{n} (x_{1},\varphi )-\overline{U}^{n}(x _{2},\varphi )\le c(1+\varphi )(x_{2}-x_{1})\) for a constant \(c=c _{1}\vee |2\mu _{0}/\sigma ^{2}|\). Symmetric arguments allow proving the converse inequality.

Let us now fix \(x\in [0,\infty )\) and \(\varphi _{1}<\varphi _{2}\) in \((0,\infty )\). Denote

$$\zeta ^{\varphi _{i}}_{n}=\inf \{t\ge 0 : \widehat{\Phi }^{x,\varphi _{i}}\ge n\}\qquad \text{for $i=1,2$} $$

and let \(\zeta _{i}=\zeta _{*}^{n}(x,\varphi _{i})\) be optimal for \(\overline{U}^{n}(x,\varphi _{i})\). Since \(\zeta _{2}\le \zeta _{n}^{ \varphi _{2}}\le \zeta _{n}^{\varphi _{1}}\), we have that \(\zeta _{2}\) is admissible for \(\overline{U}^{n}(x,\varphi _{1})\). Then using the same arguments as above, we get

$$\begin{aligned} \overline{U}^{n}(x,\varphi _{2})-\overline{U}^{n}(x,\varphi _{1})\le ( \varphi _{2}-\varphi _{1})E^{\theta }\Big[e^{ \frac{2\mu _{1}}{\sigma ^{2}}A^{x}_{\zeta _{2}}-\rho \zeta _{2}}\Big] \le c (\varphi _{2}-\varphi _{1}). \end{aligned}$$

For the converse inequality, we notice that given any stopping time \(\zeta \), the stopping time \(\zeta \wedge \zeta ^{\varphi _{2}}_{n}\) is admissible for \(\overline{U}^{n}(x,\varphi _{2})\). Using that \(\widehat{\Phi }^{x,\varphi _{1}}_{\zeta }\le n\) for \(\zeta \le \zeta _{n}^{\varphi _{1}}\) and \(\widehat{\Phi }^{x,\varphi _{2}}_{ \zeta ^{\varphi _{2}}_{n}}=n\), we get

(4.35)

where the last inequality also uses that \(t\mapsto \frac{2\mu _{0}}{ \sigma ^{2}}A^{x}_{t}-\rho t\) is decreasing. The above estimates imply (4.34), while (4.35) implies monotonicity in \(\varphi \). □

We can now state some properties of \(\overline{U}\).

Proposition 4.4

The value function\(\overline{U}\)of (4.20) has the following properties:

  1. (i)

    \(\overline{U}\in C([0,\infty )\times (0,\infty ))\)and there exists a universal constant\(c>0\)such that

    $$\begin{aligned} \big|\overline{U}(x_{1},\varphi _{1})-\overline{U}(x_{2},\varphi _{2}) \big|\le c\big((1+\varphi _{2})|x_{2}-x_{1}|+(\varphi _{2}-\varphi _{1}) \big) \end{aligned}$$
    (4.36)

    for all\(x_{1},x_{2}\in [0\infty )\)and\(0<\varphi _{1}<\varphi _{2}\).

  2. (ii)

    \(\varphi \mapsto \overline{U}(x,\varphi )\)is convex and increasing for any\(x\in [0,\infty )\).

  3. (iii)

    We have\(\lim _{\varphi \to 0}\overline{U}(x, \varphi )=1\).

  4. (iv)

    We have the transversality condition

    $$\begin{aligned} \lim _{t\to \infty }\overline{E}_{x,\varphi }\Big[e^{\frac{2\mu _{0}}{ \sigma ^{2}}A_{t}-\rho t}\overline{U}(\widehat{X}_{t},\widehat{\Phi } _{t})\Big]=0, \end{aligned}$$
    (4.37)

    for all\((x,\varphi )\in [0,\infty )\times (0,\infty )\).

Proof

In order to prove (i), it is enough to take \(n\to \infty \) in (4.34) and use (4.30). Let us now show (ii). Thanks to (4.15), we know that the map

$$\begin{aligned} \varphi \mapsto \exp \left (\frac{2\mu _{0}}{\sigma ^{2}}A^{x}_{\tau }- \rho \tau \right )(1+\widehat{\Phi }^{x,\varphi }_{\tau }) \end{aligned}$$
(4.38)

is \(\overline{P}\)-a.s. linear for any stopping time \(\tau \). As \(\sup (f+g)\le \sup (f)+\sup (g)\), we easily obtain

$$\begin{aligned} \overline{U}\big(x,\alpha \varphi _{1}+(1-\alpha )\varphi _{2}\big) \le \alpha \overline{U}(x,\varphi _{1})+(1-\alpha )\overline{U}(x,\varphi _{2}) \end{aligned}$$

for \(\alpha \in (0,1)\), \(\varphi _{1},\varphi _{2}\in (0,\infty )\) and each given \(x\in [0,\infty )\). Since the map (4.38) is increasing, it also follows that \(\varphi \mapsto \overline{U}(x, \varphi )\) is increasing as claimed. (The latter could also have been deduced by monotonicity of \(\varphi \mapsto \overline{U}^{n}(x,\varphi )\).)

Next, we observe that (iii) follows immediately by (4.31) upon noticing also that \(\overline{U}(x,\varphi )\ge 1+\varphi \). It only remains to prove (iv). From (4.31) and using (4.15) and

$$dP^{\theta }=e^{\theta \overline{W}_{t}-\frac{\theta ^{2}}{2}t}d\overline{P}, $$

we have

$$\begin{aligned} \overline{E}_{x,\varphi }\Big[e^{\frac{2\mu _{0}}{\sigma ^{2}}A_{t}- \rho t}\overline{U}(\widehat{X}_{t},\widehat{\Phi }_{t})\Big] & \le \overline{E}\Big[e^{\frac{2\mu _{0}}{\sigma ^{2}}A^{x}_{t}-\rho t}\Big]+c _{1}\varphi E^{\theta }\Big[e^{\frac{2\mu _{1}}{\sigma ^{2}}A^{x}_{t}- \rho t}\Big] \\ & \le e^{-\frac{\rho }{2}t}+c_{1}\varphi e^{-\frac{\rho }{2}t} E^{ \theta }\Big[e^{\frac{2\mu _{1}}{\sigma ^{2}}\sup _{0\le s\le t}(-\mu _{1}s-\sigma W^{\theta }_{s})-\frac{\rho }{2} t}\Big] , \end{aligned}$$

where we recall that \(W^{\theta }\) is a \(P^{\theta }\)-Brownian motion. Using now (4.22), we can find a universal constant \(c'_{1}>0\) such that

$$\begin{aligned} \overline{E}_{x,\varphi }\Big[e^{\frac{2\mu _{0}}{\sigma ^{2}}A_{t}- \rho t}\overline{U}(\widehat{X}_{t},\widehat{\Phi }_{t})\Big]\le e^{-\frac{ \rho }{2}t}(1+c'_{1}\varphi ). \end{aligned}$$

Then (4.37) follows by taking \(t\to \infty \). □

There are several conclusions that one can draw from Proposition 4.4. First we notice that \((\overline{U}-\overline{U}^{n})_{n \ge 1}\) is a decreasing sequence of continuous functions that converges to zero; therefore Dini’s theorem implies that

$$\begin{aligned} \lim _{n\to \infty }\sup _{(x,\varphi )\in K}\big|\overline{U}^{n}(x, \varphi )-\overline{U}(x,\varphi )\big|=0 \end{aligned}$$
(4.39)

for any compact \(K\subseteq [0,\infty )\times (0,\infty )\). Now we can use this fact and an argument inspired by Chiarolla and De Angelis [12, Lemma 4.17] and [11, Lemma 6.2] to prove the next lemma.

Lemma 4.5

The sequence\((\zeta _{*}^{n})_{n\ge 1}\)defined in (4.28) is increasing in\(n\)and for all\((x,\varphi )\in [0,\infty ) \times (0,\infty )\), we have, with\(\tau _{*}\)as in (4.24), that

$$\begin{aligned} \overline{P}_{x,\varphi }\left [\lim _{n\to \infty }\zeta _{*}^{n}=\tau _{*}\right ]=1. \end{aligned}$$
(4.40)

Proof

Since \((\overline{U}^{n})\) is increasing in \(n\), it is clear that the sequence \((\zeta _{*}^{n})_{n\ge 1}\) is also increasing and \(\zeta _{*} ^{n}\le \tau _{*}\) for all \(n\ge 1\), \(\overline{P}_{x,\varphi }\)-a.s. For \((x,\varphi )\in \mathcal{S}\), it is clear that (4.40) holds. For fixed \((x_{0},\varphi _{0})\in \mathcal{C}\), we argue by contradiction and assume that

$$\begin{aligned} \overline{P}_{x_{0},\varphi _{0}}\left [\lim _{n\to \infty }\zeta _{*} ^{n}< \tau _{*}\right ]>0. \end{aligned}$$

Letting \(\Omega _{0}:=\{\omega : \lim _{n\to \infty }\zeta _{*}^{n}<\tau _{*}\}\), we pick an arbitrary \(\omega \in \Omega _{0}\). Then there is \(\delta _{\omega }>0\) such that \(\tau _{*}(\omega )>\delta _{\omega }\). This implies that there also exists \(c_{\omega }>0\) such that

$$\begin{aligned} \inf _{t\in [0,\delta _{\omega }]}\left (\overline{U}(\widehat{X}_{t}, \widehat{\Phi }_{t})-(1+\widehat{\Phi }_{t})\right )(\omega )> c_{ \omega }, \end{aligned}$$
(4.41)

thanks to (i) in Proposition 4.4 and because the process \(t\mapsto (\widehat{X}_{t},\widehat{\Phi }_{t})\) is continuous up to a null subset of \(\Omega \). Then the image of \((\widehat{X}_{t}, \widehat{\Phi }_{t})(\omega )\) for \(t\in [0,\delta _{\omega }]\) is a compact that we denote by \(K_{\omega ,\delta }\), and (4.41) is equivalent to

$$\begin{aligned} \inf _{(x,\varphi )\in K_{\omega ,\delta }}\big( \overline{U}(x,\varphi )-(1+\varphi )\big)> c_{\omega }. \end{aligned}$$
(4.42)

Thanks to (4.39), we can find \(N_{\omega ,\delta }\ge 1\) such that (4.42) holds with \(\overline{U}^{n}\) instead of \(\overline{U}\), for all \(n\ge N_{\omega ,\delta }\). This implies \(\lim _{n\to \infty }\zeta _{*}^{n}(\omega )\ge \delta _{\omega }\). As \(\delta _{\omega }<\tau _{*}( \omega )\) is arbitrary, we get

$$\lim _{n\to \infty }\zeta _{*}^{n}(\omega )\ge \tau _{*}(\omega ) $$

and hence a contradiction with the definition of \(\Omega _{0}\). □

The above lemma implies optimality of \(\tau _{*}\) as explained in the next proposition.

Proposition 4.6

The stopping time\(\tau _{*}\)in (4.24) is optimal for problem (4.20) in the sense that for all\((x,\varphi )\in [0,\infty ) \times (0,\infty )\), we have

(4.43)

Moreover, the (super)martingale properties (4.25), (4.26) hold.

Proof

We start by showing (4.25) and (4.26). Recall the process \((N^{n}_{t})_{t\ge 0}\) defined in (4.29) and notice that (4.25) and (4.26) hold for this process. Then for any \(s\ge t\), we have \(\overline{P}_{x,\varphi }\)-a.s. that

$$\begin{aligned} &e^{\frac{2\mu _{0}}{\sigma ^{2}}A_{t\wedge \zeta _{n}}-\rho (t\wedge \zeta _{n})} \overline{U}^{n}\big(\widehat{X}_{t\wedge \zeta _{n}}, \widehat{\Phi }_{t\wedge \zeta _{n}}\big) \\ &\ge \overline{E}_{x,\varphi }\Big[e^{\frac{2\mu _{0}}{\sigma ^{2}}A _{s\wedge \zeta _{n}}-\rho (s\wedge \zeta _{n})}\! \overline{U}^{n}\big( \widehat{X}_{s\wedge \zeta _{n}},\widehat{\Phi }_{s\wedge \zeta _{n}} \big)\Big|\mathcal{F}_{t}\Big]. \end{aligned}$$

Letting \(n\to \infty \), dominated convergence and (4.30) imply that (4.25) holds. Similarly we have \(\overline{P}_{x,\varphi }\)-a.s. that

$$\begin{aligned} &e^{\frac{2\mu _{0}}{\sigma ^{2}}A_{t\wedge \zeta ^{n}_{*}}-\rho (t \wedge \zeta ^{n}_{*})} \overline{U}^{n}\big(\widehat{X}_{t\wedge \zeta ^{n}_{*}},\widehat{\Phi }_{t\wedge \zeta ^{n}_{*}}\big) \\ & =\overline{E}_{x,\varphi }\Big[e^{\frac{2\mu _{0}}{\sigma ^{2}}A_{s \wedge \zeta ^{n}_{*}}-\rho (s\wedge \zeta ^{n}_{*})} \overline{U}^{n} \big(\widehat{X}_{s\wedge \zeta ^{n}_{*}},\widehat{\Phi }_{s\wedge \zeta ^{n}_{*}}\big)\Big|\mathcal{F}_{t}\Big]. \end{aligned}$$

Then taking \(n\to \infty \) and using dominated convergence, (4.39) and (4.40), we obtain that (4.26) holds, too.

In order to prove (4.43), we notice that (4.26) implies for any \(t\ge 0\) that

where we have used continuity of \(\overline{U}\) in the second equality. Letting \(t\to \infty \), the transversality condition (4.37) gives (4.43). □

Before closing this section, we illustrate consequences of Proposition 4.4 for the shape of the continuation and stopping sets \(\mathcal{C}\) and \(\mathcal{S}\). These are summarised in the next corollary.

Corollary 4.7

The continuation set\(\mathcal{C}\)is open and the stopping set\(\mathcal{S}\)is closed. The continuation set is connected in the\(\varphi \)-variable, i.e., for all\(\varphi '>\varphi \), we have

$$\begin{aligned} (x,\varphi )\in \mathcal{C}\implies (x,\varphi ')\in \mathcal{C}. \end{aligned}$$

Proof

The first statement is trivial due to (i) in Proposition 4.4. The second statement follows from the fact that \(\varphi \mapsto \overline{U}(x,\varphi )-(1+\varphi )\) is convex due to (ii) in Proposition 4.4, is nonnegative and (iii) in Proposition 4.4 holds. □

For frequent future use, we define, for any \(x\in [0,\infty )\),

$$\begin{aligned} \psi (x):=\sup \{\varphi \in (0,\infty ) : (x,\varphi )\in \mathcal{S}\}, \end{aligned}$$

with the convention that \(\sup \varnothing =0\). Clearly, \(\mathcal{C}\) and \(\psi \) are related by

$$\begin{aligned} \mathcal{C}=\{(x,\varphi )\in [0,\infty )\times (0,\infty ) : \varphi > \psi (x)\} \end{aligned}$$
(4.44)

(see also Remark 4.1).

Next we infer monotonicity of \(\psi (\cdot )\) and therefore the existence of a generalised inverse \(c(\cdot )\), which is more convenient for a fuller geometric characterisation of \(\mathcal{C}\). This is done in the next subsections.

4.4 A parabolic formulation

Since the process \((\widehat{X},\widehat{\Phi })\) is driven by the same Brownian motion, we can equivalently consider a two-dimensional state dynamics in which only one component has a diffusive part. This is done by a method similar to the one used in several papers addressing partial information, including De Angelis et al. [18] and Johnson and Peskir [34].

Let us define a new process \((\widehat{Y}_{t})_{t\ge 0}\) by setting, \(\overline{P}_{x,\varphi }\)-a.s. for all \(t\ge 0\),

$$\begin{aligned} \widehat{Y}_{t}:=\frac{\sigma }{\theta }\ln \widehat{\Phi }_{t}- \widehat{X}_{t}. \end{aligned}$$

Then letting \(y:=\frac{\sigma }{\theta }\ln \varphi -x\), it is easy to verify that the couple \((\widehat{X},\widehat{Y})\) evolves under \(\overline{P}_{x,y}\) according to

dXˆt=μ0dt+σdWt+dAt,Xˆ0=x,
(4.45)
dYˆt=12(μ1+μ0)dt+dAt,Yˆ0=y.
(4.46)

In order to rewrite our problem (4.20) in terms of the new dynamics, we set

$$\begin{aligned} \widehat{U}(x,y):=\overline{U}\bigg(x,\exp \Big(\frac{\theta }{\sigma }(x+y)\Big)\bigg), \qquad (x,y)\in [0,\infty )\times \mathbb{R}, \end{aligned}$$
(4.47)

and from (4.20), we obtain

$$\begin{aligned} \widehat{U}(x,y)=\sup _{\tau }\overline{E}_{x,y} \bigg[\exp \left (\frac{2 \mu _{0}}{\sigma ^{2}}A_{\tau }- \rho \tau \right ) \bigg(1 + \exp \Big(\frac{\theta }{\sigma }(\widehat{X}_{\tau }+ \widehat{Y}_{\tau }) \Big)\bigg)\bigg]. \end{aligned}$$
(4.48)

It is convenient in what follows to set

$$\begin{aligned} g(x,y):=1+\exp \bigg(\frac{\theta }{\sigma }(x+y)\bigg), \qquad (x,y)\in [0,\infty )\times \mathbb{R}, \end{aligned}$$
(4.49)

and notice that

$$\mathcal{C}=\{(x,y)\in [0,\infty )\times \mathbb{R}:\widehat{U}(x,y)>g(x,y) \}. $$

Another formulation of the problem, which will be useful below, may be obtained by an application of Dynkin’s formula (up to standard localisation arguments). Indeed, we can write

$$\begin{aligned} \widehat{u}(x,y) &:=\widehat{U}(x,y)-g(x,y) \\ & \phantom{:}=\sup _{\tau }\overline{E}_{x,y}\bigg[\int _{0}^{\tau }e^{\frac{2 \mu _{0}}{\sigma ^{2}}A_{t}-\rho t} 2\sigma ^{-2}\big(\mu _{0}+\mu _{1}e ^{\frac{\theta }{\sigma }\widehat{Y}_{t}}\big) dA_{t} \\ & \phantom{=::\sup _{\tau }\overline{E}_{x,y}\bigg[}-\rho \int _{0}^{ \tau }e^{\frac{2\mu _{0}}{\sigma ^{2}}A_{t}-\rho t}g(\widehat{X}_{t}, \widehat{Y}_{t})dt\bigg], \end{aligned}$$
(4.50)

where we have also used that (cf. (4.7)). Recalling from Proposition 4.4 that \(\varphi \mapsto \overline{U}(x,\varphi )-(1+ \varphi )\) is convex and nonnegative with \(\overline{U}(x,0+)=1\), it follows that the mapping is also increasing. Then we have that

$$\begin{aligned} \text{$y\mapsto \widehat{u}(x,y)$ is increasing}. \end{aligned}$$
(4.51)

For frequent future use, we introduce the second-order operator \(\mathcal{L}_{X,Y}\) associated to \((\widehat{X},\widehat{Y})\). That is, for \(f\in C^{1,2}([0,\infty )\times \mathbb{R})\) and \((x,y)\in [0, \infty )\times \mathbb{R}\), we set

$$\begin{aligned} (\mathcal{L}_{X,Y}f)(x,y):=\bigg(-\frac{1}{2}(\mu _{1}+\mu _{0})f_{y}+ \frac{1}{2}\sigma ^{2}f_{xx}+\mu _{0} f_{x}\bigg)(x,y). \end{aligned}$$

Notice that (i) in Proposition 4.4 implies that \(\widehat{U}\) and \(\widehat{u}\) are both continuous on \([0,\infty ) \times \mathbb{R}\). Thanks to the parabolic formulation and recalling the martingale property (4.26), we can rely on standard optimal stopping theory and classical PDE results to state the next lemma (see e.g. Karatzas and Shreve [37, Theorem 2.7.7]).

Lemma 4.8

Given any open setwhose closure is contained in\(\mathcal{C}\), the function\(\widehat{U}\)is the unique classical solution of the boundary value problem

$$\begin{aligned} (\mathcal{L}_{X,Y}-\rho ) f=0 \quad \textit{in}\ \mathcal{R}\qquad\textit{and}\qquad f|_{ \partial \mathcal{R}}=\widehat{U}|_{\partial \mathcal{R}}. \end{aligned}$$
(4.52)

Hence\(\widehat{U}\)is\(C^{1,2}\)in\(\mathcal{C}\cap ((0,\infty ) \times \mathbb{R})\).

Now we turn to the analysis of the geometry of \(\mathcal{C}\). First we show that \(\mathcal{C}\neq \varnothing \).

Proposition 4.9

We have\(\mathcal{C}\neq \varnothing \)and in particular\(\{0\}\times (y_{\ell },\infty )\subseteq \mathcal{C}\), where we set\(y _{\ell }:=\frac{\sigma }{\theta }\ln (-\frac{\mu _{0}}{\mu _{1}})\).

Proof

Fix \(\varepsilon >0\), take \(y>y_{\ell }+\varepsilon \) and let

$$\tau _{\ell }:=\inf \{t\ge 0 : (\widehat{X}_{t},\widehat{Y}_{t}, A_{t}) \notin [0,1)\times (y_{\ell }+ \varepsilon ,\infty )\times [0,1)\}. $$

Notice that there exist \(c_{1,\varepsilon }>0\), \(c_{2,\varepsilon }>0\) such that \(\overline{P}_{0,y}\)-a.s.,

$$\begin{aligned} g(\widehat{X}_{t\wedge \tau _{\ell }},\widehat{Y}_{t\wedge \tau _{ \ell }})\le c_{2,\varepsilon }\quad \text{and}\quad \mu _{0}+\mu _{1}e ^{\frac{\theta }{\sigma }\widehat{Y}_{t\wedge \tau _{\ell }}}\ge c_{1, \varepsilon } \end{aligned}$$

for all \(t\in [0,1]\), given that \(y_{\ell }+\varepsilon \le \widehat{Y}_{t\wedge \tau _{\ell }}\le y+\frac{1}{2}|\mu _{0}+\mu _{1}|+1\). Then recalling (4.50) and that

$$A_{t}=\sup _{0\le s\le t}(-\mu _{0} s-\sigma \overline{W}_{s})=S^{\mu _{0},\sigma }_{t} \qquad \text{$\overline{P}_{0,y}$-a.s. for all $t\ge 0$}, $$

we immediately obtain that

$$\begin{aligned} \widehat{u}(0,y) &\ge \overline{E}_{0,y}\bigg[\int _{0}^{u\wedge \tau _{\ell }} e^{\frac{2\mu _{0}}{\sigma ^{2}}A_{t}-\rho t}\frac{2}{\sigma ^{2}}\big(\mu _{0}+\mu _{1}e^{\frac{\theta }{\sigma }\widehat{Y}_{t}} \big)dA_{t} \\ & \phantom{=:\overline{E}_{0,y}\bigg[}-\rho \int _{0}^{u\wedge \tau _{ \ell }} e^{\frac{2\mu _{0}}{\sigma ^{2}}A_{t}-\rho t}g(\widehat{X}_{t}, \widehat{Y}_{t})dt\bigg] \\ &\ge \overline{E}_{0,y}\big[c'_{1,\varepsilon }S^{\mu _{0},\sigma } _{u\wedge \tau _{\ell }}-c'_{2,\varepsilon }(u\wedge \tau _{\ell }) \big] \end{aligned}$$

for some \(c'_{1,\varepsilon }>0\), \(c'_{2,\varepsilon }>0\) and all \(u\in (0,1]\). Next we obtain (cf. also Peskir [41, Lemma 13]) that

(4.53)

Notice now that for each \(u\ge 0\), one has

$$\text{Law}\bigg(\sup _{0\le s\le u}\overline{W}_{s}\bigg)=\text{Law}(|\overline{W}_{u}|) =\text{Law}(|\overline{W}_{1}|\sqrt{u}). $$

Then, for some suitable \(c>0\) that may vary from line to line but is independent of \(u>0\), we obtain

$$\begin{aligned} \overline{E}_{0,y}[(S^{\mu _{0},\sigma }_{u})^{2}] \le c\big(u^{2} + \overline{E}[(S^{0,-1}_{u})^{2}]\big) = c\big(u^{2} + u \overline{E} \big[|\overline{W}_{1}|^{2}\big]\big). \end{aligned}$$
(4.54)

Moreover, we observe from (4.46) that if \(\mu _{0}+\mu _{1}\le 0\), the process \(\widehat{Y}\) will never reach \(y_{\ell }+\varepsilon \), whereas if \(\mu _{0}+\mu _{1}>0 \), then \(\widehat{Y}_{t}\le y_{\ell }\) implies that

$$t\ge 2(y-y_{\ell }-\varepsilon )/(\mu _{0}+\mu _{1})=:t_{\ell }. $$

Hence without loss of generality, we may take \(u< t_{\ell }\) and get

$$\begin{aligned} \overline{P}_{0,y}[\tau _{\ell }< u] &= \overline{P}_{0,y}\Big[ \sup _{0\le s\le u}\widehat{X}_{s}\ge 1 \text{ or } A_{u}\ge 1\Big] \\ & \le \overline{P}_{0,y}\Big[\sup _{0\le s\le u}\widehat{X}_{s}\ge 1 \Big]+\overline{P}_{0,y}[A_{u}\ge 1] . \end{aligned}$$

For the first term on the right-hand side above, we have

$$\begin{aligned} \overline{P}_{0,y}\Big[\sup _{0\le s\le u}\widehat{X}_{s}\ge 1\Big] &=\overline{P}\Big[\sup _{0\le s\le u}\Big(\sup _{0\le v\le s}\big(\mu _{0}(s-v)+ \sigma (\overline{W}_{s}-\overline{W}_{v})\big)\Big)\ge 1\Big] \\ &= \overline{P}\Big[\sup _{0\le s\le u}(\mu _{0}s+\sigma \overline{W} _{s})\ge 1\Big] \\ &\le \overline{E}\Big[\sup _{0\le s\le u}(\mu _{0}s+\sigma \overline{W} _{s})^{2}\Big]\le c \big(u^{2}+u \overline{E}\big[|\overline{W}_{1}|^{2} \big]\big), \end{aligned}$$

where we have used Markov’s inequality in the penultimate inequality. It is easy to check that we have the same bound also for \(\overline{P} _{0,y}[A_{u}\ge 1]\), and therefore

$$\begin{aligned} \overline{P}_{0,y}[\tau _{\ell }< u]\le 2c \big(u^{2}+u \overline{E} \big[|\overline{W}_{1}|^{2}\big]\big). \end{aligned}$$
(4.55)

Finally, we also notice that since \(\mu _{0}<0\), we have

$$\begin{aligned} \overline{E}_{0,y}[S^{\mu _{0},\sigma }_{u}]\ge \sigma \overline{E} \Big[\sup _{0\le s\le u}\overline{W}_{s}\Big]=\sigma \sqrt{u} \overline{E}\big[ |\overline{W}_{1}| \big]. \end{aligned}$$
(4.56)

Plugging (4.54)–(4.56) into (4.53), we obtain

$$\begin{aligned} \widehat{u}(0,y)\ge c''_{1,\varepsilon }\sqrt{u}-c''_{2,\varepsilon }(u+u^{3/2}+u^{2}) \end{aligned}$$

with suitable constants \(c''_{1,\varepsilon }>0\) and \(c''_{2,\varepsilon }>0\). Then taking \(u\) sufficiently small, we obtain \(\widehat{u}(0,y)>0\) as claimed. □

By (4.46), we notice that for \(\widehat{X}\) away from 0, the process \(\widehat{Y}\) could either have a positive drift or a negative one. Interestingly, this dichotomy also produces substantially different technical difficulties. Recalling (4.44), we start by observing that

$$\varphi >\psi (x)\iff e^{(\theta /\sigma )(x+y)}>\psi (x)\iff y> \chi (x), $$

where

$$\begin{aligned} \chi (x):=\frac{\sigma }{\theta }\big(\ln \psi (x)-x\big), \qquad x\in [0,\infty ). \end{aligned}$$
(4.57)

Hence we have that (4.44) is equivalent to

$$\begin{aligned} \mathcal{C}=\{(x,y)\in [0,\infty )\times \mathbb{R}: y> \chi (x)\}. \end{aligned}$$
(4.58)

Before going further, it is convenient to introduce

$$\begin{aligned} \mathcal{C}_{y}:=\{x\in [0,\infty ): (x,y)\in \mathcal{C}\}, \qquad \mathcal{S}_{y}:=\{x\in [0,\infty ): (x,y)\in \mathcal{S}\} \end{aligned}$$

for any \(y\in \mathbb{R}\). The geometry of \(\mathcal{C}\) in the coordinates \((x,y)\) is explained in Propositions 4.10 and 4.12 below.

Proposition 4.10

Assume\(\mu _{1}+\mu _{0}\ge 0\). Then there exists a unique increasing function\(b:\mathbb{R}\to [0,\infty ]\)such that\(\mathcal{S}_{y}=[b(y), \infty )\)for all\(y\in \mathbb{R}\) (with\(\mathcal{S}_{y}=\varnothing \)if\(b(y)=\infty \)).

Proof

First we show that \((x,y)\in \mathcal{S}\) implies that \((x',y)\in \mathcal{S}\) for all \(x'\ge x\). Fix \((x,y)\in \mathcal{S}\) and \(x'> x\); then we know from (4.58) that \((-\infty ,y]\times \{x \}\in \mathcal{S}\). Due to (4.46), we have that \(\widehat{Y}\) is decreasing during excursions of \(\widehat{X}\) away from zero. This implies that the process \((\widehat{X}^{x'},\widehat{Y}^{x',y})\) cannot reach \(x=0\) before hitting the half-line \((-\infty ,y]\times \{x\}\). Thus, letting \(\tau _{0}:=\inf \{t\ge 0 : \widehat{X}_{t}=0\}\) gives \(\overline{P}_{x',y}[\tau _{*}<\tau _{0}]=1\). Hence (4.50) gives \(\widehat{u}(x',y)\le 0\) for all \(x'\ge x\), as claimed.

Now, for each \(y\in \mathbb{R}\), we can define \(b(y):=\inf \{x\in [0, \infty ) : (x,y)\in \mathcal{S}\}\) and therefore \(\mathcal{S}_{y}=[b(y), \infty )\). Combining the latter with (4.58) gives that \(y\mapsto b(y)\) is increasing. □

Next we want to show that a result similar to Proposition 4.10 also holds for \(\mu _{1}+\mu _{2}<0\), under a mild additional condition. However, in this case, we first need to compute an expression for the derivative \(\widehat{U}_{y}\).

Lemma 4.11

For all\((x,y)\in ((0,\infty )\times \mathbb{R})\setminus \partial \mathcal{C}\), we have

(4.59)

Proof

The claim is trivial if \((x,y)\in \mathcal{S}\setminus \partial \mathcal{C}\) since \(\overline{P}_{x,y}[\tau _{*}=0]=1\) there. Take \((x,y)\in \mathcal{C}\) and let \(\tau :=\tau _{*}(x,y)\) be optimal for \(\widehat{U}(x,y)\). Then for \(\varepsilon >0\), using (4.25) and (4.26), we have

Recall (4.47), (4.37) and (4.22). Letting \(t\to \infty \) and using dominated convergence gives

The same argument may be applied to obtain

We divide both expressions by \(\varepsilon \) and let \(\varepsilon \to 0\). Then, recalling that \(\widehat{U}\in C^{1,2}\) in \(\mathcal{C}\) (Lemma 4.8), noticing that \(\partial _{y} Y^{y}_{t}=1\) for all \(t\ge 0\) and because \(\tau \) was chosen independently of \(\varepsilon \), we obtain (4.59). □

Proposition 4.12

Assume\(\mu _{1}+\mu _{0}< 0\)and\(\rho \ge \frac{\theta }{2\sigma }|\mu _{1}+\mu _{0}|\). Then there exists a unique increasing function\(b:\mathbb{R}\to [0,\infty ]\)such that\(\mathcal{S}_{y}=[b(y),\infty )\)for all\(y\in \mathbb{R}\) (with\(\mathcal{S}_{y}=\varnothing \)if\(b(y)=\infty \)).

Proof

First notice that if \(\mathcal{S}_{y}=[b(y),\infty )\) for all \(y\in \mathbb{R}\), then \(b\) is increasing due to (4.58). Then it remains to prove existence of \(b\). Fix \(y\in \mathbb{R}\). Then we have two possibilities:

  1. (i)

    \(\widehat{u}_{x}(x,y)\le 0\) for all \(x\in (0,\infty )\) such that \((x,y)\in \mathcal{C}\).

  2. (ii)

    There exists \(x_{0}\in (0,\infty )\) with \((x_{0},y)\in \mathcal{C}\) and \(\widehat{u}_{x}(x_{0},y)> 0\).

In case (i), for each \(y\in \mathbb{R}\), there exists a unique point \(b(y)\in [0,\infty ]\) such that \(\mathcal{S}_{y}=[b(y),\infty )\). In case (ii), we argue in two steps. First we show that (ii) implies \([x_{0},\infty )\times \{y\}\in \mathcal{C}\), and then we show that \([x_{0},\infty )\times \{y\}\in \mathcal{C}\) leads to a contradiction. Hence only (i) is possible, for all \(y\in \mathbb{R}\).

Step 1. (ii) \(\Rightarrow [x_{0},\infty )\times \{y\}\in \mathcal{C}\): From Lemma 4.8 and the definition of \(\widehat{u}\) in (4.50), we know that

$$\begin{aligned} \mathcal{L}_{X,Y}\widehat{u}-\rho \widehat{u}=\rho g \qquad \text{in $\mathcal{C}\cap \big((0,\infty )\times \mathbb{R}\big)$,} \end{aligned}$$
(4.60)

where we recall \(g\) from (4.49). In particular, at \((x_{0},y)\), we have \(\mu _{0}\widehat{u}_{x}(x_{0},y)<0\) and

$$\begin{aligned} \frac{1}{2}\sigma ^{2}\widehat{u}_{xx}(x_{0},y) &= \rho g(x_{0},y)+ \rho \widehat{u}(x_{0},y) \\ & \phantom{=:}-\mu _{0}\widehat{u}_{x}(x_{0},y)+\frac{1}{2}(\mu _{0}+\mu _{1})\widehat{u}_{y}(x_{0},y) \\ &> \rho g(x_{0},y)+\rho \widehat{u}(x_{0},y)+\frac{1}{2}(\mu _{0}+\mu _{1})\widehat{u}_{y}(x_{0},y). \end{aligned}$$
(4.61)

Next we use the probabilistic representation (4.59) of \(\widehat{U}_{y}\) to find a lower bound for the right-hand side of (4.61). In particular, by direct comparison of (4.48) and (4.59) (recall also (4.43)), we obtain

and consequently

(4.62)

Plugging (4.62) into the right-hand side of (4.61), we immediately find

The latter implies that \(\widehat{u}_{x}(\cdot ,y)\) is increasing in a right neighbourhood of \(x_{0}\). Hence we can repeat the argument for any point in this neighbourhood and eventually conclude that \(\widehat{u} _{x}(\cdot ,y)>0\) on \([x_{0},\infty )\). Then we must have \([x_{0}, \infty )\times \{y\}\in \mathcal{C}\).

Step 2. \([x_{0},\infty )\times \{y\}\in \mathcal{C}\) is impossible: Fix a point \((x_{0},y_{0})\) such that we have \([x _{0},\infty )\times \{y_{0}\}\in \mathcal{C}\). Recalling (4.58), we then obtain \([x_{0},\infty )\times [y_{0},\infty ) \in \mathcal{C}\) and therefore \(\overline{P}_{x,y}[\tau _{*}= \infty ]=1\) for any \((x,y)\in (x_{0},\infty )\times (y_{0},\infty )\), because \(\widehat{Y}\) is increasing (cf. (4.46)). Then for any such \((x,y)\), (4.26) gives

$$\begin{aligned} \widehat{U}(x,y)=\overline{E}_{x,y}\Big[e^{\frac{2\mu _{0}}{\sigma ^{2}}A _{t}-\rho t}\widehat{U}(\widehat{X}_{t},\widehat{Y}_{t})\Big] \qquad \text{for all $t\ge 0$}. \end{aligned}$$

Letting \(t\to \infty \), condition (4.37) gives the contradiction \(\widehat{U}(x,y)=0\). □

Combining Propositions 4.10 and 4.12 with (4.58) gives the next corollary.

Corollary 4.13

Assume either\(\mu _{1}+\mu _{0}\ge 0\), or\(\mu _{1}+\mu _{0}< 0\)with\(\rho \ge \frac{\theta }{2\sigma }|\mu _{1}+\mu _{0}|\). Then the map\(x\mapsto \chi (x)\)is increasing.

We can say that \(\chi \) is the (generalised) inverse of \(b\) in a sense that is clarified later in Sect. 5.2.

5 Fine properties of the value function and of the boundary

In this section, we continue our study of the optimal stopping problem by proving that its value function is \(C^{1}\) and by exhibiting properties of the optimal boundary in the different coordinate systems (i.e., \((x,\pi )\), \((x,\varphi )\) and \((x,y)\)).

5.1 Regularity of value function and optimal boundary

Combining Propositions 4.10 and 4.12, we conclude that under each of the two conditions

  1. (i)

    \(\mu _{1}+\mu _{0}\ge 0\),

  2. (ii)

    \(\mu _{1}+\mu _{0}< 0\) and \(\rho \ge \frac{\theta }{2\sigma }|\mu _{1}+\mu _{0}|\),

there is an increasing optimal boundary \(b\) such that

$$\begin{aligned} \mathcal{S}=\{(x,y)\in [0,\infty )\times \mathbb{R}: x\ge b(y)\}. \end{aligned}$$
(5.1)

Since we only consider the cases (i) and (ii) in the rest of this paper, it is worth summarising them in a single assumption. Recall \(\theta =(\mu _{1}-\mu _{0})/\sigma \).

Assumption 5.1

We assume that \((\mu _{0},\mu _{1},\rho ,\sigma )\) fulfil one of (i), (ii) above.

Proposition 5.2

Under Assumption5.1, we have\(0\le b(y)<\infty \)for all\(y\in \mathbb{R}\), and moreover\(b\in C(\mathbb{R})\).

Proof

1) Finiteness: Let us start by proving finiteness of the boundary by way of contradiction. Assume there is \(y_{0}\in \mathbb{R}\) such that \([0, \infty )\times \{y_{0}\}\in \mathcal{C}\). Then we must have \([0, \infty )\times [y_{0},\infty )\subseteq \mathcal{C}\) by monotonicity of \(b(\cdot )\). Notice that we have already shown in Step 2 of the proof of Proposition 4.12 that this is impossible if \(\mu _{0}+\mu _{1}<0\) and \(\rho \ge \frac{\theta }{2\sigma }|\mu _{0}+\mu _{1}|\). Thus it remains to prove the contradiction for \(\mu _{0}+\mu _{1}\ge 0\).

For future use, let us introduce

$$\begin{aligned} X^{\circ }_{t}=x+\mu _{0}t+\sigma \overline{W}_{t}, \qquad Y^{\circ }_{t}=y-\frac{1}{2}(\mu _{1}+\mu _{0})t. \end{aligned}$$
(5.2)

Fix \(t_{0}>0\) and define \(y_{1}:=y_{0}+\frac{1}{2}(\mu _{1}+\mu _{0})t _{0}\). Then by assumption, we must have \(\overline{P}_{x,y_{1}}[\tau _{*}\ge t_{0}]=1\) for all \(x\ge 0\). For \(\tau _{0}:=\inf \{t\ge 0 : \widehat{X}_{t}=0\}\), using the strong Markov property and (4.50), we obtain

(5.3)

where we use that on \(\{t\le \tau _{0}\}\), we have \((\widehat{X}_{t}, \widehat{Y}_{t})=(X^{\circ }_{t},Y^{\circ }_{t})\)\(\overline{P}_{x,y _{1}}\)-a.s. From (4.31), we deduce that for some \(c_{y_{1}}>0\) only depending on \(y_{1}\), we have

$$\begin{aligned} e^{-\rho \tau _{0}}\widehat{u}(0,Y^{\circ }_{\tau _{0}})\le e^{-\rho \tau _{0}}\big(1+ c_{1} e^{\frac{\theta }{\sigma }Y^{\circ }_{\tau _{0}}} \big)\le c_{y_{1}} e^{-\rho \tau _{0}}, \end{aligned}$$

where in the last inequality, we have also used \(\mu _{0}+\mu _{1} \ge 0\). Plugging the latter bound into (5.3) and using that \(\tau _{*}\ge t_{0}\), we get

$$\begin{aligned} \widehat{u}(x,y_{1})\le c_{y_{1}} \overline{E}_{x,y_{1}}[e^{-\rho \tau _{0}}]-\rho \overline{E}_{x,y_{1}}\bigg[\int _{0}^{\tau _{0}\wedge t_{0}}e ^{-\rho t}g(X^{\circ }_{t}, Y^{\circ }_{t})dt\bigg]. \end{aligned}$$
(5.4)

Taking \(x\to \infty \), the first term on the right-hand side of (5.4) goes to zero, whereas the second one diverges to \(\infty \) because \(\lim _{x\to \infty }\overline{P}_{x,y_{1}}[\tau _{0} \ge t_{0}]= 1\) and \(x\mapsto g(x,y)\) is increasing. Hence we have a contradiction.

2) Left-continuity: Using that \(b(\cdot )\) is increasing and \(\mathcal{S}\) is closed, we obtain that \(\lim _{n\to \infty }(b(y_{n}),y _{n})=(b(y_{0}-),y_{0})\in \mathcal{S}\) for any \(y_{0}\in \mathbb{R}\) and any increasing sequence \(y_{n}\uparrow y_{0}\) as \(n\to \infty \), where \(b(y_{0}-)\) is the left limit of \(b\) at \(y_{0}\). Then \(b(y_{0}-)\ge b(y_{0})\) by (5.1), and since \(b(y_{n})\le b(y _{0})\) for all \(n\ge 1\), \(b\) must be left-continuous, hence lower semi-continuous.

3) Right-continuity: The argument by contradiction that we use draws from [15]. Assume there is \(y_{0}\in \mathbb{R}\) with \(b(y_{0})< b(y_{0}+)\) and take \(b(y_{0})< x_{1}< x_{2}< b(y_{0}+)\) and a nonnegative function \(\phi \in C^{\infty }_{c}(x_{1},x_{2})\) such that \(\int _{x_{1}}^{x _{2}}\phi (x)dx=1\). Thanks to Lemma 4.8 (cf. also (4.60)), we have

$$\begin{aligned} (\mathcal{L}_{X,Y}-\rho )\widehat{u}(x,y)=g(x,y) \qquad \text{for $(x,y)\in (x_{1},x_{2})\times (y_{0},\infty )$.} \end{aligned}$$
(5.5)

Let us first consider the case (i) of \(\mu _{0}+\mu _{1}\ge 0\). Recall that \(\widehat{u}_{y}\ge 0\) in \(\mathcal{C}\) by (4.51). Then multiplying (5.5) by \(\phi (\cdot )\) and integrating by parts, we obtain

$$\begin{aligned} 0 &\ge -\frac{1}{2}(\mu _{0}+\mu _{1})\int ^{x_{2}}_{x_{1}}\widehat{u} _{y}(x,y)\phi (x)dx \\ &= \int ^{x_{2}}_{x_{1}}\bigg(\rho g+\rho \widehat{u}-\mu _{0} \widehat{u}_{x}-\frac{1}{2}\sigma ^{2}\widehat{u}_{xx}\bigg)(x,y) \phi (x)dx \\ &= \int ^{x_{2}}_{x_{1}}\bigg((\rho g+\rho \widehat{u})(x,y)\phi (x)+ \mu _{0} \widehat{u}(x,y)\phi '(x)-\frac{1}{2}\sigma ^{2}\widehat{u}(x,y) \phi ''(x)\bigg)dx. \end{aligned}$$

Taking limits as \(y\downarrow y_{0}\) and using dominated convergence and \(\widehat{u}(x,y_{0})=0\), we obtain

$$\begin{aligned} 0 & \ge \rho \int ^{x_{2}}_{x_{1}}g(x,y_{0})\phi (x)dx>0 \end{aligned}$$

which is a contradiction. Hence \(b(y_{0})=b(y_{0}+)\).

Next consider the case (ii) where \(\mu _{0}+\mu _{1}<0\) and \(\rho \ge \frac{\theta }{2\sigma }|\mu _{0}+\mu _{1}|\). Thanks to classical results on internal regularity of PDEs (e.g. Friedman [28, Chap. 3, Theorem 10]), we can differentiate (5.5) with respect to \(x\) and obtain that \(\widehat{u}_{x}\) is in \(C^{1,2}\) in \(\mathcal{C}\) and solves

$$\begin{aligned} (\mathcal{L}_{X,Y}-\rho )\widehat{u}_{x}=g_{x} \qquad \text{in $(x_{1},x_{2})\times (y_{0},\infty )$}. \end{aligned}$$
(5.6)

It is crucial to recall that \(\widehat{u}_{x}\le 0\), as shown in the proof of Proposition 4.12. For \(y>y_{0}\), we get from (5.6) that

$$\begin{aligned} &\int ^{x_{2}}_{x_{1}}(\mathcal{L}_{X,Y}\widehat{u}_{x}-\rho \widehat{u}_{x}-\rho g_{x})(x,y)\phi (x)dx=0. \end{aligned}$$
(5.7)

Defining \(F_{\phi }(y):=\int ^{x_{2}}_{x_{1}}\widehat{u}_{xy}(x,y) \phi (x)dx\) and using integration by parts, (5.7) may be rewritten as

$$\begin{aligned} \frac{1}{2}|\mu _{0}+\mu _{1}|F_{\phi }(y) &= \int ^{x_{2}}_{x_{1}} \bigg(\frac{1}{2}\sigma ^{2}\widehat{u}(x,y)\phi '''(x) - \mu _{0} \widehat{u}(x,y)\phi ''(x) \\ & \phantom{=:\int ^{x_{2}}_{x_{1}} \bigg(} - \rho \widehat{u}(x,y)\phi '(x) + \rho g_{x}(x,y)\phi (x)\bigg)dx. \end{aligned}$$

Taking limits as \(y\downarrow y_{0}\) and using \(\widehat{u}(x,y_{0})=0\) gives

$$\begin{aligned} F_{\phi }(y_{0}+)=\frac{2\rho }{|\mu _{0}+\mu _{1}|}\int ^{x_{2}}_{x_{1}} g_{x}(x,y_{0})\phi (x)dx\ge \rho _{0}>0 \end{aligned}$$

for some \(\rho _{0}\). Hence there is \(\varepsilon >0\) such that \(F_{\phi }(y)\ge \rho _{0}/2\) for \(y\in (y_{0},y_{0}+\varepsilon )\). Then from the definition of \(F_{\phi }\), integration by parts and Fubini’s theorem, we find that

$$\begin{aligned} \frac{1}{2}\rho _{0}\varepsilon & \le \int _{y_{0}}^{y_{0}+\varepsilon }F_{\phi }(y)dy=-\int _{x_{1}}^{x_{2}}\left (\int _{y_{0}}^{y_{0}+ \varepsilon }\widehat{u}_{y}(x,y)dy\right )\phi '(x)dx \\ &=-\int _{x_{1}}^{x_{2}}\widehat{u}(x,y_{0}+\varepsilon )\phi '(x)dx= \int _{x_{1}}^{x_{2}}\widehat{u}_{x}(x,y_{0}+\varepsilon )\phi (x)dx \le 0, \end{aligned}$$

where we also used \(\widehat{u}(x,y_{0})=0\). This contradiction implies \(b(y_{0})=b(y_{0}+)\). □

Monotonicity of \(b\) is the key to the regularity of the value function in this context. In fact, we use it to show that the first hitting time to \(\mathcal{S}\) coincides with the first hitting time to the interior of \(\mathcal{S}\). The latter, along with regularity (in the sense of diffusions) of \(\partial \mathcal{S}\), will be sufficient to prove that \(\widehat{U}\in C^{1}((0,\infty )\times \mathbb{R})\), or equivalently \(\overline{U}\in C^{1}((0,\infty )^{2})\).

Let us introduce the first hitting times to \(\mathcal{S}\) and to \(\mathcal{S}^{\circ }:=\text{int }\mathcal{S}\) as

$$\begin{aligned} \sigma _{*}:=\inf \{t>0 : (\widehat{X}_{t},\widehat{Y}_{t})\in \mathcal{S}\}, \qquad \sigma ^{\circ }_{*}:=\inf \{t>0 : (\widehat{X}_{t},\widehat{Y}_{t}) \in \mathcal{S}^{\circ }\}. \end{aligned}$$

Notice that continuity of paths for \((\widehat{X},\widehat{Y})\) implies that \(\tau _{*}=\sigma _{*}\)\(\overline{P}_{x,y}\)-a.s. for all \((x,y)\in ([0,\infty )\times \mathbb{R})\setminus \partial \mathcal{C}\). It will be crucial to prove that the equality also holds at points of the boundary \((x,y)\in \partial \mathcal{C}\). For future reference, we define

$$\begin{aligned} y^{*}_{0}:=\inf \{y\in \mathbb{R}: (0,y)\in \mathcal{C}\} \qquad \text{(with $\inf \varnothing =\infty $)}. \end{aligned}$$
(5.8)

Lemma 5.3

Under Assumption5.1, for all\((x,y)\in [0,\infty )\times \mathbb{R}\setminus (0,y^{*}_{0})\), we have

$$\begin{aligned} \overline{P}_{x,y}[\sigma _{*}=\sigma _{*}^{\circ }]=1. \end{aligned}$$
(5.9)

Proof

The statement is trivial for \((x,y)\in \mathcal{S}^{\circ }\) and for \((x,y)\in \{0\}\times (-\infty ,y^{*}_{0})\) thanks to continuity of paths. It remains to consider \((x,y)\in \overline{\mathcal{C}}\setminus (0,y^{*}_{0})\), where \(\overline{\mathcal{C}}\) is the closure of \(\mathcal{C}\). First we notice that thanks to monotonicity of \(b\), we have

$$\begin{aligned} \overline{P}_{x,y}[\widehat{X}_{\sigma _{*}}>0]=1 \qquad \text{for all $(x,y)\in \overline{\mathcal{C}}\setminus (0,y^{*}_{0})$}. \end{aligned}$$
(5.10)

Indeed, if \(\mu _{1}+\mu _{2}\le 0\), (5.10) is obvious because \(\widehat{Y}\) is increasing. If \(\mu _{1}+\mu _{2}>0\), (5.10) holds because

$$\overline{P}_{x,y}[\widehat{X}_{\sigma _{*}}=0]=\overline{P}_{x,y}[( \widehat{X}_{\sigma _{*}},\widehat{Y}_{\sigma _{*}})=(0,y^{*}_{0})]=0 \qquad \text{for all $(x,y)\in \overline{\mathcal{C}}\setminus (0,y ^{*}_{0})$}. $$

Let us now prove (5.9) in \(\overline{\mathcal{C}}\setminus (0,y ^{*}_{0})\).

In the case \(\mu _{1}+\mu _{0}> 0\), the process \(\widehat{Y}\) has a negative drift and moves to the left at a constant rate during excursions of \(\widehat{X}\) away from \(x=0\). Since \(b\) is increasing, \(t\mapsto b(\widehat{Y}_{t})\) is decreasing during excursions of \(\widehat{X}\) away from \(x=0\). It then becomes straightforward to verify (5.9), due to the law of the iterated logarithm for Brownian motion and (5.10).

If \(\mu _{1}+\mu _{0}=0\), the process \(\widehat{Y}\) only increases at times \(t\) with \(\widehat{X}_{t}=0\); otherwise it stays constant. Then (5.9) holds due to (5.10) and because \(\widehat{X}\) immediately enters intervals of the form \((x',\infty )\) after reaching \(x'\) (i.e., \(x'\) is regular for \((x',\infty )\)).

If \(\mu _{1}+\mu _{0}<0\), the process \(\widehat{Y}\) increases. Moreover, during excursions of \(\widehat{X}\) away from \(x=0\), the rate of increase is constant. Recalling (5.10), we can therefore use Cox and Peskir [13, Corollary 8] to conclude that (5.9) indeed holds (see also a self-contained proof in a setting similar to ours in [18, Appendix B]). □

We say that a boundary point \((x,y)\in \partial \mathcal{C}\) is regular for the stopping set in the sense of diffusions if

$$\begin{aligned} \overline{P}_{x,y}[\sigma _{*}>0]=0 \end{aligned}$$
(5.11)

(see Blumenthal and Getoor [9, Chap. 1, Sect. 11]; see also De Angelis and Peskir [19] for a recent account on this topic). Notice that from the 0–1 law, if (5.11) fails, then \(\overline{P}_{x,y}[\sigma _{*}>0]=1\).

In case \(\mu _{0}+\mu _{1} \ge 0\), during excursions of \(\widehat{X}\) away from zero, the process \(\widehat{Y}\) is decreasing. So the couple \((\widehat{X},\widehat{Y})\) moves towards the left of the \((x,y)\)-plane during such excursions (or \(\widehat{Y}\) is just constant if \(\mu _{0}+\mu _{1}=0\)). Recalling that \(b(\cdot )\) is increasing, the law of the iterated logarithm implies that \(\overline{P}_{x_{0},y_{0}}[ \sigma _{*}>0]=0\) if \((x_{0},y_{0})\in \partial \mathcal{C}\) with \(x_{0}>0\). So we can claim:

Proposition 5.4

Assume\(\mu _{0}+\mu _{1} \ge 0\). Then all points\((x,y)\in \partial \mathcal{C}\)with\(x>0\)are regular for the stopping set, i.e., (5.11) holds.

To treat the regularity of \(\partial \mathcal{C}\) in the remaining case \(\mu _{0}+\mu _{1}<0\), we need to take a longer route because \((\widehat{X},\widehat{Y})\) is now moving towards the right of the \((x,y)\)-plane and in principle, when started from \(\partial \mathcal{C}\), it may ‘escape’ from the stopping set. We shall prove below that this is not the case. For that, we first need to show that smooth fit holds at the boundary. Notice that this is the classical concept of smooth fit, i.e., continuity of \(z\mapsto \widehat{U}_{x}(z,y)\). Smooth fit in this sense does not imply that \((x,y)\mapsto \widehat{U}_{x}(x,y)\) is continuous across the boundary, which we prove instead in Proposition 5.10.

Lemma 5.5

Assume\(\mu _{0}+\mu _{1} < 0\)and\(\rho \ge \frac{\theta }{2\sigma }| \mu _{0}+\mu _{1}|\). For each\(y\in \mathbb{R}\), we have\(\widehat{U} _{x}(\cdot ,y)\in C(0,\infty )\) (or equivalently\(\widehat{u}_{x}( \cdot ,y)\in C(0,\infty )\)).

Proof

From

$$\begin{aligned} \frac{1}{2}\sigma ^{2}\widehat{u}_{xx}(x,y) &= \rho g(x,y)+\rho \widehat{u}(x,y)-\mu _{0}\widehat{u}_{x}(x,y) +\frac{1}{2}(\mu _{0}+\mu _{1})\widehat{u}_{y}(x,y) \end{aligned}$$

for \((x,y)\in \mathcal{C}\), \(x>0\) and using (4.36) (which clearly implies Lipschitz-continuity of \(\widehat{U}\) as well), we see that for any bounded set \(B\), we must have that

$$\begin{aligned} \text{$\widehat{u}_{xx}$ is bounded on the closure of $B\cap \mathcal{C}$.} \end{aligned}$$
(5.12)

This fact is used later to justify the use of the Itô–Tanaka formula in (5.13).

We establish the smooth fit with an argument by contradiction. The first step is to recall that \(\widehat{u}_{x}\le 0\) in \(\mathcal{C}\) as verified in the proof of Proposition 4.12. Second, notice that any \((x_{0},y_{0})\in \partial \mathcal{C}\) must be of the form \((b(y_{0}),y_{0})\) due to continuity of \(y\mapsto b(y)\) (Proposition 5.2). Next, assume that for some \(y_{0}\) and \(x_{0}=b(y_{0})>0\), we have \(\widehat{u}_{x}(x_{0}-,y_{0})<-\delta _{0}\) for some \(\delta _{0}>0\), where \(\widehat{u}_{x}(x_{0}-,y_{0})\) exists due to (5.12). Take a bounded rectangular neighbourhood \(B\) of \((x_{0},y_{0})\) such that \(B\cap (\{0\}\times \mathbb{R})= \varnothing \) and let \(\tau _{B}:=\inf \{t\ge 0 : (\widehat{X}_{t}, \widehat{Y}_{t})\notin B\}\). Then from the supermartingale property of \(\widehat{U}\) (4.25), using that \(A_{\tau _{B}\wedge t}=0\) for all \(t\ge 0\) and recalling (5.2), we have

$$\begin{aligned} \widehat{u}(x_{0},y_{0})\ge \overline{E}_{x_{0},y_{0}}\left [e^{- \rho (\tau _{B}\wedge t)}\widehat{u}\big(X^{\circ }_{\tau _{B}\wedge t},Y ^{\circ }_{\tau _{B}\wedge t}\big)-\rho \int _{0}^{\tau _{B}\wedge t}e ^{-\rho s}g(X^{\circ }_{s},Y^{\circ }_{s})ds\right ]. \end{aligned}$$

Now we notice that \(t\mapsto Y^{\circ }_{\tau _{B}\wedge t}\) is increasing. Moreover, recalling (4.51), we have \(\widehat{u}_{y}\ge 0\) in \(\mathcal{C}\). This implies \(\widehat{u}(X ^{\circ }_{\tau _{B}\wedge t},Y^{\circ }_{\tau _{B}\wedge t})\ge \widehat{u}(X^{\circ }_{\tau _{B}\wedge t},y_{0})\)\(\overline{P}_{x_{0},y _{0}}\)-a.s. Finally, observing that \(g\) is bounded on \(B\), we obtain

$$\begin{aligned} \widehat{u}(x_{0},y_{0})\ge \overline{E}_{x_{0},y_{0}}\big[e^{-\rho ( \tau _{B}\wedge t)}\widehat{u}(X^{\circ }_{\tau _{B}\wedge t},y_{0})-c ( \tau _{B}\wedge t)\big] \end{aligned}$$
(5.13)

for some \(c=c(B)>0\) that depends on the set \(B\) and will vary from line to line below.

As anticipated, we can now use the Itô–Tanaka formula in (5.13) thanks to (5.12). We let \(\mathcal{L}_{X}= \frac{1}{2}\sigma ^{2}\partial _{xx}+\mu _{0}\partial _{x}\), denote the local time of \(X^{\circ }\) at \(x_{0}\) by \(L^{x_{0}}\) and notice also that \(\widehat{u}_{xx}( \cdot , y_{0})=0\) for \(x>x_{0}\). Then we get

(5.14)

where in the final inequality, we used that \((\mathcal{L}_{X}-\rho ) \widehat{u}\) is bounded on \(B\). Letting \(t\to 0\), the inequality in (5.14) leads to a contradiction because in the limit, we have \(\overline{E}_{x_{0},y_{0}}[L^{x_{0}}_{\tau _{B}\wedge t}]\approx \sqrt{t}\) and \(\overline{E}_{x_{0},y_{0}}[\tau _{B}\wedge t]\approx t\) (the argument is similar to the one used to prove Proposition 4.9; see also e.g. [41, Lemma 13]). Hence the claim is proved. □

Next we prove regularity of \(\partial \mathcal{C}\) in the sense of diffusions when \(\mu _{0}+\mu _{1}<0\).

Proposition 5.6

Assume\(\mu _{0}+\mu _{1}<0\)and\(\rho \ge \frac{\theta }{2\sigma }|\mu _{0}+\mu _{1}|\). Then all points\((x,y)\in \partial \mathcal{C}\)with\(x>0\)are regular for the stopping set, i.e., (5.11) holds.

Proof

The idea is to show that if \(\overline{P}_{x_{0},y_{0}}[\sigma _{*}>0]=1\) for some \((x_{0},y_{0})\in \partial \mathcal{C}\), then \(\widehat{u} _{x}(x_{0}-,y_{0})<0\) which contradicts Lemma 5.5.

1) Upper bound on \(\widehat{u}_{x}\): Let us start by fixing \((x,y)\in \mathcal{C}\). It is convenient to rewrite \(\widehat{u}\) in the following form. Let \(\tau _{\varepsilon }:=\inf \{t\ge 0 : \widehat{X} _{t}=\varepsilon \}\) for \(\varepsilon \ge 0\); then by the strong Markov property, we have

(5.15)

where we used that \((\widehat{X}_{t},\widehat{Y}_{t})=(X^{\circ }_{t},Y ^{\circ }_{t})\) on \(\{t\le \tau _{\varepsilon }\}\)\(\overline{P}_{x,y}\)-a.s., with the notation of (5.2). Notice that \(\tau _{\varepsilon }\) is independent of \(y\) and therefore \(\tau _{\varepsilon }=\tau _{\varepsilon }(x)\). Moreover, due to (4.45), it is clear that

$$\begin{aligned} \tau _{0}(x-\varepsilon )=\tau _{\varepsilon }(x) \qquad \text{$\overline{P}$-a.s.} \end{aligned}$$
(5.16)

Now fix \(\varepsilon >0\), denote \(X^{\circ ,\varepsilon }_{t}=x- \varepsilon +\mu _{0}t+\sigma \overline{W}_{t}\) and \(\tau ^{\varepsilon }_{0}=\tau _{0}(x-\varepsilon )\)\(\overline{P}\)-a.s. and take \(\tau '=\tau _{*}(x,y)\), which is suboptimal for \(\widehat{u}(x-\varepsilon ,y)\). Then we obtain

(5.17)

Thanks to (5.16), we can replace \(\tau ^{\varepsilon }_{0}\) in (5.17) with \(\tau _{\varepsilon }\) as in (5.15). Then subtracting (5.17) from (5.15), we obtain

where the last inequality uses \(\widehat{u}_{x}\le 0\) in \(\mathcal{C}\) (cf. proof of Proposition 4.12) and \((\varepsilon ,Y ^{\circ }_{\tau _{\varepsilon }})\in \mathcal{C}\) on \(\{\tau '> \tau _{\varepsilon }\}\). Now we can divide by \(\varepsilon \) and let \(\varepsilon \to 0\). Using that \(\tau _{\varepsilon }\downarrow \tau _{0}\) and recalling \(\tau '=\tau _{*}(x,y)\), we obtain

$$\begin{aligned} \widehat{u}_{x}(x,y)\le -\rho \overline{E}_{x,y}\left [\int _{0}^{\tau _{*}\wedge \tau _{0}}e^{-\rho t}g_{x}(X^{\circ }_{t},Y^{\circ }_{t})dt\right ]. \end{aligned}$$
(5.18)

2) Non-smooth fit: Assume that \((x_{0},y_{0})\in \partial \mathcal{C}\) and \(\overline{P}_{x_{0},y_{0}}[\sigma _{*}>0]=1\). Take an increasing sequence \(x_{n}\uparrow x_{0}\) and denote \(\tau _{*}^{n}=\tau _{*}(x _{n},y_{0})\). Notice that we have \(\tau _{*}^{n}=\sigma _{*} ^{n}=\sigma _{*}(x_{n},y_{0})\) for all \(n\ge 1\) due to continuity of paths. Moreover, \(\sigma _{*}^{n}\) is decreasing in \(n\) with \(\sigma _{*}^{n}\ge \sigma _{*}=\sigma _{*}(x_{0},y_{0})\), because \(x\mapsto X^{x}_{t}\) is increasing and \(\mathcal{S}\) is of the form (5.1). Setting \(\tau _{0}^{n}=\inf \{t\ge 0 : X^{\circ ,n}_{t}=0 \}\), it is also easy to check that \(\tau ^{n}_{0}\uparrow \tau _{0}\) as \(n\to \infty \). Then denoting \(\sigma ^{\infty }:=\lim _{n\to \infty } \sigma ^{n}_{*}\), we have

$$\sigma ^{\infty }\wedge \tau _{0}=\lim _{n\to \infty }(\sigma ^{n}_{*} \wedge \tau _{0}^{n})\ge \sigma _{*}\wedge \tau _{0} \qquad \text{$\overline{P}$-a.s.} $$

Given that \(g_{x}\ge 0\), we can use monotone convergence and (5.18) to get

$$\begin{aligned} u_{x}(x_{0}-,y_{0}) &=\lim _{n\to \infty }\widehat{u}_{x}(x_{n},y_{0}) \\ & \le -\rho \overline{E}_{x,y}\bigg[\int _{0}^{\sigma ^{\infty }\wedge \tau _{0}}e^{-\rho t}g_{x}(X^{\circ }_{t},Y^{\circ }_{t})dt\bigg]< 0, \end{aligned}$$
(5.19)

where the final inequality holds because \(\overline{P}_{x_{0},y_{0}}[ \sigma ^{\infty }\ge \sigma _{*}>0]=1\) by assumption. Since (5.19) contradicts Lemma 5.5, we have \(\overline{P} _{x_{0},y_{0}}[\sigma _{*}>0]=0\). □

As a corollary to Lemma 5.3 and Propositions 5.4 and 5.6, we have

Corollary 5.7

Under Assumption5.1, for all\((x,y)\in [0,\infty )\times \mathbb{R}\setminus (0,y^{*}_{0})\), we have

$$\begin{aligned} \overline{P}_{x,y}[\tau _{*}=\sigma _{*}=\sigma _{*}^{\circ }]=1. \end{aligned}$$

This corollary is important to determine continuity of the stopping times with respect to the initial position of the process, at all points of the state space.

Proposition 5.8

Under Assumption5.1, we have

$$\begin{aligned} \lim _{n\to \infty }\tau _{*}(x_{n},y_{n})=\tau _{*}(x,y) \qquad \overline{P}\textit{-a.s.} \end{aligned}$$
(5.20)

for any\((x,y) \in [0,\infty ) \times \mathbb{R}\setminus (0,y^{*} _{0})\)and any sequence\((x_{n},y_{n})\to (x,y)\). In particular, for\((x,y)\in \partial \mathcal{C}\setminus (0,y^{*}_{0})\), the limit is zero.

Proof

Let us fix \((x,y)\in [0,\infty )\times \mathbb{R}\). For simplicity, in the rest of this proof, all stopping times depending on \((x_{n},y_{n})\) are denoted by \(\tau _{n}\), \(\sigma _{n}\) or \(\sigma ^{\circ }_{n}\), whereas those depending on \((x,y)\) are denoted by \(\tau \), \(\sigma \) or \(\sigma ^{\circ }\), as appropriate.

1) Lower semi-continuity: Here we show that

$$\begin{aligned} \liminf _{n\to \infty }\tau _{n}\ge \tau \qquad \text{$\overline{P}$-a.s.} \end{aligned}$$
(5.21)

Fix \(\omega \in \Omega \) outside of a nullset. If \(\tau (\omega )=0\), the result is trivial; so we assume that \(\tau (\omega )> \delta >0\). Then, recalling that the boundary is continuous (Proposition 5.2), there exists \(c_{\delta ,\omega }>0\) such that

$$\begin{aligned} b\big(\widehat{Y}^{x,y}_{t}(\omega )\big)-\widehat{X}^{x}_{t}(\omega )> c_{\delta ,\omega } \qquad \text{for all $t\in [0,\delta ]$.} \end{aligned}$$

Notice that the map \((t,x',y')\mapsto b(\widehat{Y}^{x',y'}_{t}( \omega ))-\widehat{X}^{x'}_{t}(\omega )\) is uniformly continuous on any compact \([0,\delta ]\times K\). So we can find \(\overline{n}_{\omega } \ge 1\) sufficiently large that for all \(n\ge \overline{n}_{\omega }\),

$$\begin{aligned} b\big(\widehat{Y}^{x_{n},y_{n}}_{t}(\omega )\big)-\widehat{X}^{x_{n}} _{t}(\omega )> c_{\delta ,\omega } \qquad \text{for all $t\in [0,\delta ]$}. \end{aligned}$$

This implies \(\liminf _{n\to \infty }\tau _{n}(\omega )\ge \delta \). Since \(\omega \), \(\delta \) were arbitrary, we obtain (5.21).

2) Upper semi-continuity: Here we show that

$$\begin{aligned} \limsup _{n\to \infty }\sigma ^{\circ }_{n}\le \sigma ^{\circ } \qquad \text{$\overline{P}$-a.s.} \end{aligned}$$
(5.22)

Fix \(\omega \in \Omega \) outside of a nullset. If \(\sigma ^{\circ }( \omega )=\infty \), the result is trivial; so assume \(\sigma ^{\circ }( \omega )<\delta \) for some \(\delta >0\). Then, recalling that the boundary is continuous (Proposition 5.2), there exists \(t\le \delta \) such that \(b(\widehat{Y}^{x,y}_{t}(\omega ))< \widehat{X}^{x}_{t}(\omega )\). By continuity of \((x',y') \mapsto b(\widehat{Y}^{x',y'}_{t}(\omega ))-\widehat{X}^{x'}_{t}( \omega )\), we can find \(\overline{n}_{\omega }\ge 1\) sufficiently large that for all \(n\ge \overline{n}_{\omega }\), we have \(b(\widehat{Y} ^{x_{n},y_{n}}_{t}(\omega ))<\widehat{X}^{x_{n}}_{t}(\omega )\). Hence \(\limsup _{n\to \infty }\sigma ^{\circ }_{n}\le \delta \). Since \(\omega \), \(\delta \) were arbitrary, (5.22) follows.

Combining Steps 1) and 2) with Corollary 5.7, we obtain (5.20). □

In order to finally prove that \(\widehat{U}\in C^{1}((0,\infty ) \times \mathbb{R})\), we should like to have a fully probabilistic representation of \(\nabla _{x,y}\widehat{U}\). While obtaining \(\widehat{U}_{y}\) in (4.59) was relatively easy, we now need more care for \(\widehat{U}_{x}\). First of all, recalling the explicit dynamics of \((\widehat{X},\widehat{Y},A)\) from (4.14), (4.45) and (4.46) and denoting by \(\partial _{x}^{+}\) and \(\partial _{x} ^{-}\) the right and left partial derivatives with respect to \(x\), we observe that for all \((x,y)\in (0,\infty )\times \mathbb{R}\) and \(t\ge 0\), we have

(5.23)
(5.24)

where we also recall the notation \(S^{\mu _{0},\sigma }_{t}= \sup _{0\le s\le t}(-\mu _{0} s-\sigma \overline{W}_{s})\). Recalling \(y^{*}_{0}\) from (5.8) and \(\tau _{0}=\inf \{t\ge 0 : \widehat{X}_{t}=0\}\), the same arguments as in (5.10) and Corollary 5.7 give

$$\begin{aligned} \overline{P}_{x,y}[S^{\mu _{0},\sigma }_{\tau _{*}}=x]=\overline{P}_{x,y}[ \tau _{*}=\tau _{0}]=\overline{P}_{x,y}[(\widehat{X}_{\tau _{*}}, \widehat{Y}_{\tau _{*}})=(0,y^{*}_{0})]=0 \end{aligned}$$
(5.25)

for any \((x,y)\in ([0,\infty )\times \mathbb{R})\setminus (0,y^{*} _{0})\). Then for all \((x,y)\in (0,\infty )\times \mathbb{R}\) and \(\overline{P}\)-a.s., we have

(5.26)
(5.27)

Let us now obtain the probabilistic representation of \(\widehat{U} _{x}\).

Lemma 5.9

For all\((x,y)\in ((0,\infty )\times \mathbb{R})\setminus \partial \mathcal{C}\), we have

(5.28)

Proof

The result is trivial for \((x,y)\in \mathcal{S}^{\circ }\) because \(\tau _{*}=0\). For \((x,y)\in \mathcal{C}\), we recall that \(\widehat{U} _{x}\) is well defined (Lemma 4.8), take \(\varepsilon >0\) and denote by \(\tau =\tau _{*}(x,y)\) the optimal stopping time for \(\widehat{U}(x,y)\). For any \(t>0\), using the (super)martingale property (4.25), (4.26) gives

Dividing the above expressions by \(\varepsilon \), letting \(\varepsilon \to 0\) and using (4.36) and the right derivatives in (5.23), (5.24), (5.26) and (5.27), we find a lower bound for \(\widehat{U}_{x}\), namely

(5.29)

where (notice that \(c>0\) below is the same as in (4.36))

Using (4.37), it is not hard to verify that we have \(\lim _{t\to \infty }r(t,x,y)=0\) (notice that \(\widehat{U}(x,y) \ge g(x,y)\ge e^{\frac{\theta }{\sigma }(x+y)}\ge 0\)). Hence, taking limits as \(t\to \infty \) in (5.29) and recalling also (4.22), dominated convergence gives

In order to obtain an upper bound for \(\widehat{U}_{x}\), we can employ symmetric arguments, using again \(\tau =\tau _{*}(x,y)\), to estimate \(\varepsilon ^{-1}(\widehat{U}(x,y)-\widehat{U}(x-\varepsilon ,y))\). It is not hard to check that the upper bound is the same as the lower bound; hence (5.28) holds. □

Thanks to the continuity of the optimal stopping times and the probabilistic representations of \(\widehat{U}_{x}\) and \(\widehat{U} _{y}\), we can state our next result (see also [19] for general results in this direction).

Proposition 5.10

Under Assumption5.1, we have\(\widehat{U}\in C^{1}((0, \infty )\times \mathbb{R})\).

Proof

Trivially \(\widehat{U}\in C^{1}\) in \(\mathcal{S}^{\circ }\) and moreover \(\widehat{U}\in C^{1}\) in \(\mathcal{C}\setminus (\{0\}\times \mathbb{R})\), due to Lemma 4.8. It only remains to prove that \(\nabla _{x,y}\widehat{U}\) is continuous across the boundary \(\partial \mathcal{C}\). Let us consider the case of \(\widehat{U}_{x}\), as the proof for \(\widehat{U}_{y}\) follows the same arguments.

Take \((x_{0},y_{0})\in \partial \mathcal{C}\) with \(x_{0}>0\) and a sequence \((x_{n},y_{n})_{n\ge 1}\) in \(\mathcal{C}\) converging to \((x_{0},y_{0})\) as \(n\to \infty \). Thanks to Proposition 5.8, we have \(\tau _{*}(x_{n},y_{n})\to \tau _{*}(x _{0},y_{0})=0\)\(\overline{P}\)-a.s. as \(n\to \infty \). To simplify notation, we let \(\tau _{n}:=\tau _{*}(x_{n},y_{n})\).

Fix \(t>0\) and notice that on \(\{\tau _{n}>t\}\), one has \((\widehat{X} _{t},\widehat{Y}_{t})\in \mathcal{C}\)\(\overline{P}_{x_{n},y_{n}}\)-a.s. so that \(\widehat{U}_{x}(\widehat{X} ^{x_{n}}_{t},\widehat{Y}^{x_{n},y_{n}}_{t})\) may be represented by using (5.28). Hence the tower property of conditional expectations and the Markov property allow us to write (5.28) as

(5.30)

Now we want to take limits as \(n\to \infty \) and use that \(\tau _{n} \to 0\) in (5.30) to show that \(\widehat{U}_{x}(x_{n},y_{n}) \to g_{x}(x_{0},y_{0})\). For that, first notice that and are continuous on \((-\infty ,0)\) and in particular at \(-x_{0}\). Since we also have

$$\lim _{n\to \infty }(S^{\mu _{0},\sigma }_{\tau _{n}}-x_{n}) = -x_{0}< 0, $$

we obtain that \(\overline{P}\)-a.s.,

Moreover, thanks to (4.36) and (4.22), we can invoke dominated convergence to take limits inside the expectations in (5.30). This gives

$$\begin{aligned} \lim _{n\to \infty }\widehat{U}_{x}(x_{n},y_{n})=\frac{\theta }{\sigma }\exp {\left (\frac{\theta }{\sigma }(x_{0}+y_{0})\right )}=g_{x}(x _{0},y_{0}), \end{aligned}$$

where we also used that .

Because \((x_{0},y_{0})\) and the sequence \((x_{n},y_{n})\) were arbitrary, we conclude that \(\widehat{U}_{x}\) is continuous across \(\partial \mathcal{C}\setminus (0,y^{*}_{0})\). Similar arguments applied to (4.59) allow to show that \(\widehat{U}_{y}\) is continuous across \(\partial \mathcal{C}\setminus (0,y^{*}_{0})\) as well. □

We have a simple corollary. Recall that \(\overline{\mathcal{C}}\) is the closure of \(\mathcal{C}\).

Corollary 5.11

Let Assumption5.1hold. Then we have\(\overline{U}\in C^{1}((0, \infty )^{2})\)and\(U\in C^{1}((0,\infty )\times (0,1))\). Moreover, \(\widehat{U}_{xx}\)is continuous on\(\overline{\mathcal{C}} \setminus (\{0\} \times \mathbb{R})\)with

$$\begin{aligned} \widehat{U}_{xx}(x,y)=\frac{2\rho }{\sigma ^{2}}g(x,y)+g_{xx}(x,y) \qquad \textit{for all}\ (x,y)\in \partial \mathcal{C},\ x>0. \end{aligned}$$
(5.31)

Proof

The first claim follows from Proposition 5.10, (4.47) and (4.19). For the second claim, we need

$$\begin{aligned} \frac{1}{2}\sigma ^{2}\widehat{u}_{xx}+\mu _{0}\widehat{u}_{x}- \frac{1}{2}(\mu _{0}+\mu _{1})\widehat{u}_{y}-\rho \widehat{u}=\rho g \qquad \text{in $\mathcal{C}$}, \end{aligned}$$

where \(\widehat{u}=\widehat{U}-g\). Then (5.31) follows by letting \(\mathcal{C}\ni (x,y) \to (x_{0},y_{0}) \in \partial \mathcal{C}\) with \(x_{0}>0\), and using \(\widehat{u}_{x}=\widehat{u}_{y}=\widehat{u}=0\) on \(\partial \mathcal{C}\setminus (\{0\} \times \mathbb{R})\). □

Remark 5.12

Notice that due to internal regularity results for parabolic PDEs (cf. [28, Chap. 3, Theorem 10]) and thanks to Lemma 4.8, we know that \(\widehat{U}\in C^{\infty }\) in \(\mathcal{C}\setminus (\{0\}\times \mathbb{R})\). This implies that also \(\overline{U}\) and \(U\) belong to \(C^{\infty }\) in \(\mathcal{C}\setminus (\{0\}\times \mathbb{R})\).

5.2 Reflection, creation and inverse of the boundary

Recall that we conjectured that the boundary condition (4.4) holds for \(U\) in (4.8). We now verify that this is indeed true, provided that we understand it in the limit as \(x\downarrow 0\) for each given \(\pi \in (0,1)\). Let us start by recalling that (4.19) holds with \(\varphi =\pi /(1-\pi )\). Then thanks to Remark 5.12, \(U\) satisfies

$$ \frac{1}{2}\sigma ^{2}U(0+,\pi )+ \sigma \theta \pi (1-\pi )U_{\pi }(0+, \pi ) + (\mu _{0}+\hat{\mu }\pi )U(0+,\pi )=0 $$
(5.32)

for \(\pi \in (0,1)\) such that \((0,\pi )\in \mathcal{C}\) if and only if

$$ \frac{1}{2}\sigma ^{2}\overline{U}_{x}(0+,\varphi ) +\hat{\mu }\varphi \overline{U}(0+,\varphi ) +\mu _{0}\overline{U}(0+,\varphi )=0 $$
(5.33)

for all \(\varphi > 0\) such that \((0,\varphi ) \in \mathcal{C}\). Recalling that \(\widehat{U}(x,y)=\overline{U}(x,\exp \frac{ \theta }{\sigma }(x+y))\), we see that (5.33) holds if and only if

$$ \frac{1}{2}\sigma ^{2}(\widehat{U}_{x} + \widehat{U}_{y})(0+,y) + \mu _{0}\widehat{U}(0+,y) = 0 $$
(5.34)

for all \(y \in \mathbb{R}\) such that \((0,y) \in \mathcal{C}\). We refer to the boundary condition (5.32) as reflection and creation condition. Notice that \(\{y \in \mathbb{R}:(0,y) \in \mathcal{C}\}\neq \varnothing \) was proved in Proposition 4.9.

Proposition 5.13

The boundary condition (5.32) holds.

Proof

We prove (5.34). Fix \(y\in \mathbb{R}\) with \((0,y)\in \mathcal{C}\) and take a sequence \(x_{n}\downarrow 0\) as \(n\to \infty \). Notice that \(\widehat{X}^{x_{n}}\) is decreasing in \(n\), whereas \(\widehat{Y}^{x_{n},y}\) is increasing in \(n\) thanks to (5.26) and (5.27). Then by Proposition 5.8 and the geometry of \(\mathcal{S}\), we have \(\tau _{*}(x_{n},y)\uparrow \tau _{*}(0,y)\)\(\overline{P}\)-a.s. For simplicity, we write \(\tau _{n}=\tau _{*}(x_{n},y)\) and \(\tau _{\infty }=\tau _{*}(0,y)\).

The idea is simply to take limits in the expressions of \(\widehat{U} _{x}\) and \(\widehat{U}_{y}\) (see (5.28) and (4.59)). For (5.28), we notice that \(S^{\mu _{0},\sigma }_{\tau _{n}}-x_{n} \uparrow S^{\mu _{0},\sigma }_{\tau _{\infty }}\) as \(n\to \infty \) and recall that \(\overline{P}[S^{\mu _{0},\sigma }_{\tau _{\infty }}= 0]=0\) by (5.25), since \(y>y^{*}_{0}\). Then \(\overline{P}\)-a.s., we have

(5.35)
(5.36)

Once again we use (4.22) to invoke dominated convergence, upon noticing that \(S^{\beta ,\sigma }_{\tau _{n}}\le S^{\beta , \sigma }_{\tau _{\infty }}\) for any \(\beta \). From (5.35) and (5.36), we then obtain (restoring the notation \(\tau _{\infty }= \tau _{*}\) under \(\overline{P}_{0,y}\)) that

Similarly, we get for \(\widehat{U}_{y}\) that

Combining the two expressions, we find that

where the last equality uses (4.43). □

With the goal of eventually going back to our original problem (4.8) in the \((x,\pi )\)-coordinates, we now need to consider the inverse of \(b(\cdot )\). In particular, recalling the increasing map \(x\mapsto \chi (x)\) from (4.57) and noticing that

$$\begin{aligned} x< b(y)\iff y>\chi (x), \end{aligned}$$

we conclude that \(\chi \) is the right-continuous inverse of \(b\), i.e.,

$$\begin{aligned} \chi (x)=\inf \{y\in \mathbb{R}: b(y)>x\}. \end{aligned}$$

From (4.57), we also obtain that \(x\mapsto \psi (x)\) is increasing and right-continuous with

$$\begin{aligned} \psi (x)=\exp \bigg(\frac{\theta }{\sigma }\big(\chi (x)+x\big)\bigg). \end{aligned}$$

We can therefore take the increasing, left-continuous inverse of \(\psi \),

$$\begin{aligned} c(\varphi )=\inf \{x>0 : \psi (x)\ge \varphi \}, \end{aligned}$$

and notice that

$$\varphi >\psi (x)\iff x< c(\varphi ). $$

Next we recall that \(\varphi =\pi /(1-\pi )\), and since \(\pi \mapsto \pi /(1-\pi )\) is increasing, we can define the optimal boundary in the \((x,\pi )\)-coordinates by setting

$$\begin{aligned} d(\pi ):=c\left (\frac{\pi }{1-\pi }\right )\big(=c(\varphi )\big). \end{aligned}$$

Clearly \(\pi \mapsto d(\pi )\) is left-continuous and increasing, and finally, we can define its right-continuous, increasing inverse

$$\begin{aligned} \lambda (x):=\inf \{\pi \in (0,1) : d(\pi )>x\}. \end{aligned}$$

Summarising the above, the sets \(\mathcal{C}\) and \(\mathcal{S}\) can be equivalently described in terms of \(d(\cdot )\), \(\lambda (\cdot )\), \(c(\cdot )\), \(\psi (\cdot )\), \(b(\cdot )\) or \(\chi (\cdot )\), depending on the chosen coordinates, i.e.,

$$\begin{aligned} \mathcal{C} &=\{(x,y) : y>\chi (x)\}=\{(x,y) : x< b(y)\} \\ &=\{(x,\varphi ) : \varphi >\psi (x)\}=\{(x,\varphi ) : x< c(\varphi )\} \\ &=\{(x,\pi ) : \pi >\lambda (x)\}=\{(x,\pi ) : x< d(\pi )\}, \end{aligned}$$
(5.37)
$$\begin{aligned} \mathcal{S} &=\{(x,y) : y\le \chi (x)\}=\{(x,y) : x\ge b(y)\} \\ &=\{(x,\varphi ) : \varphi \le \psi (x)\}=\{(x,\varphi ) : x\ge c( \varphi )\} \\ &=\{(x,\pi ) : \pi \le \lambda (x)\}=\{(x,\pi ) : x\ge d(\pi )\}. \end{aligned}$$
(5.38)

Before closing this section, we determine the limiting behaviour of the boundary \(d(\pi )\) as \(\pi \to \{0,1\}\). Let us recall the measure \(P^{\theta }\) introduced in (4.33) and the associated Brownian motion \(W^{\theta }\). Moreover, let us also consider

$$\begin{aligned} U^{\mu _{1}}(x)=\sup _{\tau \ge 0}E^{\theta }_{x}\Big[e^{\frac{2\mu _{1}}{ \sigma ^{2}}A_{\tau }-\rho \tau }\Big] \end{aligned}$$
(5.39)

which corresponds to problem (4.8) with \(\pi =1\) (notice that indeed \(\widehat{X}\) has drift \(\mu _{1}\) under \(P^{\theta }\)). It was shown in [16, Sect. 8.3] that (5.39) is the optimal stopping problem associated to the dividend problem with full information and drift of \(X^{D}\) equal to \(\mu _{1}\). It then follows from [16] that there is an optimal stopping boundary \(a^{*}>0\) that fully characterises the solution of (5.39) and the stopping set is \([a^{*},\infty )\) (an expression for \(a_{*}\) can be found in Schmidli [43, Theorem 2.53] with the notation \(m=\mu _{1}\) and \(\delta = \rho \)).

We now notice that using Girsanov’s theorem and (4.30), we obtain from (4.19) that

$$\begin{aligned} U\big(x,\varphi /(1+\varphi )\big) &=\frac{\overline{U}(x,\varphi )}{1+ \varphi }=\lim _{n\to \infty }\frac{\overline{U}^{n}(x,\varphi )}{1+ \varphi } \\ &=\lim _{n\to \infty }\frac{\varphi }{1+\varphi } \sup _{\tau \le \zeta _{n}}\bigg(\frac{1}{\varphi }\overline{E}\Big[e ^{\frac{2\mu _{0}}{\sigma ^{2}}A^{x}_{\tau }-\rho \tau }\Big]+E^{\theta }\Big[e^{\frac{2\mu _{1}}{\sigma ^{2}}A^{x}_{\tau }-\rho \tau }\Big] \bigg) \\ &=\frac{\varphi }{1+\varphi }\sup _{\tau \ge 0}\bigg(\frac{1}{\varphi }\overline{E}\Big[e^{\frac{2\mu _{0}}{\sigma ^{2}}A^{x}_{\tau }-\rho \tau }\Big]+E^{\theta }\Big[e^{\frac{2\mu _{1}}{\sigma ^{2}}A^{x}_{ \tau }-\rho \tau }\Big]\bigg). \end{aligned}$$

Letting \(\pi \to 1\) (or equivalently \(\varphi \to \infty \)), this yields

$$\begin{aligned} \lim _{\pi \to 1} U(x,\pi )=U^{\mu _{1}}(x) \qquad \text{for all $x\in [0,\infty )$}. \end{aligned}$$
(5.40)

We also need to state two simple facts which can be obtained by (4.19) and straightforward calculations. For all \((x,\pi ) \in \mathcal{O}\), we have

$$\begin{aligned} U_{x}(x,\pi )=\frac{1}{1+\varphi }\overline{U}_{x}(x,\varphi ), \qquad U_{\pi }(x,\pi )=-\overline{U}(x,\varphi )+(1+\varphi )\overline{U}_{ \varphi }(x,\varphi ). \end{aligned}$$

Thanks to (4.31) and (4.36), the above and (4.19) imply that there is a constant \(c>0\) such that

$$\begin{aligned} |U(x,\pi )|+|U_{x}(x,\pi )|+(1-\pi )|U_{\pi }(x,\pi )|\le c \qquad \text{for $(x,\pi )\in \mathcal{O}$}. \end{aligned}$$
(5.41)

We can now state our next result.

Proposition 5.14

Under Assumption5.1, we have

$$\begin{aligned} \lim _{\pi \to 0}d(\pi )=0\quad \textit{and}\quad \lim _{\pi \to 1} d( \pi )=a^{*}, \end{aligned}$$

where\(a^{*}\)is the optimal boundary for (5.39).

Proof

1) Limit as \(\pi \to 1\): Recall that \(d(\cdot )\) is increasing and left-continuous. So

$$\begin{aligned} d(1)=\lim _{\pi \to 1}d(\pi ). \end{aligned}$$
(5.42)

Thanks to (5.41), we have

$$\begin{aligned} \left |U\big(d(\pi ),\pi \big) - U^{\mu _{1}}\big(d(1)\big)\right | & \le \left |U\big(d(\pi ),\pi \big) - U\big(d(1),\pi \big)\right | \\ & \phantom{=:}+ \left |U\big(d(1),\pi \big) - U^{\mu _{1}}\big(d(1) \big)\right | \\ & \le c \big(d(1)-d(\pi )\big)+\left |U\big(d(1),\pi \big)-U^{\mu _{1}} \big(d(1)\big)\right |. \end{aligned}$$

Recall that \(U(d(\pi ),\pi )=1\) for all \(\pi \in (0,1)\). Hence taking limits as \(\pi \uparrow 1\) in the expression above and using (5.40) and (5.42), we obtain

$$\begin{aligned} 1=\lim _{\pi \to 1} U\big(d(\pi ),\pi \big)=U^{\mu _{1}}\big(d(1)\big). \end{aligned}$$

This implies \(d(1)\ge a^{*}\) by the definition of \(a^{*}\).

Let us now assume that \(d(1)>a^{*}\) and take an interval \([x_{1},x _{2}]\subseteq (a^{*},d(1))\). Pick an arbitrary positive function \(\phi \in C^{\infty }_{c}(x_{1},x_{2})\) with \(\int _{\mathbb{R}_{+}} \phi (\zeta )=1\). Rewriting (4.52) in \((x,\pi )\)-coordinates gives \((\mathcal{L}_{X,\pi }-\rho )U=0\) in \(\mathcal{C}\). By left-continuity of \(d(\cdot )\), we can choose \(\varepsilon >0\) sufficiently small such that \(\mathcal{R}_{\varepsilon }:=[x_{1},x_{2}]\times [1- \varepsilon ,1)\subseteq \mathcal{C}\) and

$$\phi (x)(\mathcal{L}_{X,\pi }U-\rho U)(x,\pi )=0\qquad \text{for $(x,\pi )\in \mathcal{R}_{\varepsilon }$}. $$

Integration by parts gives

$$\begin{aligned} 0 &=\int ^{x_{2}}_{x_{1}}\phi (\zeta )(\mathcal{L}_{X,\pi }-\rho )U( \zeta ,\pi )d\zeta \\ &=\int ^{x_{2}}_{x_{1}} U(\zeta ,\pi )(\mathcal{G}-\rho )\phi (\zeta )d\zeta \\ & \phantom{=:}+\pi (1-\pi )\int ^{x_{2}}_{x_{1}}\bigg(\frac{1}{2}\theta ^{2}\pi (1-\pi )U_{\pi \pi }(\zeta ,\pi )+\hat{\mu }U_{x\pi }(\zeta , \pi )\bigg)\phi (\zeta ) d\zeta , \end{aligned}$$
(5.43)

where \(\mathcal{G}=\frac{1}{2}\sigma ^{2}\frac{\partial ^{2}}{\partial x^{2}}-(\mu _{0}+\hat{\mu }\pi )\frac{\partial }{\partial x}\). Set

$$F_{\phi }(\pi ):=\int ^{x_{2}}_{x_{1}}\bigg(\frac{1}{2}\theta ^{2} \pi (1-\pi )U_{\pi \pi }(\zeta ,\pi )+\hat{\mu }U_{x\pi }(\zeta , \pi )\bigg)\phi (\zeta ) d\zeta $$

and let \(\pi \to 1\) in (5.43). Then (5.40) and dominated convergence give

$$\begin{aligned} \lim _{\pi \to 1}\pi (1-\pi )F_{\phi }(\pi )=-\int ^{x_{2}}_{x_{1}} U ^{\mu _{1}}(\zeta )(\mathcal{G}-\rho )\phi (\zeta )d\zeta . \end{aligned}$$

Since \(U^{\mu _{1}}(x)=1\) for \(x\in (x_{1},x_{2})\), undoing the integration by parts yields

$$\begin{aligned} \lim _{\pi \to 1}\pi (1-\pi )F_{\phi }(\pi )=\rho , \end{aligned}$$

which says that \(F_{\phi }(\pi )\) behaves as \(\rho /(1-\pi )\) for \(\pi \to 1\). This implies that

$$\begin{aligned} \int ^{1}_{1-\varepsilon }F_{\phi }(\pi )d\pi =\infty , \end{aligned}$$
(5.44)

and we now show that (5.44) is impossible. For \(\varepsilon >0\) as above and \(0<\delta <\varepsilon \), Fubini’s theorem and integration by parts give

$$\begin{aligned} &\int ^{1-\delta }_{1-\varepsilon }F_{\phi }(\pi )d\pi \\ &=\int _{x_{1}}^{x_{2}}\bigg(\int ^{1-\delta }_{1-\varepsilon }\Big( \frac{1}{2}\theta ^{2}\pi (1-\pi )U_{\pi \pi }(\zeta ,\pi )+\hat{\mu }U _{x\pi }(\zeta ,\pi )\Big)d\pi \bigg) \phi (\zeta )d\zeta \\ &=\frac{1}{2}\theta ^{2}\int _{x_{1}}^{x_{2}}\bigg(\Big(\pi (1-\pi )U _{\pi }(\zeta ,\pi )-(1-2\pi )U(\zeta ,\pi )\Big)\Big|^{\pi =1-\delta }_{\pi =1-\varepsilon } \\ & \phantom{=:\frac{1}{2}\theta ^{2}\int _{x_{1}}^{x_{2}}\bigg(}-2 \int _{1-\varepsilon }^{1-\delta }U(\zeta ,\pi )d\pi \bigg)\phi ( \zeta )d\zeta \\ & \phantom{=:}+\hat{\mu }\int _{x_{1}}^{x_{2}} U_{x}(\zeta ,\pi )\big|^{ \pi =1-\delta }_{\pi =1-\varepsilon } \,\phi (\zeta )d\zeta \le c', \end{aligned}$$

where the last inequality uses (5.41) and \(c'>0\) is independent of \(\delta \). Letting \(\delta \to 0\), we reach a contradiction with (5.44).

2) Limit as \(\pi \to 0\): The proof follows the same steps as above. Assume that \(d(0+):=\lim _{\pi \to 0}d(\pi ) >0\). Then take a closed interval \([x_{1},x_{2}]\subseteq (0,d(0+))\) and an arbitrary positive function \(\phi \in C^{\infty }_{c}(x_{1},x_{2})\) with \(\int _{\mathbb{R}_{+}}\phi (\zeta )=1\). Repeating the same steps as above, we write (5.43) and notice that (iii) in Proposition 4.4 implies that \(\lim _{\pi \to 0}U(x,\pi )=1\) for all \(x\ge 0\). Hence taking \(\pi \to 0\) in (5.43) gives

$$\begin{aligned} \lim _{\pi \to 0}\pi (1-\pi )F_{\phi }(\pi )=\rho , \end{aligned}$$

which also implies \(\int ^{\varepsilon }_{0}F_{\phi }(\pi )d\pi = \infty \). The latter leads to a contradiction, exactly as in 1) above. □

Using (5.37) and (5.38), we can conclude that also the boundaries \(c\) and \(b\) are bounded above by \(a_{*}\) and have the same limits.

Corollary 5.15

We have\(0 \le c(\varphi ) \le a_{*}\)for\(\varphi \in (0,\infty )\)and\(0 \le b(y) \le a_{*}\)for\(y \in \mathbb{R}\). Moreover,

$$\lim _{\varphi \to 0}c(\varphi )=\lim _{y\to -\infty }b(y)=0\quad \textit{and}\quad \lim _{\varphi \to \infty }c(\varphi )=\lim _{y\to \infty }b(y)=a_{*}. $$

6 Solution of the dividend problem

At this point, we can construct a candidate for the value function \(V\) in (2.6) by setting

$$\begin{aligned} v(x,\pi ):=\int _{0}^{x} U(\zeta ,\pi )d\zeta , \qquad (x,\pi )\in \overline{\mathcal{O}}. \end{aligned}$$
(6.1)

Thanks to Corollary 5.11 and dominated convergence, we immediately obtain

Corollary 6.1

Under Assumption5.1, the function\(v\)belongs to\(C(\overline{ \mathcal{O}})\cap C^{1}(\mathcal{O})\). Moreover, \(v_{xx}\)and\(v_{x\pi }\)are continuous in\(\mathcal{O}\).

In order to apply Theorem 3.1, it remains to show that \(v_{\pi \pi }\in L^{\infty }_{\mathrm{loc}}(\mathcal{O})\) and \(v _{\pi \pi }\in C(\overline{\mathcal{C}}\cap \mathcal{O})\). This is a nontrivial task and relies on a semi-explicit characterisation of the weak derivative \(v_{\pi \pi }\).

Proposition 6.2

Let Assumption5.1hold. The function\(v\)in (6.1) admits a weak derivative\(v_{\pi \pi }\in L^{\infty }_{\mathrm{loc}}( \mathcal{O})\). Moreover, we can select an element of the equivalence class of\(v_{\pi \pi }\in L^{\infty }_{\mathrm{loc}}(\mathcal{O})\) (denoted again by\(v_{\pi \pi }\)) given by

$$\begin{aligned} v_{\pi \pi }(x,\pi ) &=2\bigg(\rho \int _{0}^{x\wedge d_{+}(\pi )}U( \zeta ,\pi )d\zeta -\frac{1}{2}\sigma ^{2}U_{x}\big(x\wedge d_{+}( \pi ),\pi \big) \\ & \phantom{=:2\big(}-\hat{\mu }\pi (1-\pi )U_{\pi }\big(x\wedge d_{+}( \pi ),\pi \big) \\ & \phantom{=:2\big(}-(\mu _{0}+\hat{\mu }\pi )U\big(x\wedge d_{+}( \pi ),\pi \big)\bigg)\big(\theta \pi (1-\pi )\big)^{-2}, \end{aligned}$$
(6.2)

with\(d_{+}(\pi ):=\lim _{\varepsilon \to 0}d(\pi +\varepsilon )\).

Proof

Since \(v_{\pi }(x,\,\cdot \,)\) is a continuous function for all \(x>0\), we say as usual that its weak derivative with respect to \(\pi \) is a function \(f\in L^{1}_{\mathrm{loc}}(\mathcal{O})\) such that for any \(\phi \in C^{\infty }_{c}(0,1)\), we have

$$\begin{aligned} \int _{0}^{1}v_{\pi }(x,z)\phi '(z)dz=-\int _{0}^{1}f(x,z)\phi (z)dz. \end{aligned}$$

Our aim is to compute \(f\), show that it equals the right-hand side of (6.2) and therefore conclude that \(f\in L^{\infty }_{ \mathrm{loc}}(\mathcal{O})\), due to \(U\in C^{1}(\mathcal{O})\).

Recalling that \(U_{\pi }=0\) in \(\mathcal{S}\) and that \(x< d(\pi )\) is equivalent to \(\pi > \lambda (x)\) (cf. (5.37)), using Fubini’s theorem allows us to write

$$\begin{aligned} &\int _{0}^{1}v_{\pi }(x,z)\phi '(z)dz \\ &=\int _{0}^{1} \bigg(\int _{0}^{x\wedge d(z)}U_{\pi }(\zeta ,z)d \zeta \bigg)\phi '(z)dz \\ &=\int _{0}^{x} \bigg(\int _{\lambda (\zeta )}^{1} U_{\pi }(\zeta ,z) \phi '(z)dz \bigg)d\zeta \\ &=\int _{0}^{x}\bigg(U_{\pi }(\zeta ,1)\phi (1)-U_{\pi }\big(\zeta , \lambda (\zeta )\big)\phi \big(\lambda (\zeta )\big) - \int _{\lambda (\zeta )}^{1} U_{\pi \pi }(\zeta ,z)\phi (z)dz \bigg)d \zeta \\ &=-\int _{0}^{x}\bigg(\int _{\lambda (\zeta )}^{1} U_{\pi \pi }(\zeta ,z)\phi (z)dz \bigg)d\zeta , \end{aligned}$$
(6.3)

where the final equality holds because \(U_{\pi }(\zeta ,\lambda ( \zeta ))=0\) for all \(\zeta \in (0,x)\) and \(\phi (1)=0\). Now we rewrite the last expression by using that

$$\begin{aligned} \frac{1}{2}\theta ^{2}\pi ^{2}(1-\pi )^{2} U_{\pi \pi } &=-\frac{1}{2} \sigma ^{2}U_{xx} - \hat{\mu }\pi (1-\pi )U_{x\pi } \\ & \phantom{=:}-(\mu _{0} + \hat{\mu }\pi )U_{x}+\rho U \qquad \text{in $\mathcal{C}\setminus \big(\{0\} \times (0,1)\big)$}, \end{aligned}$$

thanks to (4.52) written in \((x,\pi )\)-coordinates. Hence, using Fubini’s theorem again, we get

$$\begin{aligned} &-\int _{0}^{x}\left (\int _{\lambda (\zeta )}^{1} U_{\pi \pi }(\zeta ,z) \phi (z)dz \right )d\zeta \\ &=2\int _{0}^{1} \bigg(\int _{0}^{x\wedge d(z)}\Big(\frac{1}{2}\sigma ^{2}U_{xx}(\zeta ,z)+\hat{\mu }z(1-z)U_{x\pi }(\zeta ,z)+(\mu _{0}+ \hat{\mu }z)U_{x}(\zeta ,z) \\ & \phantom{=:2\int _{0}^{1} \bigg(\int _{0}^{x\wedge d(z)}\Big(} -\rho U(\zeta , z)\Big)d\zeta \bigg)\big(\theta z(1-z)\big)^{-2} \phi (z)dz. \end{aligned}$$
(6.4)

Now consider the integral with respect to \(\zeta \) and notice that we need only look at \(z\in [0,1]\) with \(d(z)>0\), as otherwise the integral is zero. Using (5.32) for \(U\) gives

$$\begin{aligned} &\int _{0}^{x\wedge d(z)} \bigg(\frac{1}{2}\sigma ^{2}U_{xx}(\zeta ,z) + \hat{\mu }z(1 - z)U_{x\pi }(\zeta ,z) + (\mu _{0} + \hat{\mu }z)U_{x}( \zeta ,z)\bigg)d\zeta \\ &=\frac{1}{2}\sigma ^{2}\Big(U_{x}\big(x\wedge d(z),z\big) - U_{x}(0+,z) \Big) \\ & \phantom{=:}+\hat{\mu }z(1 - z)\Big(U_{\pi }\big(x\wedge d(z),z\big) - U_{\pi }(0+,z)\Big) \\ & \phantom{=:}+(\mu _{0}+\hat{\mu }z)\Big(U\big(x\wedge d(z),z\big)-U(0+,z) \Big) \\ &=\frac{1}{2}\sigma ^{2}U_{x}\big(x\wedge d(z),z\big)+\hat{\mu }z(1-z)U _{\pi }\big(x\wedge d(z),z\big) \\ & \phantom{=:}+(\mu _{0}+\hat{\mu }z)U\big(x\wedge d(z),z\big). \end{aligned}$$
(6.5)

Combining (6.3)–(6.5), we get

$$\begin{aligned} \int _{0}^{1}v_{\pi }(x,z)\phi '(z)dz &=2\int _{0}^{1}\bigg(\frac{1}{2} \sigma ^{2}U_{x}\big(x\wedge d(z),z\big)+\hat{\mu }z(1-z)U_{\pi } \big(x\wedge d(z),z\big) \\ & \phantom{=:2\int _{0}^{1}\big(}+(\mu _{0}+\hat{\mu }z)U\big(x\wedge d(z),z \big)-\rho \!\int _{0}^{x\wedge d(z)}\!U(\zeta ,z)d\zeta \bigg) \\ & \phantom{=:2\int _{0}^{1}} \times \big(\theta z(1-z)\big)^{-2}\phi (z)dz, \end{aligned}$$

from which we deduce

$$\begin{aligned} f(x,\pi ) &=2\bigg(\rho \int _{0}^{x\wedge d(\pi )} U(\zeta ,\pi )d \zeta - \frac{1}{2}\sigma ^{2}U_{x}\big(x \wedge d(\pi ),\pi \big) \\ & \phantom{=:2\big(}- \hat{\mu }\pi (1 - \pi )U_{\pi }\big(x \wedge d( \pi ),\pi \big)-(\mu _{0}+\hat{\mu }\pi )U\big(x\wedge d(\pi ),\pi \big)\bigg) \\ & \phantom{=:2}\times \big(\theta \pi (1-\pi )\big)^{-2}. \end{aligned}$$

Finally, notice that \(\pi \mapsto d(\pi )\) has at most countably many jumps for \(\pi \in [0,1]\), so that \(f(x,\pi )=\lim _{\varepsilon \to 0}f(x,\pi +\varepsilon )\) for a.e. \(\pi \in [0,1]\). Moreover, let \((\pi ^{J}_{k})_{k\ge 1}\) be the collection of jump points of \(d\) and denote

$$\mathcal{N}:=\bigcup _{k\ge 1}\Big(\big[d(\pi ^{J}_{k}),\infty \big) \times \{\pi ^{J}_{k}\}\Big). $$

Then

$$f(x,\pi )=\lim _{\varepsilon \to 0}f(x,\pi +\varepsilon )\qquad \text{for $(x,\pi )\in \mathcal{O}\setminus \mathcal{N}$}. $$

Since \(\mathcal{N}\) has zero Lebesgue measure in \(\mathcal{O}\), we conclude that (6.2) holds. □

In the remainder of the paper, we always consider the representative of \(v_{\pi \pi }\) given by the expression in (6.2). From (6.2) and \(U\in C^{1}(\mathcal{O})\), we derive the next result.

Corollary 6.3

Under Assumption5.1, the function\(v_{\pi \pi }\)in (6.2) is continuous in\(\overline{\mathcal{C}}\cap \mathcal{O}\).

Proof

It is sufficient to notice that for any \((x,\pi )\in \overline{ \mathcal{C}}\cap \mathcal{O}\), we have \(x\le d_{+}(\pi )\). Hence

$$\begin{aligned} v_{\pi \pi }(x,\pi ) &=2\bigg(\rho \int _{0}^{x}U(\zeta ,\pi )d\zeta - \frac{1}{2}{\sigma ^{2}}U_{x}(x,\pi ) \\ & \phantom{=:2\bigg(}\!-\hat{\mu }\pi (1-\pi )U_{\pi }(x,\pi )-(\mu _{0}+\hat{\mu }\pi )U(x,\pi )\bigg)\big(\theta \pi (1-\pi )\big)^{-2} \end{aligned}$$

for all \((x,\pi )\in \overline{\mathcal{C}}\cap \mathcal{O}\). Continuity of \(v_{\pi \pi }\) now follows from \(U\in C^{1}(\mathcal{O})\). □

Now that we have a candidate solution for the variational problem in Theorem 3.1, we should like to construct also a candidate optimal control. Recalling \(\mathcal{I}_{v}\) from (3.4) and noticing that \(v_{x}=U\), we immediately see that \(\mathcal{I}_{v}= \mathcal{C}\). Then, given \((x,\pi )\in \mathcal{O}\), we define \(P_{x,\pi }\)-a.s. the process

$$\begin{aligned} \widehat{D}_{t}:=\sup _{0\le s\le t}\big(X_{s}-d(\pi _{s})\big)^{+}, \end{aligned}$$
(6.6)

where we recall that \(X\) has the uncontrolled dynamics

$$X_{t}=x+\int _{0}^{t}(\mu _{0}+\hat{\mu }\pi _{s})ds+\sigma W_{t} \qquad \text{$P_{x,\pi }$-a.s.} $$

We also recall the notation \(\gamma ^{\widehat{D}}:=\inf \{t\ge 0 : X ^{\widehat{D}}_{t}\le 0\}\).

Some of the arguments in the proof of the next lemma are borrowed from De Angelis et al. [17, Sect. 5].

Lemma 6.4

Let Assumption5.1hold. The process\(\widehat{D}\)in (6.6) belongs to\(\mathcal{A}\) (i.e., it is admissible). The triple\((X^{\widehat{D}}_{t}, \widehat{D}_{t}, \pi _{t})_{t\ge 0}\)solves the Skorokhod reflection problem in \(\mathcal{C}\), that is, for\(P_{x,\pi }\)-a.e\(\omega \in \Omega \)and all\(0 \le t \le \gamma ^{\widehat{D}}(\omega )\), we have

$$\begin{aligned} (X^{\widehat{D}}_{t}, \pi _{t}) &\in \overline{\mathcal{C}}, \end{aligned}$$
(6.7)
(6.8)
(6.9)

Proof

It is immediate to see that \(\widehat{D}\) is increasing and adapted to \((\mathcal{F}_{t})_{t\ge 0}\). Then it also admits left limits at all points. In order to prove right-continuity of paths, we observe that \(d(\cdot )\) is increasing and left-continuous, hence lower semi-continuous. It then follows that \(t\mapsto X_{t}-d(\pi _{t})\) is \(P_{x,\pi }\)-a.s. upper semi-continuous. Now obviously \(\lim _{\varepsilon \to 0}\widehat{D}_{t+\varepsilon }\ge \widehat{D} _{t}\), and the converse inequality follows from

$$\begin{aligned} \lim _{\varepsilon \to 0}\widehat{D}_{t+\varepsilon } &= \lim _{\varepsilon \to 0}\Big(\widehat{D}_{t}\vee \sup _{t< s\le t+\varepsilon }\big(X_{s}-d(\pi _{s})\big)^{+}\Big) \\ &=\widehat{D}_{t}\vee \limsup _{\varepsilon \to 0}\big(X_{t+\varepsilon }-d(\pi _{t+\varepsilon })\big)^{+}\le \widehat{D}_{t}\vee \big(X_{t}-d( \pi _{t})\big)^{+}=\widehat{D}_{t}. \end{aligned}$$

Hence \(\widehat{D}\in \mathcal{A}\).

Let us turn to the study of the Skorokhod reflection problem. Since \(\pi \) is unaffected by \(\widehat{D}\), we have

$$d(\pi _{t})-X^{\widehat{D}}_{t}=d(\pi _{t})-X_{t}+\widehat{D}_{t}\ge 0 \qquad \text{for all $t\ge 0$, $P_{x,\pi }$-a.s.,} $$

where the final inequality follows from (6.6). Recalling that \(x< d(\pi )\) if and only if \((x,\pi )\in \mathcal{C}\), we deduce that (6.7) holds. It remains to prove (6.8). Fix \(\omega \in \Omega \) (outside of a nullset) and \(t_{1}>0\). If \(X^{\widehat{D}} _{t_{1}-}(\omega )< d(\pi _{t_{1}}(\omega ))\), then \(X^{\widehat{D}}=X- \widehat{D}\) implies \(\widehat{D}_{t_{1}-}(\omega )=\widehat{D}_{t_{1}}(\omega )>X_{t_{1}}(\omega )-d( \pi _{t_{1}}(\omega ))\). Combining this with the upper semi-continuity of the map \(t\mapsto X_{t}-d(\pi _{t})\), there is \(\varepsilon _{\omega }:=\varepsilon (\omega ,t_{1})\) such that

$$\sup _{t_{1}< s\le t_{1}+\varepsilon _{\omega }}\Big(X_{s}(\omega )-d \big(\pi _{s}(\omega )\big)\Big)^{+}\le \widehat{D}_{t_{1}}(\omega ). $$

Hence for all \(s\in [t_{1},t_{1}+\varepsilon _{\omega }]\), we have

$$\widehat{D}_{s}(\omega )=\widehat{D}_{t_{1}}(\omega )\vee \sup _{t_{1}< s\le t_{1}+\varepsilon _{\omega }}\Big(X_{s}(\omega )-d \big(\pi _{s}(\omega )\big)\Big)^{+}=\widehat{D}_{t_{1}}(\omega ), $$

which proves (6.8) for all \(0< t\le \gamma ^{\widehat{D}}(\omega )\). By right-continuity, the result extends to \(0\le t\le \gamma ^{\widehat{D}}(\omega )\). Finally, it follows from (6.6) that jumps of \(\widehat{D}\) may only occur along vertical jumps of the boundary \(d\); hence (6.9) holds. □

We can finally conclude the section by providing the solution of the dividend problem with partial information.

Theorem 6.5

Recall\(V\)from (2.6) and\(\widehat{D}\)from (6.6) and let Assumption5.1hold. Then we have

$$\begin{aligned} V(x,\pi )=\int _{0}^{x} U(\zeta ,\pi )d\zeta , \qquad (x,\pi )\in \overline{\mathcal{O}}, \end{aligned}$$

and\(D^{*}=\widehat{D}\)is an optimal control.

Proof

We need to check that \(v\) in (6.1) fulfils the assumptions of Theorem 3.1. It is immediate that \(0\le v(x,\pi )\le c x\) thanks to (5.41), hence \(v(0,\pi )=0\). Moreover, Corollary 6.1, Proposition 6.2 and Corollary 6.3 guarantee that \(v\) is smooth enough.

Next we verify that (3.2) holds. Once again, notice that \(\mathcal{I}_{v}=\mathcal{C}\) and let us pick \((x,\pi )\in \mathcal{C}\). By direct calculation,

$$\begin{aligned} (\mathcal{L}_{X,\pi }v-\rho v)(x,\pi ) &=\frac{1}{2}\sigma ^{2} U_{x}(x, \pi )+\hat{\mu }\pi (1-\pi )U_{\pi }(x,\pi )+(\mu _{0}+\hat{\mu } \pi )U(x,\pi ) \\ & \phantom{=:}-\rho \int _{0}^{x} U(\zeta ,\pi )d\zeta +\frac{1}{2}\theta ^{2}\pi ^{2}(1-\pi )^{2}v_{\pi \pi }(x,\pi ). \end{aligned}$$
(6.10)

Substituting the expression (6.2) for \(v_{\pi \pi }\) in the above and recalling that \((x,\pi )\in \mathcal{C}\) was arbitrary, we obtain

$$(\mathcal{L}_{X,\pi }v-\rho v)(x,\pi )=0,\qquad (x,\pi )\in \mathcal{C}. $$

Now pick \((x,\pi )\in \mathcal{S}\), recall that \(U_{x}=U_{\pi }=0\) and \(U=1\) in \(\mathcal{S}\) and repeat the calculations in (6.10). This gives

$$\begin{aligned} (\mathcal{L}_{X,\pi }v-\rho v)(x,\pi ) &=(\mu _{0}+\hat{\mu }\pi )- \rho \!\int _{0}^{x} \!U(\zeta ,\pi )d\zeta +\frac{1}{2}\theta ^{2}\pi ^{2}(1-\pi )^{2}v_{\pi \pi }(x,\pi ) \\ &=-\rho \int _{d(\pi )}^{x}U(\zeta ,\pi )d\zeta =-\rho \big(x-d( \pi )\big)\le 0, \end{aligned}$$

where we have used (6.2), upon noticing that \(U_{x}(d(\pi ), \pi )=U_{\pi }(d(\pi ),\pi )=0\) and \(U(d(\pi ),\pi )=1\). Finally, it was shown in Lemma 6.4 that (3.5)–(3.7) hold with our choice of \(D^{*}=\widehat{D}\). □