1 Introduction

Let \((\Omega ,{\mathcal {F}},{\mathbb {P}})\) be a complete probability space, filtered by \(({\mathcal {F}}_t)_{t \ge 0}\), a nondecreasing right-continuous family of sub-\(\sigma \)-fields of \({\mathcal {F}}\) such that \({\mathcal {F}}_0\) contains all events of probability 0. Let \(X=(X_t)_{t \ge 0}\), \(Y=(Y_t)_{t\ge 0}\) be adapted martingales. We also assume that both processes are càdlàg: their trajectories are right-continuous and have limits from the left. The symbol [XX] will stand for the square bracket of X: see e.g. Dellacherie and Meyer [8] for the definition. Following Wang [15], we say that Y is differentially subordinate to X if the process \(([X,X]_{t}-[Y,Y]_{t})_{t \ge 0}\) is nondecreasing and nonnegative as a function of t. We will also use the following well known characterization distinguishing the continuous and jump parts:

Lemma 1.1

If X and Y are semimartingales, then Y is differentially subordinate to X if and only if: (i) The continuous part process \(([X,X]^c_t - [Y,Y]^c_t)_{t \ge 0}\) is nondecreasing and nonnegative, (ii) concerning jumps, the inequality \(|\Delta Y_t| \le |\Delta X_t|\) holds for all \(t>0\), and (iii) for initial points, we have \(|Y_0|\le |X_0|\).

There is a lot of important inequalities for differentially subordinate martingales: see e.g. the monograph [12]. Recall the classical result proved by Burkholder [6] and extended to continuous-time processes by Wang [15]:

Theorem 1.2

Fix \(p \in (1,\infty )\). Let X and Y be two adapted, uniformly integrable, càdlàg martingales such that Y is differentially subordinate to X. Then

$$\begin{aligned} ||Y_{\infty }||_{L^p} \le (p^{*}-1)||X_{\infty }||_{L^p}, \end{aligned}$$
(1.1)

where \(p^{*}=\max \{p,p/(p-1)\}\).

We will be interested in a substitute for this estimate in the endpoint \(p=\infty \). First we will establish the following unweighted exponential bound.

Theorem 1.3

Fix \(\lambda \in (0,1)\). Let X and Y be two adapted, uniformly integrable, càdlàg martingales such that \(||X||_{L^{\infty }}\le 1\) and Y is differentially subordinate to X. Then

$$\begin{aligned} {\mathbb {E}}\exp (\lambda Y_{\infty }) \le C_{\lambda }, \end{aligned}$$
(1.2)

where

$$\begin{aligned} C_{\lambda } := {\left\{ \begin{array}{ll} \exp (\lambda ) &{} \text {for}\quad \lambda \in (0,1/2], \\ \frac{\exp (1-\lambda )}{2(1-\lambda )} &{} \text {for}\quad \lambda \in [1/2,1). \end{array}\right. } \end{aligned}$$

This will be established using a procedure invented by Burkholder in [6] - we will find an explicit Bellman function associated with this bound and later use this object to prove a more complicated weighted exponential bound. This is the main goal of this paper. Let us start with the necessary definitions. Assume that \(W=(W_t)_{t \ge 0}\) is a positive and uniformly integrable martingale; this process will be called a weight. It defines a new finite measure on \((\Omega ,{\mathcal {F}})\) by \(W(A) :={\mathbb {E}}W 1_{A}\). Let \(1<p<\infty \) be a fixed parameter. We say that W satisfies Muckenhoupt’s condition \(A_p\) if

$$\begin{aligned} {[}W]_{A_p}:= \sup _{\tau } \left|\left|{\mathbb {E}}\left[ \{W_{\tau }/W_{\infty }\}^{1/(p-1)}|{\mathcal {F}}_{\tau } \right] ^{p-1} \right|\right|_{\infty }<\infty , \end{aligned}$$
(1.3)

where the supremum is taken over all adapted stopping times \(\tau \). There is also a version of this condition for \(p=1\) and \(p=\infty \):

We say that W satisfies Muckenhoupt’s condition \(A_1\) if there exists a finite deterministic constant C such that

$$\begin{aligned} W^{*}_{t} \le C W_{t} \end{aligned}$$

almost surely for all \(t \ge 0\). Here \(W^{*}_{t}=\sup _{s \in [0,t]}|W_{s}|\) is the maximal function of the martingale W. The smallest possible C above is called the \(A_1\) characteristic of W and is denoted by \([W]_{A_1}\).

We say that W satisfies Muckenhoupt’s condition \(A_{\infty }\) if

$$\begin{aligned} {[}W]_{A_\infty }:= \sup _{\tau } \left|\left|\exp \left( \left[ {\mathbb {E}}\log (W_{\tau }/W_{\infty })|{\mathcal {F}}_{\tau } \right] \right) \right|\right|_{\infty }<\infty . \end{aligned}$$
(1.4)

Let us make an important comment here. It is often assumed that the process \((W_t)_{t \ge 0}\) has continuous paths. This restriction appeared in the first works on this topic by Izumisawa–Kazamaki [10] and is still frequently used in weighted inequalities (see for example [1, 2, 13]). There are several reasons for considering only path-continuous martingales. As we will describe later, Muckenhoupt’s weights can be identified with two-dimensional martingales with values in certain hyperbolic/logarithmic domains. Because these domains are not convex, some difficulties may arise when applying the Bellman function method for martingales with jumps: roughly speaking, the main problem is that for nonconvex sets, the local concavity does not imply the usual concavity. Moreover, in the general setting of jump processes, some basic structural properties of weights do not hold true, for example the classical self-improvement property fails [4]. In this paper, we do not assume any path-continuity of weights, hence we will need to overcome these issues.

The weighted extension of Burkholder’s \(L^p\) inequality (1.1) was established by Wittwer [16] under the assumption that Y is a discrete-time martingale transform of X adapted to a 1-dimensional dyadic filtration. In the works of Nazarov, Treil, and Volberg [14] and Lacey [11], this result was generalized to the case of discrete in time general filtrations with arbitrary underlying measure. Finally, Domelevo and Petermichl proved in [9] the weighted extension of (1.1) in the general setting of continuous-time càdlàg differentially subordinate martingales. Our main goal is to prove a weighted extension of the exponential bound (1.2) in this generality. We will establish the following result.

Theorem 1.4

Let W be an \(A_{\infty }\) weight such that \([W]_{A_{\infty }} \le c\) and XY be two adapted, uniformly integrable, càdlàg martingales such that Y is differentially subordinate to X and \(||X||_{L^{\infty }}\le 1\). Then for every \(\lambda \in [0,1/(28c)]\), we have the estimate

$$\begin{aligned} {\mathbb {E}}\exp (\lambda Y_{\infty })W \le 4 \exp (\lambda ){\mathbb {E}}W. \end{aligned}$$
(1.5)

We have the following immediate corollary.

Corollary 1.5

Let XYW be as in the previous theorem. There exist positive constants \(C_1, C_2\) such that

$$\begin{aligned} {\mathbb {E}}\exp (C_1 [W]_{A_{\infty }}^{-1}Y_{\infty })W \le C_2 {\mathbb {E}}W \end{aligned}$$

and

$$\begin{aligned} {\mathbb {E}}\exp (C_1 [W]_{A_{\infty }}^{-1}|Y_{\infty }|)W \le 2C_2 {\mathbb {E}}W. \end{aligned}$$

Clearly, from the above bound on the Laplace transform of \(|Y_{\infty }|\), we can obtain weighted moment estimates. By applying an elementary bound \(|x|^p \le p^p e^{|x|}\), we have the following corollary about \(L^p\) bounds for differential subordinates of bounded martingales.

Corollary 1.6

Fix \(p \in [1,\infty )\). Let XYW be as in the previous theorem. There exists a constant C such that

$$\begin{aligned} ||Y_{\infty }||_{L^p(W)} \le C p [W]_{A_{\infty }}({\mathbb {E}}W)^{1/p}. \end{aligned}$$
(1.6)

Four comments are in order.

Remark 1.7

We have defined the condition \(A_{\infty }\) (1.4) as a limit of the condition \(A_p\) (1.3). There are several other definitions of this class in the literature. For example, let us define a class \(A^{*}_{\infty }=\bigcup _{1<p<\infty }A_p\). It is known that under additional regularity conditions (e.g. that the underlying probability space is dyadic or that the weight process \((W_t)_{t \ge 0}\) has continuous paths), we have the equivalence of \(A_{\infty }\) and \(A^{*}_{\infty }\) [7]. However for general martingale weights with irregular jumps this is no longer the case. Because of the aforementioned lack of the self-improvement property [4] of general weights, the class \(A^{*}_{\infty }\) is a proper subset of the class \(A_{\infty }\). Thus the inequalities obtained under the condition \(A_{\infty }\) are stronger than bounds for \(A^{*}_{\infty }\) weights.

Remark 1.8

There is a natural question about Corollary 1.6 and \(L^p\) estimates for differential subordinates of bounded martingales. Does (1.6) follow from known maximal estimates? More precisely, we are interested in the inequality of the form

$$\begin{aligned} ||Y_{\infty }||_{L^p(W)} \le C p [W]_{A_{\infty }}||X^{*}_{\infty }||_{L^p(W)} \end{aligned}$$
(1.7)

for differentially subordinate martingales X and Y and we no longer assume that X is bounded. This bound would clearly be stronger than (1.6). Let us take a look at known results in this direction for \(p=1\). In [5], the inequality (1.7) was established for martingale transforms under the additional assumption of regularity of the underlying filtration. We also have the following Fefferman–Stein type inequality established by Bañuelos and Osękowski [3] for differentially subordinate martingales:

$$\begin{aligned} ||Y_{\infty }||_{L^1(W)} \le C ||X^{*}_{\infty }||_{L^1(W^{*})}, \end{aligned}$$

and an immediate corollary:

$$\begin{aligned} ||Y_{\infty }||_{L^1(W)} \le C [W]_{A_{1}}||X^{*}_{\infty }||_{L^1(W)}. \end{aligned}$$

While this result is valid for arbitrary filtrations, it cannot be extended to \(A_q\) weights for \(q>1\) (see the last section in [5]). Hence in the general setting with \(A_{\infty }\) weights with respect to arbitrary filtrations, the \(L^p\) bound (1.6) is not contained in known maximal estimates.

Remark 1.9

All the above results are valid also for discrete-time martingales f and g satisfying the discrete differential subordination, that is \(|g_0|\le |f_0|\) and \(|g_n-g_{n-1}| \le |f_n-f_{n-1}|\) for \(n=1,2,\ldots \) Indeed, this follows immediately from the continuous-time estimate for the càdlàg martingales \(X_{t}=f_{\lfloor {t}\rfloor }\) and \(Y_{t}=g_{\lfloor {t}\rfloor }\). Proving first discrete-time estimates and then obtaining continuous-time analogues would not be that straightforward because differential subordination is not preserved when sampling a martingale.

Remark 1.10

Observe that in Theorem 1.4 the value of \(\lambda \) is limited to the range [0, 1/(28c)]. This restriction is necessary - while for the unweighted version the exponential bound was valid for all \(\lambda \in [0,1)\), in the weighthed setting this is no longer the case. Intuitively, considering more irregular weights (that is with higher weight characteristic \([W]_{A_{\infty }}\)), the interval for which the weighted Laplace transform is bounded shrinks as \([W]^{-1}_{\infty }\) tends to 0. Moreover the weighted exponential bound is sharp in the following sense: There are no constants \(C_1,C_2 > 0\) and \(\varepsilon > 0\) such that

$$\begin{aligned} {\mathbb {E}}\exp (\lambda Y_{\infty })W \le C_2 \exp (\lambda ){\mathbb {E}}W \end{aligned}$$

for all martingales as in Theorem 1.4 and all \(\lambda \in [0,C_{1}[W]_{\infty }^{-1 + \varepsilon }]\). Indeed, arguing as before, this stronger exponential bound would imply the following stronger version of the moment inequality (1.6):

$$\begin{aligned} ||Y_{\infty }||_{L^p(W)} \le C p [W]_{A_{\infty }}^{1 - \varepsilon / p}({\mathbb {E}}W)^{1/p}. \end{aligned}$$

However the sublinear dependence on the weight characteristic in the above inequality would contradict the known examples (see the last section in [13] for the construction of such an example based on the Haar system).

The rest of the paper is organized as follows. In the next section, we present a proof of the unweighted exponential inequality. Then, in Section 3, we introduce a special function F of three variables that will allow us later to pass from the unweighted to the weighted case. The final section is dedicated to prove the main result, the weighted exponential inequality. We will introduce a new Bellman function B of four variables, obtained through composition of the two simpler special functions. We will describe how the concavity properties of the auxiliary functions allow us to control jumps of general càdlàg martingales.

2 Sharp unweighted exponential inequality

In this section, we will establish the unweighted exponential inequality (1.2) in Theorem 1.3. The proof will rest on constructing a very simple Bellman function of two variables. We will apply a standard method invented by Burkholder [6] and extended by Wang [15] to the continuous-time setting. A key to handle this problem is to construct a \(C^{2}\) function \(U_{\lambda }: [-1,1]\times {\mathbb {R}}\rightarrow {\mathbb {R}}\) which satisfies the following conditions:

\(1^{\circ }\):

(Initial condition) For every \(|y|\le |x|\), we have \(U_{\lambda }(x,y) \le C_{\lambda }\).

\(2^{\circ }\):

(Majorization property) We have

$$\begin{aligned} U_{\lambda }(x,y) \ge \exp (\lambda y). \end{aligned}$$
\(3^{\circ }\):

(Concavity-type property) There is a nonnegative function A on \([-1,1]\times {\mathbb {R}}\) such that for any \((x,y) \in [-1,1] \times {\mathbb {R}}\) and any \(d,e \in {\mathbb {R}}\), we have

$$\begin{aligned} \langle D^{2}U_{\lambda }(x,y)(d,e),(d,e) \rangle \le A(x,y)(e^2 - d^2). \end{aligned}$$
(2.1)

In what follows we will sometimes omit the subscript \(\lambda \) in \(U_{\lambda }\) for more readability. The connection between existence of such a function and the validity of (1.2) is described in two lemmas below.

Lemma 2.1

If there is a function \(U_{\lambda }\) satisfying condition \(3^{\circ }\), then the stochastic process \(\left( U_{\lambda }(X_t,Y_t) \right) _{t \ge 0}\) is a supermartingale for all XY as above.

Proof

The argument rests on Itô’s formula. Consider a process \(Z=(X,Y)\). Let \(0 \le s < t\). Since \(U_{\lambda }\) is of class \(C^{2}\), we may write

$$\begin{aligned} U(Z_t) = I_0 + I_1 + I_2/2 + I_3, \end{aligned}$$
(2.2)

where

$$\begin{aligned} I_0&= U(Z_s), \\ I_1&= \int \limits _{s+}^{t}U_{x}(Z_u)\text {d}X_u + \int \limits _{s+}^{t}U_{y}(Z_u)\text {d}Y_u, \\ I_2&= \int \limits _{s+}^{t}D^{2}U(Z_u)\text {d}[Z^c]_{u}, \\ I_3&=\sum _{s<u\le t}(U(Z_u)-U(Z_{u-})-U_x(Z_{u-})\Delta X_u - U_y(Z_{u-})\Delta Y_u). \end{aligned}$$

Here \(\xi ^c\) denotes the continuous part of a semimartingale \(\xi \), \(D^{2}U\) is the Hessian matrix of U, and in the definition of \(I_2\), we have used the shortened notation for the sum of all second-order terms. Let us study the properties of the terms \(I_1\), \(I_2\), and \(I_3\). The conditional expectation of \(I_1\) with respect to \({\mathcal {F}}_s\) is zero by the properties of stochastic integrals. To handle the next term, note that \(3^{\circ }\) implies that \(I_2 \le 0\) by a straightforward approximation of the integral by Riemann sums. Indeed, pick an arbitrary partition \(s<s_0<s_1<\cdots<s_N<t\) and an integer \(i \in \{0,\ldots ,N-1 \}\). Next, set \(\Delta X^c=X^c_{s_{i+1}}-X^c_{s_{i}}\) and \(\Delta Y^c=Y^c_{s_{i+1}}-Y^c_{s_{i}}\). By the concavity condition \(3^{\circ }\),

$$\begin{aligned} \bigg \langle D^{2}U(Z_{s_i})(\Delta X^c,\Delta Y^c), (\Delta X^c, \Delta Y^c) \bigg \rangle \le A(Z_{u-}) \left( |\Delta Y^c|^2 - |\Delta X^c|^2 \right) . \end{aligned}$$

Summing over all i and letting the diameter of the partition \((s_i)\) go to zero, one obtains

$$\begin{aligned} I_2 \le \int \limits _{s+}^{t} A(Z_{u-})\text {d}([Y^c]_{u}-[X^c]_{u}) \le 0 \end{aligned}$$

by the diferential subordination and the fact that A is nonnegative. The property \(3^{\circ }\) also implies that \(I_3 \le 0\). Hence by taking conditional expectations of both sides of (2.2), we obtain that

$$\begin{aligned} {\mathbb {E}}(U_{\lambda }(X_t,Y_t)|{\mathcal {F}}_{s}) \le U_{\lambda }(X_s,Y_s). \end{aligned}$$

\(\square \)

Lemma 2.2

If there is a function \(U_{\lambda }\) satisfying conditions \(1^{\circ }\), \(2^{\circ }\), and \(3^{\circ }\), then the inequality (1.2) holds true for all X, Y as above.

Proof

By the previous lemma, \(\left( U_{\lambda }(X_t,Y_t) \right) _{t \ge 0}\) is a supermartingale. From this fact and \(1^{\circ }\) and \(2^{\circ }\), we immediately obtain that for \(t < \infty \),

$$\begin{aligned} {\mathbb {E}}\exp (\lambda Y_t) \le {\mathbb {E}}U_{\lambda }(X_t,Y_t) \le {\mathbb {E}}U_{\lambda }(X_0,Y_0) \le C_{\lambda }. \end{aligned}$$

It remains to apply Fatou’s lemma to establish that

$$\begin{aligned} {\mathbb {E}}\exp (\lambda Y_{\infty }) \le \lim _{t \rightarrow \infty } {\mathbb {E}}\exp (\lambda Y_{t}) \le C_{\lambda }. \end{aligned}$$

\(\square \)

Let us make an important comment.

Remark 2.3

The \(C^2\) condition can be relaxed. It is sufficient that the function U is of class \(C^2\) and satisfies \(3^{\circ }\) outside the line \(x=0\) and satisfies the symmetry conditions \(U(x,y)=U(-x,y)\) and \(U_{x}(0,y)=0\). This can be proved by a standard mollification argument (consult e.g Domelevo and Petermichl [9] or Wang [15]).

The function \(U_\lambda \) is given by the simple formula

$$\begin{aligned} U_{\lambda }(x,y) = \exp (\lambda (|x|+y-1))\frac{1-\lambda |x|}{1-\lambda }. \end{aligned}$$

We have the following theorem.

Theorem 2.4

The function U satisfies conditions \(1^{\circ }, 2^{\circ }\) and the condition \(3^{\circ }\) for \(x \ne 0\). Moreover \(U_{x}(0,y)=0\) for every \(y \in {\mathbb {R}}\).

Proof

All the required properties follow from straightforward differentiation. The Hessian in \(3^{\circ }\) for \(x >0\) is equal to

$$\begin{aligned} D^{2} U(x,y) = \frac{\lambda ^2 \exp (\lambda (x+y-1))}{1-\lambda } \begin{pmatrix} -1-\lambda x &{} -\lambda x \\ - \lambda x &{} 1 - \lambda x \end{pmatrix}. \end{aligned}$$

Hence, the condition \(3^{\circ }\) is satisfied with

$$\begin{aligned} A(x,y) = \frac{\lambda ^2 \exp (\lambda (x+y-1))}{1-\lambda }. \end{aligned}$$

\(\square \)

Remark 2.5

It can be shown that the constant \(C_{\lambda }\) is optimal in (1.2). Moreover, the function \(U_{\lambda }\) is the smallest function satisfying conditions \(1^{\circ }\), \(2^{\circ }\) which is concave along any line of slope in \([-1,1]\).

In what follows, we will need the following additional property of U. This will be crucial to establish the main inequality (1.5) for processes with jumps.

Lemma 2.6

Fix \(\lambda \in (0,1/2)\). Let \((x,y),(x_1,y_1) \in [-1,1] \times {\mathbb {R}}\) satisfy the condition

$$\begin{aligned} |y-y_1| \le |x-x_1|. \end{aligned}$$

There exists L depending only on y and \(\lambda \) such that both values \(U_{\lambda }(x,y)\) and \(U_{\lambda }(x_1,y_1)\) are in the interval [L, 16L].

Proof

Clearly, the points (xy) and \((x_1,y_1)\) are in the set \(S_{y} := [-1,1] \times [y-2,y+2]\). From the monotonicity of \(U_{\lambda }\), we have that

$$\begin{aligned} \inf _{(\mathbf{x },\mathbf{y }) \in S_y} U_{\lambda }(\mathbf{x },\mathbf{y })= & {} U_{\lambda }(1,y-2)=\exp (\lambda (y-2)),\\ \sup _{(\mathbf{x },\mathbf{y }) \in S_y} U_{\lambda }(\mathbf{x },\mathbf{y })= & {} U_{\lambda }(0,y+2)=\exp (\lambda (y+1))/(1-\lambda ). \end{aligned}$$

Hence

$$\begin{aligned} \frac{\sup _{(\mathbf{x },\mathbf{y }) \in S_y} U_{\lambda }(\mathbf{x },\mathbf{y })}{\inf _{(\mathbf{x },\mathbf{y }) \in S_y} U_{\lambda }(\mathbf{x },\mathbf{y })} = \frac{\exp (3 \lambda )}{1- \lambda } \le 2 \exp (3/2) \le 9 \le 16. \end{aligned}$$

Now pick \(L=\inf _{(\mathbf{x },\mathbf{y }) \in S_y} U_{\lambda }(\mathbf{x },\mathbf{y })\). For every \((\mathbf{x },\mathbf{y }) \in S_y\), we have

$$\begin{aligned} L \le U_{\lambda }(\mathbf{x },\mathbf{y }) \le 16 L. \end{aligned}$$

This completes the proof of the lemma. \(\square \)

3 A special function of three variables and its properties

In this section, we will introduce the function of three variables F which composed with the unweighted function U from the previous section will allow us to pass from the unweighted to the weighted setting. We will also prove size and concavity properties of this function which will be needed to establish the main result in the last section.

Let \(c \ge 1\) be fixed. Introduce the parameters \(a=3/4, \alpha =1-1/(4c)\), \(\beta =1/(14c)\) and define the domain

$$\begin{aligned} {\mathcal {D}}_c=\{ (w,v) \in {\mathbb {R}}_{+} \times {\mathbb {R}}: 1 \le we^{-v} \le c \}. \end{aligned}$$

The function \(F:{\mathbb {R}}_{+}\times {\mathcal {D}}_{c} \rightarrow {\mathbb {R}}\) is given by

$$\begin{aligned} F(z,w,v)=z^{\beta }\frac{(we^{-v}-a)^{\alpha }}{we^{-v}}w. \end{aligned}$$

We will need the following size property of this object:

Lemma 3.1

For any \((z,w,v) \in {\mathbb {R}}_{+} \times {\mathcal {D}}_c\) with \(1\le we^{-v}\le c\), we have

$$\begin{aligned} \frac{1}{4}wz^{\beta } \le F(z,w,v) \le wz^{\beta }. \end{aligned}$$
(3.1)

Proof

We must show that

$$\begin{aligned} \frac{1}{4} \le \frac{(we^{-v}-a)^{\alpha }}{we^{-v}} \le 1. \end{aligned}$$

Firstly observe that the function \(t \longmapsto (t-a)^{\alpha }/t\) is increasing. Indeed, we have

$$\begin{aligned} \left( \frac{(t-a)^{\alpha }}{t} \right) '=\frac{(t-a)^{\alpha -1}((\alpha -1)t+a)}{t^{2}} \ge 0. \end{aligned}$$

Thus, the assertion follows from the trivial estimates \(1/4 \le (1-a)^{\alpha }\) and \((c-a)^{\alpha }/c \le 1\). \(\square \)

Next, we turn our attention to concavity-type properties of F. In the next lemma, we will establish local concavity.

Lemma 3.2

The function F is locally concave on the domain \({\mathbb {R}}_{+} \times {\mathcal {D}}_{2.1c}\).

Proof

For brevity, set \(\phi (t) = (t-a)^{\alpha }\) for \(t \ge a\). We will also write \(t=we^{-v}\). The proof rests on Sylvester’s criterion. First, observe that \(F_{ww}(u,w,v) = u^{\beta }e^{-v} \phi ''(t)\) is negative because the function \(\phi \) is concave. Next, since we have \(F_{wv}(u,w,v)=-u^{\beta }we^{-v} \phi ''(t)\) and \( F_{vv}(u,w,v)=u^{\beta }e^{v}\phi (t)+u^{\beta }w^{2}e^{-v}\phi ''(t)-u^{\beta }w\phi '(t)\), we derive that

$$\begin{aligned} \det \begin{pmatrix} F_{ww} &{} F_{wv}\\ F_{vw} &{} F_{vv} \end{pmatrix} = u^{2\beta }\phi (t)\phi ''(t)-u^{2\beta }t\phi '(t)\phi ''(t)=u^{2\beta }\phi ''(t)(\phi (t)-t\phi '(t)). \end{aligned}$$

This is positive because the expression \(\phi (t)-t\phi '(t)=(t-a)^{\alpha -1}(t(1-\alpha )-a)\) is negative when \(t \le 2.1c\). It remains to show that the determinant of the full Hessian is nonpositive:

$$\begin{aligned} \det \begin{pmatrix} F_{ww} &{} F_{wv} &{} F_{wu}\\ F_{vw} &{} F_{vv} &{} F_{vu}\\ F_{uw} &{} F_{uv} &{} F_{uu} \end{pmatrix} \le 0. \end{aligned}$$

Add to the second column the first column multiplied by w; then add to the second row the first row multiplied by w. Then the above inequality amounts to saying that

$$\begin{aligned} \det \begin{pmatrix} u^{\beta }e^{-v}\phi ''(t) &{} 0 &{} \beta u^{\beta -1}\phi '(t)\\ 0 &{} e^{v}u^{\beta }(\phi (t)-t\phi '(t)) &{} \beta u^{\beta -1}\phi (t)e^{v}\\ \beta u^{\beta -1}\phi '(t) &{} \beta u^{\beta -1}\phi (t)e^{v} &{} \beta (\beta -1)u^{\beta -2}e^{v}\phi (t) \end{pmatrix}\le 0. \end{aligned}$$

Observe that powers of u and the factors \(e^{v}\), \(e^{-v}\) do not change the sign of the determinant. The above inequality is equivalent to

$$\begin{aligned}&\det \begin{pmatrix} \phi ''(t) &{} 0 &{} \beta \phi '(t)\\ 0 &{} \phi (t)-t\phi '(t) &{} \beta \phi (t)\\ \beta \phi '(t) &{} \beta \phi (t) &{} \beta (\beta -1)\phi (t) \end{pmatrix} \\&\quad = \beta (\beta -1)\phi (t)\phi ''(t)(\phi (t)-t\phi '(t)) - \beta ^{2}(\phi '(t))^{2}(\phi (t)-t\phi '(t)) \\&\qquad -\beta ^{2}(\phi (t))^{2}\phi ''(t) \le 0, \end{aligned}$$

or, after some manipulations, \((\alpha -1)^{2}t + (\alpha + \beta \alpha -1)a \le 0\). This inequality is the strongest when \(t=2.1c\) and then reads

$$\begin{aligned} \beta \le \frac{(1-\alpha )((\alpha -1)2.1c+a)}{a\alpha }. \end{aligned}$$

This is easily shown to be true: simply plug in the formulas for \(\alpha \) and \(\beta \). \(\square \)

Local concavity would be sufficient to establish the main estimate under the additional path-continuity assumptions. Because we are also interested in processes with jumps, we will need the following stronger concavity.

Lemma 3.3

For every \(L>0\), the function F is concave on the set \([L,16L] \times {\mathcal {D}}_c\), that is: for any (zwv), \((z_1,w_1,v_1) \in [L,16L]\times {\mathcal {D}}_c\), we have

$$\begin{aligned} F(z,w,v)&\le F(z_1,w_1,v_1) + F_{z}(z_1,w_1,v_1)(z-z_1) + F_{w}(z_1,w_1,v_1)(w-w_1) \\&\quad + F_{v}(z_1,w_1,v_1)(v-v_1). \end{aligned}$$

Proof

Because F is homogeneous, without loss of generality, we may and do assume that \(L=1\). It is sufficient to show that F can be extended to a locally concave function \({\bar{F}}\) on a convex domain \(\text {conv}([1,16]\times {\mathcal {D}}_{c}) = [1,16]\times \mathcal {D}_{\infty }\), where

$$\begin{aligned} {\mathcal {D}}_{\infty }=\{ (w,v) \in {\mathbb {R}}_{+} \times {\mathbb {R}}: we^{-v} \ge 1 \}. \end{aligned}$$

Let us denote

$$\begin{aligned} A=\left( \frac{1}{2} \left( \frac{2c-a}{c-a} \right) ^{\alpha } \right) ^{1/ \beta } \end{aligned}$$

and

$$\begin{aligned} K=A^{\beta } \frac{(c-a)^{\alpha }}{c}. \end{aligned}$$

Let \(\xi :[1,A] \rightarrow {\mathbb {R}}\) be a function defined by

$$\begin{aligned} \xi (z)= \inf \left\{ t\ge 1: z^{\beta } \frac{(t-a)^{\alpha }}{t} \ge K \right\} . \end{aligned}$$

Because the function \(t \mapsto (t-a)^{\alpha }/t\) is increasing on [1, 3c], we have that \(\xi \) is decreasing and \(\xi (1)=2c\), \(\xi (A)=c\).

Now we are ready to define the function \({\bar{F}}: [1,A] \times {\mathcal {D}}_{\infty } \rightarrow {\mathbb {R}}\) by

$$\begin{aligned} {\bar{F}}(z,w,v) = {\left\{ \begin{array}{ll} z^{\beta }\frac{(we^{-v}-a)^{\alpha }}{we^{-v}}w &{} \text {on}\quad D^1, \\ Kw &{} \text {on}\quad D^2, \end{array}\right. } \end{aligned}$$

where

$$\begin{aligned} D^{1} = \{ (z,w,v) \in [1,A] \times {\mathbb {R}}_{+} \times {\mathbb {R}}: 1 \le we^{-v} \le \xi (z) \}, \end{aligned}$$

and

$$\begin{aligned} D^{2} = \{ (z,w,v) \in [1,A] \times {\mathbb {R}}_{+} \times {\mathbb {R}}: we^{-v} > \xi (z) \}, \end{aligned}$$

We will check that \({\bar{F}}\) is concave. Since its domain is a convex set, it is sufficient to establish local concavity. Of course, the linear function Kw is concave. Concerning \(D^{1}\), the function \(z^{\beta }\frac{(we^{-v}-a)^{\alpha }}{we^{-v}}w\) is locally concave on \({\mathcal {D}}_{2.1 c}\) by the previous lemma. Since \(D^1 \subset [1,A] \times {\mathcal {D}}_{2c} \subset [1,A] \times {\mathcal {D}}_{2.1 c}\), we have that \({\bar{F}}\) is locally concave on \(D^{1} \setminus (D^{1} \cap D^{2})\). It remains to check local concavity at the boundary \(D^{1} \cap D^{2}\). It follows from the construction: every point has a neighbourhood such that \({\bar{F}}\) restricted to this set is a minimum of two concave functions.

To complete the proof of the lemma, observe that \(F={\bar{F}}\) on \([1,A] \times {\mathcal {D}}_c\) and

$$\begin{aligned} A \ge \left( 2^{-1/4}e^{3/8} \right) ^{14} \ge 16. \end{aligned}$$

\(\square \)

Remark 3.4

Restriction of the domain to some interval in the above lemma is necessary. More precisely, it can be shown that the function F does not satisfy the full concavity condition in the entire set \({\mathbb {R}}_{+} \times {\mathcal {D}}_c\).

4 Burkholder’s function of four variables

We are ready to prove the main result Theorem 1.4. The Bellman function B will be defined as a composition of the function F with the unweighted function U. A similar construction appeared in the paper [13] by Osękowski about weak-type weighted inequalities. The important difference and novelty in our approach is the stronger concavity condition (Lemmas 2.6 and 3.3) which will allow us to handle problematic jump parts of processes - note that in [13] the path-continuity of the weight process \((W_t)_{t \ge 0}\) was crucial. Fix \(c \ge 1\) and \(\lambda \in (0,\beta /2)\), where \(\beta = 1/(14c)\) is as defined at the beginning of Section 3. Define the function \(B: [-1,1] \times {\mathbb {R}}\times {\mathcal {D}}_c \rightarrow {\mathbb {R}}\) by the formula

$$\begin{aligned} B(x,y,w,v) = 4 F\left( U_{\lambda /\beta }(x,y),w,v)\right) . \end{aligned}$$

In the next two lemmas, we will establish the initial and majorization properties of the function B.

Lemma 4.1

For every \(|y| \le |x|\) and \((w,v) \in {\mathcal {D}}_{c}\), we have

$$\begin{aligned} B(x,y,w,v) \le 4 \exp (\lambda )w. \end{aligned}$$

Proof

We know that for \(|y|\le |x|\), the function \(U_{\lambda /\beta }\) satisfies the initial condition, that is:

$$\begin{aligned} U_{\lambda /\beta }(x,y) \le \exp \left( \lambda /\beta \right) . \end{aligned}$$

It remains to apply Lemma 3.1 to obtain that

$$\begin{aligned} B(x,y,w,v) = 4 F\left( U_{\lambda /\beta }(x,y),w,v \right) \le 4 \left( U_{\lambda /\beta }(x,y) \right) ^{\beta }w&\le 4C_{\lambda /\beta }^{\beta } w \\&= 4\exp (\lambda )w. \end{aligned}$$

\(\square \)

Lemma 4.2

We have

$$\begin{aligned} B(x,y,w,v) \ge \exp (\lambda y)w. \end{aligned}$$

Proof

This is an immediate consequence of Lemma 3.1 and the majorization property of U. Indeed, we have

$$\begin{aligned} B(x,y,w,v)=4F(U_{\lambda /\beta }(x,y),w,v) \ge \left( U_{\lambda /\beta }(x,y) \right) ^{\beta }w \ge \exp (\lambda y)w. \end{aligned}$$

\(\square \)

Proof of Theorem 1.4

We will use the following useful interpretation of \(A_{\infty }\) weights. Fix such a weight W and let \(c=[W]_{A_\infty }\). Furthermore, let \(V=(V_t)_{t \ge 0}\) be the martingale given by \(V_t ={\mathbb {E}}\left( \log (W_{\infty })| {\mathcal {F}}_t \right) \), \(t \ge 0\). Note that Jensen’s inequality implies \(W_t e^{-V_t} \ge 1\) almost surely; furthermore, the condition \(A_{\infty }\) is equivalent to the reversed bound

$$\begin{aligned} W_{t}e^{-V_t} \le c. \end{aligned}$$

In other words, an \(A_{\infty }\) weight of characteristic equal to c gives rise to a two-dimensional martingale (WV) taking values in the domain \({\mathcal {D}}_c\). Moreover, this martingale terminates at the upper boundary of this domain: \(W_{\infty }e^{-V_{\infty }}=1\) almost surely.

We are ready for the proof of the main estimate (1.5). Let XYW be martingales as in the statement of Theorem 1.4 and let \(\lambda \in (0,\beta /2)\). Then, from Lemma 2.1, the process \(Z_t=U_{\lambda /\beta }(X_t,Y_t)\) is a supermartingale; let \(Z=Z_0+M+A\) be the Doob–Meyer decomposition for Z (cf. [8]). Let us also consider the auxiliary process \(\xi _t=(Z_t,W_t,V_t), t \ge 0\), where V is given as above. The function F is of class \(C^{\infty }\) (more precisely, it extends to a \(C^{\infty }\) function on some open set containing \({\mathbb {R}}_{+} \times {\mathcal {D}}_c\)), so we are allowed to apply Itô’s formula to obtain

$$\begin{aligned} F(\xi _t) = I_0 + I_1 + I_2 + I_3/2 + I_4, \end{aligned}$$
(4.1)

where

$$\begin{aligned} I_0&= F(\xi _0), \\ I_1&= \int \limits _{0+}^{t}F_{z}(\xi _{s-})\text {d}M_s + \int \limits _{0+}^{t}F_{w}(\xi _{s-})\text {d}W_s + \int \limits _{0+}^{t}F_{v}(\xi _{s-})\text {d}V_s, \\ I_2&= \int \limits _{0+}^{t}F_z(\xi _{s-})\text {d}A_s, \\ I_3&= \int \limits _{0+}^{t}D^{2}F(\xi _{s-})\text {d}[Z^c,W^c,V^c]_{s}, \\ I_4&=\sum _{0<s\le t} \left( F(\xi _{s}) - F(\xi _{s-}) - F_{z}(\xi _{s-})\Delta Z_s - F_{w}(\xi _{s-})\Delta W_s - F_{v}(\xi _{s-})\Delta V_s \right) , \end{aligned}$$

where \(I_3\) is the abbreviated form of the sum of all second-order terms. Let us analyze the summands \(I_1\)\(I_4\). The stochastic integrals in \(I_1\) have expectation zero. The process A coming from the Doob–Meyer decomposition is nondecreasing and \(F_z \ge 0\), so the term \(I_2\) is nonpositive. We also have \(I_3 \le 0\), which follows from the local concavity of F (Lemma 3.2) using again a standard approximation by Riemann-type sums. It remains to handle the last term. We will show that for every \(\omega \in \Omega \), each summand in \(I_4\) is nonpositive. From differential subordination and Lemma 2.6, observe that there exists \(L>0\) such that the values \(U_{\lambda /\beta }(X_{s-}(\omega ),Y_{s-}(\omega )), U_{\lambda /\beta }(X_{s}(\omega ),Y_{s}(\omega ))\) are in the interval [L, 16L]. From this and Lemma 3.3, we have that

$$\begin{aligned} F(\xi _s(\omega ))&= F(U_{\lambda /\beta }(X_s(\omega ),Y_s(\omega )),W_s(\omega ),V_s(\omega ))\\&\le F(\xi _{s-}(\omega ))+ F_{z}(\xi _{s-}(\omega )) \Delta Z_{s}(\omega ) + F_{w}(\xi _{s-}(\omega )) \Delta W_{s}(\omega ) \\&\quad + F_{v}(\xi _{s-}(\omega )) \Delta V_{s}(\omega ). \end{aligned}$$

Hence \(I_{4}\le 0\). By taking expected values of both sides of (4.1), we obtain that

$$\begin{aligned} {\mathbb {E}}F(\xi _t) \le {\mathbb {E}}F(\xi _0). \end{aligned}$$

Combining this with Lemmas 4.1 and 4.2, we get that

$$\begin{aligned} {\mathbb {E}}\exp (\lambda Y_{t})W_{t}&\le {\mathbb {E}}B(X_t,Y_t,W_t,V_t) = {\mathbb {E}}4F(\xi _t) \le {\mathbb {E}}4F(\xi _0)\\&= 4 {\mathbb {E}}B(X_0,Y_0,W_0,V_0) \le 4 \exp (\lambda ) {\mathbb {E}}W. \end{aligned}$$

It remains to apply the fact that \(W_{t}={\mathbb {E}}(W|{\mathcal {F}}_t)\) and Fatou’s lemma to complete the proof. \(\square \)