Concentration inequalities for additive functionals: A martingale approach

doi:10.1016/j.spa.2021.01.004

Stochastic Processes and their Applications

Volume 135, May 2021, Pages 103-138

https://doi.org/10.1016/j.spa.2021.01.004 Get rights and content

Abstract

This work shows how exponential concentration inequalities for additive functionals of stochastic processes over a finite time interval can be derived from concentration inequalities for martingales. The approach is entirely probabilistic and naturally includes time-inhomogeneous and non-stationary processes as well as initial laws concentrated on a single point. The class of processes studied includes martingales, Markov processes and general square integrable càdlàg processes. The general approach is complemented by a simple and direct method for martingales, diffusions and discrete-time Markov processes. The method is illustrated by deriving concentration inequalities for the Polyak–Ruppert algorithm, SDEs with time-dependent drift coefficients “contractive at infinity” with both Lipschitz and squared Lipschitz observables, some classical martingales and non-elliptic SDEs.

Introduction

In this work we consider concentration inequalities for additive functionals of the form $\int_{0}^{T} X_{t} d t$ where $X$ is a real-valued stochastic process. The methods we develop apply to a broad class of processes, and we will give theorems and examples that go beyond the classical setting of stationary Markov processes. We will treat in depth the cases where $X$ is a martingale or $X_{t} = f (t, Y_{t})$ for a Markov process $Y$ and $f$ in an appropriate class of functions. The concentration inequalities will be derived for the additive functionals centered around their expectation, which allows us to naturally treat non-stationary processes such as time-inhomogeneous diffusions.

We proceed to give an overview of the main results. In Section 2 we derive some representative results using short, self-contained proofs based on direct calculations. First, we show in Proposition 2.1 that for any continuous local martingale $X$ such that $X_{0} = 0$ we have the following concentration inequality for any $T, R > 0$ : $P (\int_{0}^{T} X_{u} d u \geq R; \int_{0}^{T} {(T - u)}^{2} d {[X]}_{u} \leq σ^{2}) \leq exp (- \frac{R^{2}}{2 σ^{2}}) .$ To the author’s knowledge, the systematic treatment of concentration inequalities for additive functionals of martingales has not appeared in the literature before.

We will then move on to solutions to SDEs of the form $d X_{t} = b (t, X_{t}) d t + σ d B_{t},$ with $X_{0}$ deterministic. In particular, denoting $P_{s, t}$ the Markov transition operator associated to $X$ , we will show in Corollary 2.7 that if there exist constants $c, κ > 0$ such that $| σ^{⊤} \nabla P_{t, u} f (x) | \leq c e^{- κ (u - t)}, x \in R^{n}, 0 \leq t \leq u$ then we have the following Gaussian concentration inequality for all $R, T > 0$ : $P (\frac{1}{T} \int_{0}^{T} f (u, X_{u}) d u - E [\frac{1}{T} \int_{0}^{T} f (u, X_{u}) d u] \geq R) \leq exp (- \frac{κ^{2} R^{2} T}{2 c^{2}}) .$ The main novelty in the corollary is the treatment of time-inhomogeneous SDEs and the method of proof. The Proposition from which the corollary is derived also provides a novel, refined statement in terms of a bound of $| σ^{⊤} \nabla P_{t, u} f |$ along trajectories of $X$ .

Section 2 concludes with the case of discrete-time processes in Proposition 2.12, again giving careful consideration to the time-inhomogeneous case and controlling the relevant quantities along trajectories of the process. Concretely, we show that for a discrete-time stochastic process $X_{t}$ and function $f$ such that $| X_{t} - X_{t - 1} | \leq C_{t}, t \geq 1, | P_{s, t} f (x) - P_{s, t} f (y) | \leq σ_{s} {(1 - κ_{s})}^{t - s} | x - y |, x, y \in R^{n}, 0 \leq s \leq t .$ we have $P (\sum_{u = 1}^{t} f (u, X_{u}) - E [\sum_{u = 1}^{t} f (u, X_{u})] \geq R; \sum_{t = 1}^{T} \frac{σ_{t}^{2} C_{t}^{2}}{κ_{t}^{2}} \leq a^{2}) \leq exp (- \frac{R^{2}}{8 a^{2}}) .$ The careful treatment of the time-inhomogeneous case and control on the level of trajectories enables in particular for the first time the derivation of concentration inequalities for the Polyak–Ruppert algorithm of the correct order in Section 4.1 (concentration inequalities for the linear case were published concurrently with this work in [30]).

Section 3 is dedicated to a wide-ranging generalization of the results from the previous section. In Section 3.1 we introduce a family of auxiliary martingales $Z_{t}^{u} = E^{F_{t}} X_{u}$ and show that for general square integrable processes, the concentration properties of $\int_{0}^{T} X_{u} d u$ are intimately linked to the predictable quadratic covariation $〈 Z^{u}, Z^{v} 〉$ and jumps $Δ Z^{u}$ of the auxiliary martingales. In Section 3.2 we recover and extend the martingale results from Section 2 to the discontinuous setting using the general method. In Section 3.3 we apply the general method to general Markov processes and recover expressions for $〈 Z^{u}, Z^{v} 〉$ in terms of the squared field operator $Γ$ , generalizing the results on Markov processes from Section 2 to general Markov processes on Polish spaces. All of the results from the preceding subsections are novel in their generality. We conclude Section 3 with Section 3.4 where we show how to incorporate arbitrary distributions of $X_{0}$ and recall a number of martingale inequalities. The subsection also includes in Corollary 3.16 a novel approach to obtain Bernstein-type inequalities in some “self-bounding” cases.

The final Section 4 illustrates how to apply the results from the preceding section on a number of concrete cases. Section 4.1 contains the novel results on Polyak–Ruppert mentioned above.

Section 4.2 provides a concrete example of an SDE case with explicit conditions on the drift and diffusion coefficients as well as the observable function. Using known results on gradient bounds for $P_{s, t}$ when the drift coefficient is “contractive at infinity” characterized by constants $ρ, κ$ , we show that for any initial law $ν$ satisfying a $T_{1} (C)$ transport inequality and Lipschitz function $f$ , we have $P (\frac{1}{T} \int_{0}^{T} f (X_{t}) d t - \frac{1}{T} \int_{0}^{T} μ_{t} (f) d t \geq R + ρ ‖ f ‖_{Lip} \frac{1 - e^{- κ T}}{κ T} W_{1} (μ_{0}, ν)) \leq exp (- \frac{κ^{2} R^{2} T}{2 ρ^{2} ‖ f ‖_{Lip}^{2} (1 + C \frac{1 - e^{- κ T}}{T})})$ for a unique evolution system of measures $μ_{t}$ (if the process has a stationary measure $μ$ then $μ_{t} = μ$ for all $t$ ).

Section 4.3 provides some concrete examples using classical martingales as integrands: Brownian motion, the compensated Poisson process and compensated squared Brownian motion $B_{t}^{2} - t$ .

Sections 4.4 Squared Ornstein–Uhlenbeck process, 4.4 Squared Ornstein–Uhlenbeck process treat the cases where the integrand $X$ is either the squared Ornstein–Uhlenbeck process or more generally the square of a Lipschitz function. These cases go beyond the scope of most previously published approaches to concentration inequalities for additive functionals. The final Section 4.6 presents a simple case of a highly non-elliptic SDE, which yields easily to the probabilistic methods presented here but is outside of the scope of previous approaches based for example on Poisson equations.

About the literature. In the Markovian setting, our approach is most closely related to the work of Joulin [25], and we recover and extend the results from that work (Proposition 3.7, Proposition 4.10). The cases of martingales and general square integrable processes do not seem to have been systematically studied in the literature.

Most previous results on concentration inequalities for functionals of the form $S_{t}$ have been obtained for time-homogeneous Markov processes using functional inequalities. The works [7], [19], [21] require the existence of a stationary measure and an initial distribution that has an integrable density with respect to the stationary measure. The same holds true for the combinatorial and perturbation arguments in the classic paper [28]. In [40] the authors establish concentration inequalities around the expectation using stochastic calculus and Girsanov’s theorem under strong contractivity conditions. Some concentration inequalities for inhomogeneous functionals have previously been established in [20]. A different approach using renewal processes has been used in the work [29] to establish concentration inequalities for functionals with bounded integrands.

For Markov processes, the mixing conditions in this work are most naturally formulated in terms of bounds on either the Lipschitz seminorm, gradient or squared field (carré du champs) operator of the semigroup. Bounds on the Lipschitz seminorm are closely related to contractivity in the $L^{1}$ transportation distance and can for instance be found in [1], [15] for elliptic diffusions, in [39] for the Riemannian case, in [16] for Langevin dynamics or in [23] for stochastic delay equations. See also [26], [31] for the discrete-time case and a large number of examples in both discrete and continuous time. Gradient estimates for semigroups can be obtained using Bismut-type formulas, see for example [12], [17], [36], the works [9], [10] for the non-autonomous case as well as the textbook [37]. Finally, in terms of the squared field operator, our mixing conditions are a relaxation of the Bakry–Emery curvature-dimension condition [2] since we allow for a prefactor strictly greater than $1$ .

Notation. For a right-continuous process ${(X_{t})}_{t \geq 0}$ with left limits we write $X_{t -} = {lim}_{ε \to 0^{+}} X_{t - ε}$ and $Δ X_{t} = X_{t} - X_{t -}$ . For a $σ$ -field $F$ and a random variable $X$ , we denote $E^{F} X$ the conditional expectation of $X$ with respect to $F$ .

Section snippets

Continuous martingales

In this and the following subsections we will establish concentration inequalities by focusing on the cases of continuous local martingales, continuous solutions to SDEs and discrete-time stochastic processes, all with their initial law concentrated on a single point. In the first two cases, we provide alternative and more direct proofs of results that also follow from the general approach in Section 3. Compared to the general approach, the direct proofs also provide an explicit martingale

Processes bounded in $L^{2} (Ω)$

Consider a filtered probability space $(Ω, F, P, {(F_{t})}_{t \geq 0})$ satisfying the usual conditions from the general theory of semimartingales, meaning that $F$ is $P$ -complete, $F_{0}$ contains all $P$ -null sets in $F$ and $F_{t}$ is right-continuous. In this section ${(X_{t})}_{t \geq 0}$ will denote a real-valued stochastic process adapted to $F_{t}$ , bounded in $L^{2} (Ω)$ in the sense that ${sup}_{t} E X_{t}^{2} < \infty$ .

Define an adapted continuous finite-variation process ${(S_{t})}_{t \geq 0}$ by $S_{t} = \int_{0}^{t} X_{u} d u .$ Fix $T > 0$ and define a martingale ${(M_{t}^{T})}_{t \geq 0}$ by $M_{t}^{T} = E^{F_{t}} S_{T} .$ By the boundedness

Polyak–Ruppert averages

In this section, we use the notation $Δ X_{t} = X_{t} - X_{t - 1}$ for a discrete-time process $X$ . The symbols $t, s, u, T$ will always denote time variables taking values in $Z^{+}$ .

Consider the real-valued process $X$ defined by the recursion $X_{t} = X_{t - 1} - α_{t} g (X_{t - 1}, W_{t}), X_{0} = x$ with $x \in R$ , ${(α_{t})}_{t \in N}$ a sequence in $R$ , $W_{t}$ a sequence of independent identically distributed random variables with common law $μ$ such that $μ$ has compact support, and $g : R \times R \to R$ is a function such that $0 < m (w) \leq \partial_{x} g (x, w) \leq M (w) < \infty, x, w \in R$ for some functions $m : R \to R$ and $M : R \to R$ .

The

Declaration of Competing Interest

No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.spa.2021.01.004.

Funding

This work was supported by the National Research Fund, Luxembourg .

References (40)

DzhaparidzeK. et al.
On Bernstein-type inequalities for martingales
Stoch. Process. Appl.
(2001)
ElworthyK.D. et al.
Formulae for the derivatives of heat semigroups
J. Funct. Anal.
(1994)
GuillinArnaud
Moderate deviations of inhomogeneous functionals of Markov processes and application to averaging
Stoch. Process. Appl.
(2001)
KuwadaKazumasa
Duality on gradient estimates and Wasserstein controls
J. Funct. Anal.
(2010)
OllivierYann
Ricci curvature of Markov chains on metric spaces
J. Funct. Anal.
(2009)
WuLiming
Gradient estimates of Poisson equations on Riemannian manifolds and applications
J. Funct. Anal.
(2009)
AssarafRoland et al.
Computation of sensitivities for the invariant measure of a parameter dependent diffusion
Stoch. Partial Differ. Equ. Anal. Comput.
(2017)
BakryDominique et al.
BercuBernard et al.
BoucheronStéphane et al.
Concentration inequalities: a nonasymptotic theory of independence
(2013)

BrycWlodzimierz et al.

Large deviations for quadratic functionals of Gaussian processes

J. Theor. Probab.

(1997)

CattiauxPatrick et al.

Central limit theorems for additive functionals of ergodic Markov diffusions processes

ALEA Lat. Am. J. Probab. Math. Stat.

(2012)

CattiauxPatrick et al.

Deviation bounds for additive functionals of markov processes

ESAIM: Probab. Statist.

(2008)

CattiauxP. et al.

Semi Log-Concave Markov diffusions

ChengLi-Juan et al.

Evolution systems of measures and semigroup properties on evolving manifolds

Electron. J. Probab.

(2018)

ChengLi-Juan et al.

Exponential contraction in wasserstein distance on static and evolving manifolds

(2020)

CrisanD. et al.

Uniform in time estimates for the weak error of the Euler method for SDEs and a Pathwise Approach to Derivative Estimates for Diffusion Semigroups

(2019)

CrisanDan et al.

Pointwise gradient bounds for degenerate semigroups (of UFG type)

Proc. R. Soc. A

(2016)

Da PratoGiuseppe et al.

A note on evolution systems of measures for time-dependent stochastic differential equations

EberleAndreas

Reflection couplings and contraction rates for diffusions

Probab. Theory Related Fields

(2015)

Cited by (4)

Improving Hoeffding's inequality using higher moments information
2023, Statistics and Probability Letters
In this paper, we generalize and improve Hoeffding’s inequality using information on the random variables’ first $p$ moments for any fixed integer $p$ . Importantly, our generalized Hoeffding’s inequality is tighter than Hoeffding’s inequality and is given in a simple closed-form expression for any fixed integer $p$ . Hence, the generalized Hoeffding’s inequality is easy to use in applications. To prove our results, we derive novel upper bounds on the moment-generating function of a random variable that depend on the random variable’s first $p$ moments and show that these bounds satisfy appropriate convexity properties. We demonstrate the usefulness of the generalized Hoeffding’s inequality in obtaining refined confidence intervals when there is some information on the random variables’ high moments.
Tight high probability bounds for linear stochastic approximation with fixed stepsize
2021, arXiv
Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize
2021, Advances in Neural Information Processing Systems
Concentration inequalities using higher moments information
2020, arXiv

View full text

Concentration inequalities for additive functionals: A martingale approach

Abstract

Introduction

Section snippets

Continuous martingales

Processes bounded in L2(Ω)

Polyak–Ruppert averages

Declaration of Competing Interest

Funding

Stoch. Process. Appl.

J. Funct. Anal.

Stoch. Process. Appl.

J. Funct. Anal.

J. Funct. Anal.

J. Funct. Anal.

Computation of sensitivities for the invariant measure of a parameter dependent diffusion

Stoch. Partial Differ. Equ. Anal. Comput.

Concentration inequalities: a nonasymptotic theory of independence

Large deviations for quadratic functionals of Gaussian processes

J. Theor. Probab.

Central limit theorems for additive functionals of ergodic Markov diffusions processes

ALEA Lat. Am. J. Probab. Math. Stat.

Deviation bounds for additive functionals of markov processes

ESAIM: Probab. Statist.

Semi Log-Concave Markov diffusions

Evolution systems of measures and semigroup properties on evolving manifolds

Electron. J. Probab.

Exponential contraction in wasserstein distance on static and evolving manifolds

Uniform in time estimates for the weak error of the Euler method for SDEs and a Pathwise Approach to Derivative Estimates for Diffusion Semigroups

Pointwise gradient bounds for degenerate semigroups (of UFG type)

Proc. R. Soc. A

A note on evolution systems of measures for time-dependent stochastic differential equations

Reflection couplings and contraction rates for diffusions

Probab. Theory Related Fields

Processes bounded in $L^{2} (Ω)$