Concentration inequalities for additive functionals: A martingale approach

https://doi.org/10.1016/j.spa.2021.01.004Get rights and content

Abstract

This work shows how exponential concentration inequalities for additive functionals of stochastic processes over a finite time interval can be derived from concentration inequalities for martingales. The approach is entirely probabilistic and naturally includes time-inhomogeneous and non-stationary processes as well as initial laws concentrated on a single point. The class of processes studied includes martingales, Markov processes and general square integrable càdlàg processes. The general approach is complemented by a simple and direct method for martingales, diffusions and discrete-time Markov processes. The method is illustrated by deriving concentration inequalities for the Polyak–Ruppert algorithm, SDEs with time-dependent drift coefficients “contractive at infinity” with both Lipschitz and squared Lipschitz observables, some classical martingales and non-elliptic SDEs.

Introduction

In this work we consider concentration inequalities for additive functionals of the form 0TXtdtwhere X is a real-valued stochastic process. The methods we develop apply to a broad class of processes, and we will give theorems and examples that go beyond the classical setting of stationary Markov processes. We will treat in depth the cases where X is a martingale or Xt=f(t,Yt) for a Markov process Y and f in an appropriate class of functions. The concentration inequalities will be derived for the additive functionals centered around their expectation, which allows us to naturally treat non-stationary processes such as time-inhomogeneous diffusions.

We proceed to give an overview of the main results. In Section 2 we derive some representative results using short, self-contained proofs based on direct calculations. First, we show in Proposition 2.1 that for any continuous local martingale X such that X0=0 we have the following concentration inequality for any T,R>0: P0TXuduR;0T(Tu)2d[X]uσ2expR22σ2.To the author’s knowledge, the systematic treatment of concentration inequalities for additive functionals of martingales has not appeared in the literature before.

We will then move on to solutions to SDEs of the form dXt=b(t,Xt)dt+σdBt,with X0 deterministic. In particular, denoting Ps,t the Markov transition operator associated to X, we will show in Corollary 2.7 that if there exist constants c,κ>0 such that |σPt,uf(x)|ceκ(ut),xRn,0tuthen we have the following Gaussian concentration inequality for all R,T>0: P1T0Tf(u,Xu)duE1T0Tf(u,Xu)duRexpκ2R2T2c2.The main novelty in the corollary is the treatment of time-inhomogeneous SDEs and the method of proof. The Proposition from which the corollary is derived also provides a novel, refined statement in terms of a bound of |σPt,uf| along trajectories of X.

Section 2 concludes with the case of discrete-time processes in Proposition 2.12, again giving careful consideration to the time-inhomogeneous case and controlling the relevant quantities along trajectories of the process. Concretely, we show that for a discrete-time stochastic process Xt and function f such that |XtXt1|Ct,t1,|Ps,tf(x)Ps,tf(y)|σs(1κs)ts|xy|,x,yRn,0st. we have Pu=1tf(u,Xu)Eu=1tf(u,Xu)R;t=1Tσt2Ct2κt2a2expR28a2.The careful treatment of the time-inhomogeneous case and control on the level of trajectories enables in particular for the first time the derivation of concentration inequalities for the Polyak–Ruppert algorithm of the correct order in Section 4.1 (concentration inequalities for the linear case were published concurrently with this work in [30]).

Section 3 is dedicated to a wide-ranging generalization of the results from the previous section. In Section 3.1 we introduce a family of auxiliary martingales Ztu=EFtXu and show that for general square integrable processes, the concentration properties of 0TXudu are intimately linked to the predictable quadratic covariation Zu,Zv and jumps ΔZu of the auxiliary martingales. In Section 3.2 we recover and extend the martingale results from Section 2 to the discontinuous setting using the general method. In Section 3.3 we apply the general method to general Markov processes and recover expressions for Zu,Zv in terms of the squared field operator Γ, generalizing the results on Markov processes from Section 2 to general Markov processes on Polish spaces. All of the results from the preceding subsections are novel in their generality. We conclude Section 3 with Section 3.4 where we show how to incorporate arbitrary distributions of X0 and recall a number of martingale inequalities. The subsection also includes in Corollary 3.16 a novel approach to obtain Bernstein-type inequalities in some “self-bounding” cases.

The final Section 4 illustrates how to apply the results from the preceding section on a number of concrete cases. Section 4.1 contains the novel results on Polyak–Ruppert mentioned above.

Section 4.2 provides a concrete example of an SDE case with explicit conditions on the drift and diffusion coefficients as well as the observable function. Using known results on gradient bounds for Ps,t when the drift coefficient is “contractive at infinity” characterized by constants ρ,κ, we show that for any initial law ν satisfying a T1(C) transport inequality and Lipschitz function f, we have P1T0Tf(Xt)dt1T0Tμt(f)dtR+ρfLip1eκTκTW1(μ0,ν)expκ2R2T2ρ2fLip21+C1eκTT for a unique evolution system of measures μt (if the process has a stationary measure μ then μt=μ for all t).

Section 4.3 provides some concrete examples using classical martingales as integrands: Brownian motion, the compensated Poisson process and compensated squared Brownian motion Bt2t.

Sections 4.4 Squared Ornstein–Uhlenbeck process, 4.4 Squared Ornstein–Uhlenbeck process treat the cases where the integrand X is either the squared Ornstein–Uhlenbeck process or more generally the square of a Lipschitz function. These cases go beyond the scope of most previously published approaches to concentration inequalities for additive functionals. The final Section 4.6 presents a simple case of a highly non-elliptic SDE, which yields easily to the probabilistic methods presented here but is outside of the scope of previous approaches based for example on Poisson equations.

About the literature. In the Markovian setting, our approach is most closely related to the work of Joulin [25], and we recover and extend the results from that work (Proposition 3.7, Proposition 4.10). The cases of martingales and general square integrable processes do not seem to have been systematically studied in the literature.

Most previous results on concentration inequalities for functionals of the form St have been obtained for time-homogeneous Markov processes using functional inequalities. The works [7], [19], [21] require the existence of a stationary measure and an initial distribution that has an integrable density with respect to the stationary measure. The same holds true for the combinatorial and perturbation arguments in the classic paper [28]. In [40] the authors establish concentration inequalities around the expectation using stochastic calculus and Girsanov’s theorem under strong contractivity conditions. Some concentration inequalities for inhomogeneous functionals have previously been established in [20]. A different approach using renewal processes has been used in the work [29] to establish concentration inequalities for functionals with bounded integrands.

For Markov processes, the mixing conditions in this work are most naturally formulated in terms of bounds on either the Lipschitz seminorm, gradient or squared field (carré du champs) operator of the semigroup. Bounds on the Lipschitz seminorm are closely related to contractivity in the L1 transportation distance and can for instance be found in [1], [15] for elliptic diffusions, in [39] for the Riemannian case, in [16] for Langevin dynamics or in [23] for stochastic delay equations. See also [26], [31] for the discrete-time case and a large number of examples in both discrete and continuous time. Gradient estimates for semigroups can be obtained using Bismut-type formulas, see for example [12], [17], [36], the works [9], [10] for the non-autonomous case as well as the textbook [37]. Finally, in terms of the squared field operator, our mixing conditions are a relaxation of the Bakry–Emery curvature-dimension condition [2] since we allow for a prefactor strictly greater than 1.

Notation. For a right-continuous process (Xt)t0 with left limits we write Xt=limε0+Xtε and ΔXt=XtXt. For a σ-field F and a random variable X, we denote EFX the conditional expectation of X with respect to F.

Section snippets

Continuous martingales

In this and the following subsections we will establish concentration inequalities by focusing on the cases of continuous local martingales, continuous solutions to SDEs and discrete-time stochastic processes, all with their initial law concentrated on a single point. In the first two cases, we provide alternative and more direct proofs of results that also follow from the general approach in Section 3. Compared to the general approach, the direct proofs also provide an explicit martingale

Processes bounded in L2(Ω)

Consider a filtered probability space (Ω,F,P,(Ft)t0) satisfying the usual conditions from the general theory of semimartingales, meaning that F is P-complete, F0 contains all P-null sets in F and Ft is right-continuous. In this section (Xt)t0 will denote a real-valued stochastic process adapted to Ft, bounded in L2(Ω) in the sense that suptEXt2<.

Define an adapted continuous finite-variation process (St)t0 by St=0tXudu.Fix T>0 and define a martingale (MtT)t0 by MtT=EFtST.By the boundedness

Polyak–Ruppert averages

In this section, we use the notation ΔXt=XtXt1 for a discrete-time process X. The symbols t,s,u,T will always denote time variables taking values in Z+.

Consider the real-valued process X defined by the recursion Xt=Xt1αtg(Xt1,Wt),X0=xwith xR, (αt)tN a sequence in R, Wt a sequence of independent identically distributed random variables with common law μ such that μ has compact support, and g:R×RR is a function such that 0<m(w)xg(x,w)M(w)<,x,wRfor some functions m:RR and M:RR.

The

Declaration of Competing Interest

No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.spa.2021.01.004.

Funding

This work was supported by the National Research Fund, Luxembourg .

References (40)

  • BrycWlodzimierz et al.

    Large deviations for quadratic functionals of Gaussian processes

    J. Theor. Probab.

    (1997)
  • CattiauxPatrick et al.

    Central limit theorems for additive functionals of ergodic Markov diffusions processes

    ALEA Lat. Am. J. Probab. Math. Stat.

    (2012)
  • CattiauxPatrick et al.

    Deviation bounds for additive functionals of markov processes

    ESAIM: Probab. Statist.

    (2008)
  • CattiauxP. et al.

    Semi Log-Concave Markov diffusions

  • ChengLi-Juan et al.

    Evolution systems of measures and semigroup properties on evolving manifolds

    Electron. J. Probab.

    (2018)
  • ChengLi-Juan et al.

    Exponential contraction in wasserstein distance on static and evolving manifolds

    (2020)
  • CrisanD. et al.

    Uniform in time estimates for the weak error of the Euler method for SDEs and a Pathwise Approach to Derivative Estimates for Diffusion Semigroups

    (2019)
  • CrisanDan et al.

    Pointwise gradient bounds for degenerate semigroups (of UFG type)

    Proc. R. Soc. A

    (2016)
  • Da PratoGiuseppe et al.

    A note on evolution systems of measures for time-dependent stochastic differential equations

  • EberleAndreas

    Reflection couplings and contraction rates for diffusions

    Probab. Theory Related Fields

    (2015)
  • Cited by (4)

    View full text