Concentration inequalities for additive functionals: A martingale approach
Introduction
In this work we consider concentration inequalities for additive functionals of the form where is a real-valued stochastic process. The methods we develop apply to a broad class of processes, and we will give theorems and examples that go beyond the classical setting of stationary Markov processes. We will treat in depth the cases where is a martingale or for a Markov process and in an appropriate class of functions. The concentration inequalities will be derived for the additive functionals centered around their expectation, which allows us to naturally treat non-stationary processes such as time-inhomogeneous diffusions.
We proceed to give an overview of the main results. In Section 2 we derive some representative results using short, self-contained proofs based on direct calculations. First, we show in Proposition 2.1 that for any continuous local martingale such that we have the following concentration inequality for any : To the author’s knowledge, the systematic treatment of concentration inequalities for additive functionals of martingales has not appeared in the literature before.
We will then move on to solutions to SDEs of the form with deterministic. In particular, denoting the Markov transition operator associated to , we will show in Corollary 2.7 that if there exist constants such that then we have the following Gaussian concentration inequality for all : The main novelty in the corollary is the treatment of time-inhomogeneous SDEs and the method of proof. The Proposition from which the corollary is derived also provides a novel, refined statement in terms of a bound of along trajectories of .
Section 2 concludes with the case of discrete-time processes in Proposition 2.12, again giving careful consideration to the time-inhomogeneous case and controlling the relevant quantities along trajectories of the process. Concretely, we show that for a discrete-time stochastic process and function such that we have The careful treatment of the time-inhomogeneous case and control on the level of trajectories enables in particular for the first time the derivation of concentration inequalities for the Polyak–Ruppert algorithm of the correct order in Section 4.1 (concentration inequalities for the linear case were published concurrently with this work in [30]).
Section 3 is dedicated to a wide-ranging generalization of the results from the previous section. In Section 3.1 we introduce a family of auxiliary martingales and show that for general square integrable processes, the concentration properties of are intimately linked to the predictable quadratic covariation and jumps of the auxiliary martingales. In Section 3.2 we recover and extend the martingale results from Section 2 to the discontinuous setting using the general method. In Section 3.3 we apply the general method to general Markov processes and recover expressions for in terms of the squared field operator , generalizing the results on Markov processes from Section 2 to general Markov processes on Polish spaces. All of the results from the preceding subsections are novel in their generality. We conclude Section 3 with Section 3.4 where we show how to incorporate arbitrary distributions of and recall a number of martingale inequalities. The subsection also includes in Corollary 3.16 a novel approach to obtain Bernstein-type inequalities in some “self-bounding” cases.
The final Section 4 illustrates how to apply the results from the preceding section on a number of concrete cases. Section 4.1 contains the novel results on Polyak–Ruppert mentioned above.
Section 4.2 provides a concrete example of an SDE case with explicit conditions on the drift and diffusion coefficients as well as the observable function. Using known results on gradient bounds for when the drift coefficient is “contractive at infinity” characterized by constants , we show that for any initial law satisfying a transport inequality and Lipschitz function , we have for a unique evolution system of measures (if the process has a stationary measure then for all ).
Section 4.3 provides some concrete examples using classical martingales as integrands: Brownian motion, the compensated Poisson process and compensated squared Brownian motion .
Sections 4.4 Squared Ornstein–Uhlenbeck process, 4.4 Squared Ornstein–Uhlenbeck process treat the cases where the integrand is either the squared Ornstein–Uhlenbeck process or more generally the square of a Lipschitz function. These cases go beyond the scope of most previously published approaches to concentration inequalities for additive functionals. The final Section 4.6 presents a simple case of a highly non-elliptic SDE, which yields easily to the probabilistic methods presented here but is outside of the scope of previous approaches based for example on Poisson equations.
About the literature. In the Markovian setting, our approach is most closely related to the work of Joulin [25], and we recover and extend the results from that work (Proposition 3.7, Proposition 4.10). The cases of martingales and general square integrable processes do not seem to have been systematically studied in the literature.
Most previous results on concentration inequalities for functionals of the form have been obtained for time-homogeneous Markov processes using functional inequalities. The works [7], [19], [21] require the existence of a stationary measure and an initial distribution that has an integrable density with respect to the stationary measure. The same holds true for the combinatorial and perturbation arguments in the classic paper [28]. In [40] the authors establish concentration inequalities around the expectation using stochastic calculus and Girsanov’s theorem under strong contractivity conditions. Some concentration inequalities for inhomogeneous functionals have previously been established in [20]. A different approach using renewal processes has been used in the work [29] to establish concentration inequalities for functionals with bounded integrands.
For Markov processes, the mixing conditions in this work are most naturally formulated in terms of bounds on either the Lipschitz seminorm, gradient or squared field (carré du champs) operator of the semigroup. Bounds on the Lipschitz seminorm are closely related to contractivity in the transportation distance and can for instance be found in [1], [15] for elliptic diffusions, in [39] for the Riemannian case, in [16] for Langevin dynamics or in [23] for stochastic delay equations. See also [26], [31] for the discrete-time case and a large number of examples in both discrete and continuous time. Gradient estimates for semigroups can be obtained using Bismut-type formulas, see for example [12], [17], [36], the works [9], [10] for the non-autonomous case as well as the textbook [37]. Finally, in terms of the squared field operator, our mixing conditions are a relaxation of the Bakry–Emery curvature-dimension condition [2] since we allow for a prefactor strictly greater than .
Notation. For a right-continuous process with left limits we write and . For a -field and a random variable , we denote the conditional expectation of with respect to .
Section snippets
Continuous martingales
In this and the following subsections we will establish concentration inequalities by focusing on the cases of continuous local martingales, continuous solutions to SDEs and discrete-time stochastic processes, all with their initial law concentrated on a single point. In the first two cases, we provide alternative and more direct proofs of results that also follow from the general approach in Section 3. Compared to the general approach, the direct proofs also provide an explicit martingale
Processes bounded in
Consider a filtered probability space satisfying the usual conditions from the general theory of semimartingales, meaning that is -complete, contains all -null sets in and is right-continuous. In this section will denote a real-valued stochastic process adapted to , bounded in in the sense that .
Define an adapted continuous finite-variation process by Fix and define a martingale by By the boundedness
Polyak–Ruppert averages
In this section, we use the notation for a discrete-time process . The symbols will always denote time variables taking values in .
Consider the real-valued process defined by the recursion with , a sequence in , a sequence of independent identically distributed random variables with common law such that has compact support, and is a function such that for some functions and .
The
Declaration of Competing Interest
No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.spa.2021.01.004.
Funding
This work was supported by the National Research Fund, Luxembourg .
References (40)
- et al.
On Bernstein-type inequalities for martingales
Stoch. Process. Appl.
(2001) - et al.
Formulae for the derivatives of heat semigroups
J. Funct. Anal.
(1994) Moderate deviations of inhomogeneous functionals of Markov processes and application to averaging
Stoch. Process. Appl.
(2001)Duality on gradient estimates and Wasserstein controls
J. Funct. Anal.
(2010)Ricci curvature of Markov chains on metric spaces
J. Funct. Anal.
(2009)Gradient estimates of Poisson equations on Riemannian manifolds and applications
J. Funct. Anal.
(2009)- et al.
Computation of sensitivities for the invariant measure of a parameter dependent diffusion
Stoch. Partial Differ. Equ. Anal. Comput.
(2017) - et al.
- et al.
- et al.
Concentration inequalities: a nonasymptotic theory of independence
(2013)
Large deviations for quadratic functionals of Gaussian processes
J. Theor. Probab.
Central limit theorems for additive functionals of ergodic Markov diffusions processes
ALEA Lat. Am. J. Probab. Math. Stat.
Deviation bounds for additive functionals of markov processes
ESAIM: Probab. Statist.
Semi Log-Concave Markov diffusions
Evolution systems of measures and semigroup properties on evolving manifolds
Electron. J. Probab.
Exponential contraction in wasserstein distance on static and evolving manifolds
Uniform in time estimates for the weak error of the Euler method for SDEs and a Pathwise Approach to Derivative Estimates for Diffusion Semigroups
Pointwise gradient bounds for degenerate semigroups (of UFG type)
Proc. R. Soc. A
A note on evolution systems of measures for time-dependent stochastic differential equations
Reflection couplings and contraction rates for diffusions
Probab. Theory Related Fields
Cited by (4)
Improving Hoeffding's inequality using higher moments information
2023, Statistics and Probability LettersTight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize
2021, Advances in Neural Information Processing Systems