Efficient Large Deviation Estimation Based on Importance Sampling

Guyader, Arnaud; Touchette, Hugo

doi:10.1007/s10955-020-02589-x

Efficient Large Deviation Estimation Based on Importance Sampling

Published: 06 July 2020

Volume 181, pages 551–586, (2020)
Cite this article

Journal of Statistical Physics Aims and scope Submit manuscript

388 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

We present a complete framework for determining the asymptotic (or logarithmic) efficiency of estimators of large deviation probabilities and rate functions based on importance sampling. The framework relies on the idea that importance sampling in that context is fully characterized by the joint large deviations of two random variables: the observable defining the large deviation probability of interest and the likelihood factor (or Radon–Nikodym derivative) connecting the original process and the modified process used in importance sampling. We recover with this framework known results about the asymptotic efficiency of the exponential tilting and obtain new necessary and sufficient conditions for a general change of process to be asymptotically efficient. This allows us to construct new examples of efficient estimators for sample means of random variables that do not have the exponential tilting form. Other examples involving Markov chains and diffusions are presented to illustrate our results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive Sampling of Large Deviations

Article 04 July 2018

Grégoire Ferré & Hugo Touchette

On Properties of Empirical and Related Random Processes

Article 10 October 2022

A. A. Abdushukurov

Almost sure central limit theorems for the maxima of randomly chosen random variables

Article Open access 27 April 2023

Tomasz Krajka

Notes

We could identify the new copies with a different symbol, say ${\tilde{{\mathbf {X}}}}_n^{(i)}$, since they are generated from a different distribution and so represent a different random variable. Here, we keep ${\mathbf {X}}_n^{(i)}$ but always specify the distribution, $P_n$ or $Q_n$, used. The same applies to the observable.
We use the same letter $\lambda $ for the Legendre–Fenchel transform and for the SCGF in (23), since, as already mentioned, the Gärtner–Ellis theorem ensures that, under appropriate conditions, both functions coincide.
A corner in $I_P(m)$ or $I_Q(m)$ signals physically a dynamical phase transition in the fluctuations of $M_n$. Here, we assume, for simplicity, that no such phase transition occurs. Note that a corner in the function $I_Q^B(w)$ is not related to a dynamical phase transition, since this function is obtained by conditioning. It can have a corner, as the example of the exponential tilting shows, regardless of whether $I_P(m)$ or $I_Q(m)$ is smooth.

References

Shwartz, A., Weiss, A.: Large Deviations for Performance Analysis. Stochastic Modeling Series. Chapman and Hall, London (1995)
MATH Google Scholar
Wales, D.: Energy Landscapes: Applications to Clusters, Biomolecules and Glasses. Cambridge University Press, Cambridge (2004)
Google Scholar
E, W., Ren, W., Vanden-Eijnden, E.: Minimum action method for the study of rare events. Commun. Pure Appl. Math. 57, 637–656 (2004)
Lelièvre, T., Rousset, M., Stoltz, G. (eds.): Free Energy Computations: A Mathematical Perspective. Imperial College Press, London (2010)
MATH Google Scholar
Ellis, R.S.: Entropy, Large Deviations, and Statistical Mechanics. Springer, New York (1985)
MATH Google Scholar
Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, 2nd edn. Springer, New York (1998)
MATH Google Scholar
den Hollander, F.: Large Deviations, Fields Institute Monograph. AMS, Providence (2000)
Google Scholar
Touchette, H.: The large deviation approach to statistical mechanics. Phys. Rep. 478, 1–69 (2009)
ADS MathSciNet Google Scholar
Garrahan, J.P., Jack, R.L., Lecomte, V., Pitard, E., van Duijvendijk, K., van Wijland, F.: Dynamical first-order phase transition in kinetically constrained models of glasses. Phys. Rev. Lett. 98, 195702 (2007)
ADS MATH Google Scholar
Garrahan, J.P., Lesanovsky, I.: Thermodynamics of quantum jump trajectories. Phys. Rev. Lett. 104, 160601 (2010)
ADS Google Scholar
Espigares, C.P., Garrido, P.L., Hurtado, P.I.: Dynamical phase transition for current statistics in a simple driven diffusive system. Phys. Rev. E 87, 032115 (2013)
ADS Google Scholar
Bunin, G., Kafri, Y., Podolsky, D.: Cusp singularities in boundary-driven diffusive systems. J. Stat. Phys. 152, 112–135 (2013)
ADS MathSciNet MATH Google Scholar
Tsobgni Nyawo, P., Touchette, H.: A minimal model of dynamical phase transition. Europhys. Lett. 116, 50009 (2016)
ADS Google Scholar
Lazarescu, A.: Generic dynamical phase transition in one-dimensional bulk-driven lattice gases with exclusion. J. Phys. A 50, 254004 (2017)
ADS MathSciNet MATH Google Scholar
Gallavotti, G., Cohen, E.G.D.: Dynamical ensembles in nonequilibrium statistical mechanics. Phys. Rev. Lett. 74, 2694–2697 (1995)
ADS Google Scholar
Kurchan, J.: Fluctuation theorem for stochastic dynamics. J. Phys. A 31, 3719–3729 (1998)
ADS MathSciNet MATH Google Scholar
Lebowitz, J.L., Spohn, H.: A Gallavotti-Cohen-type symmetry in the large deviation functional for stochastic dynamics. J. Stat. Phys. 95, 333–365 (1999)
ADS MathSciNet MATH Google Scholar
Harris, R.J., Schütz, G.M.: Fluctuation theorems for stochastic dynamics. J. Stat. Mech. 2007, P07020 (2007)
MathSciNet MATH Google Scholar
Baiesi, M., Maes, C., Wynants, B.: Fluctuations and response of nonequilibrium states. Phys. Rev. Lett. 103, 010602 (2009)
ADS MATH Google Scholar
Derrida, B.: Non-equilibrium steady states: Fluctuations and large deviations of the density and of the current. J. Stat. Mech. 2007, P07023 (2007)
MathSciNet MATH Google Scholar
Bertini, L., De Sole, A., Gabrielli, D., Jona-Lasinio, G., Landim, C.: Stochastic interacting particle systems out of equilibrium. J. Stat. Mech. 2007, P07014 (2007)
MathSciNet MATH Google Scholar
Harris, R.J., Touchette, H.: Large deviation approach to nonequilibrium systems. In: Klages, R., Just, W., Jarzynski, C. (eds.) Nonequilibrium Statistical Physics of Small Systems: Fluctuation Relations and Beyond, Reviews of Nonlinear Dynamics and Complexity, vol. 6, pp. 335–360. Wiley-VCH, Weinheim (2013)
Google Scholar
Garrahan, J.P.: Aspects of non-equilibrium in classical and quantum systems: slow relaxation and glasses, dynamical large deviations, quantum non-ergodicity, and open quantum dynamics. Physica A 504, 130–154 (2018)
ADS MathSciNet Google Scholar
Sekimoto, K.: Stochastic Energetics, Lect. Notes Phys., vol. 799. Springer, New York (2010)
MATH Google Scholar
Seifert, U.: Stochastic thermodynamics, fluctuation theorems and molecular machines. Rep. Prog. Phys. 75, 126001 (2012)
ADS Google Scholar
Seifert, U.: Stochastic thermodynamics: from principles to the cost of precision. Physica A 504, 176–191 (2018)
ADS MathSciNet Google Scholar
Ciliberto, S.: Experiments in stochastic thermodynamics: short history and perspectives. Phys. Rev. X 7, 021051 (2017)
Google Scholar
Cérou, F., Guyader, A.: Adaptive multilevel splitting for rare event analysis. Stoch. Anal. Appl. 25, 417–443 (2007)
MathSciNet MATH Google Scholar
Dean, T., Dupuis, P.: Splitting for rare event simulation: a large deviation approach to design and analysis. Stoch. Proc. Appl. 119, 562–587 (2009)
MathSciNet MATH Google Scholar
Cérou, F., Guyader, A., Lelièvre, T., Pommier, D.: A multiple replica approach to simulate reactive trajectories. J. Chem. Phys. 134, 054108 (2011)
ADS Google Scholar
Cérou, F., Delyon, B., Guyader, A., Rousset, M.: On the asymptotic normality of adaptive multilevel splitting. SIAM J. Uncertain. Quant. 7, 1–30 (2019)
MathSciNet MATH Google Scholar
Cérou, F., Guyader, A., Rousset, M.: Adaptive multilevel splitting: historical perspective and recent results. Chaos 29, 043108 (2019)
ADS MathSciNet MATH Google Scholar
Bréhier, C.-E., Lelièvre, T.: On a new class of score functions to estimate tail probabilities of some stochastic processes with adaptive multilevel splitting. Chaos 29, 033126 (2019)
ADS MathSciNet MATH Google Scholar
Grassberger, P.: Go with the winners: a general Monte Carlo strategy. Comput. Phys. Commun. 147, 64–70 (2002)
ADS MathSciNet MATH Google Scholar
Giardina, C., Kurchan, J., Peliti, L.: Direct evaluation of large-deviation functions. Phys. Rev. Lett. 96, 120603 (2006)
ADS Google Scholar
Lecomte, V., Tailleur, J.: A numerical approach to large deviations in continuous time. J. Stat. Mech. 2007, P03004 (2007)
MATH Google Scholar
Angeli, L., Grosskinsky, S., Johansen, A.M., Pizzoferrato, A.: Rare event simulation for stochastic dynamics in continuous time. J. Stat. Phys. 176, 1185–1210 (2019)
ADS MathSciNet MATH Google Scholar
Torrie, G.M., Valleau, J.P.: Nonphysical sampling distributions in Monte Carlo free-energy estimation: umbrella sampling. J. Comput. Phys. 23, 187–199 (1977)
ADS Google Scholar
Juneja, S., Shahabuddin, P.: Rare-event simulation techniques: an introduction and recent advances, Chap. 11, pp. 291–350 Elsevier, Amsterdam (2006)
Asmussen, S., Glynn, P.W.: Stochastic Simulation: Algorithms and Analysis. Stochastic Modelling and Applied Probability. Springer, New York (2007)
MATH Google Scholar
Bucklew, J.A.: Introduction to Rare Event Simulation. Springer, New York (2004)
MATH Google Scholar
Sadowsky, J.S., Bucklew, J.A.: Large deviations theory techniques in Monte Carlo simulation. In: MacNair, E.A., Musselman, K.J., Heidelberger, P. (eds.) Proceedings of the 1989 Winter Simulation Conference, pp. 505–513. ACM, New York (1989)
Google Scholar
Sadowsky, J.S., Bucklew, J.A.: On large deviations theory and asymptotically efficient Monte Carlo estimation. IEEE Trans. Inf. Theory 36, 579–588 (1990)
MathSciNet MATH Google Scholar
Bucklew, J.A., Ney, P., Sadowsky, J.S.: Monte Carlo simulation and large deviations theory for uniformly recurrent Markov chains. J. Appl. Prob. 27, 44–59 (1990)
MathSciNet MATH Google Scholar
Schlebusch, H.-J.: On the asymptotic efficiency of importance sampling techniques. IEEE Trans. Inf. Thoery 39, 710–715 (1993)
MATH Google Scholar
Dieker, A.B., Mandjes, M.: On asymptotically efficient simulation of large deviation probabilities. Adv. Appl. Prob. 37, 539–552 (2005)
MathSciNet MATH Google Scholar
Efron, B., Traux, D.: Large deviations theory in exponential families. Ann. Math. Stat. 39, 1402–1424 (1968)
MathSciNet MATH Google Scholar
Touchette, H.: Asymptotic equivalence of probability measures and stochastic processes. J. Stat. Phys. 170, 962–978 (2018a)
ADS MathSciNet MATH Google Scholar
Cottrell, M., Fort, J.-C., Malgouyres, G.: Large deviations and rare events in the study of stochastic algorithms. IEEE Trans. Autom. Control 28, 907–920 (1983)
MathSciNet MATH Google Scholar
Freidlin, M.I., Wentzell, A.D.: Random Perturbations of Dynamical Systems, Grundlehren der Mathematischen Wissenschaften, vol. 260. Springer, New York (1984)
Google Scholar
Graham, R.: Macroscopic potentials, bifurcations and noise in dissipative systems. In: Moss, F., McClintock, P.V.E. (eds.) Noise in Nonlinear Dynamical Systems, vol. 1, pp. 225–278. Cambridge University Press, Cambridge (1989)
Google Scholar
Luchinsky, D.G., McClintock, P.V.E., Dykman, M.I.: Analogue studies of nonlinear systems. Rep. Prog. Phys. 61, 889–997 (1998)
ADS Google Scholar
Touchette, H.: Introduction to dynamical large deviations of Markov processes. Physica A 504, 5–19 (2018b)
ADS MathSciNet Google Scholar
Bertini, L., De Sole, A., Gabrielli, D., Jona-Lasinio, G., Landim, C.: Macroscopic fluctuation theory. Rev. Mod. Phys. 87, 593–636 (2015)
ADS MathSciNet MATH Google Scholar
Touchette, H.: Equivalence and nonequivalence of ensembles: thermodynamic, macrostate, and measure levels. J. Stat. Phys. 159, 987–1016 (2015)
ADS MathSciNet MATH Google Scholar
Rubinstein, R.Y., Kroese, D.P.: The Cross-Entropy Method. Springer, New York (2004)
MATH Google Scholar
Engel, A., Monasson, R., Hartmann, A.K.: On large deviation properties of Erdös-Rényi random graphs. J. Stat. Phys. 117, 387–426 (2004)
ADS MATH Google Scholar
Hartmann, A.K.: Large-deviation properties of largest component for random graphs. Eur. J. Phys. B 84, 627–634 (2011)
ADS MathSciNet Google Scholar
Dewenter, T., Hartmann, A.K.: Large-deviation properties of resilience of power grids. New J. Phys. 17, 015005 (2015)
ADS Google Scholar
Guasoni, P., Robertson, S.: Optimal importance sampling with explicit formulas in continuous time. Financ. Stoch. 12, 1–19 (2008)
MathSciNet MATH Google Scholar
Vanden-Eijnden, E., Weare, J.: Rare event simulation of small noise diffusions. Commun. Pure Appl. Math. 65, 1770–1803 (2012)
MathSciNet MATH Google Scholar
Kundu, A., Sabhapandit, S., Dhar, A.: Application of importance sampling to the computation of large deviations in nonequilibrium processes. Phys. Rev. E 83, 031119 (2011)
ADS Google Scholar
Klymko, K., Geissler, P.L., Garrahan, J.P., Whitelam, S.: Rare behavior of growth processes via umbrella sampling of trajectories. Phys. Rev. E 97, 032123 (2018)
ADS Google Scholar
Whitelam, S.: Sampling rare fluctuations of discrete-time Markov chains. Phys. Rev. E 97, 032122 (2018)
ADS Google Scholar
Jacobson, D., Whitelam, S.: Direct evaluation of dynamical large-deviation rate functions using a variational ansatz. Phys. Rev. E 100, 052139 (2019)
ADS MathSciNet Google Scholar
Glasserman, P., Wang, Y.: Counterexamples in importance sampling for large deviations probabilities. Ann. Appl. Prob. 7, 731–746 (1997)
MathSciNet MATH Google Scholar
Puhalskii, A., Spokoiny, V.: On large-deviation efficiency in statistical inference. Bernoulli 4, 203–272 (1998)
MathSciNet MATH Google Scholar
Ellis, R.S., Haven, K., Turkington, B.: Large deviation principles and complete equivalence and nonequivalence results for pure and mixed ensembles. J. Stat. Phys. 101, 999–1064 (2000)
MathSciNet MATH Google Scholar
Varadhan, S.R.S.: Asymptotic probabilities and differential equations. Commun. Pure Appl. Math. 19, 261–286 (1966)
MathSciNet MATH Google Scholar
Touchette, H.: A basic introduction to large deviations: theory, applications, simulations. In: Leidl, R., Hartmann, A.K. (eds.) Modern Computational Science 11: Lecture Notes from the 3rd International Oldenburg Summer School. BIS-Verlag der Carl von Ossietzky Universität Oldenburg, Oldenburg (2011)
Google Scholar
Chetrite, R., Touchette, H.: Nonequilibrium Markov processes conditioned on large deviations. Ann. Henri Poincaré 16, 2005–2057 (2015a)
ADS MathSciNet MATH Google Scholar
Harris, R.J., Touchette, H.: Current fluctuations in stochastic systems with long-range memory. J. Phys. A 42, 342001 (2009)
MathSciNet MATH Google Scholar
Küchler, U., Sōrensen, M.: On exponential families of Markov processes. J. Stat. Plan. Inference 66, 3–19 (1998)
MathSciNet MATH Google Scholar
Stroock, D.W., Varadhan, S.R.S.: Multidimensional Diffusion Processes. Springer, New York (1979)
MATH Google Scholar
Chetrite, R., Touchette, H.: Variational and optimal control representations of conditioned and driven processes. J. Stat. Mech. 2015, P12001 (2015b)
MathSciNet MATH Google Scholar
Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)
MATH Google Scholar
Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis, vol. 317, p. 1988. Springer, New York (1988)
Google Scholar
Borwein, J., Lewis, A.: Convex Analysis and Nonlinear Optimization, 2nd edn. Springer, New York (2006)
MATH Google Scholar

Download references

Acknowledgements

A.G. thanks Maxime Sangnier for fruitful discussions during the writing of this paper. We also thank Grégoire Ferré and Gabriel Stoltz for carefully reading the paper. H.T. is supported by Stellenbosch University (Establishment Funds) and the National Research Foundation of South Africa (Grant No. 96199).

Author information

Authors and Affiliations

Laboratoire de Probabilités, Statistique et Modélisation, Sorbonne Université, Paris, France
Arnaud Guyader
CERMICS, École des Ponts ParisTech, Champs-sur-Marne, France
Arnaud Guyader
Department of Mathematical Sciences, Stellenbosch University, Stellenbosch, South Africa
Hugo Touchette

Authors

Arnaud Guyader
View author publications
You can also search for this author in PubMed Google Scholar
Hugo Touchette
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hugo Touchette.

Additional information

Communicated by Abhishek Dhar.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Convex Analysis

We collect in this section basic results of convex analysis used in the paper in relation to the rate function $I_Q^B(w)$, defined in (50), and its Legendre–Fenchel transform $\lambda _Q^B(k)$, defined in (57). Both are functions of a single real variable, so we state the necessary results only for this simple case. We assume further that all convex functions are proper closed convex functions. For more general results and proofs, we refer to [76,77,78].

1.1 Subdifferentials

Let $f:{\mathbb {R}}\rightarrow {\bar{{\mathbb {R}}}}$ be a real function taking values in the set of extended reals ${\bar{{\mathbb {R}}}}$. The subdifferential $\partial f(x)$ of f at the point x is the set of all values $k\in {\mathbb {R}}$ such that

$$\begin{aligned} f(y)\ge f(x) +k(y-x) \end{aligned}$$

(A.1)

for all $y\in {\mathbb {R}}$ [76, Sect. 23]. Put differently, and as illustrated in Fig. 7a, $\partial f(x)$ is the set of slopes of all possible supporting lines of f at x. If f has not supporting line at x, then $\partial f(x)=\emptyset $. We will see next that this may happen when f is nonconvex.

For convex functions, subdifferentials exist everywhere in the domain of f(x), except possibly at boundary points [76, Theorem 23.4]. For this class of functions, we have in fact $\partial f(x) = [f'(x^-),f'(x^+)]$, where $f'(x^-)$ is the left-derivative and $f'(x^+)$ the right-derivative [76, Theorem 24.3]. If these are equal, f is differentiable at x so that $\partial f(x) = \{f'(x)\}$ [76, Theorem 25.1]. In all cases, $\partial f(x)$ is a closed convex interval [76, p. 215].

1.2 Legendre–Fenchel Transforms

The Legendre–Fenchel transform of f is the real function defined by

$$\begin{aligned} f^*(k) = \sup _{x\in {\mathbb {R}}} \{kx-f(x)\},\qquad k\in {\mathbb {R}}. \end{aligned}$$

(A.2)

This function is also called the dual or conjugate of f and has the property of being convex [76, Theorem 12.2]. The double dual or biconjugate of f is the Legendre–Fenchel of $f^*$:

$$\begin{aligned} f^{**}(x) = \sup _{k\in {\mathbb {R}}} \{kx-f^*(k)\}. \end{aligned}$$

(A.3)

This is also a convex function, corresponding to the convex envelope or convex hull of f [77, Theorem 11.1], as illustrated in Fig. 7b.

With this geometric interpretation of $f^{**}$, it is natural to say that x is a convex point of f if $f(x)=f^{**}(x)$ and a nonconvex point of f if $f(x)\ne f^{**}(x)$. An important result proved in [68, Lem. 4.1] is that the set of convex points coincides with the set of points admitting supporting lines, except possibly at boundary points. With this proviso, we then have $f(x)=f^{**}(x)$ if and only if $\partial f(x)\ne \emptyset $. This is illustrated in Fig. 7a. The same result also implies that, if $f(x)=f^{**}(x)$, then $\partial f(x)=\partial f^{**}(x)$.

In this paper, we deal with rate functions, which always have at least one global minimum. Denoting one such minimizer by $x^*$, we then have $0\in \partial f(x^*)$. Hence, $x^*$ is a convex point such that $f(x^*)=f^{**}(x^*)$ and $\partial f(x^*)=\partial f^{**}(x^*)$.

1.3 Duality

The proof of our main result, Theorem 4, is based on another important result about convex functions stating (see [76, Cor. 23.5.1] or [77, Prop. 11.3]) that

$$\begin{aligned} k\in \partial f(x)\iff x\in \partial f^*(k). \end{aligned}$$

(A.4)

This property expresses a form of duality or conjugacy between the slopes of f and the slopes of $f^*$, illustrated in Fig. 8a. From this result, it is easy to see that convex, affine parts of f correspond to cusps of $f^*$, and vice versa, as shown in Fig. 8b.

The duality in (A.4) also holds for $f^{**}$, since this function is convex and is the Legendre–Fenchel transform of $f^*$. Therefore,

$$\begin{aligned} k\in \partial f^{**}(x)\iff x\in \partial f^*(k). \end{aligned}$$

(A.5)

This result implies that $f^*$ has a cusp also when f is nonconvex, as shown in Fig. 8, since $f^{**}$ is affine where f is nonconvex. Thus, $f^*$ has a cusp either if f is affine or f is nonconvex.

Since subdifferentials of f and $f^{**}$ match at convex points, it is also clear from (A.5) that the first duality (A.4) holds locally at these points even if f is not globally convex. We use this result in this paper when dealing with the subdifferential of $I_Q^B$ at its global minimum $w^*$, which is a convex point, as mentioned. In this case, the first duality result can be applied at that point even though $I_Q^B$ might be nonconvex at other points, as in Figs. 2c or 6.

Appendix B: Contraction Principle

The contraction principle is an important result in large deviation theory relating the rate functions of random variables that can be mapped to one another. Let $(A_n)_{n>0}$ be a sequence of random variables satisfying the LDP with good rate function $I_A$ and let $(B_n)_{n>0}$ be another sequence such that $B_n=f(A_n)$ with f continuous. Then $(B_n)_{n>0}$ also satisfies the LDP with good rate function

$$\begin{aligned} I_B(b)=\inf _{a:f(a)=b} I_A(a). \end{aligned}$$

(A.6)

See [6, Theorem 4.2.1] for details.

Instead of considering a single continuous function f as the contraction, one can also consider a sequence $(f_n)_{n>0}$ of continuous functions. In this case, the contraction principle also applies provided that $f_n$ is “close enough” to f with respect to $P_n$. To be more precise, let ${\mathcal {A}}$ denote the space of $A_n$ and define

$$\begin{aligned} \Gamma _{n,\delta }=\{a\in {\mathcal {A}}: \Vert f_n(a)-f(a)\Vert >\delta \} \end{aligned}$$

(A.7)

as the set of points for which $f_n$ differs from f by at least $\delta >0$ with respect to any metric $\Vert \cdot \Vert $ on ${\mathcal {B}}$, the space of $B_n$. Then, according to [6, Cor. 4.2.21], $B_n=f_n(A_n)$ satisfies the LDP with good rate function $I_B$ given by (A.6) with f as the contraction if, for all $\delta >0$,

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{1}{n}\log P_n(\Gamma _{n,\delta })=-\infty . \end{aligned}$$

(A.8)

This condition only means that the probability that $f_n$ differs from f decreases faster than exponentially with n in the large deviation limit. This is met in most cases when $f_n$ is smooth and $I_A$ is a good rate function.

Two particular applications of this result are considered in the paper.

Example 4

Consider two real random variables $A_n$ and $B_n$ related by the simple rescaling $B_n=c_n A_n$ with $c_n\rightarrow 1$ as $n\rightarrow \infty $. Here, the limit function is the identity $f(a)=a$, so one expects $A_n$ and $B_n$ to have the same rate function. This is verified by noting that, for every $M>0$, there exists $n_0=n_0(M,\delta )$ such that for all $n\ge n_0$, one has $\Gamma _{n,\delta }\subseteq (-\infty ,-M]\cup [M,\infty )$. Therefore, from the definition of the LDP, we obtain

$$\begin{aligned} \limsup _{n\rightarrow \infty } \frac{1}{n}\log P_n(\Gamma _{n,\delta })\le -\inf _{|a|\ge M} I_A(a). \end{aligned}$$

(A.9)

But, since the rate function $I_A$ of $A_n$ is good, it is coercive, so that

$$\begin{aligned} \lim _{|a|\rightarrow \infty } I_A(a)=\infty . \end{aligned}$$

(A.10)

Therefore, the limit on the left-hand side of (A.9) must give $-\infty $, implying $I_B(b) = I_A(b)$ from the condition (A.8).

Example 5

Let $B_n =f(A_n)+c_n$ with $c_n\rightarrow c$. Then the rate function of $B_n$ is obtained from (A.6) with the contraction $B_n=f(A_n)+c$. This follows trivially because the distance between $f(a)+c_n$ and $f(a)+c$ is constant in a. Since $c_n\rightarrow c$, there must be an n beyond which $|c_n-c|<\delta $, leading to $P_n(\Gamma _{n,\delta })=0$, so the condition (A.8) is also satisfied.

These results also hold if $\Gamma _{n,\delta }$ is defined on a subset of ${\mathcal {A}}$, since any restriction or constraint on $A_n$ can be included in the definition of $f_n$. This arises, for example, when considering the contraction of $J_Q(m,w)$ to $I_Q^B(w)$, which involves the restriction $m\in B$.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Guyader, A., Touchette, H. Efficient Large Deviation Estimation Based on Importance Sampling. J Stat Phys 181, 551–586 (2020). https://doi.org/10.1007/s10955-020-02589-x

Download citation

Received: 11 March 2020
Accepted: 12 June 2020
Published: 06 July 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s10955-020-02589-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Large Deviation Estimation Based on Importance Sampling

Abstract

Access this article