Skip to main content
Log in

Efficient Large Deviation Estimation Based on Importance Sampling

  • Published:
Journal of Statistical Physics Aims and scope Submit manuscript

Abstract

We present a complete framework for determining the asymptotic (or logarithmic) efficiency of estimators of large deviation probabilities and rate functions based on importance sampling. The framework relies on the idea that importance sampling in that context is fully characterized by the joint large deviations of two random variables: the observable defining the large deviation probability of interest and the likelihood factor (or Radon–Nikodym derivative) connecting the original process and the modified process used in importance sampling. We recover with this framework known results about the asymptotic efficiency of the exponential tilting and obtain new necessary and sufficient conditions for a general change of process to be asymptotically efficient. This allows us to construct new examples of efficient estimators for sample means of random variables that do not have the exponential tilting form. Other examples involving Markov chains and diffusions are presented to illustrate our results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. We could identify the new copies with a different symbol, say \({\tilde{{\mathbf {X}}}}_n^{(i)}\), since they are generated from a different distribution and so represent a different random variable. Here, we keep \({\mathbf {X}}_n^{(i)}\) but always specify the distribution, \(P_n\) or \(Q_n\), used. The same applies to the observable.

  2. We use the same letter \(\lambda \) for the Legendre–Fenchel transform and for the SCGF in (23), since, as already mentioned, the Gärtner–Ellis theorem ensures that, under appropriate conditions, both functions coincide.

  3. A corner in \(I_P(m)\) or \(I_Q(m)\) signals physically a dynamical phase transition in the fluctuations of \(M_n\). Here, we assume, for simplicity, that no such phase transition occurs. Note that a corner in the function \(I_Q^B(w)\) is not related to a dynamical phase transition, since this function is obtained by conditioning. It can have a corner, as the example of the exponential tilting shows, regardless of whether \(I_P(m)\) or \(I_Q(m)\) is smooth.

References

  1. Shwartz, A., Weiss, A.: Large Deviations for Performance Analysis. Stochastic Modeling Series. Chapman and Hall, London (1995)

    MATH  Google Scholar 

  2. Wales, D.: Energy Landscapes: Applications to Clusters, Biomolecules and Glasses. Cambridge University Press, Cambridge (2004)

    Google Scholar 

  3. E, W., Ren, W., Vanden-Eijnden, E.: Minimum action method for the study of rare events. Commun. Pure Appl. Math. 57, 637–656 (2004)

  4. Lelièvre, T., Rousset, M., Stoltz, G. (eds.): Free Energy Computations: A Mathematical Perspective. Imperial College Press, London (2010)

    MATH  Google Scholar 

  5. Ellis, R.S.: Entropy, Large Deviations, and Statistical Mechanics. Springer, New York (1985)

    MATH  Google Scholar 

  6. Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications, 2nd edn. Springer, New York (1998)

    MATH  Google Scholar 

  7. den Hollander, F.: Large Deviations, Fields Institute Monograph. AMS, Providence (2000)

    Google Scholar 

  8. Touchette, H.: The large deviation approach to statistical mechanics. Phys. Rep. 478, 1–69 (2009)

    ADS  MathSciNet  Google Scholar 

  9. Garrahan, J.P., Jack, R.L., Lecomte, V., Pitard, E., van Duijvendijk, K., van Wijland, F.: Dynamical first-order phase transition in kinetically constrained models of glasses. Phys. Rev. Lett. 98, 195702 (2007)

    ADS  MATH  Google Scholar 

  10. Garrahan, J.P., Lesanovsky, I.: Thermodynamics of quantum jump trajectories. Phys. Rev. Lett. 104, 160601 (2010)

    ADS  Google Scholar 

  11. Espigares, C.P., Garrido, P.L., Hurtado, P.I.: Dynamical phase transition for current statistics in a simple driven diffusive system. Phys. Rev. E 87, 032115 (2013)

    ADS  Google Scholar 

  12. Bunin, G., Kafri, Y., Podolsky, D.: Cusp singularities in boundary-driven diffusive systems. J. Stat. Phys. 152, 112–135 (2013)

    ADS  MathSciNet  MATH  Google Scholar 

  13. Tsobgni Nyawo, P., Touchette, H.: A minimal model of dynamical phase transition. Europhys. Lett. 116, 50009 (2016)

    ADS  Google Scholar 

  14. Lazarescu, A.: Generic dynamical phase transition in one-dimensional bulk-driven lattice gases with exclusion. J. Phys. A 50, 254004 (2017)

    ADS  MathSciNet  MATH  Google Scholar 

  15. Gallavotti, G., Cohen, E.G.D.: Dynamical ensembles in nonequilibrium statistical mechanics. Phys. Rev. Lett. 74, 2694–2697 (1995)

    ADS  Google Scholar 

  16. Kurchan, J.: Fluctuation theorem for stochastic dynamics. J. Phys. A 31, 3719–3729 (1998)

    ADS  MathSciNet  MATH  Google Scholar 

  17. Lebowitz, J.L., Spohn, H.: A Gallavotti-Cohen-type symmetry in the large deviation functional for stochastic dynamics. J. Stat. Phys. 95, 333–365 (1999)

    ADS  MathSciNet  MATH  Google Scholar 

  18. Harris, R.J., Schütz, G.M.: Fluctuation theorems for stochastic dynamics. J. Stat. Mech. 2007, P07020 (2007)

    MathSciNet  MATH  Google Scholar 

  19. Baiesi, M., Maes, C., Wynants, B.: Fluctuations and response of nonequilibrium states. Phys. Rev. Lett. 103, 010602 (2009)

    ADS  MATH  Google Scholar 

  20. Derrida, B.: Non-equilibrium steady states: Fluctuations and large deviations of the density and of the current. J. Stat. Mech. 2007, P07023 (2007)

    MathSciNet  MATH  Google Scholar 

  21. Bertini, L., De Sole, A., Gabrielli, D., Jona-Lasinio, G., Landim, C.: Stochastic interacting particle systems out of equilibrium. J. Stat. Mech. 2007, P07014 (2007)

    MathSciNet  MATH  Google Scholar 

  22. Harris, R.J., Touchette, H.: Large deviation approach to nonequilibrium systems. In: Klages, R., Just, W., Jarzynski, C. (eds.) Nonequilibrium Statistical Physics of Small Systems: Fluctuation Relations and Beyond, Reviews of Nonlinear Dynamics and Complexity, vol. 6, pp. 335–360. Wiley-VCH, Weinheim (2013)

    Google Scholar 

  23. Garrahan, J.P.: Aspects of non-equilibrium in classical and quantum systems: slow relaxation and glasses, dynamical large deviations, quantum non-ergodicity, and open quantum dynamics. Physica A 504, 130–154 (2018)

    ADS  MathSciNet  Google Scholar 

  24. Sekimoto, K.: Stochastic Energetics, Lect. Notes Phys., vol. 799. Springer, New York (2010)

    MATH  Google Scholar 

  25. Seifert, U.: Stochastic thermodynamics, fluctuation theorems and molecular machines. Rep. Prog. Phys. 75, 126001 (2012)

    ADS  Google Scholar 

  26. Seifert, U.: Stochastic thermodynamics: from principles to the cost of precision. Physica A 504, 176–191 (2018)

    ADS  MathSciNet  Google Scholar 

  27. Ciliberto, S.: Experiments in stochastic thermodynamics: short history and perspectives. Phys. Rev. X 7, 021051 (2017)

    Google Scholar 

  28. Cérou, F., Guyader, A.: Adaptive multilevel splitting for rare event analysis. Stoch. Anal. Appl. 25, 417–443 (2007)

    MathSciNet  MATH  Google Scholar 

  29. Dean, T., Dupuis, P.: Splitting for rare event simulation: a large deviation approach to design and analysis. Stoch. Proc. Appl. 119, 562–587 (2009)

    MathSciNet  MATH  Google Scholar 

  30. Cérou, F., Guyader, A., Lelièvre, T., Pommier, D.: A multiple replica approach to simulate reactive trajectories. J. Chem. Phys. 134, 054108 (2011)

    ADS  Google Scholar 

  31. Cérou, F., Delyon, B., Guyader, A., Rousset, M.: On the asymptotic normality of adaptive multilevel splitting. SIAM J. Uncertain. Quant. 7, 1–30 (2019)

    MathSciNet  MATH  Google Scholar 

  32. Cérou, F., Guyader, A., Rousset, M.: Adaptive multilevel splitting: historical perspective and recent results. Chaos 29, 043108 (2019)

    ADS  MathSciNet  MATH  Google Scholar 

  33. Bréhier, C.-E., Lelièvre, T.: On a new class of score functions to estimate tail probabilities of some stochastic processes with adaptive multilevel splitting. Chaos 29, 033126 (2019)

    ADS  MathSciNet  MATH  Google Scholar 

  34. Grassberger, P.: Go with the winners: a general Monte Carlo strategy. Comput. Phys. Commun. 147, 64–70 (2002)

    ADS  MathSciNet  MATH  Google Scholar 

  35. Giardina, C., Kurchan, J., Peliti, L.: Direct evaluation of large-deviation functions. Phys. Rev. Lett. 96, 120603 (2006)

    ADS  Google Scholar 

  36. Lecomte, V., Tailleur, J.: A numerical approach to large deviations in continuous time. J. Stat. Mech. 2007, P03004 (2007)

    MATH  Google Scholar 

  37. Angeli, L., Grosskinsky, S., Johansen, A.M., Pizzoferrato, A.: Rare event simulation for stochastic dynamics in continuous time. J. Stat. Phys. 176, 1185–1210 (2019)

    ADS  MathSciNet  MATH  Google Scholar 

  38. Torrie, G.M., Valleau, J.P.: Nonphysical sampling distributions in Monte Carlo free-energy estimation: umbrella sampling. J. Comput. Phys. 23, 187–199 (1977)

    ADS  Google Scholar 

  39. Juneja, S., Shahabuddin, P.: Rare-event simulation techniques: an introduction and recent advances, Chap. 11, pp. 291–350 Elsevier, Amsterdam (2006)

  40. Asmussen, S., Glynn, P.W.: Stochastic Simulation: Algorithms and Analysis. Stochastic Modelling and Applied Probability. Springer, New York (2007)

    MATH  Google Scholar 

  41. Bucklew, J.A.: Introduction to Rare Event Simulation. Springer, New York (2004)

    MATH  Google Scholar 

  42. Sadowsky, J.S., Bucklew, J.A.: Large deviations theory techniques in Monte Carlo simulation. In: MacNair, E.A., Musselman, K.J., Heidelberger, P. (eds.) Proceedings of the 1989 Winter Simulation Conference, pp. 505–513. ACM, New York (1989)

    Google Scholar 

  43. Sadowsky, J.S., Bucklew, J.A.: On large deviations theory and asymptotically efficient Monte Carlo estimation. IEEE Trans. Inf. Theory 36, 579–588 (1990)

    MathSciNet  MATH  Google Scholar 

  44. Bucklew, J.A., Ney, P., Sadowsky, J.S.: Monte Carlo simulation and large deviations theory for uniformly recurrent Markov chains. J. Appl. Prob. 27, 44–59 (1990)

    MathSciNet  MATH  Google Scholar 

  45. Schlebusch, H.-J.: On the asymptotic efficiency of importance sampling techniques. IEEE Trans. Inf. Thoery 39, 710–715 (1993)

    MATH  Google Scholar 

  46. Dieker, A.B., Mandjes, M.: On asymptotically efficient simulation of large deviation probabilities. Adv. Appl. Prob. 37, 539–552 (2005)

    MathSciNet  MATH  Google Scholar 

  47. Efron, B., Traux, D.: Large deviations theory in exponential families. Ann. Math. Stat. 39, 1402–1424 (1968)

    MathSciNet  MATH  Google Scholar 

  48. Touchette, H.: Asymptotic equivalence of probability measures and stochastic processes. J. Stat. Phys. 170, 962–978 (2018a)

    ADS  MathSciNet  MATH  Google Scholar 

  49. Cottrell, M., Fort, J.-C., Malgouyres, G.: Large deviations and rare events in the study of stochastic algorithms. IEEE Trans. Autom. Control 28, 907–920 (1983)

    MathSciNet  MATH  Google Scholar 

  50. Freidlin, M.I., Wentzell, A.D.: Random Perturbations of Dynamical Systems, Grundlehren der Mathematischen Wissenschaften, vol. 260. Springer, New York (1984)

    Google Scholar 

  51. Graham, R.: Macroscopic potentials, bifurcations and noise in dissipative systems. In: Moss, F., McClintock, P.V.E. (eds.) Noise in Nonlinear Dynamical Systems, vol. 1, pp. 225–278. Cambridge University Press, Cambridge (1989)

    Google Scholar 

  52. Luchinsky, D.G., McClintock, P.V.E., Dykman, M.I.: Analogue studies of nonlinear systems. Rep. Prog. Phys. 61, 889–997 (1998)

    ADS  Google Scholar 

  53. Touchette, H.: Introduction to dynamical large deviations of Markov processes. Physica A 504, 5–19 (2018b)

    ADS  MathSciNet  Google Scholar 

  54. Bertini, L., De Sole, A., Gabrielli, D., Jona-Lasinio, G., Landim, C.: Macroscopic fluctuation theory. Rev. Mod. Phys. 87, 593–636 (2015)

    ADS  MathSciNet  MATH  Google Scholar 

  55. Touchette, H.: Equivalence and nonequivalence of ensembles: thermodynamic, macrostate, and measure levels. J. Stat. Phys. 159, 987–1016 (2015)

    ADS  MathSciNet  MATH  Google Scholar 

  56. Rubinstein, R.Y., Kroese, D.P.: The Cross-Entropy Method. Springer, New York (2004)

    MATH  Google Scholar 

  57. Engel, A., Monasson, R., Hartmann, A.K.: On large deviation properties of Erdös-Rényi random graphs. J. Stat. Phys. 117, 387–426 (2004)

    ADS  MATH  Google Scholar 

  58. Hartmann, A.K.: Large-deviation properties of largest component for random graphs. Eur. J. Phys. B 84, 627–634 (2011)

    ADS  MathSciNet  Google Scholar 

  59. Dewenter, T., Hartmann, A.K.: Large-deviation properties of resilience of power grids. New J. Phys. 17, 015005 (2015)

    ADS  Google Scholar 

  60. Guasoni, P., Robertson, S.: Optimal importance sampling with explicit formulas in continuous time. Financ. Stoch. 12, 1–19 (2008)

    MathSciNet  MATH  Google Scholar 

  61. Vanden-Eijnden, E., Weare, J.: Rare event simulation of small noise diffusions. Commun. Pure Appl. Math. 65, 1770–1803 (2012)

    MathSciNet  MATH  Google Scholar 

  62. Kundu, A., Sabhapandit, S., Dhar, A.: Application of importance sampling to the computation of large deviations in nonequilibrium processes. Phys. Rev. E 83, 031119 (2011)

    ADS  Google Scholar 

  63. Klymko, K., Geissler, P.L., Garrahan, J.P., Whitelam, S.: Rare behavior of growth processes via umbrella sampling of trajectories. Phys. Rev. E 97, 032123 (2018)

    ADS  Google Scholar 

  64. Whitelam, S.: Sampling rare fluctuations of discrete-time Markov chains. Phys. Rev. E 97, 032122 (2018)

    ADS  Google Scholar 

  65. Jacobson, D., Whitelam, S.: Direct evaluation of dynamical large-deviation rate functions using a variational ansatz. Phys. Rev. E 100, 052139 (2019)

    ADS  MathSciNet  Google Scholar 

  66. Glasserman, P., Wang, Y.: Counterexamples in importance sampling for large deviations probabilities. Ann. Appl. Prob. 7, 731–746 (1997)

    MathSciNet  MATH  Google Scholar 

  67. Puhalskii, A., Spokoiny, V.: On large-deviation efficiency in statistical inference. Bernoulli 4, 203–272 (1998)

    MathSciNet  MATH  Google Scholar 

  68. Ellis, R.S., Haven, K., Turkington, B.: Large deviation principles and complete equivalence and nonequivalence results for pure and mixed ensembles. J. Stat. Phys. 101, 999–1064 (2000)

    MathSciNet  MATH  Google Scholar 

  69. Varadhan, S.R.S.: Asymptotic probabilities and differential equations. Commun. Pure Appl. Math. 19, 261–286 (1966)

    MathSciNet  MATH  Google Scholar 

  70. Touchette, H.: A basic introduction to large deviations: theory, applications, simulations. In: Leidl, R., Hartmann, A.K. (eds.) Modern Computational Science 11: Lecture Notes from the 3rd International Oldenburg Summer School. BIS-Verlag der Carl von Ossietzky Universität Oldenburg, Oldenburg (2011)

    Google Scholar 

  71. Chetrite, R., Touchette, H.: Nonequilibrium Markov processes conditioned on large deviations. Ann. Henri Poincaré 16, 2005–2057 (2015a)

    ADS  MathSciNet  MATH  Google Scholar 

  72. Harris, R.J., Touchette, H.: Current fluctuations in stochastic systems with long-range memory. J. Phys. A 42, 342001 (2009)

    MathSciNet  MATH  Google Scholar 

  73. Küchler, U., Sōrensen, M.: On exponential families of Markov processes. J. Stat. Plan. Inference 66, 3–19 (1998)

    MathSciNet  MATH  Google Scholar 

  74. Stroock, D.W., Varadhan, S.R.S.: Multidimensional Diffusion Processes. Springer, New York (1979)

    MATH  Google Scholar 

  75. Chetrite, R., Touchette, H.: Variational and optimal control representations of conditioned and driven processes. J. Stat. Mech. 2015, P12001 (2015b)

    MathSciNet  MATH  Google Scholar 

  76. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)

    MATH  Google Scholar 

  77. Rockafellar, R.T., Wets, R.J.-B.: Variational Analysis, vol. 317, p. 1988. Springer, New York (1988)

    Google Scholar 

  78. Borwein, J., Lewis, A.: Convex Analysis and Nonlinear Optimization, 2nd edn. Springer, New York (2006)

    MATH  Google Scholar 

Download references

Acknowledgements

A.G. thanks Maxime Sangnier for fruitful discussions during the writing of this paper. We also thank Grégoire Ferré and Gabriel Stoltz for carefully reading the paper. H.T. is supported by Stellenbosch University (Establishment Funds) and the National Research Foundation of South Africa (Grant No. 96199).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hugo Touchette.

Additional information

Communicated by Abhishek Dhar.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Convex Analysis

We collect in this section basic results of convex analysis used in the paper in relation to the rate function \(I_Q^B(w)\), defined in (50), and its Legendre–Fenchel transform \(\lambda _Q^B(k)\), defined in (57). Both are functions of a single real variable, so we state the necessary results only for this simple case. We assume further that all convex functions are proper closed convex functions. For more general results and proofs, we refer to [76,77,78].

1.1 Subdifferentials

Let \(f:{\mathbb {R}}\rightarrow {\bar{{\mathbb {R}}}}\) be a real function taking values in the set of extended reals \({\bar{{\mathbb {R}}}}\). The subdifferential \(\partial f(x)\) of f at the point x is the set of all values \(k\in {\mathbb {R}}\) such that

$$\begin{aligned} f(y)\ge f(x) +k(y-x) \end{aligned}$$
(A.1)

for all \(y\in {\mathbb {R}}\) [76, Sect. 23]. Put differently, and as illustrated in Fig. 7a, \(\partial f(x)\) is the set of slopes of all possible supporting lines of f at x. If f has not supporting line at x, then \(\partial f(x)=\emptyset \). We will see next that this may happen when f is nonconvex.

For convex functions, subdifferentials exist everywhere in the domain of f(x), except possibly at boundary points [76, Theorem 23.4]. For this class of functions, we have in fact \(\partial f(x) = [f'(x^-),f'(x^+)]\), where \(f'(x^-)\) is the left-derivative and \(f'(x^+)\) the right-derivative [76, Theorem 24.3]. If these are equal, f is differentiable at x so that \(\partial f(x) = \{f'(x)\}\) [76, Theorem 25.1]. In all cases, \(\partial f(x)\) is a closed convex interval [76, p. 215].

Fig. 7
figure 7

a Function f(x) with a unique supporting line at the point a, no supporting line at the point b, and many supporting lines at the point c, leading to \(\partial f(a)=\{f'(a)\}\), \(\partial f(b)=\emptyset \), and \(\partial f(c)=[f'(c^-),f'(c^+)]\). b Function f(x) and its convex envelope \(f^{**}(x)\)

1.2 Legendre–Fenchel Transforms

The Legendre–Fenchel transform of f is the real function defined by

$$\begin{aligned} f^*(k) = \sup _{x\in {\mathbb {R}}} \{kx-f(x)\},\qquad k\in {\mathbb {R}}. \end{aligned}$$
(A.2)

This function is also called the dual or conjugate of f and has the property of being convex [76, Theorem 12.2]. The double dual or biconjugate of f is the Legendre–Fenchel of \(f^*\):

$$\begin{aligned} f^{**}(x) = \sup _{k\in {\mathbb {R}}} \{kx-f^*(k)\}. \end{aligned}$$
(A.3)

This is also a convex function, corresponding to the convex envelope or convex hull of f [77, Theorem 11.1], as illustrated in Fig. 7b.

With this geometric interpretation of \(f^{**}\), it is natural to say that x is a convex point of f if \(f(x)=f^{**}(x)\) and a nonconvex point of f if \(f(x)\ne f^{**}(x)\). An important result proved in [68, Lem. 4.1] is that the set of convex points coincides with the set of points admitting supporting lines, except possibly at boundary points. With this proviso, we then have \(f(x)=f^{**}(x)\) if and only if \(\partial f(x)\ne \emptyset \). This is illustrated in Fig. 7a. The same result also implies that, if \(f(x)=f^{**}(x)\), then \(\partial f(x)=\partial f^{**}(x)\).

In this paper, we deal with rate functions, which always have at least one global minimum. Denoting one such minimizer by \(x^*\), we then have \(0\in \partial f(x^*)\). Hence, \(x^*\) is a convex point such that \(f(x^*)=f^{**}(x^*)\) and \(\partial f(x^*)=\partial f^{**}(x^*)\).

1.3 Duality

The proof of our main result, Theorem 4, is based on another important result about convex functions stating (see [76, Cor. 23.5.1] or [77, Prop. 11.3]) that

$$\begin{aligned} k\in \partial f(x)\iff x\in \partial f^*(k). \end{aligned}$$
(A.4)

This property expresses a form of duality or conjugacy between the slopes of f and the slopes of \(f^*\), illustrated in Fig. 8a. From this result, it is easy to see that convex, affine parts of f correspond to cusps of \(f^*\), and vice versa, as shown in Fig. 8b.

The duality in (A.4) also holds for \(f^{**}\), since this function is convex and is the Legendre–Fenchel transform of \(f^*\). Therefore,

$$\begin{aligned} k\in \partial f^{**}(x)\iff x\in \partial f^*(k). \end{aligned}$$
(A.5)

This result implies that \(f^*\) has a cusp also when f is nonconvex, as shown in Fig. 8, since \(f^{**}\) is affine where f is nonconvex. Thus, \(f^*\) has a cusp either if f is affine or f is nonconvex.

Since subdifferentials of f and \(f^{**}\) match at convex points, it is also clear from (A.5) that the first duality (A.4) holds locally at these points even if f is not globally convex. We use this result in this paper when dealing with the subdifferential of \(I_Q^B\) at its global minimum \(w^*\), which is a convex point, as mentioned. In this case, the first duality result can be applied at that point even though \(I_Q^B\) might be nonconvex at other points, as in Figs. 2c or 6.

Fig. 8
figure 8

a Illustration of the duality between the slopes of f(x) and the slopes of its Legendre–Fenchel transform \(f^*(k)\). b Functions with affine or nonconvex parts give rise to a Legendre–Fenchel transform having a cusp

Appendix B: Contraction Principle

The contraction principle is an important result in large deviation theory relating the rate functions of random variables that can be mapped to one another. Let \((A_n)_{n>0}\) be a sequence of random variables satisfying the LDP with good rate function \(I_A\) and let \((B_n)_{n>0}\) be another sequence such that \(B_n=f(A_n)\) with f continuous. Then \((B_n)_{n>0}\) also satisfies the LDP with good rate function

$$\begin{aligned} I_B(b)=\inf _{a:f(a)=b} I_A(a). \end{aligned}$$
(A.6)

See [6, Theorem 4.2.1] for details.

Instead of considering a single continuous function f as the contraction, one can also consider a sequence \((f_n)_{n>0}\) of continuous functions. In this case, the contraction principle also applies provided that \(f_n\) is “close enough” to f with respect to \(P_n\). To be more precise, let \({\mathcal {A}}\) denote the space of \(A_n\) and define

$$\begin{aligned} \Gamma _{n,\delta }=\{a\in {\mathcal {A}}: \Vert f_n(a)-f(a)\Vert >\delta \} \end{aligned}$$
(A.7)

as the set of points for which \(f_n\) differs from f by at least \(\delta >0\) with respect to any metric \(\Vert \cdot \Vert \) on \({\mathcal {B}}\), the space of \(B_n\). Then, according to [6, Cor. 4.2.21], \(B_n=f_n(A_n)\) satisfies the LDP with good rate function \(I_B\) given by (A.6) with f as the contraction if, for all \(\delta >0\),

$$\begin{aligned} \lim _{n\rightarrow \infty } \frac{1}{n}\log P_n(\Gamma _{n,\delta })=-\infty . \end{aligned}$$
(A.8)

This condition only means that the probability that \(f_n\) differs from f decreases faster than exponentially with n in the large deviation limit. This is met in most cases when \(f_n\) is smooth and \(I_A\) is a good rate function.

Two particular applications of this result are considered in the paper.

Example 4

Consider two real random variables \(A_n\) and \(B_n\) related by the simple rescaling \(B_n=c_n A_n\) with \(c_n\rightarrow 1\) as \(n\rightarrow \infty \). Here, the limit function is the identity \(f(a)=a\), so one expects \(A_n\) and \(B_n\) to have the same rate function. This is verified by noting that, for every \(M>0\), there exists \(n_0=n_0(M,\delta )\) such that for all \(n\ge n_0\), one has \(\Gamma _{n,\delta }\subseteq (-\infty ,-M]\cup [M,\infty )\). Therefore, from the definition of the LDP, we obtain

$$\begin{aligned} \limsup _{n\rightarrow \infty } \frac{1}{n}\log P_n(\Gamma _{n,\delta })\le -\inf _{|a|\ge M} I_A(a). \end{aligned}$$
(A.9)

But, since the rate function \(I_A\) of \(A_n\) is good, it is coercive, so that

$$\begin{aligned} \lim _{|a|\rightarrow \infty } I_A(a)=\infty . \end{aligned}$$
(A.10)

Therefore, the limit on the left-hand side of (A.9) must give \(-\infty \), implying \(I_B(b) = I_A(b)\) from the condition (A.8).

Example 5

Let \(B_n =f(A_n)+c_n\) with \(c_n\rightarrow c\). Then the rate function of \(B_n\) is obtained from (A.6) with the contraction \(B_n=f(A_n)+c\). This follows trivially because the distance between \(f(a)+c_n\) and \(f(a)+c\) is constant in a. Since \(c_n\rightarrow c\), there must be an n beyond which \(|c_n-c|<\delta \), leading to \(P_n(\Gamma _{n,\delta })=0\), so the condition (A.8) is also satisfied.

These results also hold if \(\Gamma _{n,\delta }\) is defined on a subset of \({\mathcal {A}}\), since any restriction or constraint on \(A_n\) can be included in the definition of \(f_n\). This arises, for example, when considering the contraction of \(J_Q(m,w)\) to \(I_Q^B(w)\), which involves the restriction \(m\in B\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guyader, A., Touchette, H. Efficient Large Deviation Estimation Based on Importance Sampling. J Stat Phys 181, 551–586 (2020). https://doi.org/10.1007/s10955-020-02589-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10955-020-02589-x

Keywords

Navigation