Skip to main content
Log in

On Bayesian Posterior Mean Estimators in Imaging Sciences and Hamilton–Jacobi Partial Differential Equations

  • Published:
Journal of Mathematical Imaging and Vision Aims and scope Submit manuscript

Abstract

Variational and Bayesian methods are two widely used set of approaches to solve image denoising problems. In a Bayesian setting, these approaches correspond, respectively, to using maximum a posteriori estimators and posterior mean estimators for reconstructing images. In this paper, we propose novel theoretical connections between Hamilton–Jacobi partial differential equations (HJ PDEs) and a broad class of posterior mean estimators with quadratic data fidelity term and log-concave prior. Where solutions to some first-order HJ PDEs with initial data describe maximum a posteriori estimators, here we show that solutions to some viscous HJ PDEs with initial data describe a broad class of posterior mean estimators. We use these connections to establish representation formulas and various properties of posterior mean estimators. In particular, we use these connections to show that some Bayesian posterior mean estimators can be expressed as proximal mappings of smooth functions and derive representation formulas for these functions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Alberti, G., Ambrosio, L., Cannarsa, P.: On the singularities of convex functions. Manuscripta Math. 76(3–4), 421–435 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  2. Aubert, G., Kornprobst, P.: Mathematical Problems in Image Processing: Partial Differential Equations and the Calculus of Variations, vol. 147. Springer Science & Business Media, Berlin (2006)

    Book  MATH  Google Scholar 

  3. Aubin, J.P., Cellina, A.: Differential inclusions: set-valued maps and viability theory, vol. 264. Springer Science & Business Media, Berlin (2012)

    MATH  Google Scholar 

  4. Banerjee, A., Guo, X., Wang, H.: On the optimality of conditional expectation as a Bregman predictor. IEEE Trans. Inform. Theory 51(7), 2664–2669 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  5. Boncelet, C.: Chapter 7 - image noise models. In: Bovik, A. (ed.) The Essential Guide to Image Processing, pp. 143–167. Academic Press, Cambridge (2009)

    Chapter  Google Scholar 

  6. Bouman, C., Sauer, K.: A generalized Gaussian image model for edge-preserving map estimation. IEEE Trans. Image Process. 2(3), 296–310 (1993)

    Article  Google Scholar 

  7. Boyat, A.K., Joshi, B.K.: A review paper: noise models in digital image processing. Signal Image Process. Int. J. (SIPIJ) 6(2), 63–75 (2015)

    Article  Google Scholar 

  8. Burger, M., Dong, Y., Sciacchitano, F.: Bregman cost for non-Gaussian noise (2016). arXiv preprint arXiv:1608.07483

  9. Burger, M., Lucka, F.: Maximum a posteriori estimates in linear inverse problems with log-concave priors are proper Bayes estimators. Inverse Probl. 30(11), 114004 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  10. Candès, E., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inform. Theory 52(2), 489–509 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  11. Chambolle, A., Darbon, J.: On total variation minimization and surface evolution using parametric maximum flows. Int. J. Comput. Vis. 84(3), 288 (2009)

    Article  MATH  Google Scholar 

  12. Chambolle, A., Lions, P.L.: Image recovery via total variation minimization and related problems. Numer. Math. 76(2), 167–188 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  13. Chambolle, A., Pock, T.: An introduction to continuous optimization for imaging. Acta Numer. 25, 161–319 (2016)

    Article  MathSciNet  MATH  Google Scholar 

  14. Chan, T., Marquina, A., Mulet, P.: High-order total variation-based image restoration. SIAM J. Sci. Comput. 22(2), 503–516 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  15. Chaudhari, P., Choromanska, A., Soatto, S., LeCun, Y., Baldassi, C., Borgs, C., Chayes, J., Sagun, L., Zecchina, R.: Entropy-sgd: Biasing gradient descent into wide valleys. J. Stat. Mech. Theory Exp. 2019(12), 124018 (2019)

    Article  MathSciNet  MATH  Google Scholar 

  16. Chaudhari, P., Oberman, A., Osher, S., Soatto, S., Carlier, G.: Deep relaxation: partial differential equations for optimizing deep neural networks. Res. Math. Sci. 5(3), 30 (2018)

    Article  MathSciNet  MATH  Google Scholar 

  17. Darbon, J.: On convex finite-dimensional variational methods in imaging sciences and Hamilton-Jacobi equations. SIAM J. Imaging Sci. 8(4), 2268–2293 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  18. Darbon, J., Meng, T.: On decomposition models in imaging sciences and multi-time Hamilton-Jacobi partial differential equations. SIAM J. Imaging Sci. 13(2), 971–1014 (2020)

    Article  MathSciNet  MATH  Google Scholar 

  19. Darbon, J., Sigelle, M.: Image restoration with discrete constrained total variation part I: Fast and exact optimization. J. Math. Imaging Vis. 26(3), 261–276 (2006)

    Article  Google Scholar 

  20. Daubechies, I., Defrise, M., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. A J. Issued Courant Insti. Math. Sci. 57(11), 1413–1457 (2004)

  21. Demoment, G.: Image reconstruction and restoration: overview of common estimation structures and problems. IEEE Trans. Acoust. Speech Signal Process. 37(12), 2024–2036 (1989)

    Article  Google Scholar 

  22. Deuschel, J.D., Stroock, D.: Large Deviations, Pure and Applied Mathematics, vol. 137. Academic Press, Cambridge (1989)

    MATH  Google Scholar 

  23. Dobson, D.C., Santosa, F.: Recovery of blocky images from noisy and blurred data. SIAM J. Appl. Math. 56(4), 1181–1198 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  24. Donoho, D.: Compressed sensing. IEEE Trans. Inform. Theory 52(4), 1289–1306 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  25. Durand, S., Malgouyres, F., Rougé, B.: Image deblurring, spectrum interpolation and application to satellite imaging. ESAIM Control Optim. Calc. Var. 5, 445–475 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  26. Evans, L.: Partial Differential Equations. American Mathematical Society, Providence, RI (2010)

    MATH  Google Scholar 

  27. Falconer, K.J.: The Geometry of Fractal Sets. Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge (1985). https://doi.org/10.1017/CBO9780511623738

  28. Federer, H.: Geometric Measure Theory. Springer, Berlin (1969)

    MATH  Google Scholar 

  29. Figueiredo, M.A., Nowak, R.D.: Wavelet-based image estimation: an empirical Bayes approach using Jeffrey’s noninformative prior. IEEE Trans. Image Process. 10(9), 1322–1331 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  30. Folland, G.B.: Real Analysis: Modern Techniques and Their Applications. John Wiley & Sons, Hoboken (2013)

    MATH  Google Scholar 

  31. García Trillos, N., Kaplan, Z., Sanz-Alonso, D.: Variational characterizations of local entropy and heat regularization in deep learning. Entropy 21(5), 511 (2019)

    Article  MathSciNet  Google Scholar 

  32. Giusti, E.: Minimal Surfaces and Functions of Bounded Variation, Monographs in Mathematics, vol. 80. Birkhäuser Verlag, Basel (1984)

    Book  MATH  Google Scholar 

  33. Gradshtein, I., Ryzhik, I., Jeffrey, A., Zwillinger, D.: Table of integrals, series and products. Academic Press, Cambridge (2007)

    Google Scholar 

  34. Gribonval, R.: Should penalized least squares regression be interpreted as maximum a posteriori estimation? IEEE Trans. Signal Process. 59(5), 2405–2410 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  35. Gribonval, R., Machart, P.: Reconciling “priors” & “priors” without prejudice? In: Advances in Neural Information Processing Systems, pp. 2193–2201 (2013)

  36. Gribonval, R., Nikolova, M.: A characterization of proximity operators. J. Math. Imaging Vis. 62, 773–789 (2020). https://doi.org/10.1007/s10851-020-00951-y

    Article  MathSciNet  MATH  Google Scholar 

  37. Gribonval, R., Nikolova, M.: On Bayesian estimation and proximity operators. Appl. Comput. Harmon. Anal. 50, 49–72 (2021). https://doi.org/10.1016/j.acha.2019.07.002

    Article  MathSciNet  MATH  Google Scholar 

  38. Hiriart-Urruty, J.B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms I: Fundamentals, Grundlehren Text Editions, vol. 305. Springer Science & Business Media, Berlin (1993)

    Book  MATH  Google Scholar 

  39. Hiriart-Urruty, J.B., Lemaréchal, C.: Convex Analysis and Minimization Algorithms II: Advanced Theory and Bundle Methods, Grundlehren Text Editions, vol. 306. Springer Science & Business Media, Berlin (1993)

    Book  MATH  Google Scholar 

  40. Hiriart-Urruty, J.B., Plazanet, P.H.: Moreau’s decomposition theorem revisited. In: Annales de l’Institut Henri Poincare (C) Non Linear Analysis, vol. 6, pp. 325–338. Elsevier (1989)

  41. Hochbaum, D.S.: An efficient algorithm for image segmentation, Markov random fields and related problems. J. ACM (JACM) 48(4), 686–701 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  42. Hörmander, L.: The analysis of linear partial differential operators. I. Classics in Mathematics. Springer-Verlag, Berlin (2003). https://doi.org/10.1007/978-3-642-61497-2. Distribution theory and Fourier analysis, Reprint of the second (1990) edition [Springer, Berlin; MR1065993 (91m:35001a)]

  43. Kay, S.M.: Fundamentals of Statistical Signal Processing. Prentice Hall PTR, Hoboken (1993)

    MATH  Google Scholar 

  44. Keener, R.W.: Theoretical Statistics: Topics for a Core Course. Springer, Berlin (2011)

    MATH  Google Scholar 

  45. Lang, R.: A note on the measurability of convex sets. Arch. Math. (Basel) 47(1), 90–92 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  46. Leindler, L.: On a certain converse of Hölder’s inequality inequality. In: Linear Operators and Approximation, Lineare Operatoren und Approximation, pp. 182–184. Springer (1972)

  47. Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16(6), 964–979 (1979)

    Article  MathSciNet  MATH  Google Scholar 

  48. Louchet, C.: Modèles variationnels et bayésiens pour le débruitage d’images: de la variation totale vers les moyennes non-locales. Ph.D. thesis, Université René Descartes-Paris V (2008)

  49. Louchet, C., Moisan, L.: Posterior expectation of the total variation model: properties and experiments. SIAM J. Imaging Sci. 6(4), 2640–2684 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  50. Moreau, J.J.: Proximité et dualité dans un espace hilbertien. Bull. de la Société mathématique de France 93, 273–299 (1965)

  51. Nikolova, M.: Weakly constrained minimization: application to the estimation of images and signals involving constant regions. J. Math. Imaging Vis. 21(2), 155–175 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  52. Nikolova, M.: Model distortions in Bayesian map reconstruction. Inverse Probl. Imaging 1(2), 399 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  53. Pereyra, M.: Revisiting maximum-a-posteriori estimation in log-concave models. SIAM J. Imaging Sci. 12(1), 650–670 (2019)

    Article  MathSciNet  Google Scholar 

  54. Pfeffer, W.F.: Divergence theorem for vector fields with singularities. In: New Integrals, pp. 150–166. Springer (1990)

  55. Phillips, D.L.: A technique for the numerical solution of certain integral equations of the first kind. J. ACM (JACM) 9(1), 84–97 (1962)

    Article  MathSciNet  MATH  Google Scholar 

  56. Prékopa, A.: Logarithmic concave measures with application to stochastic programming. Acta Sci. Math. 32, 301–316 (1971)

    MathSciNet  MATH  Google Scholar 

  57. Roberts, G.O., Rosenthal, J.S., et al.: Optimal scaling for various Metropolis-Hastings algorithms. Stat. Sci. 16(4), 351–367 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  58. Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton (1970)

    Book  MATH  Google Scholar 

  59. Rockafellar, R.T., Wets, R.J.B.: Variational Analysis, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 317. Springer-Verlag, Berlin (2009)

    Google Scholar 

  60. Rudin, L.I., Osher, S., Fatemi, E.: Nonlinear total variation based noise removal algorithms. Phys. D 60(1–4), 259–268 (1992)

    Article  MathSciNet  MATH  Google Scholar 

  61. Rudin, W.: Real and Complex Analysis. Tata McGraw-Hill education, New York (2006)

    MATH  Google Scholar 

  62. Stuart, A.M.: Inverse problems: a Bayesian perspective. Acta Numer. 19, 451–559 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  63. Tarantola, A.: Inverse Problem Theory and Methods for Model Parameter Estimation. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (2005)

    Book  MATH  Google Scholar 

  64. Tikhonov, A.N., Goncharsky, A., Stepanov, V., Yagola, A.G.: Numerical Methods for the Solution of Ill-Posed Problems, vol. 328. Springer Science & Business Media, Berlin (1995)

    Book  MATH  Google Scholar 

  65. Vidal, R., Bruna, J., Giryes, R., Soatto, S.: Mathematics of deep learning (2017). arXiv preprint arXiv:1712.04741

  66. Vogel, C.R.: Computational Methods for Inverse Problems, Frontiers in Applied Mathematics, vol. 23. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (2002)

    Book  Google Scholar 

  67. Winkler, G.: Image Analysis, Random Fields and Markov Chain Monte Carlo Methods: a Mathematical Introduction, Applications of Mathematics (New York), vol. 27, 2nd edn. Springer-Verlag, Berlin (2003)

  68. Woodford, O.J., Rother, C., Kolmogorov, V.: A global perspective on map inference for low-level vision. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2319–2326. IEEE (2009)

  69. Zhou, X.: On the Fenchel duality between strong convexity and Lipschitz continuous gradient (2018). arXiv preprint arXiv:1803.06573

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jérôme Darbon.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Research supported by NSF DMS 1820821. Authors’ names are given in last/family name alphabetical order.

Appendices

Proof of Proposition 3.1

We will use the following lemma, which characterizes the partition function (26) in terms of the solution to a Cauchy problem involving the heat equation with initial data \(J \in \Gamma _0({\mathbb {R}}^n)\), to prove parts (i) and (ii)(a)–(d) of Proposition 3.1.

Lemma A.1

(The heat equation with initial data in \(\Gamma _0({\mathbb {R}}^n)\)) Suppose the function \(J:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\cup \{+\infty \}\) satisfies assumptions (A1)–(A3).

  1. (i)

    For every \(\epsilon >0\), the function \(w_\epsilon :{\mathbb {R}}^n\times [0,+\infty )\rightarrow (0,1]\) defined by

    $$\begin{aligned} w_{\epsilon }({\varvec{x}},t)&:=\frac{1}{(2\pi t \epsilon )^{n/2}}Z_J({\varvec{x}},t,\epsilon ) \nonumber \\&= \frac{1}{\left( 2\pi t\epsilon \right) ^{n/2}}\int _{{\mathbb {R}}^n}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon }d{\varvec{y}} \end{aligned}$$
    (55)

    is the unique smooth solution to the Cauchy problem

    $$\begin{aligned} {\left\{ \begin{array}{ll} \frac{\partial w_{\epsilon }}{\partial t}({\varvec{x}},t)=\frac{\epsilon }{2}\Delta _{{\varvec{x}}}w_{\epsilon }({\varvec{x}},t) &{} \text{ in } {\mathbb {R}}^n\times (0,+\infty ),\\ w_{\epsilon }({\varvec{x}},0)=e^{-J({\varvec{x}})/\epsilon } &{} \text{ in } {\mathbb {R}}^n. \end{array}\right. } \end{aligned}$$
    (56)

    In addition, the domain of integration of the integral (55) can be taken to be \(\mathrm {dom}~J\) or, up to a set of Lebesgue measure zero, \(\mathrm {int}~(\mathrm {dom}~J)\) or \(\mathrm {dom}~\partial J\). Furthermore, for every \({\varvec{x}}\in {\mathbb {R}}^n\) and \(\epsilon > 0\), except possibly at the points \({\varvec{x}}\in (\mathrm {dom}~J)\setminus (\mathrm {int}~\mathrm {dom}~J)\) if such points exist, the pointwise limit of \(w_\epsilon ({\varvec{x}},t)\) as \(t \rightarrow 0\) exists and satisfies

    $$\begin{aligned} \lim _{\begin{array}{c} t\rightarrow 0\\ t>0 \end{array}}w_{\epsilon }({\varvec{x}},t)=e^{-J({\varvec{x}})/\epsilon }, \end{aligned}$$

    with the limit equal to 0 whenever \({\varvec{x}}\notin \mathrm {dom}~J\).

  2. (ii)

    (Log-concavity and monotonicity properties).

    1. (a)

      The function \({\mathbb {R}}^n\times (0,+\infty )\ni ({\varvec{x}},t)\mapsto t^{n/2}w_{\epsilon }({\varvec{x}},t)\) is jointly log-concave.

    2. (b)

      The function \((0,+\infty )\ni t\mapsto t^{n/2}w_{\epsilon }({\varvec{x}},t)\) is strictly monotone increasing.

    3. (c)

      The function \((0,+\infty )\ni \epsilon \mapsto \epsilon ^{n/2}w_{\epsilon }({\varvec{x}},t)\) is strictly monotone increasing.

    4. (d)

      The function \({\mathbb {R}}^n\ni {\varvec{x}}\mapsto e^{\frac{1}{2t\epsilon }\left\| {{\varvec{x}}} \right\| _{2}^{2}}w_{\epsilon }({\varvec{x}},t)\) is strictly log-convex.

The proof of (i) follows from classical PDEs arguments for the Cauchy problem (56) tailored to the initial data \(({\varvec{x}},\epsilon )\mapsto e^{-J({\varvec{x}})/\epsilon }\) with J satisfying assumptions (A1)–(A3), and the proof of log-concavity and monotonicity (ii)(a)–(d) follows from the Prékopa–Leindler and Hölder’s inequalities [30, 46, 56]; we present the details below.

Proof

Proof of Lemma A.1(i): This result follows directly from the theory of convolution of Schwartz distributions ( [42], Chapter 2, Sect. 2.1, Chapter 4, Sect. 4.2 and 4.4., and in particular Theorem 4.4.1 on page 110). To see why this is the case, note that by assumptions (A1)–(A3) the initial condition \({\varvec{y}}\mapsto e^{-J({\varvec{y}})}\) is a locally integrable function, and locally integrable functions are Schwartz distributions.

Proof of Lemma A.1(ii)(a): The log-concavity property will be shown using the Prékopa–Leindler inequality.

Theorem A.1

(Prékopa–Leindler inequality [46, 56]) Let f, g,  and h be non-negative real-valued and measurable functions on \({\mathbb {R}}^n\), and suppose

$$\begin{aligned} h(\lambda {\varvec{y}}_{1}+(1-\lambda ){\varvec{y}}_{2})\geqslant f({\varvec{y}}_{1})^{\lambda }g({\varvec{y}}_{2})^{(1-\lambda )} \end{aligned}$$

for every \({\varvec{y}}_{1},\) \({\varvec{y}}_{2}\in {\mathbb {R}}^n\) and \(\lambda \in (0,1)\). Then,

$$\begin{aligned} \int _{{\mathbb {R}}^n}h({\varvec{y}})d{\varvec{y}}\geqslant \left( \int _{{\mathbb {R}}^n}f({\varvec{y}})d{\varvec{y}}\right) ^{\lambda }\left( \int _{{\mathbb {R}}^n}g({\varvec{y}})d{\varvec{y}}\right) ^{(1-\lambda )}. \end{aligned}$$

Proof of Lemma A.1(ii)(a) (continued): Let \(\epsilon >0\), \(\lambda \in (0,1)\), \({\varvec{x}}=\lambda {\varvec{x}}_{1}+(1-\lambda ){\varvec{x}}_{2}\), \({\varvec{y}}=\lambda {\varvec{y}}_{1}+(1-\lambda ){\varvec{y}}_{2}\), and \(t=\lambda t_{1}+(1-\lambda )t_{2}\) for any \({\varvec{x}}_{1},\,{\varvec{x}}_{2},\,{\varvec{y}}_{1},\,{\varvec{y}}_{2}\in {\mathbb {R}}^n\) and \(t_{1},\,t_{2}\in (0,+\infty )\). The joint convexity of the function \({\mathbb {R}}^n\times (0,+\infty )\ni ({\varvec{z}},t)\mapsto \frac{1}{2t}\left\| {\varvec{z}}\right\| _{2}^{2}\) and convexity of J imply

$$\begin{aligned}&\frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}}) \\&\quad \leqslant \frac{\lambda }{2t_{1}}\left\| {\varvec{x}}_{1}-{\varvec{y}}_{1}\right\| _{2}^{2}+\frac{(1-\lambda )}{2t_{2}}\left\| {\varvec{x}}_{2}-{\varvec{y}}_{2}\right\| _{2}^{2}\\&\qquad +\lambda J({\varvec{y}}_{1})+(1-\lambda )J({\varvec{y}}_{2}), \end{aligned}$$

This gives

$$\begin{aligned}&\frac{e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon }}{(2\pi \epsilon )^{n/2}}\\&\quad \geqslant \left( \frac{e^{-\left( \frac{1}{2t_{1}}\left\| {\varvec{x}}_{1}-{\varvec{y}}_{1}\right\| _{2}^{2}+J({\varvec{y}}_{1})\right) /\epsilon }}{(2\pi \epsilon )^{n/2}} \right) ^{\lambda }\\&\qquad \left( \frac{e^{-\left( \frac{1}{2t_{2}}\left\| {\varvec{x}}_{2}-{\varvec{y}}_{2}\right\| _{2}^{2}+J({\varvec{y}}_{2})\right) /\epsilon }}{(2\pi \epsilon )^{n/2}}\right) ^{1-\lambda }. \end{aligned}$$

Applying the Prékopa–Leindler inequality with

$$\begin{aligned} h({\varvec{y}})= & {} \frac{e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon }}{(2\pi \epsilon )^{n/2}},\\ f({\varvec{y}})= & {} \frac{e^{-\left( \frac{1}{2t_{1}}\left\| {\varvec{x}}_{1}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon }}{(2\pi \epsilon )^{n/2}}, \end{aligned}$$

and

$$\begin{aligned} g({\varvec{y}})=\frac{e^{-\left( \frac{1}{2t_{2}}\left\| {\varvec{x}}_{2}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon }}{(2\pi \epsilon )^{n/2}}, \end{aligned}$$

and using the definition (55) of \(w_{\epsilon }({\varvec{x}},t)\), we get

$$\begin{aligned} t^{n/2}w_{\epsilon }({\varvec{x}},t)\geqslant \left( t_{1}^{n/2}w_{\epsilon } ({\varvec{x}}_{1},t_{1})\right) ^{\lambda }\left( t_{2}^{n/2}w_{\epsilon }({\varvec{x}}_{2},t_{2})\right) ^{(1-\lambda )}, \end{aligned}$$

As a result, the function \(({\varvec{x}},t)\mapsto t^{n/2}w_{\epsilon }({\varvec{x}},t)\) is jointly log-concave on \({\mathbb {R}}^n\times (0,+\infty )\).

Proof of Lemma A.1(ii)(b): Since \(t\mapsto \frac{1}{t}\) is strictly monotone decreasing on \((0,+\infty )\), then for every \({\varvec{x}}\in {\mathbb {R}}^n\), \({\varvec{y}}\in \mathrm {dom}~J\), \(\epsilon >0\), and \(0<t_1<t_2\),

$$\begin{aligned}&\frac{e^{-\left( \frac{1}{2t_1}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon }}{(2\pi \epsilon )^{n/2}}\\&\quad <\frac{e^{-\left( \frac{1}{2t_2}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon }}{(2\pi \epsilon )^{n/2}} \end{aligned}$$

whenever \({\varvec{x}}\ne {\varvec{y}}\). Integrating both sides of the inequality with respect to \({\varvec{y}}\) over \(\mathrm {dom}~J\) yields

$$\begin{aligned}&\frac{1}{(2\pi \epsilon )^{n/2}}\int _{\mathrm {dom}~J}e^{-\left( \frac{1}{2t_1}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon }d{\varvec{y}}\\&\quad <\frac{1}{(2\pi \epsilon )^{n/2}}\int _{\mathrm {dom}~J}e^{-\left( \frac{1}{2t_2}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon }d{\varvec{y}}, \end{aligned}$$

As a result, the function \(t\mapsto t^{n/2}w_{\epsilon }({\varvec{x}},t)\) is strictly monotone increasing on \((0,+\infty )\).

Proof of Lemma A.1(ii)(c): Since \(\epsilon \mapsto \frac{1}{\epsilon }\) is strictly monotone decreasing on \((0,+\infty )\) and \(\mathrm {dom}~J \ni {\varvec{y}}\mapsto J({\varvec{y}})\) is non-negative by assumption (A3), then for every \({\varvec{x}}\in {\mathbb {R}}^n\), \(t>0\), and \(0<\epsilon _1<\epsilon _2\) we have

$$\begin{aligned} e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon _1}<e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon _2} \end{aligned}$$

whenever \({\varvec{x}}\ne {\varvec{y}}\). Integrating both sides of the inequality with respect to \({\varvec{y}}\) over \(\mathrm {dom}~J\) yields

$$\begin{aligned}&\int _{\mathrm {dom}~J}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon _1}d{\varvec{y}}\\&\quad <\int _{\mathrm {dom}~J}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon _2}d{\varvec{y}}, \end{aligned}$$

As a result, the function \(\epsilon \mapsto \epsilon ^{n/2}w_{\epsilon }({\varvec{x}},t)\) is strictly monotone increasing on \((0,+\infty )\).

Proof of Lemma A.1(ii)(d): Let \(\epsilon >0\), \(t>0\), \(\lambda \in (0,1)\), \({\varvec{x}}_1,{\varvec{x}}_2\in {\mathbb {R}}^n\) with \({\varvec{x}}_1\ne {\varvec{x}}_2\) and \({\varvec{x}}=\lambda {\varvec{x}}_{1}+(1-\lambda ){\varvec{x}}_{2}\). Then,

$$\begin{aligned}&e^{\frac{1}{2t\epsilon }\left\| {{\varvec{x}}} \right\| _{2}^{2}}w_\epsilon ({\varvec{x}},t) \\&\quad = \frac{1}{(2\pi t\epsilon )^{n/2}}\int _{\mathrm {dom}~J} e^{\left( \left\langle {\varvec{x}},{\varvec{y}}\right\rangle /t -\frac{1}{2t}\left\| {{\varvec{y}}} \right\| _{2}^{2} - J({\varvec{y}})\right) /\epsilon }d{\varvec{y}}\\&\quad =\int _{\mathrm {dom}~J} \left( \frac{e^{\left( \left\langle {\varvec{x}}_1,{\varvec{y}}\right\rangle /t-\frac{1}{2t}\left\| {{\varvec{y}}} \right\| _{2}^{2} - J({\varvec{y}})\right) /\epsilon }}{(2\pi t \epsilon )^{n/2}}\right) ^\lambda \\&\qquad \left( \frac{e^{\left\langle {\varvec{x}}_2,{\varvec{y}}\right\rangle /t\epsilon -\frac{1}{2t}\left\| {{\varvec{y}}} \right\| _{2}^{2} - J({\varvec{y}})/\epsilon }}{(2\pi t \epsilon )^{n/2}}\right) ^{1-\lambda } d{\varvec{y}}. \end{aligned}$$

Hölder’s inequality [30, Theorem 6.2] then implies

$$\begin{aligned}&e^{\frac{1}{2t\epsilon }\left\| {{\varvec{x}}} \right\| _{2}^{2}}w_\epsilon ({\varvec{x}},t) \\&\quad \leqslant \left( \int _{\mathrm {dom}~J} \frac{e^{\left( \left\langle {\varvec{x}}_1,{\varvec{y}}\right\rangle /t-\frac{1}{2t}\left\| {{\varvec{y}}} \right\| _{2}^{2} - J({\varvec{y}})\right) /\epsilon }}{(2\pi t \epsilon )^{n/2}} d{\varvec{y}}\right) ^\lambda \\&\qquad \left( \int _{\mathrm {dom}~J} \frac{e^{\left\langle {\varvec{x}}_2,{\varvec{y}}\right\rangle /t\epsilon -\frac{1}{2t}\left\| {{\varvec{y}}} \right\| _{2}^{2} - J({\varvec{y}})/\epsilon }}{(2\pi t \epsilon )^{n/2}} d{\varvec{y}}\right) ^{1-\lambda } \\&\quad =\left( e^{\frac{1}{2t\epsilon }\left\| {{\varvec{x}}_1} \right\| _{2}^{2}}w_\epsilon ({\varvec{x}}_1,t)\right) ^\lambda \left( e^{\frac{1}{2t\epsilon }\left\| {{\varvec{x}}_2} \right\| _{2}^{2}}w_\epsilon ({\varvec{x}}_2,t)\right) ^{1-\lambda }, \end{aligned}$$

where the inequality in the equation above is an equality if and only if there exists a constant \(\alpha \in {\mathbb {R}}\) such that \(\alpha e^{\left\langle {\varvec{x}}_1,{\varvec{y}}\right\rangle /t\epsilon } = e^{\left\langle {\varvec{x}}_x,{\varvec{y}}\right\rangle /t\epsilon }\) for almost every \({\varvec{y}}\in \mathrm {dom}~J\). This does not hold here since \({\varvec{x}}_1\ne {\varvec{x}}_2\). As a result, the function \({\mathbb {R}}^n\ni {\varvec{x}}\mapsto e^{\frac{1}{2t\epsilon }\left\| {{\varvec{x}}} \right\| _{2}^{2}}w_{\epsilon }({\varvec{x}},t)\) is strictly log-convex. \(\square \)

Proof of Proposition 3.1(i) and (ii)(a)–(d): The proof of these statements follows from Lemma A.1 and classic results about the Cole–Hopf transform (see, e.g., [26, Section 4.4.1]), with \(S_\epsilon ({\varvec{x}},t) :=-\epsilon \log (w_\epsilon ({\varvec{x}},t))\).

Proof of Proposition 3.1(iii): The formulas follow from a straightforward calculation of the gradient, divergence, and Laplacian of \(S_\epsilon ({\varvec{x}},t)\) that we omit here. Since the function \({\varvec{x}}\mapsto \frac{1}{2}\left\| {{\varvec{x}}} \right\| _{2}^{2}-tS_{\epsilon }({\varvec{x}},t)\) is strictly convex, we can invoke (Corollary 26.3.1, [58]) to conclude that its gradient \({\varvec{x}}\mapsto {\varvec{x}}- t\nabla _{{\varvec{x}}}S_\epsilon ({\varvec{x}},t)\), which gives the posterior mean \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\), is bijective.

Proof of Proposition 3.1(iv): We will prove this result in three steps. First, we will show that

$$\begin{aligned} \limsup _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array} }S_{\epsilon }({\varvec{x}},t)\leqslant \inf _{{\varvec{y}}\in \mathrm {int}~(\mathrm {dom}~J)}\left\{ \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right\} \end{aligned}$$

and

$$\begin{aligned}&\inf _{{\varvec{y}}\in \mathrm {int}~(\mathrm {dom}~J)}\left\{ \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right\} \\&\quad = \inf _{{\varvec{y}}\in \mathrm {dom}~J}\left\{ \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right\} \equiv S_0({\varvec{x}},t). \end{aligned}$$

Next, we will show that \(\liminf _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array} }S_{\epsilon }({\varvec{x}},t)\geqslant S_0({\varvec{x}},t)\). Finally, we will use steps 1 and 2 to conclude that \(\lim _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array}}S_{\epsilon }({\varvec{x}},t)=S_0({\varvec{x}},t)\). Pointwise and local uniform convergence of the gradient \(\lim _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array}}\nabla _{{\varvec{x}}}S_{\epsilon }({\varvec{x}},t)=\nabla _{{\varvec{x}}}S_0({\varvec{x}},t)\), the partial derivative \(\lim _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array}}\frac{\partial S_{\epsilon }({\varvec{x}},t)}{\partial t}=\frac{\partial S_0({\varvec{x}},t)}{\partial t}\), and the Laplacian \(\lim _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array}}\frac{\epsilon }{2}\Delta _{{\varvec{x}}}S_{\epsilon }({\varvec{x}},t)=0\) then follow from the convexity and differentiability of the solutions \(({\varvec{x}},t)\mapsto S_0({\varvec{x}},t)\) and \(({\varvec{x}},t)\mapsto S_{\epsilon }({\varvec{x}},t)\) to the HJ PDEs (20) and (29).

In what follows, we will use the following large deviation principle result [22]: For every Lebesgue measurable set \({\mathcal {A}} \in {\mathbb {R}}^n\),

$$\begin{aligned}&\lim _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array}}-\epsilon \ln \left( \frac{1}{(2\pi t\epsilon )^{n/2}} \int _{{\mathcal {A}}}e^{-\frac{1}{2t\epsilon }\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}}d{\varvec{y}}\right) \\&\quad =\mathop {{\mathrm{ess\,inf}}}\limits _{{\varvec{y}}\in {\mathcal {A}}}\left\{ \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}\right\} , \end{aligned}$$

where

$$\begin{aligned}&\mathop {{\mathrm{ess\,inf}}}\limits _{{\varvec{y}}\in {\mathcal {A}}} \left\{ \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}\right\} \\&\quad = \sup \left\{ a \in {\mathbb {R}} :a \leqslant \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} \mathrm {, for}\,\mathrm {a.e.}\,{\varvec{y}}\in {\mathcal {A}} \right\} . \end{aligned}$$

Step 1. (Adapted from Deuschel and Stroock [22], Lemma 2.1.7.) By convexity, the function J is continuous for every \({\varvec{y}}_{0}\in \mathrm {int}~(\mathrm {dom}~J)\), the latter set being open. Therefore, for every such \({\varvec{y}}_{0}\) there exists a number \(r_{{\varvec{y}}_{0}}>0\) such that for every \(0<r\leqslant r_{{\varvec{y}}_{0}}\) the open ball \(B_{r}({\varvec{y}}_{0})\) is contained in \(\mathrm {int}~(\mathrm {dom}~J)\). Hence,

$$\begin{aligned} S_{\epsilon }({\varvec{x}},t)&:=-\epsilon \ln \left( \frac{1}{(2\pi t\epsilon )^{n/2}}\int _{\mathrm {int}~(\mathrm {dom}~J)}e^{-(\frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2}+J({\varvec{y}}))/\epsilon }d{\varvec{y}}\right) \\&\leqslant -\epsilon \ln \left( \frac{1}{(2\pi t\epsilon )^{n/2}}\int _{B_{r}({\varvec{y}}_{0})}e^{-(\frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2} +J({\varvec{y}}))/\epsilon }d{\varvec{y}}\right) \\&\leqslant -\epsilon \ln \left( \frac{1}{(2\pi t\epsilon )^{n/2}}\int _{B_{r}({\varvec{y}}_{0})}e^{-\frac{1}{2t\epsilon } \left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2}}d{\varvec{y}}\right) \\&\quad +\sup _{{\varvec{y}}\in B_{r}({\varvec{y}}_{0})}J({\varvec{y}}). \end{aligned}$$

Take \(\limsup _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array}}\) and apply the large deviation principle to the term on the right to get

$$\begin{aligned} \limsup _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array}}S_{\epsilon }({\varvec{x}},t) \leqslant \mathop {{\mathrm{ess\,inf}}}\limits _{{\varvec{y}}\in B_{r}({\varvec{y}}_{0})}\left\{ \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}\right\} +\sup _{{\varvec{y}}\in B_{r}({\varvec{y}}_{0})}J({\varvec{y}}). \end{aligned}$$

Take \(\lim _{r\rightarrow 0}\) on both sides of the inequality to find

$$\begin{aligned} \limsup _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array}}S_{\epsilon }({\varvec{x}},t)\leqslant \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}_{0}\right\| _{2}^{2}+J({\varvec{y}}_{0}). \end{aligned}$$

Since the inequality holds for every \({\varvec{y}}_{0}\in \mathrm {int}~(\mathrm {dom}~J)\), we can take the infimum over all \(y\in \mathrm {int}~(\mathrm {dom}~J)\) on the right-hand-side of the inequality to get

$$\begin{aligned} \limsup _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array}}S_{\epsilon }({\varvec{x}},t)\leqslant \inf _{{\varvec{y}}\in \mathrm {int}~(\mathrm {dom}~J)}\left\{ \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right\} . \end{aligned}$$
(57)

By assumptions (A1) and (A2) that \(J \in \Gamma _0({\mathbb {R}}^n)\) and \(\mathrm {int}~(\mathrm {dom}~J) \ne \varnothing \), the infimum on the right hand side is equal to that taken over \(\mathrm {dom}~J\) [58, Corollary 7.3.2], i.e.,

$$\begin{aligned}&\inf _{{\varvec{y}}\in \mathrm {int}~(\mathrm {dom}~J)}\left\{ \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right\} \nonumber \\&\quad = \inf _{{\varvec{y}}\in \mathrm {dom}~J}\left\{ \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right\} \equiv S_0({\varvec{x}},t). \end{aligned}$$
(58)

We combine (57) and (58) to obtain

$$\begin{aligned} \limsup _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array}}S_{\epsilon }({\varvec{x}},t)\leqslant S_0({\varvec{x}},t), \end{aligned}$$

which is the desired result.

Step 2. We can invoke Lemma 2.1.8 in [22] because its conditions are satisfied (in the notation of [22], \(\Phi =-J\), which is upper semicontinuous, \({\varvec{y}}\mapsto \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}\) is the rate function, and note that the tail condition (2.1.9) is satisfied in that \(\sup _{{\varvec{y}}\in {\mathbb {R}}^n} -J({\varvec{y}}) = -\inf _{{\varvec{y}}\in {\mathbb {R}}^n} J({\varvec{y}}) = 0\) by assumption (A3)) to get

$$\begin{aligned} \liminf _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array}}S_{\epsilon }({\varvec{x}},t)\geqslant S_0({\varvec{x}},t). \end{aligned}$$

Step 3. Combining the two limits derived in steps 1 and 2 yields

$$\begin{aligned} \lim _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array}}S_{\epsilon }({\varvec{x}},t)=S_0({\varvec{x}},t) \end{aligned}$$

for every \({\varvec{x}}\in {\mathbb {R}}^n\) and \(t>0\), where the limit converges uniformly on every compact subset \(({\varvec{x}},t)\) of \({\mathbb {R}}^n\times (0,+\infty )\) [58, Theorem 10.8].

By differentiability and joint convexity of both \({\mathbb {R}}^n\times (0,+\infty ) \ni ({\varvec{x}},t) \mapsto S_0({\varvec{x}},t)\) and \({\mathbb {R}}^n\times (0,+\infty ) \ni ({\varvec{x}},t)\mapsto S_{\epsilon }({\varvec{x}},t)-\frac{n\epsilon }{2}\ln t\) (Proposition 2.2 (i), and Proposition 3.1 (i) and (ii)(a)), we can invoke [58, Theorem 25.7] to get

$$\begin{aligned} \lim _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon>0 \end{array}}\nabla _{{\varvec{x}}}S_{\epsilon } ({\varvec{x}},t)= & {} \nabla _{{\varvec{x}}}S_0({\varvec{x}},t)\,\text{ and }\,\lim _{\begin{array}{c} \epsilon \rightarrow 0 \\ \epsilon>0 \end{array}}\left( \frac{\partial S_{\epsilon }({\varvec{x}},t)}{\partial t}-\frac{n\epsilon }{2t}\right) \\= & {} \lim _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array}}\frac{\partial S_{\epsilon }({\varvec{x}},t)}{\partial t}=\frac{\partial S_0({\varvec{x}},t)}{\partial t}, \end{aligned}$$

for every \({\varvec{x}}\in {\mathbb {R}}^n\) and \(t>0\), where the limit converges uniformly on every compact subset of \({\mathbb {R}}^n\times (0,+\infty )\). Furthermore, the viscous HJ PDE (29) for \(S_\epsilon \) implies that

$$\begin{aligned} \lim _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon>0 \end{array}}\frac{\epsilon }{2}\Delta _{{\varvec{x}}}S_{\epsilon }({\varvec{x}},t)&=\lim _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array} }\left( \frac{\partial S_{\epsilon }({\varvec{x}},t)}{\partial t}+\frac{1}{2}\left\| \nabla _{{\varvec{x}}}S_{\epsilon }({\varvec{x}},t)\right\| ^{2}\right) ,\\&=\left( \frac{\partial S_{0}({\varvec{x}},t)}{\partial t}+\frac{1}{2}\left\| \nabla _{{\varvec{x}}}S_0({\varvec{x}},t)\right\| ^{2}\right) \\&=0, \end{aligned}$$

where the last equality holds thanks to the HJ PDE (20) (see Proposition 2.2). Here, again, the limit holds for every \({\varvec{x}}\in {\mathbb {R}}^n\) and \(t>0\), and the limit converges uniformly over any compact subset of \({\mathbb {R}}^n\times (0,+\infty )\). Finally, the limit \(\lim _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array}} {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) = {\varvec{u}}_{MAP}({\varvec{x}},t)\) holds directly as a consequence to the limit \(\lim _{\begin{array}{c} \epsilon \rightarrow 0\\ \epsilon >0 \end{array}}\nabla _{{\varvec{x}}}S_{\epsilon }({\varvec{x}},t)=\nabla _{{\varvec{x}}}S_0({\varvec{x}},t)\) and the representation formulas (30) (see Proposition 3.1(iii)) and (23) (see Proposition 2.2(ii))for the posterior mean and MAP estimates, respectively.

Proof of Proposition 4.1

Proof of (i): We will prove that \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\in \mathrm {int}~(\mathrm {dom}~J)\) in two steps. First, we will use the projection operator (10) (see Definition 4) and the posterior mean estimate \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\) to prove by contradiction that \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\in \mathrm {cl}~(\mathrm {dom}~J)\). Second, we will use the following variant of the Hahn–Banach theorem for convex bodies in \({\mathbb {R}}^n\) to show in fact that \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\in \mathrm {int}~(\mathrm {dom}~J)\).

Theorem B.1

[58, Theorem 11.6 and Corollary 11.6.2] Let C be a convex set. A point \({\varvec{u}}\in C\) is a relative boundary point of C if and only if there exist a vector \({\varvec{a}}\in {\mathbb {R}}^n\setminus \{{\varvec{0}}\}\) and a number \(b\in {\mathbb {R}}\) such that

$$\begin{aligned} {\varvec{u}}=\mathop {{\mathrm{arg\,max}}}\limits _{{\varvec{y}}\in C}\left\{ \left\langle {\varvec{a}},{\varvec{y}}\right\rangle +b\right\} , \end{aligned}$$

with \(\left\langle {\varvec{a}},{\varvec{y}}\right\rangle +b < \left\langle {\varvec{a}},{\varvec{u}}\right\rangle +b\) for every \({\varvec{y}}\in \mathrm {int}~(C)\).

Step 1. Suppose \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\notin \mathrm {cl}~(\mathrm {dom}~J)\). Since the set \(\mathrm {cl}~(\mathrm {dom}~J)\) is closed and convex, the projection of \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\) onto \(\mathrm {cl}~(\mathrm {dom}~J)\) given by \(\pi _{\text{ cl }(\text{ dom } J)}({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )) \equiv {\bar{{\varvec{u}}}}\) is well-defined and unique (see Definition 4), with \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\ne {\bar{{\varvec{u}}}}\) by assumption. The projection \({\bar{{\varvec{u}}}}\) also satisfies the characterization (11), namely

$$\begin{aligned} \left\langle {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\bar{{\varvec{u}}}},{\varvec{y}}-{\bar{{\varvec{u}}}})\right\rangle \leqslant 0 \end{aligned}$$

for every \({\varvec{y}}\in \mathrm {cl}~(\mathrm {dom}~J)\). Then, by linearity of the posterior mean estimate,

$$\begin{aligned} \begin{aligned} \left\| {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\bar{{\varvec{u}}}}\right\| _2 ^{2}&= \left\langle {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\bar{{\varvec{u}}}}, {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\bar{{\varvec{u}}}}\right\rangle \\&= \left\langle {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\bar{{\varvec{u}}}}, {\mathbb {E}}_{J}\left[ {{\varvec{y}}}\right] -{\bar{{\varvec{u}}}}\right\rangle \\&= {\mathbb {E}}_{J}\left[ {\left\langle {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\bar{{\varvec{u}}}},{\varvec{y}}-{\bar{{\varvec{u}}}}\right\rangle )}\right] \\&\leqslant 0, \end{aligned} \end{aligned}$$

which implies that \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) = {\bar{{\varvec{u}}}}\). This contradicts the assumption that \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\notin \mathrm {cl}~(\mathrm {dom}~J)\). Hence, it follows that \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\in \mathrm {cl}~(\mathrm {dom}~J)\).

Step 2. We now wish to prove that \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \in \mathrm {int}~(\mathrm {dom}~J)\). Note that this inclusion trivially holds if there are no boundary points, i.e., \( \left( \mathrm {cl}~(\mathrm {dom}~J) \setminus \mathrm {int}~(\mathrm {dom}~J)\right) = \varnothing \). Now, we consider the case \(\left( \mathrm {cl}~(\mathrm {dom}~J) \setminus \mathrm {int}~(\mathrm {dom}~J)\right) \ne \varnothing \). Suppose that \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \in \left( \mathrm {cl}~(\mathrm {dom}~J) \setminus \mathrm {int}~(\mathrm {dom}~J)\right) \). Then, Thm .B.1 applies and there exist a vector \({\varvec{a}}\in {\mathbb {R}}^n\setminus \{{\varvec{0}}\}\) and a number \(b\in {\mathbb {R}}\) such that

$$\begin{aligned} {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )=\mathop {{\mathrm{arg\,max}}}\limits _{_{{\varvec{y}}\in \mathrm {cl}~(\mathrm {dom}~J)}}\left\{ \left\langle {\varvec{a}},{\varvec{y}}\right\rangle +b\right\} , \end{aligned}$$

with \(\left\langle {\varvec{a}},{\varvec{y}}\right\rangle +b<\left\langle {\varvec{a}},{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle +b\) for every \({\varvec{y}}\in \mathrm {int}~(\mathrm {dom}~J)\). By linearity of the posterior mean estimate,

$$\begin{aligned} \begin{aligned} \left\langle {\varvec{a}},{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle +b&= \left\langle {\varvec{a}},{\mathbb {E}}_{J}\left[ {{\varvec{y}}}\right] \right\rangle +b\\&= {\mathbb {E}}_{J}\left[ {\left\langle {\varvec{a}},{\varvec{y}}\right\rangle +b}\right] \\&< {\mathbb {E}}_{J}\left[ {\left\langle {\varvec{a}},{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle +b}\right] \\&=\left\langle {\varvec{a}},{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle +b, \end{aligned} \end{aligned}$$

where the strict inequality in the third line follows from integrating over \(\mathrm {int}~(\mathrm {dom}~J)\). This contradicts the assumption that \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \in \left( \mathrm {cl}~(\mathrm {dom}~J) \setminus \mathrm {int}~(\mathrm {dom}~J)\right) \). Hence, \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\in \mathrm {int}~(\mathrm {dom}~J)\).

Proof of (ii): First, as a consequence that \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \in \mathrm {int}~(\mathrm {dom}~J)\), the subdifferential of J at \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\) is non-empty because the subdifferential \(\partial J\) is non-empty at every point \({\varvec{y}}\in \mathrm {int}~(\mathrm {dom}~J)\) [58, Theorem 23.4]. Hence, there exists a subgradient \({\varvec{p}}\in \partial J({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ))\) such that

$$\begin{aligned} J({\varvec{y}})\geqslant J({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ))-\left\langle {\varvec{p}},{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle . \end{aligned}$$
(59)

Take the expectation \({\mathbb {E}}_{J}\left[ {\cdot }\right] \) on both sides of inequality (59) to find

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{J}\left[ {J({\varvec{y}})}\right] \\&\quad \geqslant {\mathbb {E}}_{J}\left[ {J({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ))-\left\langle {\varvec{p}},{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle }\right] \\&\quad = J({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ))-{\mathbb {E}}_{J}\left[ {\left\langle {\varvec{p}},{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle }\right] \\&\quad = J({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ))-\left\langle {\varvec{p}},{\mathbb {E}}_{J}\left[ {{\varvec{y}}}\right] -{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle \\&\quad = J({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ))-\left\langle {\varvec{p}},{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle \\&\quad = J({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )). \end{aligned} \end{aligned}$$
(60)

This gives the lower bound of inequality (38).

Second, use the convex inequality \(1+z\leqslant e^{z}\) that holds on \({\mathbb {R}}\) with \(z \equiv J({\varvec{y}})/\epsilon \) for \({\varvec{y}}\in \mathrm {dom}~J\). This gives the inequality \(1 + \frac{1}{\epsilon }J({\varvec{y}}) \leqslant e^{J({\varvec{y}})/\epsilon }\). Multiply this inequality by \(e^{-J({\varvec{y}})/\epsilon }\) and subtract by \(e^{-J({\varvec{y}})/\epsilon }\) on both sides to find

$$\begin{aligned} \frac{1}{\epsilon }J({\varvec{y}})e^{-J({\varvec{y}})/\epsilon } \leqslant (1-e^{-J({\varvec{y}})/\epsilon }). \end{aligned}$$
(61)

Multiply both sides by \(e^{-\frac{1}{2t\epsilon }\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}}\), divide by the partition function \(Z_J({\varvec{x}},t,\epsilon )\) (see Eq. (26)), integrate with respect to \({\varvec{y}}\in \mathrm {dom}~J\), and use

$$\begin{aligned}&\frac{1}{Z_J({\varvec{x}},t,\epsilon )}\int _{\mathrm {dom}~J} \frac{1}{\epsilon }J({\varvec{y}})e^{-\left( \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2} +J({\varvec{y}})\right) /\epsilon } d{\varvec{y}}\\&\quad = \frac{1}{\epsilon }{\mathbb {E}}_{J}\left[ {J({\varvec{y}})}\right] \end{aligned}$$

to obtain

$$\begin{aligned}&\frac{1}{\epsilon }{\mathbb {E}}_{J}\left[ {J({\varvec{y}})}\right] \leqslant \frac{1}{Z_J({\varvec{x}},t,\epsilon )}\nonumber \\&\qquad \int _{\mathrm {dom}~J} \left( e^{-\frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2}/\epsilon } - e^{-\left( \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2} +J({\varvec{y}})\right) /\epsilon }\right) d{\varvec{y}}. \end{aligned}$$
(62)

Now, we can bound the right hand side of (62) as follows

$$\begin{aligned} \begin{aligned}&\frac{1}{Z_J({\varvec{x}},t,\epsilon )}\int _{\mathrm {dom}~J} \left( e^{-\frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2}/\epsilon } - e^{-\left( \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2} +J({\varvec{y}})\right) /\epsilon }\right) d{\varvec{y}}\\&\quad = \frac{1}{Z_J({\varvec{x}},t,\epsilon )}\int _{\mathrm {dom}~J} e^{-\frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2}/\epsilon } d{\varvec{y}}\\&\qquad - \frac{1}{Z_J({\varvec{x}},t,\epsilon )}\int _{\mathrm {dom}~J} e^{-\left( \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2} +J({\varvec{y}})\right) /\epsilon } d{\varvec{y}}\\&\quad \leqslant \frac{1}{Z_J({\varvec{x}},t,\epsilon )}\int _{{\mathbb {R}}^n} e^{-\frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2}/\epsilon } d{\varvec{y}}- 1 \\&\quad = \frac{(2\pi t\epsilon )^{n/2}}{Z_J({\varvec{x}},t,\epsilon )} - 1. \end{aligned} \end{aligned}$$
(63)

Combining (62) and (63), we get

$$\begin{aligned} {\mathbb {E}}_{J}\left[ {J({\varvec{y}})}\right] \leqslant \epsilon \left( \frac{(2\pi t\epsilon )^{n/2}}{Z_J({\varvec{x}},t,\epsilon )} - 1\right) . \end{aligned}$$
(64)

Using the representation formula (28) for the solution \(({\varvec{x}},t) \mapsto S_\epsilon \) to the viscous HJ PDE (29), we have that \((2\pi t\epsilon )^{n/2}/Z_J({\varvec{x}},t,\epsilon ) = e^{S_\epsilon ({\varvec{x}},t)/\epsilon }\). We can therefore write (64) as follows

$$\begin{aligned} {\mathbb {E}}_{J}\left[ {J({\varvec{y}})}\right] \leqslant \epsilon \left( e^{S_\epsilon ({\varvec{x}},t)/\epsilon } - 1\right) < +\infty . \end{aligned}$$

Combining the latter inequalities with (60), we obtain the desired set of inequalities (38).

Proof of Proposition 4.2

Proof of (i): We will show that \({\mathbb {E}}_{J}\left[ {\left\| {\pi _{\partial J ({\varvec{y}})}({\varvec{0}})} \right\| _{2}}\right] < +\infty \) and derive formulas (39), (40), (41), (42), and (43) in four steps. To describe these steps, let us first introduce some notation. Recall that J satisfies assumptions (A1)–(A3) and \(\mathrm {dom}~J = {\mathbb {R}}^n\). Define the set

$$\begin{aligned} D_J:=\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \partial J({\varvec{y}}) = \{\nabla J({\varvec{y}})\} \right\} . \end{aligned}$$

We can invoke [58, Theorem 25.5] to conclude that \(D_J\) is a dense subset of \({\mathbb {R}}^n\), the n-dimensional Lebesgue measure of the set \(({\mathbb {R}}^n\setminus D_J)\) is zero, and the function \({\varvec{y}}\mapsto \nabla J({\varvec{y}})\) is continuous on \(D_J\). Now, let \({\varvec{x}}\in {\mathbb {R}}^n\), \(t>0\), \(\epsilon > 0\), and \({\varvec{y}}_0 \in {\mathbb {R}}^n\). Define the function \(\varphi _J:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^n\) as

$$\begin{aligned} \varphi _J({\varvec{y}}|{\varvec{x}},t) = \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) + \pi _{\partial J({\varvec{y}})}({\varvec{0}}). \end{aligned}$$

Note that for every \({\varvec{y}}\in {\mathbb {R}}^n\) we have \(\varphi _J({\varvec{y}}|{\varvec{x}},t) \in \partial \left( {\mathbb {R}}^n\ni {\varvec{u}}\mapsto \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{u}}} \right\| _{2}^{2} + J({\varvec{u}}) \right) ({\varvec{y}})\), i.e., \(\varphi _J({\varvec{y}}|{\varvec{x}},t)\) is a subgradient of the function \({\varvec{u}}\mapsto \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{u}}} \right\| _{2}^{2} + J({\varvec{u}})\) evaluated at \({\varvec{u}}= {\varvec{y}}\). Let

$$\begin{aligned} C_1({\varvec{x}},{\varvec{y}}_0,t,\epsilon ) = \int _{{\mathbb {R}}^n} \left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2}e^{-\frac{1}{2t\epsilon }\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2}}d{{\varvec{y}}}, \end{aligned}$$
(65)

and note that by assumption (A3), the expected value \({\mathbb {E}}_{J}\left[ {\left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2}}\right] \) is bounded as follows

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{J}\left[ {\left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2}}\right] \\&\quad = \frac{1}{Z_J({\varvec{x}},t,\epsilon )}\int _{{\mathbb {R}}^n} \left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2}e^{-(\frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2} + J({\varvec{y}}))/\epsilon }d{{\varvec{y}}} \\&\quad \leqslant \frac{1}{Z_J({\varvec{x}},t,\epsilon )}\int _{{\mathbb {R}}^n}\left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2} e^{-\frac{1}{2t\epsilon }\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2}}d{{\varvec{y}}} \\&\quad = \frac{C_1({\varvec{x}},{\varvec{y}}_0,t,\epsilon )}{Z_J({\varvec{x}},t,\epsilon )}. \end{aligned} \end{aligned}$$
(66)

Define the vector field \({\varvec{V}}:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^n\) as

$$\begin{aligned} {\varvec{V}}({\varvec{y}}) = ({\varvec{y}}-{\varvec{y}}_0) e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon }, \end{aligned}$$
(67)

which is continuous on \({\mathbb {R}}^n\). It is also bounded on \({\mathbb {R}}^n\); to see this, use the triangle inequality, assumption (A3), and the fact that the function \((0,+\infty ) \ni r \mapsto re^{-\frac{1}{2t\epsilon }r^2}\) attains its maximum at \(r^* = \sqrt{t\epsilon }\) to get

$$\begin{aligned} \begin{aligned} \left\| {{\varvec{V}}({\varvec{y}})} \right\| _{2}&= \left\| {({\varvec{y}}-{\varvec{y}}_0)} \right\| _{2} e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } \\&=\left\| {({\varvec{y}}-{\varvec{x}}+ {\varvec{x}}-{\varvec{y}}_0)} \right\| _{2} e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } \\&\leqslant (\left\| {{\varvec{x}}-{\varvec{y}}_0} \right\| _{2} + \left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}) e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } \\&\leqslant \left\| {{\varvec{x}}-{\varvec{y}}_0} \right\| _{2} + \left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } \\&\leqslant \left\| {{\varvec{x}}-{\varvec{y}}_0} \right\| _{2} + \left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}e^{-\frac{1}{2t\epsilon }\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}} \\&\leqslant \left\| {{\varvec{x}}-{\varvec{y}}_0} \right\| _{2} + \sup _{{\varvec{y}}\in {\mathbb {R}}^n}\left( \left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}e^{-\frac{1}{2t\epsilon }\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}}\right) \\&\leqslant \left\| {{\varvec{x}}-{\varvec{y}}_0} \right\| _{2} + (\sqrt{t\epsilon })e^{-\frac{\sqrt{t\epsilon }}{2}}. \end{aligned}\nonumber \\ \end{aligned}$$
(68)

The divergence \(\nabla _{{\varvec{y}}} \cdot {\varvec{V}}({\varvec{y}})\), which is well-defined and continuous on \(D_J\), is given for every \({\varvec{y}}\in D_J\) by

$$\begin{aligned} \begin{aligned}&\nabla _{{\varvec{y}}}\cdot {\varvec{V}}({\varvec{y}}) \\&\quad = \nabla _{{\varvec{y}}}\cdot \left( ({\varvec{y}}-{\varvec{y}}_0) e^{-\left( \frac{1}{2t} \left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon }\right) \\&\quad = (\nabla _{{\varvec{y}}}\cdot ({\varvec{y}}-{\varvec{y}}_0))e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } \\&\qquad + \left\langle \nabla _{{\varvec{y}}}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon }, {\varvec{y}}-{\varvec{y}}_0 \right\rangle \\&\quad = ne^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } \\&\qquad - \left\langle \frac{1}{\epsilon }\left( \frac{{\varvec{y}}- {\varvec{x}}}{t} + \nabla J({\varvec{y}})\right) e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon }, {\varvec{y}}-{\varvec{y}}_0 \right\rangle \\&\quad =\left( n - \left\langle \frac{1}{\epsilon }\left( \frac{{\varvec{y}}- {\varvec{x}}}{t} + \nabla J({\varvec{y}})\right) , {\varvec{y}}-{\varvec{y}}_0 \right\rangle \right) e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon }. \end{aligned} \end{aligned}$$
(69)

We now outline the four steps that will be used to prove Proposition 4.2(i). In the first step, we will show that the divergence of the vector field \({\varvec{V}}\) on \(D_J\) integrates to zero in the sense that

$$\begin{aligned} \lim _{r \rightarrow +\infty } \left| \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} \cap \,D_J} \nabla _{{\varvec{y}}} \cdot {\varvec{V}}({\varvec{y}}) d{{\varvec{y}}}\right| = 0. \end{aligned}$$
(70)

In the second step, we will show that \({\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \right| }\right] < +\infty \), with \({\mathbb {E}}_{J}\left[ {\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle }\right] = n\epsilon \), hereby proving formula (39), using the convexity of the function \({\varvec{y}}\mapsto \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2} + J({\varvec{y}})\), Fatou’s lemma [30, Lemma 2.18], and Eq. (70) derived in the first step. In the third step, we will combine the results from the first and second steps to show that \({\mathbb {E}}_{J}\left[ {\left\| {\pi _{\partial J ({\varvec{y}})}({\varvec{0}})} \right\| _{2}}\right] < +\infty \) and conclude that the representation formulas (40) and (41) hold. Finally, in the fourth step we will conclude that the representation formulas (42) and (43) hold using Eqs. (40) and (41) and Proposition (3.1)(iii).

Step 1. The proof of the limit result (70) that we present here is based on an application of Theorem 4.14 in [54] to the vector field \({\varvec{V}}(\cdot )\). As this result is fairly technical, we first introduce some terminology and definitions that will be used exclusively in this part of the proof of (i).

Let C be a non-empty convex subset of \({\mathbb {R}}^n\). The dimension of the set C is defined as the smallest dimension of a non-empty affine set containing C, with the dimension of a non-empty affine set being the dimension of the subspace parallel to it [58, pages 4 and 12]. If C consists of a single point, then its dimension is taken to be zero.

Let \(k \in \{0,\dots ,n\}\). Denote by \({\mathcal {H}}^{n-k}\) the \((n-k)\)-dimensional outer Hausdorff measure in \({\mathbb {R}}^n\) as defined in [28, Sect. 2.10.2, p.171]. The measure \({\mathcal {H}}^{n-k}\), in particular, is a constant multiple of the \((n-k)\)-dimensional Lebesgue measure for every measurable subset \(B \subset {\mathbb {R}}^n\) (see [27], Section 1.2, p.7, and Theorem 1.12, p.13).

A subset \(S \subset {\mathbb {R}}^n\) is called slight if \({\mathcal {H}}^{n-1}(S) = 0\), and a subset \(T \subset {\mathbb {R}}^n\) is called thin if T is \(\sigma \)-finite for \({\mathcal {H}}^{n-1}\), i.e., T can be expressed as a countable union of sets \(T = \cup _{k=1}^{+\infty } T_k\) with \({\mathcal {H}}^{n-1}(T_k) < +\infty \) for each \(k \in {\mathbb {N}}^+\) (see, e.g., [54]).

Let \(k \in \{0,\dots ,n\}\). A non-empty, measurable subset \(\Omega \subset {\mathbb {R}}^n\) is said to be countably \({\mathcal {H}}^{n-k}\)-rectifiable if it is contained, up to a null set of \((n-k)\)-dimensional outer Hausdorff measure \({\mathcal {H}}^{n-k}\) zero, in a countable union of continuously differentiable hypersurfaces of dimension \((n-k)\) (see, e.g., [1] and references therein). A non-empty, measurable and countably \({\mathcal {H}}^{n-k}\)-rectifiable subset of \({\mathbb {R}}^n\), in particular, is \(\sigma \)-finite for \({\mathcal {H}}^{n-k}\).

A subset \(A\subset {\mathbb {R}}^n\) is called admissible if its boundary \(\mathrm {bd}~A\) is thin and if the distributional gradient of the characteristic function of A is a vector measure on Borel subsets of \({\mathbb {R}}^n\) whose variation is finite (see [54] pp.151 and the reference therein). For the purpose of our proof, we will use the fact that the family of closed balls of radius \(r>0\), namely \(\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\}\), are admissible sets (see [32], Example 1.10, and note that admissible sets are also called Caccioppoli sets [32], pages 5-6).

Let A be an admissible set and let \({\varvec{v}}:A\rightarrow {\mathbb {R}}^n\) be a vector field. In the terminology of [54], we say that \({\varvec{v}}\) is integrable over the admissible set A if \({\varvec{v}}\) satisfies definition 4.1 of [54], and in that case, the number \(I({\varvec{v}},A)\) is called the integral of \({\varvec{v}}\) over A. Note, here, that the notion of integrability considered in [54] is different from that of the usual Lebesgue integrability. Nevertheless, if \({\varvec{v}}\) is integrable in the sense of [54], \({\varvec{v}}\) is also Lebesgue measurable (Corollary 4.9, [54]), and if the Lebesgue integral \(\int _{A}|{\varvec{v}}({\varvec{y}})|d{{\varvec{y}}}\) is finite, then \(I({\varvec{v}},A) = \int _{A}|{\varvec{v}}({\varvec{y}})|d{{\varvec{y}}}\) ( [54], Proposition 4.7).

Let E be an non-empty subset of \({\mathbb {R}}^n\), let \({\varvec{v}}:E \rightarrow {\mathbb {R}}^n\) be a vector field, and let \(D_{{\varvec{v}}}\) denote the set of points at which \({\varvec{v}}\) is differentiable in \(\mathrm {int}~{E}\) (for the definition of differentiability of vector fields, see [61], page 150, Definition 7.22). In the terminology of [54], we call a divergence of \({\varvec{v}}\) any function \(g :E \mapsto {\mathbb {R}}\) such that \(g({\varvec{y}}) = \nabla \cdot {\varvec{v}}({\varvec{y}})\) for each \({\varvec{y}}\in (\mathrm {int}~{E}) \cap D_{{\varvec{v}}}\).

In addition to these definitions, we will need the following two results due to, respectively, [1] and [54].

Theorem C.1

[1, Theorem 4.1] (for convex functions) Let \(\Omega \) be a bounded, open, convex subset of \({\mathbb {R}}^n\), and let \(f:\Omega \rightarrow {\mathbb {R}}\) be a convex and Lipschitz continuous function. Denote the subdifferential of f at \({\varvec{y}}\in \Omega \) by \(\partial f({\varvec{y}})\). Then, for each \(k\in \{0,\dots ,n\}\), the set

$$\begin{aligned} \{{\varvec{y}}\in \Omega \mid \text {dim}\left( \partial f({\varvec{y}})\right) \geqslant k\} \end{aligned}$$

is countably \({\mathcal {H}}^{n-k}\)-rectifiable.

Theorem C.2

[54, Theorem 4.14] Let A be an admissible set, and let S and T be, respectively, a slight and thin subset of \(\mathrm {cl}~{A}\). Let \({\varvec{v}}\) be a bounded vector field in \(\mathrm {cl}~{A}\) that is continuous in \((\mathrm {cl}~{A})\setminus S\) and differentiable in \((\mathrm {int}~{A})\setminus T\). Then, every divergence of \({\varvec{v}}\) is integrable in A. Moreover, there exists a vector field \(\mathrm {bd}~A \ni {\varvec{y}}\rightarrow n_{A}({\varvec{y}})\) with \(\left\| {n_{A}({\varvec{y}})} \right\| _{2} = 1\) for every \({\varvec{y}}\in \mathrm {bd}~A\) such that if \(\text {div }{\varvec{v}}\) denotes any divergence \({\varvec{v}}\), then

$$\begin{aligned} I(\text {div }{\varvec{v}},A) = \int _{\mathrm {bd}~{A}} \left\langle {\varvec{v}}({\varvec{y}}), {\varvec{n}}_{{\varvec{v}}}({\varvec{y}})\right\rangle d{\mathcal {H}}^{n-1}d{{\varvec{y}}}. \end{aligned}$$
(71)

Step 1 (Continued). Fix \(r>0\) and let \(A = \{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\}\) denote the closed ball of radius r centered at the origin in \({\mathbb {R}}^n\). Note that A is bounded, convex, closed, and admissible. Consider now the restriction of the convex function J to \(\mathrm {int}~{A}\). As \(\mathrm {int}~{A}\) is bounded, open and convex, the function J is Lipschitz continuous on \(\mathrm {int}~{A}\) [58, Theorem 10.4]. All conditions in Theorem (C.1) are satisfied (with \(\Omega = \mathrm {int}~{A}\) and \(f = J\)), and we can invoke the theorem to conclude that the set

$$\begin{aligned} T = \{{\varvec{y}}\in \mathrm {int}~{A} \mid \text {dim}\left( \partial J({\varvec{y}})\right) \geqslant 1\} \end{aligned}$$

is countably \({\mathcal {H}}^{n-1}\)-rectifiable, and therefore \(\sigma \)-finite for \({\mathcal {H}}^{n-1}\). In particular, the set T is thin. Moreover, recalling the definition of the set \(D_J:=\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \partial J({\varvec{y}}) = \{\nabla J({\varvec{y}})\} \right\} \), we find that the set \((\mathrm {int}~{A})\setminus T\) comprises the points \({\varvec{y}}\in \mathrm {int}~{A}\) at which the subdifferential \(\partial J({\varvec{y}})\) is a singleton, i.e., \(T = (\mathrm {int}~{A})\cap ({\mathbb {R}}^n\setminus D_J)\).

Now, consider the vector field \({\varvec{V}}\) defined by (67). This vector field is continuous in \({\mathbb {R}}^n\) by convexity of J and \(\mathrm {dom}~J = {\mathbb {R}}^n\). It is also bounded by (68). Now, define the function \(g :A \rightarrow {\mathbb {R}}\) via

$$\begin{aligned} g({\varvec{y}}) = {\left\{ \begin{array}{ll} \nabla \cdot {\varvec{V}}({\varvec{y}}) \text { if }{\varvec{y}}\in A\cap D_{J}, \\ 0, \quad \text { if } {\varvec{y}}\in A\cap ({\mathbb {R}}^n\setminus D_{J}). \end{array}\right. } \end{aligned}$$
(72)

The function g constitutes a divergence of the vector field \({\varvec{V}}\) because it coincides with the divergence \(\nabla \cdot {\varvec{V}}({\varvec{y}})\) at every \({\varvec{y}}\in (\mathrm {int}~{A})\cap D_J\). Moreover, its Lebesgue integral over A is finite; to see this, first note that for every \({\varvec{y}}\in A\cap D_J\) the absolute value of \(g({\varvec{y}})\) can be bounded using (69), the triangle inequality, the Cauchy–Schwarz inequality, and assumption (A3) as follows

$$\begin{aligned} \begin{aligned} |g({\varvec{y}})|&= \left| \nabla \cdot {\varvec{V}}({\varvec{y}}) \right| \\&= \left| \left( n - \left\langle \frac{1}{\epsilon }\left( \frac{{\varvec{y}}- {\varvec{x}}}{t} + \nabla J({\varvec{y}})\right) , {\varvec{y}}-{\varvec{y}}_0 \right\rangle \right) \right. \\&\quad \left. e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon }\right| \\&\leqslant \left( n + \left| \left\langle \frac{1}{\epsilon }\left( \frac{{\varvec{y}}- {\varvec{x}}}{t} + \nabla J({\varvec{y}})\right) , {\varvec{y}}-{\varvec{y}}_0 \right\rangle \right| \right) \\&\quad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } \\&\leqslant \left( n + \frac{1}{\epsilon }\left\| {\frac{{\varvec{y}}- {\varvec{x}}}{t} + \nabla J({\varvec{y}})} \right\| _{2}\left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2} \right) \\&\quad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } \\&\leqslant \left( n + \frac{1}{\epsilon }\left( \left\| {\frac{{\varvec{y}}- {\varvec{x}}}{t}} \right\| _{2} + \left\| {\nabla J({\varvec{y}})} \right\| _{2}\right) \left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2} \right) \\&\quad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } \\&\leqslant \left( n + \frac{1}{\epsilon }\left( \left\| {\frac{{\varvec{y}}- {\varvec{x}}}{t}} \right\| _{2} + \left\| {\nabla J({\varvec{y}})} \right\| _{2}\right) \left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2} \right) \\&\quad e^{-\frac{1}{2t\epsilon }\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}} \end{aligned} \end{aligned}$$
(73)

Second, as the set A is a closed bounded subset of \(\mathrm {dom}~J = {\mathbb {R}}^n\) the function J is Lipschitz continuous relative to A, and therefore there exists a number \(L_A>0\) such that \(\left\| {\nabla J({\varvec{y}})} \right\| _{2} \leqslant L_A\) for every \({\varvec{y}}\in A \cap D_J\). As a consequence, we can further bound \(g({\varvec{y}})\) for every \({\varvec{y}}\in A\cap D_J\) in (73) as

$$\begin{aligned} |g({\varvec{y}})| \leqslant \left( n + \frac{1}{\epsilon }\left( \left\| {\frac{{\varvec{y}}- {\varvec{x}}}{t}} \right\| _{2} + L_A\right) \left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2} \right) e^{-\frac{1}{2t\epsilon }\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}}. \end{aligned}$$
(74)

In particular, using the definition of g given by (72), we have that (74) holds for every \({\varvec{y}}\in A\). We can now use (72) and (74) to get

$$\begin{aligned} \begin{aligned}&\int _{A}|g({\varvec{y}})|d{\varvec{y}}\\&\quad = \int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\}}|g({\varvec{y}})|d{\varvec{y}}\\&\quad = \int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\}\cap D_J} \left| \nabla \cdot {\varvec{V}}({\varvec{y}}) \right| d{\varvec{y}}\\&\quad \leqslant \int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\}\cap D_J} \left( n + \frac{1}{\epsilon }\left( \left\| {\frac{{\varvec{y}}- {\varvec{x}}}{t}} \right\| _{2} + L_A\right) \left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2} \right) \\&\qquad e^{-\frac{1}{2t\epsilon }\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}} d{\varvec{y}}\\&\quad \leqslant \int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\}} \left( n + \frac{1}{\epsilon }\left( \left\| {\frac{{\varvec{y}}- {\varvec{x}}}{t}} \right\| _{2} + L_A\right) \left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2} \right) \\&\qquad e^{-\frac{1}{2t\epsilon }\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}} d{\varvec{y}}. \\ \end{aligned} \end{aligned}$$
(75)

Since the function

$$\begin{aligned} {\varvec{y}}\mapsto \left( n + \frac{1}{\epsilon }\left( \left\| {\frac{{\varvec{y}}- {\varvec{x}}}{t}} \right\| _{2} + L_A\right) \left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2} \right) e^{-\frac{1}{2t\epsilon }\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}} \end{aligned}$$

is continuous, it is bounded on the compact set \(A = \{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\}\). Its integral over A is therefore finite, and using (75) we find that \(\int _{A}|g({\varvec{y}})|d{\varvec{y}}\) is finite as well.

The previous considerations show that all conditions in Theorem (C.2) are satisfied (with \(A = \{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\}\), \({\varvec{v}}= {\varvec{V}}\), \(S=\varnothing \), \(T = \{{\varvec{y}}\in \mathrm {int}~{A} \mid \text {dim}\left( \partial J({\varvec{y}})\right) \geqslant 1\} = (\mathrm {int}~{A})\cap ({\mathbb {R}}^n\setminus D_J)\)). We can therefore invoke [54, Theorem 4.14] to conclude that the divergence of g is integrable (in the sense described by [54]), with integral I(gA), and that there exists a vector field \(\mathrm {bd}~A \ni {\varvec{y}}\rightarrow {\varvec{n}}_{{\varvec{v}}}({\varvec{y}})\) with \(\left\| {{\varvec{n}}_{{\varvec{v}}}({\varvec{y}})} \right\| _{2} = 1\) for every \({\varvec{y}}\in \mathrm {bd}~A\) such that

$$\begin{aligned} I(g,A) = \int _{\mathrm {bd}~A} \left\langle {\varvec{V}}({\varvec{y}}), {\varvec{n}}_{{\varvec{v}}}({\varvec{y}})\right\rangle \, d{\mathcal {H}}^{n-1}d{{\varvec{y}}}. \end{aligned}$$
(76)

Since the Lebesgue integral of |g| over A is finite, we also have [54, Prop 4.7]

$$\begin{aligned} I(g,A) = \int _{A} g({\varvec{y}}) d{\varvec{y}}. \end{aligned}$$
(77)

Using that \(A = \{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\}\), Eqs. (72), (76), and (77), we obtain

$$\begin{aligned}&\int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\}} g({\varvec{y}}) d{\varvec{y}}\nonumber \\&\quad = \int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\}} \nabla \cdot {\varvec{V}}({\varvec{y}}) d{\varvec{y}}\nonumber \\&\quad = \int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} = r\}} \left\langle {\varvec{V}}({\varvec{y}}), {\varvec{n}}_{{\varvec{v}}}({\varvec{y}})\right\rangle \, d{\mathcal {H}}^{n-1}. \end{aligned}$$
(78)

As r was an arbitrary positive number, we can take the absolute value and then the limit \(r \rightarrow +\infty \) on both sides of (78) to find

$$\begin{aligned}&\lim _{r\rightarrow +\infty } \left| \int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\}\cap D_J} \nabla \cdot {\varvec{V}}({\varvec{y}}) d{\varvec{y}}\right| \nonumber \\&\quad = \lim _{r\rightarrow +\infty } \left| \int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} = r\}} \left\langle {\varvec{V}}({\varvec{y}}), {\varvec{n}}_{{\varvec{v}}}({\varvec{y}})\right\rangle \, d{\mathcal {H}}^{n-1} \right| . \end{aligned}$$
(79)

We will now show that the limit on the right side of (79) is equal to zero. To show this, first take the absolute value inside the integral on the right side of (79) to find

$$\begin{aligned} \begin{aligned}&\left| \int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} = r\}} \left\langle {\varvec{V}}({\varvec{y}}), {\varvec{n}}_{{\varvec{v}}}({\varvec{y}})\right\rangle \, d{\mathcal {H}}^{n-1} \right| \\&\quad \leqslant \int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} = r\}} \left| \left\langle {\varvec{V}}({\varvec{y}}), {\varvec{n}}_{{\varvec{v}}}({\varvec{y}})\right\rangle \right| \, d{\mathcal {H}}^{n-1}. \end{aligned} \end{aligned}$$
(80)

Use the Cauchy–Schwarz inequality, Eq. (67), assumption (A3) (\(\inf _{{\varvec{y}}\in {\mathbb {R}}^n}J({\varvec{y}})=0\)) and \(\left\| {{\varvec{n}}_{{\varvec{v}}}} \right\| _{2} = 1\) to further bound the right side of (80) as follows

$$\begin{aligned} \begin{aligned}&\int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} = r\}} \left| \left\langle {\varvec{V}}({\varvec{y}}), {\varvec{n}}_{{\varvec{v}}}({\varvec{y}})\right\rangle \right| \, d{\mathcal {H}}^{n-1}\\&\quad \leqslant \int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} = r\}} \left\| {{\varvec{V}}({\varvec{y}})} \right\| _{2}\left\| {{\varvec{n}}_{{\varvec{v}}}({\varvec{y}})} \right\| _{2} \, d{\mathcal {H}}^{n-1} \\&\quad \leqslant \int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} = r\}} \left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2} e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } \, d{\mathcal {H}}^{n-1}\\&\quad \leqslant \int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} = r\}} (\left\| {{\varvec{y}}} \right\| _{2} + \left\| {{\varvec{y}}_0} \right\| _{2}) e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}\right) /\epsilon } \, d{\mathcal {H}}^{n-1}. \end{aligned} \end{aligned}$$
(81)

Use the parallelogram law \(2(\left\| {{\varvec{x}}} \right\| _{2}^{2} + \left\| {{\varvec{u}}} \right\| _{2}^{2}) = \left\| {{\varvec{x}}-{\varvec{u}}} \right\| _{2}^{2} + \left\| {{\varvec{x}}+{\varvec{u}}} \right\| _{2}^{2}\) with \({\varvec{u}}= {\varvec{x}}-{\varvec{y}}\) to bound the exponential \(e^{-\frac{1}{2t}\left\| {{\varvec{x}}- {\varvec{y}}} \right\| _{2}^{2}/\epsilon }\) by

$$\begin{aligned} \begin{aligned}&e^{-\frac{1}{2t}\left\| {{\varvec{x}}- {\varvec{y}}} \right\| _{2}^{2}/\epsilon } \\&\quad = e^{-\frac{1}{2t}(\frac{1}{2}(\left\| {{\varvec{y}}} \right\| _{2}^{2} + \left\| {2{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2}) - \left\| {{\varvec{x}}} \right\| _{2}^{2})/\epsilon } \\&\quad \leqslant e^{-\frac{1}{2t}(\frac{1}{2}\left\| {{\varvec{y}}} \right\| _{2}^{2} - \left\| {{\varvec{x}}} \right\| _{2}^{2})/\epsilon } \end{aligned} \end{aligned}$$
(82)

and use it in (81) to get

$$\begin{aligned}&\int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} = r\}} \left| \left\langle {\varvec{V}}({\varvec{y}}), {\varvec{n}}_{{\varvec{v}}}({\varvec{y}})\right\rangle \right| \, d{\mathcal {H}}^{n-1} \nonumber \\&\quad \leqslant \int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} = r\}} (\left\| {{\varvec{y}}} \right\| _{2} + \left\| {{\varvec{y}}_0} \right\| _{2}) e^{-\frac{1}{2t}(\frac{1}{2}\left\| {{\varvec{y}}} \right\| _{2}^{2} - \left\| {{\varvec{x}}} \right\| _{2}^{2})/\epsilon } \, d{\mathcal {H}}^{n-1}. \end{aligned}$$
(83)

Since the domain of integration in (83) is over the surface of an n-dimensional sphere of radius \(\left\| {{\varvec{y}}} \right\| _{2} = r\), the integral on the right side of (83) is given by

$$\begin{aligned}&\int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} = r\}} (\left\| {{\varvec{y}}} \right\| _{2} + \left\| {{\varvec{y}}_0} \right\| _{2}) e^{-\frac{1}{2t}(\frac{1}{2}\left\| {{\varvec{y}}} \right\| _{2}^{2} - \left\| {{\varvec{x}}} \right\| _{2}^{2})/\epsilon } \, d{\mathcal {H}}^{n-1}\nonumber \\&\quad = \int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} = r\}} (r + \left\| {{\varvec{y}}_0} \right\| _{2}) e^{-\frac{1}{2t}(\frac{1}{2}r^2 - \left\| {{\varvec{x}}} \right\| _{2}^{2})/\epsilon } \, d{\mathcal {H}}^{n-1}\nonumber \\&\quad = (r + \left\| {{\varvec{y}}_0} \right\| _{2}) e^{-\frac{1}{2t}(\frac{1}{2}r^2 - \left\| {{\varvec{x}}} \right\| _{2}^{2})/\epsilon } \int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} = r\}} d{\mathcal {H}}^{n-1} \nonumber \\&\quad =(r + \left\| {{\varvec{y}}_0} \right\| _{2})e^{-\frac{1}{2t}\left( \frac{1}{2}r^2 - \left\| {{\varvec{x}}} \right\| _{2}^{2} \right) /\epsilon } \frac{n\pi ^{n/2}}{\Gamma \left( \frac{n}{2} + 1\right) }, \end{aligned}$$
(84)

where \(n\pi ^{n/2}/\Gamma \left( \frac{n}{2} + 1\right) \) is the area of an n-dimensional sphere of radius one, with \(\Gamma \left( \frac{n}{2} + 1\right) \) denoting the Gamma function evaluated at \(\frac{n}{2} + 1\). Since

$$\begin{aligned} \lim _{r\rightarrow +\infty } (r + \left\| {{\varvec{y}}_0} \right\| _{2})e^{-\frac{1}{2t}\left( \frac{1}{2}r^2 - \left\| {{\varvec{x}}} \right\| _{2}^{2} \right) /\epsilon } = 0, \end{aligned}$$

the limit \(r\rightarrow +\infty \) in (84) is equal to zero, i.e.,

$$\begin{aligned} \lim _{r\rightarrow +\infty }\int _{\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}} \right\| _{2} = r\}} (\left\| {{\varvec{y}}} \right\| _{2} + \left\| {{\varvec{y}}_0} \right\| _{2}) e^{-\frac{1}{2t}(\frac{1}{2}\left\| {{\varvec{y}}} \right\| _{2}^{2} - \left\| {{\varvec{x}}} \right\| _{2}^{2})/\epsilon } \, d{\mathcal {H}}^{n-1} = 0. \end{aligned}$$
(85)

Combining (79), (80), (83) and (85) yield

$$\begin{aligned} \lim _{r \rightarrow +\infty } \left| \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} \cap \,D_J} \nabla _{{\varvec{y}}} \cdot {\varvec{V}}({\varvec{y}}) d{{\varvec{y}}}\right| = 0. \end{aligned}$$

which proves the limit result (70).

Step 2. Recall that the divergence of the vector field \({\varvec{y}}\mapsto {\varvec{V}}({\varvec{y}})\) on \(D_J\) is given by (69). Combine (70) and (69) to conclude that

$$\begin{aligned}&\lim _{r \rightarrow +\infty } \left| \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} \cap \,D_J} \left( n\epsilon - \left\langle \left( \frac{{\varvec{y}}- {\varvec{x}}}{t} + \nabla J({\varvec{y}})\right) , {\varvec{y}}-{\varvec{y}}_0 \right\rangle \right) \right. \nonumber \\&\quad \left. e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}}\right| = 0. \end{aligned}$$
(86)

Note that the minimal subgradient \(\pi _{\partial J({\varvec{y}})}({\varvec{0}}) = \nabla J({\varvec{y}})\) for every \({\varvec{y}}\in D_J\). We can therefore substitute the minimal subgradient \(\pi _{\partial J({\varvec{y}})}({\varvec{0}})\) for the gradient \(\nabla J({\varvec{y}})\) inside the integral in the limit (86) without changing its value. Moreover, since the set \(D_J\) is dense in \({\mathbb {R}}^n\) and the n-dimensional Lebesgue measure of \(({\mathbb {R}}^n\setminus D_J)\) is zero, we can further substitute the domain of integration \(\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} \cap \,D_J\) of the integral in the limit (86) with \(\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} \) without changing its value. With these two changes, the limit (86) can be written as

$$\begin{aligned}&\lim _{r \rightarrow +\infty } \left| \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } \left( n\epsilon - \left\langle \left( \frac{{\varvec{y}}- {\varvec{x}}}{t} + \pi _{\partial J({\varvec{y}})}({\varvec{0}})\right) , {\varvec{y}}-{\varvec{y}}_0 \right\rangle \right) \right. \\&\quad \left. e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}}\right| = 0. \end{aligned}$$

Using the notation \(\varphi _J({\varvec{y}}|{\varvec{x}},t) = \left( \frac{{\varvec{y}}- {\varvec{x}}}{t}\right) + \pi _{\partial J({\varvec{y}})}({\varvec{0}})\), we can write this limit more succinctly as

$$\begin{aligned}&\lim _{r \rightarrow +\infty } \left| \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } \left( n\epsilon - \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t), {\varvec{y}}-{\varvec{y}}_0 \right\rangle \right) \right. \nonumber \\&\quad \left. e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}}\right| = 0. \end{aligned}$$
(87)

Now, consider the function \({\mathbb {R}}^n\ni {\varvec{y}}\mapsto \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t) - \varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \). Note here that as J is convex with \(\mathrm {dom}~J ={\mathbb {R}}^n\), both \(\varphi _J({\varvec{y}}|{\varvec{x}},t)\) and \(\varphi _J({\varvec{y}}_0|{\varvec{x}},t)\) are subgradients of the convex function \({\varvec{u}}\mapsto \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{u}}} \right\| _{2}^{2} + J({\varvec{u}})\) at \({\varvec{u}}= {\varvec{y}}\) and \({\varvec{u}}= {\varvec{y}}_0\), respectively [58, Theorem 23.4]. We can therefore apply inequality (13) (with \(p = \varphi _J({\varvec{y}}|{\varvec{x}},t)\), \(p_0 = \varphi _J({\varvec{y}}_0|{\varvec{x}},t)\), and \(m = 0\)) to find \(\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t) - \varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \geqslant 0\). Define \(F:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\) and \(G :{\mathbb {R}}^n\rightarrow {\mathbb {R}}\) as follows:

$$\begin{aligned} F({\varvec{y}}) = \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } \end{aligned}$$

and

$$\begin{aligned} G({\varvec{y}}) = \left\langle \varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon }. \end{aligned}$$

Note that \(F({\varvec{y}})-G({\varvec{y}}) = \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t) - \varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } \geqslant 0\) for every \({\varvec{y}}\in {\mathbb {R}}^n\). Integrate \({\varvec{y}}\mapsto F({\varvec{y}})-G({\varvec{y}})\) over \({\mathbb {R}}^n\) and use Fatou’s lemma to find

$$\begin{aligned} \begin{aligned}&0 \leqslant \int _{{\mathbb {R}}^n} F({\varvec{y}})-G({\varvec{y}}) d{{\varvec{y}}} \\&\quad \leqslant \lim _{r \rightarrow +\infty } \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } F({\varvec{y}})-G({\varvec{y}}) d{{\varvec{y}}} \\&\quad = \lim _{r \rightarrow +\infty } \left( \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } F({\varvec{y}}) d{{\varvec{y}}} \right. \\&\qquad \left. + \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } (-G({\varvec{y}})) d{{\varvec{y}}}\right) \end{aligned} \end{aligned}$$
(88)

Use the Cauchy–Schwarz inequality assumption (A3) (\(\inf _{{\varvec{y}}\in {\mathbb {R}}^n} J({\varvec{y}}) = 0\)) to bound the second integral on the right hand side of (88) as follows

$$\begin{aligned} \begin{aligned}&\int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } (-G({\varvec{y}})) d{{\varvec{y}}} \\&\quad = \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } -\left\langle \varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \\&\qquad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}} \\&\quad \leqslant \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } \left\| {\varphi _J({\varvec{y}}_0|{\varvec{x}},t)} \right\| _{2}\left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2}\\&\qquad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}}\\&\quad \leqslant \left\| {\varphi _J({\varvec{y}}_0|{\varvec{x}},t)} \right\| _{2} \int _{{\mathbb {R}}^n}\left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2}\\&\qquad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}\right) /\epsilon } d{{\varvec{y}}} \\&\quad = \left\| {\varphi _J({\varvec{y}}_0|{\varvec{x}},t)} \right\| _{2}C_1({\varvec{x}},{\varvec{y}}_0,t,\epsilon ), \end{aligned} \end{aligned}$$
(89)

where \(C_1({\varvec{x}},{\varvec{y}}_0,t,\epsilon )\) was defined in (65). Combine (88) and (89) to find

$$\begin{aligned} \begin{aligned}&0 \leqslant \int _{{\mathbb {R}}^n} F({\varvec{y}})-G({\varvec{y}}) d{{\varvec{y}}} \\&\quad \leqslant \lim _{r \rightarrow +\infty } \Bigg (\int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } F({\varvec{y}}) d{{\varvec{y}}} + \left\| {\varphi _J({\varvec{y}}_0|{\varvec{x}},t)} \right\| _{2}\\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad C_1({\varvec{x}},{\varvec{y}}_0,t,\epsilon )\Bigg ) \\&\quad = \left( \lim _{r \rightarrow +\infty } \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } F({\varvec{y}}) d{{\varvec{y}}}\right) + \left\| {\varphi _J({\varvec{y}}_0|{\varvec{x}},t)} \right\| _{2}\\&\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad C_1({\varvec{x}},{\varvec{y}}_0,t,\epsilon ). \end{aligned} \end{aligned}$$
(90)

The integral on the right hand side of (90) can be bounded using assumption (A3) as follows

$$\begin{aligned}&\int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } F({\varvec{y}}) d{{\varvec{y}}}\nonumber \\&\quad = \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \nonumber \\&\qquad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}}\nonumber \\&\quad = \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } (\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle +(n\epsilon - n\epsilon ))\nonumber \\&\qquad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}} \nonumber \\&\quad = \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } (\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle - n\epsilon )\nonumber \\&\qquad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}} \\&\qquad + n\epsilon \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} }\nonumber \\&\qquad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}}\nonumber \\&\quad \leqslant \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } (\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle - n\epsilon ) \nonumber \\&\qquad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}}\nonumber \\&\qquad + n\epsilon \int _{{\mathbb {R}}^n} e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}\right) /\epsilon } d{{\varvec{y}}}\nonumber \\&\quad = \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } (\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle - n\epsilon ) \nonumber \\&\qquad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}} + n\epsilon (2\pi t\epsilon )^{n/2}.\nonumber \end{aligned}$$
(91)

Combine (90) and (91) to get

$$\begin{aligned} \begin{aligned}&0 \leqslant \int _{{\mathbb {R}}^n} F({\varvec{y}})-G({\varvec{y}}) d{{\varvec{y}}} \\&\quad \leqslant \left( \lim _{r \rightarrow +\infty }\int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } (\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle - n\epsilon )\right. \\&\qquad \left. e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}}\right) \\&\qquad + \left\| {\varphi _J({\varvec{y}}_0|{\varvec{x}},t)} \right\| _{2}C_1({\varvec{x}},{\varvec{y}}_0,t,\epsilon ) + n\epsilon (2\pi t\epsilon )^{n/2}. \end{aligned} \end{aligned}$$
(92)

Combine (87) and (92) to get

$$\begin{aligned} \begin{aligned}&0 \leqslant \int _{{\mathbb {R}}^n} F({\varvec{y}})-G({\varvec{y}}) d{{\varvec{y}}} \\&\quad = \int _{{\mathbb {R}}^n} \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t)-\varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \\&\qquad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}} \\&\quad \leqslant \left\| {\varphi _J({\varvec{y}}_0|{\varvec{x}},t)} \right\| _{2}C_1({\varvec{x}},{\varvec{y}}_0,t,\epsilon ) + n\epsilon (2\pi t\epsilon )^{n/2}. \end{aligned} \end{aligned}$$
(93)

Divide (93) by the partition function \(Z_J({\varvec{x}},t,\epsilon )\) (see Eq. (26)) to get

$$\begin{aligned} 0\leqslant & {} {\mathbb {E}}_{J}\left[ {\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t)-\varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle }\right] \nonumber \\\leqslant & {} \frac{\left\| {\varphi _J({\varvec{y}}_0|{\varvec{x}},t)} \right\| _{2}C_1({\varvec{x}},{\varvec{y}}_0,t,\epsilon ) + n\epsilon (2\pi t\epsilon )^{n/2}}{Z_J({\varvec{x}},t,\epsilon )} < +\infty . \end{aligned}$$
(94)

Now, using the Cauchy–Schwarz inequality and (66), we can bound \({\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0\right\rangle \right| }\right] \) as follows

$$\begin{aligned}&{\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \right| }\right] \nonumber \\&\quad = \frac{1}{Z_J({\varvec{x}},t,\epsilon )}\int _{{\mathbb {R}}^n} \left| \left\langle \varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \right| \nonumber \\&\qquad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}}\nonumber \\&\quad \leqslant \left\| {\varphi _J({\varvec{y}}_0|{\varvec{x}},t)} \right\| _{2}\frac{1}{Z_J({\varvec{x}},t,\epsilon )} \int _{{\mathbb {R}}^n}\left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2}\nonumber \\&\qquad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}\right) /\epsilon } d{{\varvec{y}}}\nonumber \\&\quad = \frac{\left\| {\varphi _J({\varvec{y}}_0|{\varvec{x}},t)} \right\| _{2}C_1({\varvec{x}},{\varvec{y}}_0,t,\epsilon )}{Z_J({\varvec{x}},t,\epsilon )}. \end{aligned}$$
(95)

Use the triangle inequality and the upper bounds in (94) and (95) to obtain

$$\begin{aligned}&{\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \right| }\right] \nonumber \\&\quad = {\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t)-(\varphi _J({\varvec{y}}_0|{\varvec{x}},t) - \varphi _J({\varvec{y}}_0|{\varvec{x}},t)),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \right| }\right] \nonumber \\&\quad \leqslant {\mathbb {E}}_{J}\left[ \left| \left\langle \varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \right| \right. \nonumber \\&\qquad \left. + \left| \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t)-\varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \right| \right] \nonumber \\&\quad = {\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \right| }\right] \nonumber \\&\qquad + {\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t)-\varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \right| }\right] \nonumber \\&\quad \leqslant \frac{\left\| {\varphi _J({\varvec{y}}_0|{\varvec{x}},t)} \right\| _{2}C_1({\varvec{x}},{\varvec{y}}_0,t,\epsilon )}{Z_J({\varvec{x}},t,\epsilon )}\nonumber \\&\qquad + \frac{\left\| {\varphi _J({\varvec{y}}_0|{\varvec{x}},t)} \right\| _{2}C_1({\varvec{x}},{\varvec{y}}_0,t,\epsilon ) + n\epsilon (2\pi t\epsilon )^{n/2}}{Z_J({\varvec{x}},t,\epsilon )} \nonumber \\&\quad <+\infty . \end{aligned}$$
(96)

Since \({\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \right| }\right] < +\infty \), we can use (87) to conclude that

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{J}\left[ {\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle }\right] \\&\quad = \frac{1}{Z_J({\varvec{x}},t,\epsilon )}\int _{{\mathbb {R}}^n} \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \\&\qquad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}} \\&\quad = \frac{1}{Z_J({\varvec{x}},t,\epsilon )}\\&\qquad \lim _{r \rightarrow +\infty }\int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \\&\qquad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}} \\&\quad = \frac{1}{Z_J({\varvec{x}},t,\epsilon )}\\&\qquad \lim _{r \rightarrow +\infty }\int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } (n\epsilon - n\epsilon + \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle )\\&\qquad e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}} \\&\quad = n\epsilon \\&\qquad - \lim _{r \rightarrow +\infty }\left( \int _{\left\{ {\varvec{y}}\in {\mathbb {R}}^n\mid \, \left\| {{\varvec{y}}} \right\| _{2} \leqslant r\right\} } (n\epsilon - \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle )\right. \\&\qquad \left. e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}}\right) \\&\quad = n\epsilon . \end{aligned} \end{aligned}$$
(97)

Inequality (96) and equality (97) show the desired results \({\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \right| }\right] < +\infty \) and \({\mathbb {E}}_{J}\left[ {\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle }\right] = n\epsilon \), which, after recalling the definition \(\varphi _J({\varvec{y}}|{\varvec{x}},t) = \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) + \pi _{\partial J({\varvec{y}})}({\varvec{0}})\), also proves formula (39).

Step 3. Thanks to Step 2, we have \({\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \right| }\right] < +\infty \) and \({\mathbb {E}}_{J}\left[ {\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle }\right] = n\epsilon \) for every \({\varvec{y}}_0 \in {\mathbb {R}}^n\). In particular, the choice of \({\varvec{y}}_0 = {\varvec{0}}\) yields \({\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}\right\rangle \right| }\right] < +\infty \) and \({\mathbb {E}}_{J}\left[ {\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}\right\rangle }\right] = n\epsilon \). As a consequence, we have that

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}_0 \right\rangle \right| }\right] \\&\quad = {\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}_0 + ({\varvec{y}}-{\varvec{y}})\right\rangle \right| }\right] \\&\quad \leqslant {\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \right| + \left| \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}\right\rangle \right| }\right] \\&\quad = {\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0 \right\rangle \right| }\right] + {\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}\right\rangle \right| }\right] \\&\quad <\infty , \end{aligned} \end{aligned}$$
(98)

and

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{J}\left[ {\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}_0 \right\rangle }\right] \\&\quad = {\mathbb {E}}_{J}\left[ {\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),({\varvec{y}}- {\varvec{y}}) + {\varvec{y}}_0 \right\rangle }\right] \\&\quad = {\mathbb {E}}_{J}\left[ {\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}\right\rangle }\right] - {\mathbb {E}}_{J}\left[ {\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}- {\varvec{y}}_0 \right\rangle }\right] \\&\quad = n\epsilon - n\epsilon \\&\quad = 0, \end{aligned} \end{aligned}$$
(99)

for every \({\varvec{y}}_0 \in {\mathbb {R}}^n\). Now, let \(\{{\varvec{e}}_i\}_{i=1}^{n}\) denote the standard basis in \({\mathbb {R}}^n\) and let \(\{\varphi _J({\varvec{y}}|{\varvec{x}},t)_i\}_{i=1}^{n}\) denote the components of the vector \(\varphi _J({\varvec{y}}|{\varvec{x}},t)\), i.e., \(\varphi _J({\varvec{y}}|{\varvec{x}},t) = (\varphi _J({\varvec{y}}|{\varvec{x}},t)_1,\dots ,\varphi _J({\varvec{y}}|{\varvec{x}},t)_n)\). Using (98) with the choice of \({\varvec{y}}_0 = {\varvec{e}}_i\) for \(i\in \{1,\dots ,n\}\), we get \({\mathbb {E}}_{J}\left[ {|\varphi _J({\varvec{y}}|{\varvec{x}},t)_i|}\right] < +\infty \) for every \(i \in \{1,\dots ,n\}\). Using the norm inequality \(\left\| {\varphi _J({\varvec{y}}|{\varvec{x}},t)} \right\| _{2} \leqslant \sum _{i=1}^{n}\left| \varphi _J({\varvec{y}}|{\varvec{x}},t)_i\right| \), we can bound \({\mathbb {E}}_{J}\left[ {\left\| {\varphi _J({\varvec{y}}|{\varvec{x}},t)} \right\| _{2}}\right] \) as follows

$$\begin{aligned} \begin{aligned} {\mathbb {E}}_{J}\left[ {\left\| {\varphi _J({\varvec{y}}|{\varvec{x}},t)} \right\| _{2}}\right]&\leqslant {\mathbb {E}}_{J}\left[ {\sum _{i=1}^{n}\left| \varphi _J({\varvec{y}}|{\varvec{x}},t)_i\right| }\right] \\&= \sum _{i=1}^{n} {\mathbb {E}}_{J}\left[ {\left| \varphi _J({\varvec{y}}|{\varvec{x}},t)_i\right| }\right] \\&< +\infty . \end{aligned} \end{aligned}$$
(100)

We can therefore combine (99) and (100) to get \({\mathbb {E}}_{J}\left[ {\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}_0 \right\rangle }\right] = \left\langle {\mathbb {E}}_{J}\left[ {\varphi _J({\varvec{y}}|{\varvec{x}},t)}\right] ,{\varvec{y}}_0 \right\rangle = 0\) for every \({\varvec{y}}_0 \in {\mathbb {R}}^n\), which yields the following equality:

$$\begin{aligned} {\mathbb {E}}_{J}\left[ {\varphi _J({\varvec{y}}|{\varvec{x}},t)}\right] = 0. \end{aligned}$$
(101)

Moreover, recalling the definition \(\varphi _J({\varvec{y}}|{\varvec{x}},t) = \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) + \pi _{\partial J({\varvec{y}})}({\varvec{0}})\) and using (66) (with \({\varvec{y}}_0 = {\varvec{x}}\)) and (100), we can bound \({\mathbb {E}}_{J}\left[ {\left\| {\pi _{\partial J({\varvec{y}})}({\varvec{0}})} \right\| _{2}}\right] \) as follows

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{J}\left[ {\left\| {\pi _{\partial J({\varvec{y}})}({\varvec{0}})} \right\| _{2}}\right] \\&\quad = {\mathbb {E}}_{J}\left[ {\left\| {\pi _{\partial J({\varvec{y}})}({\varvec{0}}) + \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) - \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) } \right\| _{2}}\right] \\&\quad \leqslant {\mathbb {E}}_{J}\left[ {\left\| {\pi _{\partial J({\varvec{y}})}({\varvec{0}}) + \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) } \right\| _{2} + \left\| {\left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) } \right\| _{2}}\right] \\&\quad ={\mathbb {E}}_{J}\left[ {\left\| {\varphi _J({\varvec{y}}|{\varvec{x}},t)} \right\| _{2}}\right] + \frac{1}{t}{\mathbb {E}}_{J}\left[ {\left\| {{\varvec{y}}-{\varvec{x}}} \right\| _{2}}\right] \\&\quad \leqslant {\mathbb {E}}_{J}\left[ {\left\| {\varphi _J({\varvec{y}}|{\varvec{x}},t)} \right\| _{2}}\right] + \frac{C_1({\varvec{x}},{\varvec{x}},t,\epsilon )}{tZ_J({\varvec{x}},t,\epsilon )} \\&\quad < +\infty . \end{aligned} \end{aligned}$$
(102)

We can now combine (66), (101) and (102) to expand the expected value of \({\mathbb {E}}_{J}\left[ {\varphi _J({\varvec{y}}|{\varvec{x}},t)}\right] \) as follows

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{J}\left[ {\varphi _J({\varvec{y}}|{\varvec{x}},t)}\right] \\&\quad = {\mathbb {E}}_{J}\left[ {\left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) + \pi _{\partial J({\varvec{y}})}({\varvec{0}})}\right] \\&\quad = {\mathbb {E}}_{J}\left[ {\left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) }\right] + {\mathbb {E}}_{J}\left[ {\pi _{\partial J({\varvec{y}})}({\varvec{0}})}\right] \\&\quad = \left( \frac{{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\varvec{x}}}{t}\right) + {\mathbb {E}}_{J}\left[ {\pi _{\partial J({\varvec{y}})}({\varvec{0}})}\right] \\&\quad ={\varvec{0}}. \end{aligned} \end{aligned}$$
(103)

Solving for \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\) in (103) yields \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) = {\varvec{x}}-t{\mathbb {E}}_{J}\left[ {\pi _{\partial J({\varvec{y}})}({\varvec{0}})}\right] \), which gives the representation formula (40).

We now derive the second representation formula (41). Let \({\varvec{y}}_0 = {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\) in Eq. (97) and use the representation formula (40) to find

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{J}\left[ {\left\langle \pi _{\partial J({\varvec{y}})}({\varvec{0}}), {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle }\right] \\&\quad = {\mathbb {E}}_{J}\left[ \left\langle \pi _{\partial J({\varvec{y}})}({\varvec{0}}) + \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) - \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) , \right. \right. \\&\qquad \left. \left. {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle \right] \\&\quad = {\mathbb {E}}_{J}\left[ \left\langle \pi _{\partial J({\varvec{y}})}({\varvec{0}}) + \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) , {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle \right. \\&\qquad \left. - \left\langle \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) , {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle \right] \\&\quad = {\mathbb {E}}_{J}\left[ {\left\langle \pi _{\partial J({\varvec{y}})}({\varvec{0}}) + \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) , {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle }\right] \\&\qquad - {\mathbb {E}}_{J}\left[ {\left\langle \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) , {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle }\right] \\&\quad = {\mathbb {E}}_{J}\left[ {\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t), {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle }\right] \\&\qquad - {\mathbb {E}}_{J}\left[ {\left\langle \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) , {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle }\right] \\&\quad = n\epsilon - {\mathbb {E}}_{J}\left[ {\left\langle \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) , {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle }\right] . \end{aligned} \end{aligned}$$
(104)

We will use (104) to derive a representation formula for \({\mathbb {E}}_{J}\left[ {\left\| {{\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2}^{2}}\right] \). Multiply (104) by t and rearrange to get

$$\begin{aligned}&{\mathbb {E}}_{J}\left[ {\left\langle {\varvec{y}}- {\varvec{x}}, {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle }\right] \nonumber \\&\quad = nt\epsilon - t{\mathbb {E}}_{J}\left[ {\left\langle \pi _{\partial J({\varvec{y}})}({\varvec{0}}), {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle }\right] . \end{aligned}$$
(105)

The left hand side of (105) can be expressed as

$$\begin{aligned}&{\mathbb {E}}_{J}\left[ {\left\langle {\varvec{y}}- {\varvec{x}}, {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle }\right] \nonumber \\&\quad = {\mathbb {E}}_{J}\left[ \left\langle {\varvec{y}}- {\varvec{x}}+ ({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) - {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )), \right. \right. \nonumber \\&\qquad \left. \left. {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle \right] \nonumber \\&\quad = {\mathbb {E}}_{J}\left[ \left\langle {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ), {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle \right. \nonumber \\&\qquad \left. + \left\langle {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ), {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle \right] \nonumber \\&\quad ={\mathbb {E}}_{J}\left[ {\left\| {{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2}^{2}}\right] \nonumber \\&\qquad + {\mathbb {E}}_{J}\left[ {\left\langle {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ),{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle }\right] \nonumber \\&\quad = {\mathbb {E}}_{J}\left[ {\left\| {{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2}^{2}}\right] \nonumber \\&\qquad + \left\langle {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ),{\mathbb {E}}_{J}\left[ {{\varvec{y}}}\right] -{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle \nonumber \\&\quad = {\mathbb {E}}_{J}\left[ {\left\| {{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2}^{2}}\right] \nonumber \\&\qquad + \left\langle {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ),{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle \nonumber \\&\quad = {\mathbb {E}}_{J}\left[ {\left\| {{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2}^{2}}\right] . \end{aligned}$$
(106)

Combine Eqs. (105) and (106) to get

$$\begin{aligned}&{\mathbb {E}}_{J}\left[ {\left\| {{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2}^{2}}\right] \\&\quad = nt\epsilon - t{\mathbb {E}}_{J}\left[ {\left\langle \pi _{\partial J({\varvec{y}})}({\varvec{0}}), {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle }\right] , \end{aligned}$$

which gives the representation formula (41).

Step 4. Thanks to Step 3, the representation formulas (40) and (41) hold. Recall that by Proposition 3.1(iii), the gradient \(\nabla _{{\varvec{x}}}S_\epsilon ({\varvec{x}},t)\) and Laplacian \(\Delta _{{\varvec{x}}}S_\epsilon ({\varvec{x}},t)\) of the solution \(S_\epsilon \) to the viscous HJ PDE (29) satisfy the representation formulas

$$\begin{aligned} {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) = {\varvec{x}}- t\nabla _{{\varvec{x}}}S_\epsilon ({\varvec{x}},t) \end{aligned}$$
(107)

and

$$\begin{aligned} {\mathbb {E}}_{J}\left[ {\left\| {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\| _{2}^{2}}\right] = nt\epsilon - t^2\epsilon \Delta _{{\varvec{x}}}S_{\epsilon }({\varvec{x}},t). \end{aligned}$$
(108)

Use (40) and (107) to get

$$\begin{aligned} \nabla _{{\varvec{x}}}S_\epsilon ({\varvec{x}},t) = {\mathbb {E}}_{J}\left[ {\pi _{\partial J({\varvec{y}})}({\varvec{0}})}\right] , \end{aligned}$$

which is the representation formula (42). Use (41) and (108) to get

$$\begin{aligned} t^2\epsilon \Delta _{{\varvec{x}}}S_{\epsilon }({\varvec{x}},t) = t{\mathbb {E}}_{J}\left[ {\left\langle \pi _{\partial J({\varvec{y}})}({\varvec{0}}), {\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle }\right] \end{aligned}$$

which is, after dividing by \(t\epsilon \) on both sides, the representation formulas (43). This concludes Step 4.

Proof of (ii): Here, we only assume that J satisfies assumptions (A1)–(A3); we do not assume that \(\mathrm {dom}~J = {\mathbb {R}}^n\). Let \(\{\mu _k\}_{k=1}^{+\infty }\) be a sequence of positive real numbers converging to zero. Define \(f_k:{\mathbb {R}}^n\times (0,+\infty )\times (0,+\infty ) \rightarrow {\mathbb {R}}\) by

$$\begin{aligned}&f_\epsilon ({\varvec{x}},t,k) \nonumber \\&\quad = -\epsilon \log \left( \frac{1}{(2\pi t\epsilon )^{n/2}}\int _{{\mathbb {R}}^n}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+S_0({\varvec{y}},\mu _k)\right) /\epsilon }d{\varvec{y}}\right) \end{aligned}$$
(109)

and let \(S_0({\varvec{x}},\mu _k)\) denote the solution to the first-order HJ PDE (20) with initial data J evaluated at \(({\varvec{x}},\mu _k)\), that is,

$$\begin{aligned} S_0({\varvec{x}},\mu _k) = \inf _{{\varvec{y}}\in {\mathbb {R}}^n}\left\{ \frac{1}{2\mu _k}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2} + J({\varvec{y}})\right\} . \end{aligned}$$
(110)

By Proposition 2.2(i), the function \({\mathbb {R}}^n\ni {\varvec{x}}\mapsto S_0({\varvec{x}},\mu _k)\) is continuously differentiable and convex for each \(k \in {\mathbb {N}}\), and the sequence of real numbers \(\{S_0({\varvec{x}},\mu _k)\}_{k=1}^{+\infty }\) converges to \(J({\varvec{x}})\) for every \({\varvec{x}}\in \mathrm {dom}~J\). Moreover, by assumption (A3) (\(\inf _{{\varvec{y}}\in {\mathbb {R}}^n} J({\varvec{y}}) = 0\)) the sequence \(\{S_0({\varvec{x}},\mu _k)\}_{k=1}^{+\infty }\) is uniformly bounded from below by 0, that is,

$$\begin{aligned} S_0({\varvec{x}},\mu _k)&=\inf _{{\varvec{y}}\in {\mathbb {R}}^n}\left\{ \frac{1}{2\mu _k}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2} + J({\varvec{y}})\right\} \\&\geqslant \inf _{{\varvec{y}}\in {\mathbb {R}}^n}\left\{ \frac{1}{2\mu _k}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2}\right\} +\inf _{{\varvec{y}}\in {\mathbb {R}}^n}J({\varvec{y}})\\&= 0. \end{aligned}$$

As a consequence, we can invoke Proposition 3.1(i) to conclude that for each \(k \in {\mathbb {N}}\), the function \(({\varvec{x}},t) \mapsto f_\epsilon ({\varvec{x}},t,k)\) corresponds to the solution to the viscous HJ PDE (29) with initial data \(f_k({\varvec{x}},0,\epsilon ) = S_0({\varvec{x}},\mu _k)\). Moreover, \({\mathbb {R}}^n\ni {\varvec{x}}\mapsto f_\epsilon ({\varvec{x}},t,k)\) is continuously differentiable and convex by Proposition 3.1(i) and (ii)(a). Finally, as the domain of the function \({\varvec{x}}\mapsto S_0({\varvec{x}},\mu _k)\) is \({\mathbb {R}}^n\), we can use the representation formula (42) in Proposition 4.2(i) (which was proven previously in this Appendix) to express the gradient \(\nabla _{{\varvec{x}}}f_{k}({\varvec{x}},t,\epsilon )\) as follows

$$\begin{aligned} \nabla _{{\varvec{x}}}f_\epsilon ({\varvec{x}},t,k) = \frac{\int _{{\mathbb {R}}^n} \nabla _{{\varvec{y}}}S_0({\varvec{y}},\mu _k)e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{\varvec{y}}}{\int _{{\mathbb {R}}^n} e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{\varvec{y}}}. \end{aligned}$$
(111)

Now, since \(S_0({\varvec{x}},\mu _k) \geqslant 0\) for every \(k \in {\mathbb {N}}\), we can bound the integrand in (109) as follows

$$\begin{aligned} \frac{1}{(2\pi t\epsilon )^{n/2}}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+S_0({\varvec{y}},\mu _k)\right) /\epsilon } \leqslant \frac{1}{(2\pi t\epsilon )^{n/2}}e^{-\frac{1}{2t\epsilon }\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}}, \end{aligned}$$
(112)

where \(\int _{{\mathbb {R}}^n} \frac{1}{(2\pi t\epsilon )^{n/2}}e^{-\frac{1}{2t\epsilon }\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}} d{{\varvec{y}}} = 1\). We can therefore invoke the Lebesgue dominated convergence theorem [30, Theorem 2.24] and use (109) and the limit \(\lim _{k\rightarrow +\infty }e^{-S_0({\varvec{x}},\mu _k)/\epsilon }=e^{-J({\varvec{x}})/\epsilon }\) (with \(\lim _{k\rightarrow +\infty }e^{-S_0({\varvec{x}},\mu _k)/\epsilon }=0\) for every \({\varvec{x}}\notin \mathrm {dom}~J\)) to find

$$\begin{aligned} \begin{aligned}&\lim _{k\rightarrow +\infty } f_\epsilon ({\varvec{x}},t,k) \\&\quad = \lim _{k\rightarrow +\infty } -\epsilon \log \left( \frac{1}{(2\pi t\epsilon )^{n/2}}\int _{{\mathbb {R}}^n}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+S_0({\varvec{y}},\mu _k)\right) /\epsilon }d{\varvec{y}}\right) \\&\quad = -\epsilon \log \left( \frac{1}{(2\pi t\epsilon )^{n/2}}\int _{\mathrm {dom}~J}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+J({\varvec{y}})\right) /\epsilon }d{\varvec{y}}\right) \\&\quad = S_\epsilon ({\varvec{x}},t), \end{aligned} \end{aligned}$$
(113)

which gives the limit (44). By continuous differentiability and convexity of \({\mathbb {R}}^n\ni {\varvec{x}}\mapsto f_\epsilon ({\varvec{x}},t,k)\) and \({\mathbb {R}}^n\ni {\varvec{x}}\mapsto S_\epsilon ({\varvec{x}},t)\) and the limit (113), we can invoke [58, Theorem 25.7] to conclude that the gradient \(\nabla _{{\varvec{x}}}f_k({\varvec{x}},t,\mu _k)\) converges to the gradient \(\nabla _{{\varvec{x}}}S_\epsilon ({\varvec{x}},t)\) as \(k \rightarrow +\infty \). Hence, we can take the limit \(k\rightarrow +\infty \) in (111) to find

$$\begin{aligned} \begin{aligned}&\lim _{k\rightarrow +\infty } \nabla _{{\varvec{x}}}f_\epsilon ({\varvec{x}},t,k) \\&\quad = \lim _{k\rightarrow +\infty } \left( \frac{\int _{{\mathbb {R}}^n} \nabla _{{\varvec{y}}}S_0({\varvec{y}},\mu _k)e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{\varvec{y}}}{\int _{{\mathbb {R}}^n} e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{\varvec{y}}}\right) \\&\quad = \nabla _{{\varvec{x}}}S_\epsilon ({\varvec{x}},t), \end{aligned} \end{aligned}$$
(114)

which gives the limit (45). Finally, using the definition of the posterior mean estimate (3), (114), and the representation formula (30) derived in Proposition 3.1(iii), namely \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) = {\varvec{x}}- t\nabla _{{\varvec{x}}}S_\epsilon ({\varvec{x}},t)\), we find the two limits

$$\begin{aligned} \begin{aligned}&{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \\&\quad = \lim _{k\rightarrow +\infty } \left( \frac{\int _{{\mathbb {R}}^n} {\varvec{y}}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{\varvec{y}}}{\int _{{\mathbb {R}}^n} e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{\varvec{y}}}\right) \\&\quad = {\varvec{x}}- t\lim _{k\rightarrow +\infty } \left( \frac{\int _{{\mathbb {R}}^n} \nabla _{{\varvec{y}}}S_0({\varvec{y}},\mu _k)e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{\varvec{y}}}{\int _{{\mathbb {R}}^n} e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}-{\varvec{y}}\right\| _{2}^{2}+S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{\varvec{y}}}\right) , \end{aligned} \end{aligned}$$

which establishes (46). This concludes the proof of (ii).

Proof of Proposition 4.3

Let us first introduce some notation. Let \({\varvec{x}}\in {\mathbb {R}}^n\), \(t>0\), \(\epsilon >0\), and \({\varvec{y}}_0 \in \mathrm {dom}~\partial J\). Define the functions

$$\begin{aligned} \mathrm {dom}~\partial J \ni {\varvec{y}}\mapsto \varphi _J({\varvec{y}}|{\varvec{x}},t)= & {} \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) + \pi _{\partial J({\varvec{y}})}({\varvec{0}}),\\ \mathrm {dom}~\partial J \ni {\varvec{y}}\mapsto \Phi _J({\varvec{y}}|{\varvec{x}},t)= & {} \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2} + J({\varvec{y}}), \end{aligned}$$

and

$$\begin{aligned} \varphi _{S_0(\cdot ,\mu _k)}({\varvec{y}}|{\varvec{x}},t) = \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) + \nabla _{{\varvec{y}}}S_0({\varvec{y}},\mu _k). \end{aligned}$$

Note that for every \({\varvec{y}}\in {\mathbb {R}}^n\), \(\varphi _J({\varvec{y}}|{\varvec{x}},t)\) is a subgradient of the function \({\varvec{u}}\mapsto \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{u}}} \right\| _{2}^{2} + J({\varvec{u}})\) evaluated at \({\varvec{u}}= {\varvec{y}}\) and \(\varphi _{S_0(\cdot ,\mu _k)}({\varvec{y}}|{\varvec{x}},t)\) is a subgradient of the function \({\varvec{u}}\mapsto \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{u}}} \right\| _{2}^{2} + S({\varvec{u}},\mu _k)\) evaluated at \({\varvec{u}}= {\varvec{y}}\). Let \(\{\mu _k\}_{k=1}^{+\infty }\) be a sequence of positive real numbers converging to zero and let \(S_0:{\mathbb {R}}^n\times (0,+\infty ) \rightarrow {\mathbb {R}}\) denote the solution to the first-order HJ PDE (20) with initial data J (see Proposition 2.2). Note that the sequence \(\{S_0({\varvec{y}},\mu _k)\}_{k=1}^{+\infty }\) is uniformly bounded from below since

$$\begin{aligned} \begin{aligned} S_0({\varvec{y}},\mu _k)&= \inf _{{\varvec{u}}\in {\mathbb {R}}^n}\left\{ \frac{1}{2t}\left\| {{\varvec{y}}-{\varvec{u}}} \right\| _{2}^{2} + J({\varvec{u}})\right\} \\&\geqslant J({\varvec{y}}) \\&\geqslant 0. \end{aligned} \end{aligned}$$
(115)

Now, define the function \(F:\mathrm {dom}~\partial J \times \mathrm {dom}~\partial J\times {\mathbb {R}}^n\times (0,+\infty )\rightarrow {\mathbb {R}}\) as

$$\begin{aligned}&F({\varvec{y}},{\varvec{y}}_0,{\varvec{x}},t)\nonumber \\&\quad = \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t)-\varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_{0}\right\rangle \nonumber \\&\qquad \frac{e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon }}{\int _{{\mathbb {R}}^n}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}}} \end{aligned}$$
(116)

and the sequence of functions \(\{F_{\mu _k}\}_{k=1}^{+\infty }\) with \(F_{\mu _k}:{\mathbb {R}}^n\times {\mathbb {R}}^n\times {\mathbb {R}}^n\times (0,+\infty )\rightarrow {\mathbb {R}}\) as

$$\begin{aligned}&F_{\mu _k}({\varvec{y}},{\varvec{y}}_0,{\varvec{x}},t)\nonumber \\&\quad =\left\langle \varphi _{S_0(\cdot ,\mu _k)}({\varvec{y}}|{\varvec{x}},t)-\varphi _{S_0(\cdot ,\mu _k)} ({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_{0}\right\rangle \nonumber \\&\qquad \frac{e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon }}{\int _{{\mathbb {R}}^n}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{{\varvec{y}}}}. \end{aligned}$$
(117)

Since \(\lim _{k\rightarrow +\infty } S_0({\varvec{y}},\mu _k) = J({\varvec{y}})\) and \(\lim _{k\rightarrow +\infty } \nabla _{{\varvec{y}}}S_0({\varvec{y}},\mu _k) = \pi _{\partial J({\varvec{y}})}({\varvec{0}})\) for every \({\varvec{y}}\in \mathrm {dom}~\partial J\) by Proposition 2.2(i) and (iv), and

$$\begin{aligned}&\lim _{k\rightarrow +\infty } \int _{{\mathbb {R}}^n}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{{\varvec{y}}} \nonumber \\&\quad = \int _{{\mathbb {R}}^n}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}} \end{aligned}$$
(118)

by (44) in Proposition 4.2(ii) and continuity of the logarithm, the limit \(\lim _{k \rightarrow +\infty } F_{\mu }({\varvec{y}},{\varvec{y}}_0,{\varvec{x}},t) = F({\varvec{y}},{\varvec{y}}_0,{\varvec{x}},t)\) holds for every \({\varvec{y}}\in \mathrm {dom}~\partial J\), \({\varvec{y}}_0 \in \mathrm {dom}~\partial J\), \({\varvec{x}}\in {\mathbb {R}}^n\) and \(t > 0\). Note that as J is strongly convex with parameter \(m\geqslant 0\), the functions \({\varvec{y}}\mapsto \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2} + J({\varvec{y}})\) and \({\varvec{y}}\mapsto \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\) are strongly convex with parameter \(\left( \frac{1+mt}{t}\right) > 0\). As a consequence, for every pair \(({\varvec{y}},{\varvec{y}}_0) \in \mathrm {dom}~\partial J \times \mathrm {dom}~\partial J\), the following monotonicity inequalities hold (see Definition 5, Eq. (13)):

$$\begin{aligned} 0\leqslant & {} \left( \frac{1+mt}{t}\right) \left\| {{\varvec{y}}-{\varvec{y}}_{0}} \right\| _{2}^{2}\nonumber \\\leqslant & {} \left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t)-\varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_{0}\right\rangle \end{aligned}$$
(119)

and

$$\begin{aligned} 0\leqslant & {} \left( \frac{1+mt}{t}\right) \left\| {{\varvec{y}}-{\varvec{y}}_{0}} \right\| _{2}^{2}\nonumber \\\leqslant & {} \left\langle \varphi _{S_0(\cdot ,\mu _k)}({\varvec{y}}|{\varvec{x}},t)-\varphi _{S_0(\cdot ,\mu _k)} ({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_{0}\right\rangle . \end{aligned}$$
(120)

Multiply the first set of inequalities by \(e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon }/\int _{{\mathbb {R}}^n}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}}\) and the second set of inequalities by \(e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon }/\int _{{\mathbb {R}}^n}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{{\varvec{y}}}\) and use the definition of F and \(F_{\mu _k}\) to get the inequalities

$$\begin{aligned} 0\leqslant & {} \left( \frac{1+mt}{t}\right) \left\| {{\varvec{y}}-{\varvec{y}}_{0}} \right\| _{2}^{2} \frac{e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon }}{\int _{{\mathbb {R}}^n}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + J({\varvec{y}})\right) /\epsilon } d{{\varvec{y}}}}\nonumber \\\leqslant & {} F({\varvec{y}},{\varvec{y}}_0,{\varvec{x}},t)\\ 0\leqslant & {} \left( \frac{1+mt}{t}\right) \left\| {{\varvec{y}}-{\varvec{y}}_{0}} \right\| _{2}^{2} \frac{e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon }}{\int _{{\mathbb {R}}^n}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{{\varvec{y}}}} \nonumber \\\leqslant & {} F_{\mu _k}({\varvec{y}},{\varvec{y}}_0,{\varvec{x}},t).\nonumber \end{aligned}$$
(121)

These inequalities show, in particular, that F and \(F_{\mu _k}\) are both non-negative functions for every \(({\varvec{y}},{\varvec{y}}_0) \in \mathrm {dom}~\partial J \times \mathrm {dom}~\partial J\), \({\varvec{x}}\in {\mathbb {R}}^n\), and \(t>0\). As a consequence, Fatou’s lemma [30, Lemma 2.18] applies to the sequence of functions \(\{F_{\mu _k}\}_{k=1}^{+\infty }\), and hence

$$\begin{aligned} \begin{aligned}&\int _{\mathrm {dom}~\partial J}F({\varvec{y}},{\varvec{y}}_0,{\varvec{x}},t) d{\varvec{y}}\\&\quad \leqslant \liminf _{k \rightarrow +\infty } \int _{\mathrm {dom}~\partial J} F_{\mu _k}({\varvec{y}},{\varvec{y}}_0,{\varvec{x}},t) d{\varvec{y}}\\&\quad \leqslant \liminf _{k \rightarrow +\infty } \int _{{\mathbb {R}}^n} F_{\mu _k}({\varvec{y}},{\varvec{y}}_0,{\varvec{x}},t) d{\varvec{y}}\\&\quad = \liminf _{k \rightarrow +\infty } \left( \frac{\int _{{\mathbb {R}}^n} \left\langle \varphi _{S_0(\cdot ,\mu _k)}({\varvec{y}}|{\varvec{x}},t)-\varphi _{S_0(\cdot ,\mu _k)} ({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_{0}\right\rangle e^{-\left( \frac{1}{2t} \left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{\varvec{y}}}{\int _{{\mathbb {R}}^n}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{{\varvec{y}}}}\right) \\&\quad = \liminf _{k \rightarrow +\infty } \left( \frac{\int _{{\mathbb {R}}^n} \left\langle \varphi _{S_0(\cdot ,\mu _k)}({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_{0}\right\rangle e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{\varvec{y}}}{\int _{{\mathbb {R}}^n}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{{\varvec{y}}}} \right. \\&\qquad - \left. \frac{\int _{{\mathbb {R}}^n} \left\langle \varphi _{S_0(\cdot ,\mu _k)}({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_{0}\right\rangle e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{\varvec{y}}}{\int _{{\mathbb {R}}^n}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{{\varvec{y}}}} \right) . \end{aligned} \end{aligned}$$
(122)

We now wish to compute the limit in (122). On the one hand, we can apply formula (39) in Proposition 4.2(i) (with initial data \(S_0(\cdot ,\mu _k)\) and using \(\varphi _{S_0(\cdot ,\mu _k)}({\varvec{y}}|{\varvec{x}},t) = \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) + \nabla _{{\varvec{y}}}S_0({\varvec{y}},\mu _k)\)) to the first integral on the right side on the last line of (122) to get

$$\begin{aligned}&\frac{\int _{{\mathbb {R}}^n} \left\langle \varphi _{S_0(\cdot ,\mu _k)}({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_{0}\right\rangle e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{\varvec{y}}}{\int _{{\mathbb {R}}^n}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{{\varvec{y}}}} \nonumber \\&\quad = n\epsilon . \end{aligned}$$
(123)

On the other hand, applying the limit result (46) in Proposition 4.2(ii) for the posterior mean estimate \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\) and the limit \(\lim _{k\rightarrow +\infty } \varphi _{S_0(\cdot ,\mu _k)}({\varvec{y}}_0|{\varvec{x}},t) = \varphi _{J}({\varvec{y}}_0|{\varvec{x}},t) = \left( \frac{{\varvec{y}}_0-{\varvec{x}}}{t}\right) + \pi _{\partial J({\varvec{y}}_0)}({\varvec{0}})\) to the second integral on the right side on the last line of (122), we get

$$\begin{aligned}&\liminf _{k\rightarrow +\infty } \left( \frac{\int _{{\mathbb {R}}^n} \left\langle \varphi _{S_0(\cdot ,\mu _k)}({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_{0}\right\rangle e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{\varvec{y}}}{\int _{{\mathbb {R}}^n}e^{-\left( \frac{1}{2t}\left\| {\varvec{x}}- {\varvec{y}}\right\| _{2}^{2} + S_0({\varvec{y}},\mu _k)\right) /\epsilon } d{{\varvec{y}}}}\right) \nonumber \\&\quad = \left\langle \varphi _{J}({\varvec{y}}_0|{\varvec{x}},t),{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\varvec{y}}_0\right\rangle . \end{aligned}$$
(124)

Combine (27), (116), (121), (122), (123), and (124) to get

$$\begin{aligned}&\left( \frac{1+mt}{t}\right) {\mathbb {E}}_{J}\left[ {\left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2}^{2}}\right] \\&\quad \leqslant {\mathbb {E}}_{J}\left[ {\left\langle \varphi _J({\varvec{y}}|{\varvec{x}},t)-\varphi _J({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_{0}\right\rangle }\right] \\&\quad \leqslant n\epsilon - \left\langle \varphi _{J}({\varvec{y}}_0|{\varvec{x}},t),{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\varvec{y}}_0\right\rangle . \end{aligned}$$

This establishes the set of inequalities (47).

Next, we show that \({\mathbb {E}}_{J}\left[ {\left\| {\pi _{\partial J({\varvec{y}})}({\varvec{0}})} \right\| _{2}}\right] < +\infty \) indirectly using the set of inequalities (47). By Proposition 4.1, \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\in \mathrm {int}~(\mathrm {dom}~J)\). Hence, there exists a number\(\delta >0\) such that the open ball \(\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2} < \delta \}\) is contained in \(\mathrm {int}~(\mathrm {dom}~J)\). Let \({\varvec{y}}_0 \in \{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2} < \delta \}\) with \({\varvec{y}}_0 \ne {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\). Recall \(\mathrm {int}~(\mathrm {dom}~J) \subset \mathrm {dom}~\partial J\), so that both \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\) and \({\varvec{y}}_0\) are in the set \(\mathrm {dom}~\partial J\). We claim that \({\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t),{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\varvec{y}}_0\right\rangle \right| }\right] < +\infty \). Indeed, using the triangle inequality, the set of inequalities (47) proven previously, the Cauchy-Schwarz inequality, and that \({\mathbb {E}}_{J}\left[ {\left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2}}\right] \leqslant \left( \int _{{\mathbb {R}}^n} \left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2}e^{-\frac{1}{2t\epsilon }\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2}} d{\varvec{y}}\right) /Z_J({\varvec{x}},t,\epsilon ) < +\infty \) by assumption (A3),

$$\begin{aligned} 0&\leqslant {\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t),{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\varvec{y}}_0\right\rangle \right| }\right] \nonumber \\&= {\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t),{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\varvec{y}}_0 + ({\varvec{y}}-{\varvec{y}})\right\rangle \right| }\right] \nonumber \\&= {\mathbb {E}}_{J}\left[ \left| \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0\right\rangle \right. \right. \nonumber \\&\quad \left. \left. - \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle \right| \right] \nonumber \\&\leqslant {\mathbb {E}}_{J}\left[ \left| \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0\right\rangle \right| \right. \nonumber \\&\quad \left. + \left| \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle \right| \right] \nonumber \\&= {\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0\right\rangle \right| }\right] \nonumber \\&\quad + {\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\right\rangle \right| }\right] \nonumber \\&\leqslant {\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0\right\rangle \right| }\right] + n\epsilon \nonumber \\&= {\mathbb {E}}_{J}\left[ \left| \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t) \right. \right. \right. \nonumber \\&\quad \left. \left. \left. + (\varphi _{J}({\varvec{y}}_0|{\varvec{x}},t) -\varphi _{J}({\varvec{y}}_0|{\varvec{x}},t)),{\varvec{y}}-{\varvec{y}}_0\right\rangle \right| \right] + n\epsilon \nonumber \\&= {\mathbb {E}}_{J}\left[ \left| \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t) -\varphi _{J}({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0\right\rangle \right. \right. \nonumber \\&\quad \left. \left. + \left\langle \varphi _{J}({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0\right\rangle \right| \right] + n\epsilon \nonumber \\&\leqslant {\mathbb {E}}_{J}\left[ \left| \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t) -\varphi _{J}({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0\right\rangle \right| \right. \nonumber \\&\quad \left. + \left| \left\langle \varphi _{J}({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0\right\rangle \right| \right] + n\epsilon \nonumber \\&\leqslant {\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t) -\varphi _{J}({\varvec{y}}_0|{\varvec{x}},t), {\varvec{y}}-{\varvec{y}}_0\right\rangle \right| }\right] \nonumber \\&\quad + {\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _{J}({\varvec{y}}_0|{\varvec{x}},t),{\varvec{y}}-{\varvec{y}}_0\right\rangle \right| }\right] + n\epsilon \nonumber \\&\leqslant {\mathbb {E}}_{J}\left[ {\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t) -\varphi _{J}({\varvec{y}}_0|{\varvec{x}},t), {\varvec{y}}-{\varvec{y}}_0\right\rangle }\right] \nonumber \\&\quad + {\mathbb {E}}_{J}\left[ {\left\| {\varphi _{J}({\varvec{y}}_0|{\varvec{x}},t)} \right\| _{2}\left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2}}\right] + n\epsilon \nonumber \\&\leqslant n\epsilon - \left\langle \varphi _{J}({\varvec{y}}_0|{\varvec{x}},t),{\varvec{u}}_{PM}(x,t,\epsilon )-{\varvec{y}}_0 \right\rangle \nonumber \\&\quad + \left\| {\varphi _{J}({\varvec{y}}_0|{\varvec{x}},t)} \right\| _{2}{\mathbb {E}}_{J}\left[ {\left\| {{\varvec{y}}-{\varvec{y}}_0} \right\| _{2}}\right] + n\epsilon \nonumber \\&< +\infty . \end{aligned}$$
(125)

This shows that \({\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t),{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\varvec{y}}_0\right\rangle \right| }\right] < +\infty \) for every \({\varvec{y}}_0 \in \{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2} < \delta \}\) different from \({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )\). Now, let \(\{{\varvec{e}}_i\}_{i=1}^{n}\) denote the standard basis in \({\mathbb {R}}^n\) and let \(\{\varphi _J({\varvec{y}}|{\varvec{x}},t)_i\}_{i=1}^{n}\) denote the components of the vector \(\varphi _J({\varvec{y}}|{\varvec{x}},t)\), i.e., \(\varphi _J({\varvec{y}}|{\varvec{x}},t) = (\varphi _J({\varvec{y}}|{\varvec{x}},t)_1,\dots ,\varphi _J({\varvec{y}}|{\varvec{x}},t)_n)\). Using (125) with the choice of \({\varvec{y}}_0 = {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) - \frac{\delta }{2}{\varvec{e}}_i\), which is contained in the open ball \(\{{\varvec{y}}\in {\mathbb {R}}^n\mid \left\| {{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2} < \delta \}\) for each \(i\in \{1,\dots ,n\}\), we get

$$\begin{aligned} \begin{aligned} 0&\leqslant {\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t), {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) - ({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) - \frac{\delta }{2}{\varvec{e}}_i)\right\rangle \right| }\right] \\&={\mathbb {E}}_{J}\left[ {\left| \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t), \frac{\delta }{2}{\varvec{e}}_i\right\rangle \right| }\right] \\&= \frac{\delta }{2}{\mathbb {E}}_{J}\left[ {|\varphi _J({\varvec{y}}|{\varvec{x}},t)_i|}\right] \\&\leqslant 2n\epsilon - \left\langle \varphi _{J}({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) - \frac{\delta }{2}{\varvec{e}}_i\left| \right. {\varvec{x}},t),\frac{\delta }{2} e_i \right\rangle \\&\quad + \left\| {\varphi _{J}({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) - \frac{\delta }{2}{\varvec{e}}_i|{\varvec{x}},t)} \right\| _{2}\\&\quad {\mathbb {E}}_{J}\left[ {\left\| {{\varvec{y}}-({\varvec{u}}_{PM}({\varvec{x}},t, \epsilon )-\frac{\delta }{2}{\varvec{e}}_i)} \right\| _{2}}\right] \\&< +\infty . \end{aligned} \end{aligned}$$
(126)

Using (126) and the norm inequality \(\left\| {\varphi _J({\varvec{y}}|{\varvec{x}},t)} \right\| _{2} \leqslant \sum _{i=1}^{n}\left| \varphi _J({\varvec{y}}|{\varvec{x}},t)_i\right| \), we can bound \({\mathbb {E}}_{J}\left[ {\left\| {\varphi _J({\varvec{y}}|{\varvec{x}},t)} \right\| _{2}}\right] \) as follows

$$\begin{aligned} 0\leqslant & {} {\mathbb {E}}_{J}\left[ {\left\| {\varphi _J({\varvec{y}}|{\varvec{x}},t)} \right\| _{2}}\right] \nonumber \\\leqslant & {} {\mathbb {E}}_{J}\left[ {\sum _{i=1}^{n}\left| \varphi _J({\varvec{y}}|{\varvec{x}},t)_i\right| }\right] \nonumber \\= & {} \sum _{i=1}^{n} {\mathbb {E}}_{J}\left[ {\left| \varphi _J({\varvec{y}}|{\varvec{x}},t)_i\right| }\right] \nonumber \\\leqslant & {} 2n^2\epsilon -\sum _{i=1}^{n} \left\langle \varphi _{J}({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) - \frac{\delta }{2}{\varvec{e}}_i|{\varvec{x}},t),\frac{\delta }{2} e_i \right\rangle \nonumber \\&+ \sum _{i=1}^{n} \left\| {\varphi _{J}({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) + \frac{\delta }{2}{\varvec{e}}_i|{\varvec{x}},t)} \right\| _{2}\nonumber \\&{\mathbb {E}}_{J}\left[ {\left\| {{\varvec{y}}-({\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \frac{\delta }{2}{\varvec{e}}_i)} \right\| _{2}}\right] \nonumber \\&< +\infty . \end{aligned}$$
(127)

This shows that \({\mathbb {E}}_{J}\left[ {\left\| {\varphi _J({\varvec{y}}|{\varvec{x}},t)} \right\| _{2}}\right] < + \infty \). Finally, use (127), \(\varphi _J({\varvec{y}}|{\varvec{x}},t) = \frac{{\varvec{y}}-{\varvec{x}}}{t} + \pi _{\partial J({\varvec{y}})}({\varvec{0}})\), and assumption (A3) to find

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{J}\left[ {\left\| {\pi _{\partial J({\varvec{y}})}({\varvec{0}})} \right\| _{2}}\right] \\&\quad = {\mathbb {E}}_{J}\left[ {\left\| {\pi _{\partial J({\varvec{y}})}({\varvec{0}}) + \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) - \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) } \right\| _{2}}\right] \\&\quad \leqslant {\mathbb {E}}_{J}\left[ {\left\| {\pi _{\partial J({\varvec{y}})}({\varvec{0}}) + \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) } \right\| _{2} + \left\| {\left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) } \right\| _{2}}\right] \\&\quad ={\mathbb {E}}_{J}\left[ {\left\| {\varphi _J({\varvec{y}}|{\varvec{x}},t)} \right\| _{2}}\right] + {\mathbb {E}}_{J}\left[ {\left\| {\frac{{\varvec{y}}-{\varvec{x}}}{t}} \right\| _{2}}\right] \\&\quad \leqslant {\mathbb {E}}_{J}\left[ {\left\| {\varphi _J({\varvec{y}}|{\varvec{x}},t)} \right\| _{2}}\right] \\&\qquad + \frac{1}{tZ_J({\varvec{x}},t,\epsilon )}\int _{{\mathbb {R}}^n} \left\| {{\varvec{y}}-{\varvec{x}}} \right\| _{2} e^{-\frac{1}{2t\epsilon }\left\| {{\varvec{y}}-{\varvec{x}}} \right\| _{2}^{2}} d{\varvec{y}}\\&\quad < +\infty . \end{aligned} \end{aligned}$$

This shows that \({\mathbb {E}}_{J}\left[ {\left\| {\pi _{\partial J({\varvec{y}})}({\varvec{0}})} \right\| _{2}}\right] < +\infty \).

Proof of Proposition 4.5

Proof of (i): Let \({\varvec{x}}\in {\mathbb {R}}^n\) and \(t > 0\) and define the functions

$$\begin{aligned} \mathrm {dom}~\partial J \ni {\varvec{y}}\mapsto \varphi _J({\varvec{y}}|{\varvec{x}},t)= & {} \left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) + \pi _{\partial J({\varvec{y}})}({\varvec{0}}),\\ \mathrm {dom}~\partial J \ni {\varvec{y}}\mapsto \Phi _J({\varvec{y}}|{\varvec{x}},t)= & {} \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2} + J({\varvec{y}}). \end{aligned}$$

Note that for every \({\varvec{y}}\in {\mathbb {R}}^n\), \(\varphi _J({\varvec{y}}|{\varvec{x}},t)\) is a subgradient of the function \({\varvec{u}}\mapsto \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{u}}} \right\| _{2}^{2} + J({\varvec{u}})\) evaluated at \({\varvec{u}}= {\varvec{y}}\).

Let \({\varvec{u}}\in \mathrm {dom}~\partial J\). The Bregman divergence of the function \(\mathrm {dom}~\partial J\ni {\varvec{y}}\mapsto \Phi _{J}({\varvec{y}}|{\varvec{x}},t)\) at \(({\varvec{u}},\varphi _{J}({\varvec{y}}|{\varvec{x}},t))\) is given by

$$\begin{aligned}&D_{\Phi _{J}}({\varvec{u}},\varphi _{J}({\varvec{y}}|{\varvec{x}},t)) \\&\quad =\Phi _{J}({\varvec{u}}|{\varvec{x}},t)-\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t)),{\varvec{u}}\right\rangle +\Phi _{J}^{*}(\varphi _{J}({\varvec{y}}|{\varvec{x}},t))\\&\quad \equiv \Phi _{J}({\varvec{u}}|{\varvec{x}},t)-\Phi _{J}({\varvec{y}}|{\varvec{x}},t)+\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t)),{\varvec{y}}-{\varvec{u}}\right\rangle , \end{aligned}$$

where the second equality follows by definition of the subdifferential (see Definition 6) and that \(\varphi _{J}({\varvec{y}}|{\varvec{x}},t) \in \partial \Phi _J({\varvec{y}},{\varvec{x}},t)\).

Take the expected value with respect to the variable \({\varvec{y}}\) over \(\mathrm {dom}~\partial J\) to find

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{J}\left[ {D_{\Phi _{J}}({\varvec{u}},\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] \\&\quad =\Phi _{J}({\varvec{u}}|{\varvec{x}},t)-{\mathbb {E}}_{J}\left[ {\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t)),{\varvec{u}}\right\rangle + \Phi _{J}^{*}(\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] \\&\quad \equiv \Phi _{J}({\varvec{u}}|{\varvec{x}},t)-{\mathbb {E}}_{J}\left[ {\Phi _{J}({\varvec{y}}|{\varvec{x}},t) + \left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t)),{\varvec{y}}-{\varvec{u}}\right\rangle }\right] . \end{aligned} \end{aligned}$$
(128)

We claim that the expected value \({\mathbb {E}}_{J}\left[ {D_{\Phi _{J}}({\varvec{u}},\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] \) is finite. We will show this by proving, in turn, that the expected values \({\mathbb {E}}_{J}\left[ {\Phi _{J}({\varvec{y}}|{\varvec{x}},t)}\right] \) and \({\mathbb {E}}_{J}\left[ {\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t)),{\varvec{y}}-{\varvec{u}}\right\rangle }\right] \) are finite. Establishing the finiteness of \({\mathbb {E}}_{J}\left[ {D_{\Phi _{J}}({\varvec{u}},\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] \) will enable us to conclude that the expected value \({\mathbb {E}}_{J}\left[ {\Phi _{J}^{*}(\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] \) on the right hand side of the first equality of (128) is also finite.

First, using the definition of \(\Phi _{J}({\varvec{y}}|{\varvec{x}},t)\) we have

$$\begin{aligned} {\mathbb {E}}_{J}\left[ {\Phi _{J}({\varvec{y}}|{\varvec{x}},t)}\right] \equiv {\mathbb {E}}_{J}\left[ {\frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2} + J({\varvec{y}})}\right] . \end{aligned}$$

The expected value \({\mathbb {E}}_{J}\left[ {\frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2}}\right] \) is finite because we can use the definitions of the posterior mean estimate and inequality (48) (with \(m \equiv 0\) in (48)) to express it as

$$\begin{aligned} \begin{aligned} 0&< {\mathbb {E}}_{J}\left[ {\frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2}}\right] \\&= {\mathbb {E}}_{J}\left[ {\frac{1}{2t}\left\| {({\varvec{x}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )) - ({\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ))} \right\| _{2}^{2}}\right] \\&= {\mathbb {E}}_{J}\left[ {\frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2}^{2}}\right] \\&\quad + {\mathbb {E}}_{J}\left[ {\frac{1}{2t}\left\| {{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2}^{2}}\right] \\&\quad + 2{\mathbb {E}}_{J}\left[ {\left\langle {\varvec{x}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ),{\varvec{y}}- {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle }\right] \\&= \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2}^{2} \\&\quad + {\mathbb {E}}_{J}\left[ {\frac{1}{2t}\left\| {{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2}^{2}}\right] \\&\quad + 2\left\langle {\varvec{x}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ),{\mathbb {E}}_{J}\left[ {{\varvec{y}}}\right] - {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle \\&= \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2}^{2} \\&\quad + {\mathbb {E}}_{J}\left[ {\frac{1}{2t}\left\| {{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2}^{2}}\right] \\&\quad + 2\left\langle {\varvec{x}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ),{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) - {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon ) \right\rangle \\&= \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2}^{2} \\&\quad + {\mathbb {E}}_{J}\left[ {\frac{1}{2t}\left\| {{\varvec{y}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2}^{2}}\right] \\&\leqslant \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )} \right\| _{2}^{2} + \frac{n\epsilon }{2}. \end{aligned} \end{aligned}$$

The expected value \({\mathbb {E}}_{J}\left[ {J({\varvec{y}})}\right] \) is also finite because it is bounded by the set of inequalities (38) in Proposition 4.1. Hence, the expected value \({\mathbb {E}}_{J}\left[ {\Phi _{J}({\varvec{y}}|{\varvec{x}},t)}\right] \equiv {\mathbb {E}}_{J}\left[ {\frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{y}}} \right\| _{2}^{2}}\right] + {\mathbb {E}}_{J}\left[ {J({\varvec{y}})}\right] \) is finite.

Second, note that the expected value \({\mathbb {E}}_{J}\left[ {\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t), {\varvec{y}}-{\varvec{u}}\right\rangle }\right] \) can be written as

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{J}\left[ {\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t), {\varvec{y}}-{\varvec{u}}\right\rangle }\right] \\&\quad = {\mathbb {E}}_{J}\left[ {\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t)-\varphi _{J}({\varvec{u}}|{\varvec{x}},t), {\varvec{y}}-{\varvec{u}}\right\rangle + \left\langle \varphi _{J}({\varvec{u}}|{\varvec{x}},t), {\varvec{y}}-{\varvec{u}}\right\rangle }\right] \\&\quad = {\mathbb {E}}_{J}\left[ {\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t)-\varphi _{J}({\varvec{u}}|{\varvec{x}},t), {\varvec{y}}-{\varvec{u}}\right\rangle }\right] \\&\qquad + \left\langle \varphi _{J}({\varvec{u}}|{\varvec{x}},t), {\mathbb {E}}_{J}\left[ {{\varvec{y}}}\right] -{\varvec{u}}\right\rangle \\&\quad = {\mathbb {E}}_{J}\left[ {\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t)-\varphi _{J}({\varvec{u}}|{\varvec{x}},t), {\varvec{y}}-{\varvec{u}}\right\rangle }\right] \\&\qquad + \left\langle \varphi _{J}({\varvec{u}}|{\varvec{x}},t), {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\varvec{u}}\right\rangle . \end{aligned} \end{aligned}$$
(129)

Apply the monotonicity property (47) to the expected value \({\mathbb {E}}_{J}\left[ {\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t)-\varphi _{J}({\varvec{u}}|{\varvec{x}},t), {\varvec{y}}-{\varvec{u}}\right\rangle }\right] \) (with \({\varvec{y}}_0 \equiv {\varvec{u}}\) in (47)) in the previous equation to find

$$\begin{aligned} 0\leqslant & {} {\mathbb {E}}_{J}\left[ {\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t)-\varphi _{J}({\varvec{u}}|{\varvec{x}},t), {\varvec{y}}-{\varvec{u}}\right\rangle }\right] \\\leqslant & {} n\epsilon - \left\langle \varphi _{J}({\varvec{u}}|{\varvec{x}},t), {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\varvec{u}}\right\rangle . \end{aligned}$$

Add the term \(\left\langle \varphi _{J}({\varvec{u}}|{\varvec{x}},t), {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\varvec{u}}\right\rangle \) on both sides of these inequalities to get

$$\begin{aligned}&\left\langle \varphi _{J}({\varvec{u}}|{\varvec{x}},t), {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\varvec{u}}\right\rangle \nonumber \\&\quad \leqslant {\mathbb {E}}_{J}\left[ {\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t)-\varphi _{J}({\varvec{u}}|{\varvec{x}},t), {\varvec{y}}-{\varvec{u}}\right\rangle }\right] \nonumber \\&\qquad + \left\langle \varphi _{J}({\varvec{u}}|{\varvec{x}},t), {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\varvec{u}}\right\rangle \leqslant n\epsilon . \end{aligned}$$
(130)

Combine the inequalities (130) with the equality (129) to find

$$\begin{aligned}&\left\langle \varphi _{J}({\varvec{u}}|{\varvec{x}},t), {\varvec{u}}_{PM}({\varvec{x}},t,\epsilon )-{\varvec{u}}\right\rangle \\&\quad \leqslant {\mathbb {E}}_{J}\left[ {\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t), {\varvec{y}}-{\varvec{u}}\right\rangle }\right] \leqslant n\epsilon . \end{aligned}$$

These bounds prove that the expected value \({\mathbb {E}}_{J}\left[ {\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t), {\varvec{y}}-{\varvec{u}}\right\rangle }\right] \) is finite.

The previous arguments show that the expected value \({\mathbb {E}}_{J}\left[ {D_{\Phi _{J}}({\varvec{u}},\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] \) is finite. Now, we claim that the expected value \({\mathbb {E}}_{J}\left[ {\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t)),{\varvec{u}}\right\rangle }\right] \equiv \left\langle {\mathbb {E}}_{J}\left[ {\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] ,{\varvec{u}}\right\rangle \) is finite. Indeed, we can use the representation formula (30) for expressing the posterior mean estimate in terms of the gradient \(\nabla _{{\varvec{x}}}S_\epsilon ({\varvec{x}},t)\) of the solution to the viscous HJ PDE (29) and use that \({\mathbb {E}}_{J}\left[ {\pi _{\partial J({\varvec{y}})}({\varvec{0}})}\right] \) is finite (Proposition 4.3) to write

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{J}\left[ {\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] \\&\quad = {\mathbb {E}}_{J}\left[ {\left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) + \pi _{\partial J({\varvec{y}})}({\varvec{0}})}\right] \\&\quad = {\mathbb {E}}_{J}\left[ {\left( \frac{{\varvec{y}}-{\varvec{x}}}{t}\right) }\right] + {\mathbb {E}}_{J}\left[ {\pi _{\partial J({\varvec{y}})}({\varvec{0}})}\right] \\&\quad \equiv -\nabla _{{\varvec{x}}}S_\epsilon ({\varvec{x}},t) + {\mathbb {E}}_{J}\left[ {\pi _{\partial J({\varvec{y}})}({\varvec{0}})}\right] , \end{aligned} \end{aligned}$$

where both terms on the right hand side are finite. This shows that \({\mathbb {E}}_{J}\left[ {\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] \) is finite.

Using that \({\mathbb {E}}_{J}\left[ {D_{\Phi _{J}}({\varvec{u}},\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] \) and \({\mathbb {E}}_{J}\left[ {\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] \) are finite in Eq. (128), we conclude that the expected value \({\mathbb {E}}_{J}\left[ {\Phi _{J}^{*}(\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] \) is also finite. We can now use the definitions of \(\Phi _J\) and \(\varphi _J\) to express Eq. (128) as

$$\begin{aligned} \begin{aligned}&{\mathbb {E}}_{J}\left[ {D_{\Phi _{J}}({\varvec{u}},\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] \\&\quad = {\mathbb {E}}_{J}\left[ {\Phi _{J}({\varvec{u}}|{\varvec{x}},t)-\left\langle \varphi _{J}({\varvec{y}}|{\varvec{x}},t)),{\varvec{u}}\right\rangle +\Phi _{J}^{*}(\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] \\&\quad = \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{u}}} \right\| _{2}^{2} + J({\varvec{u}}) \\&\qquad + \left\langle \nabla _{{\varvec{x}}}S_\epsilon ({\varvec{x}},t) - {\mathbb {E}}_{J}\left[ {\pi _{\partial J({\varvec{y}})}({\varvec{0}})}\right] ,{\varvec{u}}\right\rangle \\&\qquad + {\mathbb {E}}_{J}\left[ {\Phi _{J}^{*}(\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] , \end{aligned} \end{aligned}$$
(131)

where, again, we used that \({\mathbb {E}}_{J}\left[ {\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] = -\nabla _{{\varvec{x}}}S_\epsilon ({\varvec{x}},t) + {\mathbb {E}}_{J}\left[ {\pi _{\partial J({\varvec{y}})}({\varvec{0}})}\right] \). Now, let

$$\begin{aligned} {\tilde{J}}({\varvec{u}}) = J({\varvec{u}}) + \left\langle \nabla _{{\varvec{x}}}S_\epsilon ({\varvec{x}},t) - {\mathbb {E}}_{J}\left[ {\pi _{\partial J({\varvec{y}})}({\varvec{0}})}\right] ,{\varvec{u}}\right\rangle . \end{aligned}$$

Take the infimum over \({\varvec{u}}\in {\mathbb {R}}^n\) on both sides of Eq. (131) to find:

$$\begin{aligned}&\inf _{{\varvec{u}}\in {\mathbb {R}}^n}{\mathbb {E}}_{J}\left[ {D_{\Phi _{J}}({\varvec{u}},\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] \\&\quad = \inf _{{\varvec{u}}\in {\mathbb {R}}^n}\left\{ \frac{1}{2t}\left\| {{\varvec{x}}-{\varvec{u}}} \right\| _{2}^{2} + {\tilde{J}}({\varvec{u}})\right\} + {\mathbb {E}}_{J}\left[ {\Phi _{J}^{*}(\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] \end{aligned}$$

Now, note that by assumption (A1) \({\varvec{y}}\mapsto J({\varvec{y}}) \in \Gamma _0({\mathbb {R}}^n)\), and therefore the function \({\varvec{u}}\mapsto {\tilde{J}}({\varvec{u}}) \in \Gamma _0({\mathbb {R}}^n)\). Therefore, the function \(u \ni {\mathbb {R}}^n\rightarrow \mapsto \frac{1}{2}\left\| {{\varvec{x}}-{\varvec{u}}} \right\| _{2}^{2} + {\tilde{J}}(u)\) is strictly convex and has a unique minimizer denoted by \({\bar{{\varvec{u}}}}\). Therefore, the infimum in the equality above can be replaced by a minimum. In addition, recall that \(\min _{u \in {\mathbb {R}}^n} {\frac{1}{2}\left\| {{\varvec{x}}-{\varvec{u}}} \right\| _{2}^{2} + {\tilde{J}}(u)} \) corresponds to the solution to the first-order HJ PDE (20) with initial condition \({\tilde{J}}\). Using Proposition 2.2(ii), the unique minimizer \({\bar{{\varvec{u}}}}\) can be expressed using the inclusion relation

$$\begin{aligned} \left( \frac{{\varvec{x}}- {\bar{{\varvec{u}}}}}{t}\right) \in \partial J({\bar{{\varvec{u}}}}) + \left( \nabla _{{\varvec{x}}}S_\epsilon ({\varvec{x}},t) - {\mathbb {E}}_{J}\left[ {\pi _{\partial J({\varvec{y}})}({\varvec{0}})}\right] \right) . \end{aligned}$$
(132)

Therefore, the minimizer \({\bar{{\varvec{u}}}}\) is also the unique minimizer to \({\varvec{u}}\mapsto {\mathbb {E}}_{J}\left[ {D_{\Phi _{J}}({\varvec{u}},\varphi _{J}({\varvec{y}}|{\varvec{x}},t))}\right] \).

Proof of (ii): If \(\mathrm {dom}~J = {\mathbb {R}}^n\), then the representation formula \(\nabla _{{\varvec{x}}}S_\epsilon ({\varvec{x}},t) = {\mathbb {E}}_{J}\left[ {\pi _{\partial J({\varvec{y}})}({\varvec{0}})}\right] \) derived in Proposition 4.2 holds and the characterization of the unique minimizer \({\bar{{\varvec{u}}}}\) in equation (132) reduces to

$$\begin{aligned} \left( \frac{{\varvec{x}}- {\bar{{\varvec{u}}}}}{t}\right) \in \partial J({\bar{{\varvec{u}}}}). \end{aligned}$$

By Proposition 2.2(ii), the unique minimizer that satisfies this characterization is the MAP estimate \({\varvec{u}}_{MAP}({\varvec{x}},t)\), i.e., \({\bar{{\varvec{u}}}} = {\varvec{u}}_{MAP}({\varvec{x}},t)\).\(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Darbon, J., Langlois, G.P. On Bayesian Posterior Mean Estimators in Imaging Sciences and Hamilton–Jacobi Partial Differential Equations. J Math Imaging Vis 63, 821–854 (2021). https://doi.org/10.1007/s10851-021-01036-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10851-021-01036-0

Keywords

Navigation