Skip to main content
Log in

Nonparametric estimation for interacting particle systems: McKean–Vlasov models

  • Published:
Probability Theory and Related Fields Aims and scope Submit manuscript

Abstract

We consider a system of N interacting particles, governed by transport and diffusion, that converges in a mean-field limit to the solution of a McKean–Vlasov equation. From the observation of a trajectory of the system over a fixed time horizon, we investigate nonparametric estimation of the solution of the associated nonlinear Fokker–Planck equation, together with the drift term that controls the interactions, in a large population limit \(N \rightarrow \infty \). We build data-driven kernel estimators and establish oracle inequalities, following Lepski’s principle. Our results are based on a new Bernstein concentration inequality in McKean–Vlasov models for the empirical measure around its mean, possibly of independent interest. We obtain adaptive estimators over anisotropic Hölder smoothness classes built upon the solution map of the Fokker–Planck equation, and prove their optimality in a minimax sense. In the specific case of the Vlasov model, we derive an estimator of the interaction potential and establish its consistency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

Notes

  1. Usually required to control the model for convergence to equilibrium when T is large.

  2. In particular, it is a first step toward the interesting problem of testing the hypothesis \(F=0\) against a set of local alternatives that quantify how far F is from being constant.

References

  1. Abraham, K., Nickl, R.: On statistical Caldéron problems. Mathematical Statistics and Machine Learning. To appear (2019). arXiv:1906.03486

  2. Angiuli, L., Lorenzi, L.: Compactness and invariance properties of evolution operators associated with Kolmogorov operators with unbounded coefficients. J. Math. Anal. Appl. 379(1), 125–149 (2011)

    Article  MathSciNet  Google Scholar 

  3. Baladron, J., Fasoli, D., Faugeras, O., Touboul, J.: Mean-field description and propagation of chaos in networks of Hodgkin-Huxley and FitzHugh-Nagumo neurons. J. Math. Neurosci. 2, 1–50 (2012)

    Article  MathSciNet  Google Scholar 

  4. Barlow, M.T., Yor, M.: Semimartingale inequalities via the Garsia-Rodemich-Rumsey lemma, and applications to local times. J. Funct. Anal. 49(2), 198–229 (1982)

    Article  MathSciNet  Google Scholar 

  5. Benachour, S., Roynette, B., Talay, D., Vallois, P.: Nonlinear self-stabilizing processes-i existence, invariant probability, propagation of chaos. Stoch. Processes Appl. 75(2), 173–201 (1998)

    Article  MathSciNet  Google Scholar 

  6. Birgé, L., Massart, P.: Minimum contrast estimators on sieves: exponential bounds and rates of convergence. Bernoulli 4(3), 329–375 (1998)

    Article  MathSciNet  Google Scholar 

  7. Bogachev, V., Krylov, N., Röckner, M., Shaposhnikov, S.: Fokker–Planck–Kolmogorov equations. Mathematical Survey and Monographs (2015)

  8. Bolley, F., Guillin, A., Villani, C.: Quantitative concentration inequalities for empirical measures on non-compact spaces. Probab. Theory Relat. Fields 137(3–4), 541–593 (2007)

    MathSciNet  MATH  Google Scholar 

  9. Boumezoued, A., Hoffmann, M., Jeunesse, P.: Nonparametric adaptive inference of birth and death processes in a large population limit. Mathematical Statistics and Machine Learning. To appear (2019). arXiv:1903.00673

  10. Buldygin, V.V., Kozačenko, J.V.: Sub-Gaussian random variables. Ukrain. Mat. Zh. 32(6), 723–730 (1980)

    Google Scholar 

  11. Burger, M., Capasso, V., Morale, D.: On an aggregation model with long and short range interactions. Nonlinear Anal. Real World Appl. 8(3), 939–958 (2007)

    Article  MathSciNet  Google Scholar 

  12. Burkholder, D.L., Pardoux, É., Sznitman, A.S.: École d’Été de Probabilités de Saint-Flour XIX—1989, volume 1464 of Lecture Notes in Mathematics. Springer-Verlag, Berlin (1991). Papers from the school held in Saint-Flour, August 16–September 2, 1989, Edited by P. L. Hennequin

  13. Canuto, C., Fagnani, F., Tilli, P.: An Eulerian approach to the analysis of Krause’s consensus models. SIAM J. Control Optim. 50(1), 243–265 (2012)

    Article  MathSciNet  Google Scholar 

  14. Cardaliaguet, P., Delarue, F., Lasry, J.-M., Lions, P.-L.: The master equation and the convergence problem in mean field games. In: Annals of Mathematics Studies, vol. 201. Princeton University Press, Princeton, NJ (2019)

  15. Cardaliaguet, P., Lehalle, C.: Mean field game of controls and an application to trade crowding. Math. Financ. Econ. 12(3), 335–363 (2019)

    Article  MathSciNet  Google Scholar 

  16. Carmona, R., Delarue, F., et al.: Probabilistic Theory of Mean Field Games with Applications I–II. Springer (2018)

  17. Cattiaux, P., Guillin, A., Malrieu, F.: Probabilistic approach for granular media equations in the non-uniformly convex case. Probab. Theory Relat. Fields 140(1–2), 19–40 (2008)

    MathSciNet  MATH  Google Scholar 

  18. Chassagneux, J.-C., Szpruch, L., Tse, A.: Weak quantitative propagation of chaos via differential calculus on the space of measures (2019). arXiv:1901.02556v1

  19. Chazelle, B., Jiu, Q., Li, Q., Wang, C.: Well-posedness of the limiting equation of a noisy consensus model in opinion dynamics. J. Differ. Equ. 263(1), 365–397 (2017)

    Article  MathSciNet  Google Scholar 

  20. Coghi, M., Deuschel, J.-M., Friz, P., Maurelli, M.: Pathwise McKean–Vlasov theory with additive noise (2018). arXiv:1812.11773

  21. Comte, F., Genon-Catalot, V.: Nonparametric drift estimation for IID paths of stochastic differential equations. Ann. Stat. to appear (2019)

  22. Doumic, M., Hoffmann, M., Krell, N., Robert, L., et al.: Statistical estimation of a growth-fragmentation model observed on a genealogical tree. Bernoulli 21(3), 1760–1799 (2015)

    Article  MathSciNet  Google Scholar 

  23. Doumic, M., Hoffmann, M., Reynaud-Bouret, P., Rivoirard, V.: Nonparametric estimation of the division rate of a size-structured population. SIAM J. Numer. Anal. 50(2), 925–950 (2012)

    Article  MathSciNet  Google Scholar 

  24. Fernandez, B., Méléard, S.: A Hilbertian approach for fluctuations on the McKean-Vlasov model. Stoch. Process. Appl. 71(1), 33–53 (1997)

    Article  MathSciNet  Google Scholar 

  25. Fouque, J.P., Sun, L.H.: Systemic risk illustrated. Handbook on Systemic Risk, Eds J.P Fouque and J Langsam (2013)

  26. Fournier, N., Guillin, A.: On the rate of convergence in Wasserstein distance of the empirical measure. Probab. Theory Relat. Fields 162(3–4), 707–738 (2015)

    Article  MathSciNet  Google Scholar 

  27. Friedman, A.: Partial Differential Equations of Parabolic Type. Courier Dover Publications (2008)

  28. Gärtner, J.: On the McKean-Vlasov limit for interacting diffusions. Math. Nachr. 137, 197–248 (1988)

    Article  MathSciNet  Google Scholar 

  29. Genon-Catalot, V.: Estimation of the diffusion coefficient for diffusion processes: random sampling. Scand. J. Statist. 21(3), 193–221 (1994)

    MathSciNet  MATH  Google Scholar 

  30. Giesecke, K., Schwenkler, G., Sirignano, J.A.: Inference for large financial systems. Math. Finance 30(1), 3–46 (2020)

    Article  MathSciNet  Google Scholar 

  31. Goldenshluger, A., Lepski, O.: Universal pointwise selection rule in multivariate function estimation. Bernoulli 14(4), 1150–1190 (2008)

    Article  MathSciNet  Google Scholar 

  32. Goldenshluger, A.: Bandwidth selection in kernel density estimation: oracle inequalities and adaptive minimax optimality. Ann. Statist. 39(3), 1608–1632 (2011)

    Article  MathSciNet  Google Scholar 

  33. Goldenshluger, A., Lepski, O.: On adaptive minimax density estimation on \(R^d\). Probab. Theory Relat. Fields 159(3–4), 479–543 (2014)

    Article  Google Scholar 

  34. Herrmann, S., Imkeller, P., Peithmann, D., et al.: Large deviations and a Kramer type law for self-stabilizing diffusions. Ann. Appl. Probab. 18(4), 1379–1423 (2008)

    Article  MathSciNet  Google Scholar 

  35. Hoang, V.H., Ngoc, T.M.P., Rivoirard, V., Tran, V.C.: Nonparametric estimation of the fragmentation kernel based on a PDE stationary distribution approximation (2019). arXiv:1710.09172v3

  36. Hoffmann, M., Olivier, A.: Nonparametric estimation of the division rate of an age dependent branching process. Stoch. Process. Appl. 126(5), 1433–1471 (2016)

    Article  MathSciNet  Google Scholar 

  37. Johannes, J.: Deconvolution with unknown error distribution. Ann. Statist. 37(5A), 2301–2323 (2009)

    Article  MathSciNet  Google Scholar 

  38. Jourdain, B.: Propagation of chaos and fluctuations for a moderate model with smooth initial data. Ann. Inst. H. Poincaré Probab. Statist. 34(6), 727–766 (1998)

    Article  MathSciNet  Google Scholar 

  39. Jourdain, B., Tse, A.: Central limit theorem over non-linear functionals of empirical measures with applications to the mean-field fluctuation of interacting particle systems (2020). arXiv:2002.01458

  40. Karatzas, I., Shreve, S.E.: Brownian motion. In: Brownian Motion and Stochastic Calculus, pp. 47–127. Springer (1998)

  41. Kasonga, R.A.: Maximum likelihood theory for large interacting systems. SIAM J. Appl. Math. 50(3), 865–875 (1990)

    Article  MathSciNet  Google Scholar 

  42. Krylov, N.V.: Lectures on Elliptic and PARABOLIC EQUATIONS in Holder spaces. Number 12. American Mathematical Soc (1996)

  43. Lacker, D.: Mean field games and interacting particle systems. Preprint (2018)

  44. Lacker, D.: On a strong form of propagation of chaos for McKean-Vlasov equations. Electron. Commun. Probab. 23, Paper No. 45, 11 (2018)

  45. Lacour, C., Massart, P., Rivoirard, V.: Estimator selection: a new method with applications to kernel density estimation. Sankhya A 79(2), 298–335 (2017)

    Article  MathSciNet  Google Scholar 

  46. Le, C.: Lucien: Asymptotic Methods in Statistical Decision Theory. Springer, New York (1986)

    Google Scholar 

  47. Lepskiĭ, O.V.: A problem of adaptive estimation in Gaussian white noise. Teor. Veroyatnost. i Primenen. 35(3), 459–470 (1990)

    MathSciNet  Google Scholar 

  48. Low, M.G.: Nonexistence of an adaptive estimator for the value of an unknown probability density. Ann. Statist. 20(1), 598–602 (1992)

    MathSciNet  MATH  Google Scholar 

  49. Maïda, M., Nguyen, T.D., Ngoc, T.M.P., Rivoirard, V., Tran, V.C.: Statistical deconvolution of the free Fokker–Planck equation at fixed time (2020). arXiv:2006.11899

  50. Malrieu, F.: Logarithmic Sobolev inequalities for some nonlinear PDE’s. Stochastic Process. Appl. 95(1), 109–132 (2001)

    Article  MathSciNet  Google Scholar 

  51. Manita, O., Shaposhnikov, S.: Nonlinear parabolic equations for measures. St. Petersburg Math. J. 25(1), 43–62 (2014)

    Article  MathSciNet  Google Scholar 

  52. Massart, P.: Concentration inequalities and model selection, vol. 6. Springer (2007)

  53. McKean, H.P., Jr.: A class of Markov processes associated with nonlinear parabolic equations. Proc Natl Acad Sci USA 56(6), 1907 (1966)

    Article  MathSciNet  Google Scholar 

  54. Méléard, S.: Asymptotic behaviour of some interacting particle systems; McKean–Vlasov and Boltzmann models. In: Probabilistic models for nonlinear partial differential equations, pp. 42–95. Springer (1996)

  55. Mogilner, A.: A non-local model for a swarm. J. Math. Biol. 38(6), 534–570 (1999)

    Article  MathSciNet  Google Scholar 

  56. Monard, F., Nickl, R., Paternain, G.P.: Consistent inversion of noisy non-Abelian X-ray transforms (2019). arXiv:1905.00860

  57. Nadaraja, È.A.: On a regression estimate. Teor. Verojatnost. i Primenen. 9, 157–159 (1964)

    MathSciNet  Google Scholar 

  58. Nickl.: Bernstein-von Mises theorems for statistical inverse problems i: Schrödinger equation (2017). arXiv:1707.01764

  59. Nickl, R.: On Bayesian inference for some statistical inverse problems with partial differential equations. Bernoulli News 24(2), 5–9 (2017)

    Google Scholar 

  60. Oelschläger, K.: A law of large numbers for moderately interacting diffusion processes. Z. Wahrsch. Verw. Gebiete 69(2), 279–322 (1985)

    Article  MathSciNet  Google Scholar 

  61. Revuz, R., Yor, M.: Continuous martingales and Brownian Motion, volume 293 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], 3rd edn. Springer, Berlin (1999)

  62. Stroock, D.W., Srinivasa Varadhan, S.R.: Multidimensional Diffusion Processes, Volume 233 of Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences]. Springer, Berlin (1979)

  63. Sznitman, A.-S.: Nonlinear reflecting diffusion process, and the propagation of chaos and fluctuations associated. J. Funct. Anal. 56(3), 311–336 (1984)

    Article  MathSciNet  Google Scholar 

  64. Tanaka, H.: Central limit theorem for a simple diffusion model of interacting particles. Hiroshima Math. J. 11(2), 415–423 (1981)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

Informal discussions with colleagues at CEREMADE are gratefully acknowledged; we thank in particular, Pierre Cardaliaguet, Djalil Chafaï and Stéphane Mischler. We also thank Denis Belomestny and Nicolas Fournier for insightful comments. This work partially answers a problem that was posed to us by Sylvie Méléard almost two decades ago (at a time we did not have the proper tools to address it!).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Marc Hoffmann.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 Characterisation of sub-Gaussian random variables

We recall a classical definition of a sub-Gaussian random variable. Recommended reference is [10].

Definition 30

A real-valued random variable Z such that \(\mathbb {E}[Z]=0\) is \(\lambda ^2\) sub-Gaussian if one of the following conditions is satisfied, each statement implying the next:

  1. (i)

    Laplace transform condition

    $$\begin{aligned} \mathbb {E}\big [\exp (zZ)\big ] \le \exp \big (\tfrac{1}{2}\lambda ^2 z^2\big )\;\;\text {for every}\;\;z \in \mathbb {R}. \end{aligned}$$
  2. (ii)

    Moment condition

    $$\begin{aligned} \mathbb {E}\big [Z^{2p}\big ] \le p! (4\lambda ^2)^p\;\;\text {for every integer}\;\;p \ge 1. \end{aligned}$$
  3. (iii)

    Orlicz condition

    $$\begin{aligned} \mathbb {E}\big [\exp \big (\tfrac{1}{8\lambda ^2}Z^2\big )\big ] \le 2. \end{aligned}$$
  4. (iv)

    Laplace transform condition (bis)

    $$\begin{aligned} \mathbb {E}\big [\exp (zZ)\big ] \le \exp \big (\tfrac{24}{2}\lambda ^2 z^2\big )\;\;\text {for every}\;\;z \in \mathbb {R}. \end{aligned}$$

We will also use the following additive property of sub-Gaussian random variables: if the random variables \(Z_i\) are independent and \(\lambda _i^2\) sub-Gaussian, then \(\rho (Z_1+Z_2)\) is \(|\rho |^2(\lambda _1^2+\lambda _2^2) \) sub-Gaussian for every \(\rho \in \mathbb {R}\).

1.2 Proof of Lemma 20

By Assumption 4, the estimate

$$\begin{aligned} |b(t,x,\mu _t)|\le b_0+|b|_{\mathrm {Lip}}\big (|x|+\mathbb {E}_{\overline{\mathbb P}^N}\big [\big |X_t^i\big |\big ]\big ) \end{aligned}$$

holds for every \((t,x)\in [0,T] \times \mathbb {R}^d\), where \( b_0 = \sup _{t \in [0,T]}|b(t,0,\delta _0)|\). Remember that

$$\begin{aligned} \overline{B}^i_t = \int _0^t c(s,X_s^i)^{-1/2}\big (dX_s^i-b(s,X_s^i,\mu _s) ds\big ), \;\;1 \le i \le N, \end{aligned}$$

are independent d-dimensional \(\overline{\mathbb P}^{N}\)-Brownian motions. By Minkowski’s and Jensen’s inequality, we have

$$\begin{aligned} \big |X_t^i \big |&\le \big |X_0^i\big |+\int _0^t \big |b(t,X_s^i,\mu _s)\big |ds + \big |\int _0^t\sigma (s,X_s^i)d\overline{B}_{s}^i\big | \nonumber \\&\le \big |X_0^i\big |+ b_0t+|b|_{\mathrm {Lip}}\int _0^t \big (\big |X_s^i\big |+ \mathbb {E}_{\overline{\mathbb P}^N}\big [\big |X_s^i\big |\big ]\big )ds+ \big |\int _0^t\sigma (s,X_s^i)d\overline{B}_{s}^i\big | \nonumber \\&\le \big |X_0^i\big |+ b_0T+|b|_{\mathrm {Lip}}\int _0^t \big (\big |X_s^i\big |+ \mathbb {E}_{\overline{\mathbb P}^N}\big [\big |X_s^i\big |\big ]\big )ds+\zeta _{T}^{i}, \end{aligned}$$
(72)

where \(\zeta _{T}^{i} = \sup _{0 \le t \le T}\big |\int _0^t \sigma (s,X_s^i)d\overline{B}_{s}^i\big | \). Integrating w.r.t. \(\overline{\mathbb P}^N\), we also have

$$\begin{aligned} \mathbb {E}_{\overline{\mathbb P}^N}\big [\big |X_t^i \big |\big ] \le \mathbb {E}_{\overline{\mathbb P}^N}\big [\big |X_0^i\big |\big ]+ b_0T+2|b|_{\mathrm {Lip}}\int _0^t \mathbb {E}_{\overline{\mathbb P}^N}\big [\big |X_s^i\big |\big ]ds+\mathbb {E}_{\overline{\mathbb P}^N}\big [\zeta _{T}^{i}\big ]. \end{aligned}$$

We infer by Grönwall’s lemma

$$\begin{aligned} \mathbb {E}_{\overline{\mathbb P}^N}\big [\big |X_t^i \big |\big ] \le \big (\mathbb {E}_{\overline{\mathbb P}^N}\big [\big |X_0^i\big |\big ]+ b_0T+\mathbb {E}_{\overline{\mathbb P}^N}\big [\zeta _{T}^{i}\big ]\big )\mathrm {e}^{2|b|_{\mathrm {Lip}}t} \end{aligned}$$

and plugging this estimate in (72) we infer

$$\begin{aligned}&\big |X_t^i \big | \le \big |X_0^i\big |+ b_0T+|b|_{\mathrm {Lip}}\int _0^t \big |X_s^i\big |ds+ \big (\mathbb {E}_{\overline{\mathbb P}^N}\big [\big |X_0^i\big |\big ]\\&+ b_0T+\mathbb {E}_{\overline{\mathbb P}^N}\big [\zeta _{T}^{i}\big ]\big )\mathrm {e}^{2|b|_{\mathrm {Lip}}T}+\zeta _T^i. \end{aligned}$$

Applying Grönwall’s lemma again, we derive

$$\begin{aligned} \big |X_t^i \big |&\le \big (\big |X_0^i\big |+ b_0T+ \big (\mathbb {E}_{\overline{\mathbb P}^N}\big [\big |X_0^i\big |\big ]+ b_0T+\mathbb {E}_{\overline{\mathbb P}^N}\big [\zeta _{T}^{i}\big ]\big )\mathrm {e}^{2|b|_{\mathrm {Lip}}T}+\zeta _T^i\big )\mathrm {e}^{|b|_{\mathrm {Lip}}t} \\&\le \big (\big |X_0^i\big |+\mathbb {E}_{\overline{\mathbb P}^N}\big [\big |X_0^i\big |\big ]+2b_0T+\zeta _T^i+\mathbb {E}_{\overline{\mathbb P}^N}\big [\zeta _{T}^{i}\big ]\big )\mathrm {e}^{3|b|_{\mathrm {Lip}}t}. \end{aligned}$$

Taking the exponent 2p and expectation w.r.t. \(\overline{\mathbb P}^N\), we further obtain

$$\begin{aligned} \mathbb {E}_{\overline{\mathbb P}^N}\big [\big |X_t^i \big |^{2p}\big ]&\le 5^{2p-1}\big (2\mathbb {E}_{\overline{\mathbb P}^N}\big [\big |X_0^i\big |^{2p}\big ]+(2b_0T)^p+2\mathbb {E}_{\overline{\mathbb P}^N}\big [\big (\zeta _{T}^{i})^{2p}\big ]\big )\mathrm {e}^{3|b|_{\mathrm {Lip}}Tp} \\&\le C_6^p\big (\mathbb {E}_{\overline{\mathbb P}^N}\big [\big |X_0^i\big |^{2p}\big ]+(b_0T)^p+\mathbb {E}_{\overline{\mathbb P}^N}\big [\big (\zeta _{T}^{i})^{2p}\big ]\big ) \end{aligned}$$

with \(C_6=50\,\mathrm {e}^{3|b|_{\mathrm {Lip}}T}\). By Assumption 1, the initial condition \(|X_0^i|\) satisfies

$$\begin{aligned} \mathbb {E}_{\overline{\mathbb {P}}^N}\big [\exp (\gamma _0|X_0^i|^2)\big ] = 1 + \sum _{p \ge 1}\frac{\gamma _0^p}{p!}\mathbb {E}_{\overline{\mathbb {P}}^N}\big [\big |X_0^i\big |^{2p}\big ] \le \gamma _1 \end{aligned}$$

hence for every \(p \ge 1\), we obtain

$$\begin{aligned} \mathbb {E}_{\overline{\mathbb P}^N}\big [\big |X_0^i\big |^{2p}\big ] \le p! \big (\tfrac{\gamma _1}{\gamma _0}\big )^p \end{aligned}$$

since \(\gamma _1 \ge 1\). By Burkholder–Davis–Gundy’s inequality with constant \((C^\star )^{p/2} p^{p/2}\) for some numerical constant \(C^\star \), see e.g. Barlow and Yor [4], we also have

$$\begin{aligned} \mathbb {E}_{\overline{\mathbb P}^N}\big [\big (\zeta _{T}^{i})^{2p}\big ]&\le \Big (\frac{2p}{2p-1}\Big )^{2p}\,\mathbb {E}_{\overline{\mathbb P}^N}\Big [\Big |\int _0^T\sigma (t,X_t^i)d\overline{B}^i_t\Big |^{2p}\Big ] \\&\le \Big (\frac{2p}{2p-1}\Big )^{2p}(2C^\star )^p p^{p}\mathbb {E}_{\overline{\mathbb P}^N}\Big [\big (\int _0^T\mathrm {Tr}\big (c(t,X_t^i)\big )dt\big )^p\Big ] \\&\le p^{p} (8C^\star T\big |\mathrm {Tr}(c)\big |_\infty )^p \le p! \, \big (8C^\star \mathrm {e}T \big |\mathrm {Tr}(c)\big |_\infty \big )^p. \end{aligned}$$

Putting these estimates together, we conclude

$$\begin{aligned} \mathbb {E}_{\overline{\mathbb P}^N}\big [\big |X_t^i \big |^{2p}\big ]&\le p! \,C_6^{p}\Big (\tfrac{\gamma _1}{\gamma _0}+T(b_0+8C^\star \mathrm {e}\big |\mathrm {Tr}(c)\big |_\infty )\Big )^p \end{aligned}$$

and Lemma 20 is established with \(C_2 = C_6\big (\tfrac{\gamma _1}{\gamma _0}+T(b_0+8C^\star \mathrm {e}\big |\mathrm {Tr}(c)\big |_\infty )\big )\).

1.3 Proof of Lemma 22

Fix \(\mathcal I_k = \{i_1,\ldots , i_k\} \subset \{1,\ldots , N\}\). For \(g: [0,T] \times (\mathbb {R}^d)^{k} \times (\mathbb {R}^d)^\ell \rightarrow \mathbb {R}^d\), we define

$$\begin{aligned} g_{\mathcal I_k}(t,y^\ell ) = g(t,X_t^{i_1},\ldots , X_t^{i_k}, y^\ell ).\end{aligned}$$

For technical convenience, we establish a slightly stronger, replacing \(\mathcal V_{2p}^N\big (f(t,\cdot )\big )\) in (46) by

$$\begin{aligned} {\mathcal V}_{2p, \ell }^N\big (g_{\mathcal I_{k-\ell +1}}(t,\cdot )\big ) = \mathbb {E}_{\overline{\mathbb {P}}^N}\Big [\big |\int _{(\mathbb {R}^d)^{\ell }} g(t,X_t^{i_1},X_t^{i_2},\ldots , X_t^{i_{k-\ell +1}}, y^{\ell })(\mu _t^N-\mu _t)^{\otimes \ell }(dy^{\ell })\big |^{2p}\Big ] \end{aligned}$$

for every \(\mathcal I_{k-\ell +1} \subset \{1,\ldots , N\}\) with cardinality \(k-\ell +1\) and every function \(g: [0,T] \times (\mathbb {R}^d)^{k-\ell +1} \times (\mathbb {R}^d)^\ell \rightarrow \mathbb {R}^d\), Lipschitz continuous in the space variables, that defines in turn a class \(\mathcal G_{k-\ell +1,\ell }\). In particular \(\mathcal V_{2p,\ell }^N\big (f(t,\cdot )\big )\) and \({\mathcal V}_{2p, \ell }^N\big (g_{\mathcal I_{k-\ell +1}}(t,\cdot )\big )\) agree for \(\ell = k\) in which case the class \(\mathcal G_{1,k}\) coincide with \(\mathcal G_k\) and we obtain Lemma 22. We prove the result by induction.

Step 1: The case \(\ell = 1\). For \(g\in \mathcal G_{k,1}\), \(x^{k}\in (\mathbb {R}^d)^{k}\) and \(\mathcal I \subset \{1,\ldots ,N\}\), let

$$\begin{aligned} \Lambda _t^{\mathcal I}(g,x^{k}) = \int _{\mathbb {R}^d} g(t,x^{k},y)(\mu ^{\mathcal J}_t- \mu _t)(dy), \end{aligned}$$

where we write \(\mu ^{\mathcal J}_t(dx) = |\mathcal J|^{-1}\sum _{i \in \mathcal J}\delta _{X_t ^i}(dx)\) for the empirical measure in restriction to \(\mathcal I\). Observe that \(\Lambda _t^{\mathcal I}(g,x^{k})\) is a sum of independent and centred random variables under \(\overline{\mathbb P}^{N}\). We write

$$\begin{aligned} \Lambda _t^{\{1,\ldots , N\}}(g, X_t^{i_1},\ldots , X_t^{i_k})&= N^{-1}\sum _{i \in \mathcal I_k}\Big (g(t,X_t^{i_1},\ldots , X_t^{i_k},X_t^{i})\\&\quad -\int _{\mathbb {R}^d}g(t, X_t^{i_1},\ldots , X_t^{i_k},y)\mu _t(dy)\Big )\\&\quad +\frac{N-k}{N} \Lambda _t^{\mathcal I_k^c}(g, X_t^{i_1},\ldots , X_t^{i_k}), \end{aligned}$$

since \(|\mathcal I_k|=k\). We obtain the decomposition

$$\begin{aligned} \mathcal V_{2p,1}^N\big (g_{\mathcal I_k}(t,\cdot )\big ) = \mathbb {E}_{\overline{\mathbb {P}}^N}\Big [\big | \Lambda _t^{\{1,\ldots , N\}}(g, X_t^{i_1},\ldots , X_t^{i_k}) \big |^{2p}\Big ] \le 2^{2p-1}(I+II), \end{aligned}$$

with

$$\begin{aligned} I&=\frac{k^{2p-1}}{N^{2p}}\sum _{i \in \mathcal I_k}\Big (\mathbb {E}_{\overline{\mathbb {P}}^N}\Big [\big |g(t,X_t^{i_1},\ldots , X_t^{i_k},X_t^{i})-\int _{\mathbb {R}^d}g(t, X_t^{i_1},\ldots , X_t^{i_k},y)\mu _t(dy)\big |^{2p}\Big ]\Big ),\\ II&= \Big (\frac{N-k}{N}\Big )^{2p}\mathbb {E}_{\overline{\mathbb {P}}^N}\Big [\big |\Lambda _t^{\mathcal I_k^c}(g, X_t^{i_1},\ldots , X_t^{i_k})\big |^{2p}\Big ]. \end{aligned}$$

The term I is controlled by the smoothness of g:

$$\begin{aligned} I&\le \frac{k^{2p-1}}{N^{2p}}|g(t,\cdot )|_{\mathrm {Lip}}^{2p} \sum _{i \in \mathcal I_k}\mathbb {E}_{\overline{\mathbb {P}}^N}\Big [\int _{\mathbb {R}^d}|X_t^i-y|^{2p}\mu _t(dy)\Big ] \le N^{-2p} p! \big (k^{2}4C_2\big )^p |g(t,\cdot )|_{\mathrm {Lip}}^{2p}, \end{aligned}$$

where the last estimate stems from Lemma 20. For the term II, writing \(g = (g^1,\ldots , g^d)\) where the functions \(g^j\) are real-valued, we further have

$$\begin{aligned} II \le \Big (\frac{N-k}{N}\Big )^{2p} d^{2p-1}\sum _{j = 1}^d \mathbb {E}_{\overline{\mathbb {P}}^N}\big [\Lambda _t^{\mathcal I_k^c}(g^j, X_t^{i_1},\ldots , X_t^{i_k})^{2p}\big ]. \end{aligned}$$
(73)

Moreover, for every \(x\in \mathbb {R}^{d}\), the term

$$\begin{aligned} \Lambda _t^{\mathcal I_k^c}(g^j,x^{k}) = \frac{1}{N-k}\sum _{i \in \mathcal I_k^c}\big (g^j(t,x^{k},X_t^{i})- \mathbb {E}_{\overline{\mathbb {P}}^{N}}\big [g^j(t,x^{k},X_t^{i})\big ]\big ) \end{aligned}$$

is the sum of independent centred random variables that are independent of \((X_t^{i_1}, \ldots , X_t^{i_k})\) and

$$\begin{aligned} g^j(t,x^{k},X_t^{i})- \mathbb {E}_{\overline{\mathbb {P}}^{N}}\big [g^j(t,x^{k},X_t^{i})\big ] \end{aligned}$$

is \(\lambda ^2\) sub-Gaussian with \(\lambda ^2 = 24C_2|g^j(t,\cdot )|_{\mathrm {Lip}}^{2}\) via the same estimate as for I and the fact that (ii) implies (iv) in Definition 30. Thanks to the additivity property of independent sub-Gaussian random variables, we further infer that \(\Lambda _t^{\mathcal I_k^c}(g^j,x^{k})\) is \(\widetilde{\lambda }^2\) sub-Gaussian with

$$\begin{aligned} \widetilde{\lambda }^2 = \frac{1}{N-k}\lambda ^2 = \frac{1}{N-k}24C_2|g^j(t,\cdot )|_{\mathrm {Lip}}^{2}. \end{aligned}$$

Conditioning on \((X_t^{i_1}, \ldots , X_t^{i_k})\), we derive

$$\begin{aligned} \mathbb {E}_{\overline{\mathbb {P}}^N}\big [\Lambda _t^{\mathcal I_k^c}(g^j, X_t^{i_1},\ldots , X_t^{i_k})^{2p}\big ] \le \frac{p! (96C_2)^p}{(N-k)^p} |g^j(t,\cdot )|_{\mathrm {Lip}}^{2p} \end{aligned}$$

by (ii) of Definition 30. Plugging this estimate in (73), we obtain

$$\begin{aligned} II&\le \frac{p! (96C_2d^2)^p}{(N-k)^p} |g(t,\cdot )|_{\mathrm {Lip}}^{2p} \end{aligned}$$

and putting together our estimates for I and II, we conclude

$$\begin{aligned} \mathcal V_{2p, 1}^N\big (g(t,\cdot )\big ) \le \frac{p! K_1^p}{(N-k)^p} |g(t,\cdot )|_{\mathrm {Lip}}^{2p} \end{aligned}$$

with \(K_1 = 16(k^2+24d^2)C_2\). This establishes Lemma 22 for g in the case \(\ell = 1\).

Step 2: We assume that (46) holds for \(\mathcal V_{2p,\ell }^N\big (g_{{\mathcal I}_{k-\ell +1}}(t,\cdot )\big )\), for every \(\mathcal I_{k-\ell +1} \subset \{1,\ldots , N\}\) with cardinality \(k-\ell +1\) and every \(g \in \mathcal G_{k-\ell +1,\ell }\) with \(\ell < k\). Let \(g \in {\mathcal G}_{k-\ell ,\ell +1}\) and \({\mathcal I}_{k-\ell } \subset \{1,\ldots , N\}\). We have:

$$\begin{aligned} \mathcal V_{2p,\ell +1}^N\big (g_{{\mathcal I}_{k-\ell }}(t,\cdot )\big )&= \mathbb {E}_{\overline{\mathbb {P}}^{N}}\Big [\big |\int _{(\mathbb {R}^d)^{\ell +1}} g_{\mathcal I_{k-\ell }}(t,y^{\ell +1})(\mu _t^N-\mu _t)^{\otimes (\ell +1)}(dy^{\ell +1})\big |^{2p}\Big ] \\&\le 2^{2p-1}(III + IV), \end{aligned}$$

with

$$\begin{aligned} III&= N^{-1}\sum _{i = 1}^N\mathbb {E}_{\overline{\mathbb {P}}^{N}}\Big [\big |\int _{(\mathbb {R}^d)^{\ell }} g\big (t,X_t^{i_1},\ldots , X_{t}^{i_{k-\ell }}, (X_t^i,y^\ell )\big )(\mu _t^N-\mu _t)^{\otimes \ell }(dy^{\ell }) \big |^{2p}\Big ], \\ IV&= \int _{\mathbb {R}^d}\mathbb {E}_{\overline{\mathbb {P}}^{N}}\Big [\big |\int _{(\mathbb {R}^d)^{\ell }} g\big (t,X_t^{i_1},\ldots , X_{t}^{i_{k-\ell }}, (y,y^\ell )\big )(\mu _t^N-\mu _t)^{\otimes \ell }(dy^{\ell }) \big |^{2p}\Big ]\mu _t(dy). \end{aligned}$$

Let \(i_0 \in \mathcal I_{k-\ell }^c\) and put \(\mathcal I_{k-\ell +1} = \mathcal I_{k-\ell } \cup \{i_0\}\). The term IV can be rewritten as

$$\begin{aligned} IV = \int _{\mathbb {R}^d}\mathcal V^N_{2p,\ell }\big (g_{\mathcal I_{k-\ell +1}}'(t,\cdot )(y)\big )\mu _t(dy), \end{aligned}$$

where, for fixed \(y \in \mathbb {R}^d\), the function \(g'(t,x_{i_1}, \ldots x_{i_{k-\ell }}, x_{i_0},y^\ell )(y) = g(t,x_{i_1}, \ldots x_{i_{k-\ell }},(y,y^\ell ))\) with the artificial variable \(x_{i_0}\) belongs to \(\mathcal G_{k-\ell +1,\ell }\). By the induction hypothesis and noting that \(\sup _{y \in \mathbb {R}^d}|g'(t,\cdot ,y)|_{\mathrm {Lip}} \le |g(t,\cdot )|_{\mathrm {Lip}}\), we infer

$$\begin{aligned} IV \le \frac{p! K_\ell ^p}{(N-k)^p}|g(t,\cdot )|_{\mathrm {Lip}}^{2p}. \end{aligned}$$

We split the sum in III over indices in \(\mathcal I_{k-\ell }\) and \(\mathcal I_{k-\ell }^c\). If \(i \in \mathcal I_{k-\ell }\), in the same way as for IV, we write

$$\begin{aligned} g\big (t,X_t^{i_1},\ldots , X_{t}^{i_{k-\ell }}, (X_t^i,y^\ell )\big ) = g''_{\mathcal I_{k-\ell +1}}(t,y^\ell ) \end{aligned}$$

with \(\mathcal I_{k-\ell +1} = \mathcal I_{k-\ell } \cup \{i_0\}\) for some arbitrary \(i_0 \in \mathcal I_{k-\ell }^c\) and with \(g''(t,x_{i_1},\ldots , x_{i_{k-\ell }},x_{i_0},y^\ell ) =g\big (t,x_{i_1},\ldots , x_{i_{k-\ell }},(x_i,y^\ell )\big )\), where i coincides with one of the \(i_j \in \mathcal I_{k-\ell }\). Also, \(g''\) belongs to \(\mathcal G_{k-\ell +1,\ell }\). If \(i \in \mathcal I_{k-\ell }^c\), we write

$$\begin{aligned} g\big (t,X_t^{i_1},\ldots , X_{t}^{i_{k-\ell }}, (X_t^i,y^\ell )\big ) = g'''_{ \mathcal I_{k-\ell } \cup \{i\}}(t,y^\ell ) \end{aligned}$$

with \(g'''(t,x_{i_1},\ldots , x_{i_{k-\ell }}, x_i,y^\ell ) =g\big (t,x_{i_1},\ldots , x_{i_{k-\ell }},(x_i,y^\ell )\big )\) and \(g'''\) belongs to \(\mathcal G_{k-\ell +1,\ell }\) as well. We infer

$$\begin{aligned} III&\le (k-\ell )N^{-1}\mathcal V^N_{2p,\ell }\big (g''_{\mathcal I_{k-\ell +1}}(t,\cdot )\big )+ N^{-1}\sum _{i \in \mathcal I_{k-\ell }^c}\mathcal V^N_{2p,\ell }\big (g'''_{ \mathcal I_{k-\ell } \cup \{i\}}(t,\cdot )\big ) \\&\le \frac{p! K_\ell ^p}{(N-k)^p}|g(t,\cdot )|_{\mathrm {Lip}}^{2p} \end{aligned}$$

by the induction hypothesis and noting again that \(|g''(t,\cdot )|_{\mathrm {Lip}}\) and \(|g'''(t,\cdot )|_{\mathrm {Lip}}\) are controlled by \(|g(t,\cdot )|_{\mathrm {Lip}}\). We conclude

$$\begin{aligned} \mathcal V_{2p,\ell +1}^N\big (g_{{\mathcal I}_{k-\ell }}(t,\cdot )\big ) \le 2^{2p}\frac{p! K_\ell ^p}{(N-k)^p}|g(t,\cdot )|_{\mathrm {Lip}}^{2p} =\frac{p! K_{\ell +1}^p}{(N-k)^p}|g(t,\cdot )|_{\mathrm {Lip}}^{2p} \end{aligned}$$

with \(K_{\ell +1} = 4 K_\ell \). The proof of Lemma 22 is complete.

1.4 (Sketch of) proof of Proposition 13

Step 1: Thanks to Chapters 6 and 9 of [7], since FG are bounded and \(\mu _0\) satisfies Assumption 1, it can be shown that (3) admits a unique probability solution \(\mu \) in the sense of [7], absolutely continuous w.r.t. the Lebesgue measure, that we still denote \(\mu (t,x) = \mu _t(x)\). Moreover \(\mu \in \mathcal {H}_{\text {loc}}^{\delta /2,\delta } = \cap _{(t_0,x,_0)\in (0,T)\times \mathbb {R}^d} \mathcal H^{\delta /2,\delta }(t_0,x_0)\) for every \(0< \delta < 1\). The main arguments of these properties rely on the existence of a suitable Lyapunov function associated to (3), following the terminology of [51] and [7] (for instance \(x \mapsto 1+ |x|^2\)) together with Sobolev embeddings.

Step 2: Define

$$\begin{aligned} \widetilde{a}_k(t,x)=G^k(x)+\int _{\mathbb {R}^d}F^k(x-y)\mu _t(y)dy,\;\;k=1,\ldots ,d,\end{aligned}$$

and

$$\begin{aligned} \widetilde{a}(t,x) = \text {div} (G(x) + \int _{\mathbb {R}^d}F(x-y)\mu _t(y)dy), \end{aligned}$$

which are well defined since \(\beta ,\beta ' >1\). Consider next the Cauchy problem associated to (3) in its strong form:

$$\begin{aligned} \left\{ \begin{array}{ll} \partial _{t}\widetilde{\mu }_t = \frac{1}{2}\sigma ^2\Delta \widetilde{\mu _t} - \sum _{k=1}^d \widetilde{a}_k(t,\cdot ) \partial _{k}\widetilde{\mu _t} - \widetilde{a}(t,\cdot )\widetilde{\mu }_t \\ \tilde{\mu }_{t = 0} = \mu _0. \end{array} \right. \end{aligned}$$
(74)

Taking \(\delta =\beta -\lfloor \beta \rfloor \) we obtain \(\widetilde{a}_i, \widetilde{a} \in \mathcal {C}^{(\beta -\lfloor \beta \rfloor )/2,\beta -\lfloor \beta \rfloor }_{\text {loc}}\) by Step 1.

Step 3: Using \(\text {inf} \; \widetilde{a} > -\infty \) and the existence of a Lyapunov function associated to the problem, by Theorem 2.3 of [2], there exists a unique solution \(\widetilde{\mu }\) of (74). Moreover, \(\widetilde{\mu }\) is continuous on \((0,T) \times \mathbb {R}^d\) and

$$\begin{aligned} \widetilde{\mu } \in \mathcal {C}^{1+(\beta -\lfloor \beta \rfloor )/2,2+ \beta -\lfloor \beta \rfloor }_{\text {loc}}. \end{aligned}$$

It is also the unique solution defined in Theorem 12 of Chapter 1 of [27], therefore the unique integrable solution of the problem (3). By uniqueness, \(\mu = \widetilde{\mu }\).

Step 4: If \(\lfloor \beta \rfloor = 1 \), we obtain \(\mu \in \mathcal {H}^{(1+\beta )/2, 1+\beta }(t_0,x_0)\) for every \((t_0,x_0)\in (0,T)\times \mathbb {R}^d\). Otherwise, we can iterate the process thanks to results of Section 8.12 in [42]: successively:

  • Since \(\partial _{x_{k'}}\widetilde{a}_k\) and \(\partial _{x_k} \widetilde{a}\) are in \(\mathcal {C}^{(\beta -\lfloor \beta \rfloor )/2,\beta -\lfloor \beta \rfloor }_{\text {loc}}\), we have

    $$\begin{aligned} \partial _{x_k} \mu \in \mathcal {C}^{1+(\beta -\lfloor \beta \rfloor )/2,2+ \beta -\lfloor \beta \rfloor }_{\text {loc}}. \end{aligned}$$
  • Since \(\partial _{t}\widetilde{a}_k\) and \(\partial _{t} \widetilde{a}\) are now in \(\mathcal {C}^{(\beta -\lfloor \beta \rfloor )/2,\beta -\lfloor \beta \rfloor }_{\text {loc}}\), we have

    $$\begin{aligned} \partial _{t} \mu \in \mathcal {C}^{1+(\beta -\lfloor \beta \rfloor )/2,2+ \beta -\lfloor \beta \rfloor }_{\text {loc}}. \end{aligned}$$

Therefore, if \(\lfloor \beta \rfloor = 2 \), we obtain \(\mu \in \mathcal {H}^{(1+\beta )/2, 1+\beta }(t_0,x_0)\) for every \((t_0,x_0)\in (0,T)\times \mathbb {R}^d\). Otherwise, we can iterate again the process and so on. The result follows.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Della Maestra, L., Hoffmann, M. Nonparametric estimation for interacting particle systems: McKean–Vlasov models. Probab. Theory Relat. Fields 182, 551–613 (2022). https://doi.org/10.1007/s00440-021-01044-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00440-021-01044-6

Keywords

Mathematics Subject Classification

Navigation