Modeling of missing dynamical systems: deriving parametric models using a nonparametric framework

Jiang, Shixiao W.; Harlim, John

doi:10.1007/s40687-020-00217-4

Modeling of missing dynamical systems: deriving parametric models using a nonparametric framework

Research
Published: 08 July 2020

Volume 7, article number 16, (2020)
Cite this article

Research in the Mathematical Sciences Aims and scope Submit manuscript

Shixiao W. Jiang¹ &
John Harlim^1,2,3

177 Accesses
12 Citations
1 Altmetric
Explore all metrics

Abstract

In this paper, we consider modeling missing dynamics with a nonparametric non-Markovian model, constructed using the theory of kernel embedding of conditional distributions on appropriate reproducing kernel Hilbert spaces (RKHS), equipped with orthonormal basis functions. Depending on the choice of the basis functions, the resulting closure model from this nonparametric modeling formulation is in the form of parametric model. This suggests that the success of various parametric modeling approaches that were proposed in various domains of applications can be understood through the RKHS representations. When the missing dynamical terms evolve faster than the relevant observable of interest, the proposed approach is consistent with the effective dynamics derived from the classical averaging theory. In the linear Gaussian case without the time-scale gap, we will show that the proposed non-Markovian model with a very long memory yields an accurate estimation of the nontrivial autocovariance function for the relevant variable of the full dynamics. The supporting numerical results on instructive nonlinear dynamics show that the proposed approach is able to replicate high-dimensional missing dynamical terms on problems with and without the separation of temporal scales.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Regularization of Hidden Markov Models Embedded into Reproducing Kernel Hilbert Space

Nonparametric estimation for stationary and strongly mixing processes on Riemannian manifolds

Article 16 September 2021

Amour T. Gbaguidi Amoussou, Freedath Djibril Moussa, … Mamadou Abdoul Diop

Drift Estimation of Multiscale Diffusions Based on Filtered Data

Article Open access 13 October 2021

Assyr Abdulle, Giacomo Garegnani, … Andrea Zanoni

References

Berry, T., Harlim, J.: Linear theory for filtering nonlinear multiscale systems with model error. Proc. R. Soc. A 20140168, 168 (2014)
MATH Google Scholar
Berry, T., Harlim, J.: Semiparametric modeling: correcting low-dimensional model error in parametric models. J. Comput. Phys. 308, 305–321 (2016)
Article MathSciNet MATH Google Scholar
Berry, T., Harlim, J.: Correcting biased observation model error in data assimilation. Mon. Weather Rev. 145(7), 2833–2853 (2017)
Article Google Scholar
Chorin, A., Hald, O., Kupferman, R.: Optimal prediction with memory. Phys. D Nonlinear Phenom. 166(3), 239–257 (2002)
Article MathSciNet MATH Google Scholar
Chorin, A., Stinis, P.: Problem reduction, renormalization, and memory. Commun. Appl. Math. Comput. Sci. 1(1), 1–27 (2007)
Article MathSciNet MATH Google Scholar
Christmann, A., Steinwart, I.: Support Vector Machines. Springer, Berlin (2008)
MATH Google Scholar
Crommelin, D., Vanden-Eijnden, E.: Subgrid-scale parameterization with conditional Markov chains. J. Atmos. Sci. 65(8), 2661–2675 (2008)
Article Google Scholar
Fatkullin, I., Vanden-Eijnden, E.: A computational strategy for multiscale systems with applications to Lorenz 96 model. J. Comput. Phys. 200(2), 605–638 (2004)
Article MathSciNet MATH Google Scholar
Frederiksen, J., O’Kane, T.: Entropy, closures and subgrid modeling. Entropy 10, 635–683 (2008)
Article Google Scholar
Givon, D., Kupferman, R., Stuart, A.: Extracting macroscopic dynamics: model problems and algorithms. Nonlinearity 17(6), R55 (2004)
Article MathSciNet MATH Google Scholar
Gottwald, G.A., Harlim, J.: The role of additive and multiplicative noise in filtering complex dynamical systems. Proc. R. Soc. A Math. Phys. Eng. Sci. 469(2155), 20130096 (2013)
MathSciNet MATH Google Scholar
Gouasmi, A., Parish, E.J., Duraisamy, K.: A priori estimation of memory effects in reduced-order models of nonlinear systems using the Mori–Zwanzig formalism. Proc. R. Soc. A Math. Phys. Eng. Sci. 473(2205), 20170385 (2017)
MathSciNet MATH Google Scholar
Grabowski, W.: An improved framework for superparameterization. J. Atmos. Sci. 61, 1940–1952 (2004)
Article Google Scholar
Hamill, T.M.: Interpretation of rank histograms for verifying ensemble forecasts. Mon. Weather Rev. 129(3), 550–560 (2001)
Article Google Scholar
Harlim, J.: Data-Driven Computational Methods: Parameter and Operator Estimations. Cambridge University Press, Cambridge (2018)
Book MATH Google Scholar
Harlim, J., Jiang, S., Liang, S., Yang, H.: Machine learning for prediction with missing dynamics. arXiv:1910.05861 (2019)
Harlim, J., Li, X.: Parametric reduced models for the nonlinear Schrödinger equation. Phys. Rev. E. 91, 053306 (2015)
Article MathSciNet Google Scholar
Harlim, J., Mahdi, A., Majda, A.: An ensemble Kalman filter for statistical estimation of physics constrained nonlinear regression models. J. Comput. Phys. 257(Part A), 782–812 (2014)
Article MathSciNet MATH Google Scholar
Jiang, S.W., Harlim, J.: Parameter estimation with data-driven nonparametric likelihood functions. Entropy 21(6), 559 (2019)
Article MathSciNet Google Scholar
Kerstein, A.: A linear-eddy model of turbulent scalar transport and mixing. Combust. Sci. Technol. 60(4–6), 391–421 (1988)
Article Google Scholar
Kerstein, A.: One-dimensional turbulence: model formulation and application to homogeneous turbulence, shear flows, and buoyant stratified flows. J. Fluid Mech. 392, 277–334 (1999)
Article MathSciNet MATH Google Scholar
Khasminskii, R.: On averaging principle for Itô stochastic differential equations. Kybern. Chekhoslovakia 4(3), 260–279 (1968). (in Russian)
Google Scholar
Khouider, B., Biello, J.A., Majda, A.J.: A stochastic multicloud model for tropical convection. Commun. Math. Sci. 8, 187–216 (2010)
Article MathSciNet MATH Google Scholar
Khouider, B., St-Cyr, A., Majda, A., Tribbia, J.: The MJO and convectively coupled waves in a coarse-resolution GCM with a simple multicloud parameterization. J. Atmos. Sci. 68, 240–264 (2011)
Article Google Scholar
Kondrashov, D., Chekroun, M.D., Ghil, M.: Data-driven non-Markovian closure models. Phys. D Nonlinear Phenom. 297, 33–55 (2015)
Article MathSciNet MATH Google Scholar
Kraichnan, R.H.: The structure of isotropic turbulence at very high Reynolds numbers. J. Fluid Mech. 5, 497–543 (1959)
Article MathSciNet MATH Google Scholar
Kravtsov, S., Kondrashov, D., Ghil, M.: Multilevel regression modeling of nonlinear processes: derivation and applications to climatic variability. J. Clim. 18(21), 4404–4424 (2005)
Article Google Scholar
Kurtz, T.: Semigroups of conditional shifts and approximations of Markov processes. Ann. Probab. 3, 618–642 (1975)
Article MATH Google Scholar
Kwasniok, F.: Data-based stochastic subgrid-scale parametrization: an approach using cluster-weighted modelling. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 370(1962), 1061–1086 (2012)
Article Google Scholar
Lei, H., Baker, N.A., Li, X.: Data-driven parameterization of the generalized Langevin equation. Proc. Natl. Acad. Sci. 113(50), 14183–14188 (2016)
Article MathSciNet MATH Google Scholar
Lorenz, E.: Predictability: a problem partly solved. In Seminar on Predictability, 4–8 September 1995, vol 1, pp. 1–18, Shinfield Park, Reading. ECMWF (1995)
Lu, F., Lin, K., Chorin, A.: Comparison of continuous and discrete-time data-based modeling for hypoelliptic systems. Commun. Appl. Math. Comput. Sci. 11(2), 187–216 (2016)
Article MathSciNet Google Scholar
Lu, F., Lin, K., Chorin, A.: Data-based stochastic model reduction for the Kuramoto–Sivashinsky equation. Phys. D Nonlinear Phenom. 340, 46–57 (2017)
Article MathSciNet MATH Google Scholar
Lu, F., Tu, X., Chorin, A.J.: Accounting for model error from unresolved scales in ensemble kalman filters by stochastic parameterization. Mon. Weather Rev. 145(9), 3709–3723 (2017)
Article Google Scholar
Majda, A., Abramov, R.V., Grote, M.J.: Information Theory and Stochastics for Multiscale Nonlinear Systems, vol. 25. American Mathematical Society, Providence (2005)
Book MATH Google Scholar
Majda, A., Grooms, I.: New perspectives on superparameterization for geophysical turbulence. J. Comput. Phys. 271, 60–77 (2014)
Article MathSciNet MATH Google Scholar
Majda, A., Harlim, J.: Physics constrained nonlinear regression models for time series. Nonlinearity 26, 201–217 (2013)
Article MathSciNet MATH Google Scholar
Majda, A., Timofeyev, I., Vanden-Eijnden, E.: Stochastic models for selected slow variables in large deterministic systems. Nonlinearity 19(4), 769 (2006)
Article MathSciNet MATH Google Scholar
Majda, A., Tomofeyev, I.: Statistical mechanics for truncations of the Burgers-Hopf equation: a model for intrinsic stochastic behavior with scaling. Milan J. Math. 70(1), 39–96 (2002)
Article MathSciNet MATH Google Scholar
Majda, A.J., Harlim, J.: Physics constrained nonlinear regression models for time series. Nonlinearity 26(1), 201 (2012)
Article MathSciNet MATH Google Scholar
Majda, A.J., Timofeyev, I.: Remarkable statistical behavior for truncated Burgers-Hopf dynamics. Proc. Natl. Acad. Sci. 97(23), 12413–12417 (2000)
Article MathSciNet MATH Google Scholar
Majda, A.J., Timofeyev, I., Eijnden, E.V.: Models for stochastic climate prediction. Proc. Natl. Acad. Sci. 96(26), 14687–14691 (1999)
Article MathSciNet MATH Google Scholar
Majda, A.J., Timofeyev, I., Eijnden, E.: A mathematical framework for stochastic climate models. Commun. Pure Appl. Math. J. Issued Courant Inst. Math. Sci. 54(8), 891–974 (2001)
Article MathSciNet MATH Google Scholar
Mori, H.: Transport, collective motion, and Brownian motion. Prog. Theor. Phys. 33, 423–450 (1965)
Article MATH Google Scholar
Nemtsov, A., Averbuch, A., Schclar, A.: Matrix compression using the Nyström method. Intell. Data Anal. 20(5), 997–1019 (2016)
Article Google Scholar
Papanicolaou, G.C., et al.: Some probabilistic problems and methods in singular perturbations. Rocky Mt. J. Math. 6(4), 653–674 (1976)
Article MathSciNet MATH Google Scholar
Pavliotis, G., Stuart, A.: Multiscale Methods: Averaging and Homogenization. Springer, Berlin (2008)
MATH Google Scholar
Song, L., Fukumizu, K., Gretton, A.: Kernel embeddings of conditional distributions: a unified kernel framework for nonparametric inference in graphical models. IEEE Signal Process. Mag. 30(4), 98–111 (2013)
Article Google Scholar
Song, L., Huang, J., Smola, A., Fukumizu, K.: Hilbert space embeddings of conditional distributions with applications to dynamical systems. In Proceedings of 26th Annual International Conference on Machine Learning, pp. 961–968. ACM (2009)
Weinan, E., Engquist, B., Li, X., Ren, W., Vanden-Eijnden, E.: Heterogeneous multiscale methods: a review. Commun. Comput. Phys. 2(3), 367–450 (2007)
MathSciNet MATH Google Scholar
Wilks, D.S.: Effects of stochastic parametrizations in the Lorenz’96 system. Q. J. R. Meteorol. Soc. 131(606), 389–407 (2005)
Article Google Scholar
Zhang, H., Harlim, J., Li, X.: Computing linear response statistics using orthogonal polynomial based estimators: An RKHS formulation. arXiv:1912.11110 (2019)
Zwanzig, R.: Statistical mechanics of irreversiblity. Lect. Theor. Phys. 3, 106–141 (1961)
Google Scholar
Zwanzig, R.: Nonlinear generalized Langevin equations. J. Stat. Phys. 9, 215–220 (1973)
Article Google Scholar

Download references

Acknowledgements

It is a great pleasure to dedicate this paper to Andrew Majda on the occasion of his 70th birthday. The research of J.H. was partially supported by the ONR Grant N00014-16-1-2888, NSF Grants DMS-1619661 and DMS-1854299. S.W.J. was supported as a postdoctoral fellow under the ONR Grant N00014-16-1-2888.

Author information

Authors and Affiliations

Department of Mathematics, The Pennsylvania State University, 109 McAllister Building, University Park, PA, 16802-6400, USA
Shixiao W. Jiang & John Harlim
Department of Meteorology and Atmospheric Science, The Pennsylvania State University, 503 Walker Building, University Park, PA, 16802-5013, USA
John Harlim
Institute for Computational and Data Sciences, The Pennsylvania State University, 224B Computer Building, University Park, PA, 16802, USA
John Harlim

Authors

Shixiao W. Jiang
View author publications
You can also search for this author in PubMed Google Scholar
John Harlim
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to John Harlim.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Kernel mean embedding of conditional distributions

The purpose of this review is to verify Eq. (7). While the derivation here follows closely the description in [48, 49], we present a formulation with Mercer-type kernels induced by orthonormal basis of $L^2$-spaces. Some of the basic theory of RKHS can be found in many texts, such as [6].

First, let us repeat the discussion in Sect. 2.1 on $\mathcal {Z}$. Let $\mathcal {Z}$ be a compact set and define $\hat{K}:\mathcal {Z}\times \mathcal {Z}\rightarrow \mathbb {R}$ to be a kernel, which means it is symmetric positive definite and let it be bounded. By Moore–Aronszajn theorem, there exists a unique Hilbert space $\mathcal {H}_Z=\overline{\text{ span }\{\hat{K}(\textit{\textbf{z}},\cdot ),\forall \textit{\textbf{z}}\in \mathcal {Z}\}}$. Let $\hat{q}:\mathcal {Z}\rightarrow \mathbb {R}$ be a positive weight function and $\{\varphi _k\}_{k\ge 1}$ be a set of eigenfunctions corresponding to eigenvalues $\{\xi _k\}$ of the following integral operator $\mathcal {\hat{K}}:L^2(\mathcal {Z},\hat{q}) \rightarrow L^2(\mathcal {Z},\hat{q})$, defined as

$$\begin{aligned} \mathcal {\hat{K}} f(\textit{\textbf{z}}) := \int _{\mathcal {Z}} \hat{K}(\textit{\textbf{z}},\textit{\textbf{z}}') f(\textit{\textbf{z}}') \hat{q}(\textit{\textbf{z}}') \hbox {d}\textit{\textbf{z}}'. \end{aligned}$$

(34)

By Mercer’s theorem, the kernel $\hat{K}$ has the following representation:

$$\begin{aligned} \hat{K}(\textit{\textbf{z}},\textit{\textbf{z}}') = \sum _{k=1}^{\infty } \xi _k \varphi _k(\textit{\textbf{z}})\varphi _k(\textit{\textbf{z}}'). \end{aligned}$$

(35)

We should point out that if $\mathcal {Z}$ is not a compact domain such as $\mathbb {R}^n$, with an exponentially decaying $\hat{q}$, one can construct a bounded Mercer-type kernel as in (35) with an appropriate choice of decreasing sequence $\{\xi _k\}$ (see Lemma 3.2 in [52]) and it is a reproducing kernel corresponding to the RKHS $\mathcal {H}_Z$ (see Proposition 3.4 in [52]).

In this case, the RKHS $\mathcal {H}_Z$ induced by the Mercer-type kernel in (35) is a subspace of $L^2(\mathcal {Z},\hat{q})$ with the reproducing property corresponding to an inner product defined as $\langle f,g\rangle _{\mathcal {H}_Z} = \sum _{k=1}^\infty \frac{f_k g_k}{\xi _k}$, for all $f,g\in \mathcal {H}_Z$ where $f_k = \langle f,\varphi _k\rangle _{L^2(\mathcal {Z},\hat{q})}$ and $g_k = \langle g,\varphi _k\rangle _{L^2(\mathcal {Z},\hat{q})}$ . Then, for any $f\in \mathcal {H}_Z$ and $\textit{\textbf{z}}\in \mathcal {Z}$, we can represent

$$\begin{aligned} f(\textit{\textbf{z}}) = \langle f,\hat{K}(\textit{\textbf{z}},\cdot )\rangle _{\mathcal {H}_Z} = \sum _{k=1}^\infty \frac{f_k \xi _k \varphi _k(\textit{\textbf{z}})}{\xi _k} = \sum _{k=1}^\infty f_k \varphi _k(\textit{\textbf{z}}), \end{aligned}$$

(36)

with basis of $L^2(\mathcal {Z},\hat{q})$, where the convergence of the series holds uniformly (or in $C_0(\mathbb {R}^n)$ for non-compact $\mathcal {Z}=\mathbb {R}^n$).

We called the Hilbert space of functions, $\mathcal {H}_Z$, as an RKHS induced by the orthonormal basis of $L^2(\mathcal {Z},\hat{q})$. While we have discussed $\mathcal {H}$ as an RKHS induced by the orthonormal basis of $L^2(\mathcal {Y},q^{-1})$ in Sect. 2.1, we can also repeat the argument above and construct $\mathcal {H}_Y$ as an RKHS induced by the orthonormal basis of $L^2(\mathcal {Y},q)$. In this case, recall that while $\{\psi _k q\}$ are orthogonal eigenbasis of the integral operator in (4), the orthogonal basis $\psi _k\in L^2(\mathcal {Y},q)$ is eigenfunctions of an adjoint integral operator of (4). That is, one can verify that

$$\begin{aligned} \langle \psi _kq, \mathcal {K}^* \psi _k \rangle _{L^2(\mathcal {Y})} = \langle \mathcal {K}(\psi _kq), \psi _k\rangle _{L^2(\mathcal {Y})} = \lambda _k\langle \psi _kq,\psi _k\rangle _{L^2(\mathcal {Y})}, \end{aligned}$$

(37)

where for $f\in L^2(\mathcal {Y},q)$,

$$\begin{aligned} \mathcal {K}^* f(x) := \int _\mathcal {Y} K^*(x,y) f(y) q(y)\,\hbox {d}y, \end{aligned}$$

and $K^*(x,y)= q(x)^{-1}K(x,y)q^{-1}(y)$ is also a symmetric positive definite kernel. By Mercer’s theorem, one can write

$$\begin{aligned} K^*(y,y') = \sum _{k=1}^{\infty } \lambda _k \psi _k(y)\psi _k(y'). \end{aligned}$$

(38)

Let Y and Z be random variables on $\mathcal {Y}$ and $\mathcal {Z}$ with distribution P(Y, Z), we define the cross-covariance operators, $\mathcal {C}_{YZ}:\mathcal {H}_Z\rightarrow \mathcal {H}_Y$ and $\mathcal {C}_{ZZ}:\mathcal {H}_Z\rightarrow \mathcal {H}_Z$ as

$$\begin{aligned} \begin{aligned}&\mathcal {C}_{YZ} := \mathbb {E}_{YZ} [K^*(Y,\cdot )\otimes \hat{K}(Z,\cdot ) ], \\&\mathcal {C}_{ZZ} := \mathbb {E}_{Z} [\hat{K}(Z,\cdot )\otimes \hat{K}(Z,\cdot ) ]. \end{aligned} \end{aligned}$$

(39)

One can immediately see that for any $f\in \mathcal {H}_Y$ and $g\in \mathcal {H}_Z$,

$$\begin{aligned} \mathbb {E}_{YZ}[f(Y)\otimes g(Z)]= & {} \int _{\mathcal {Y}\times \mathcal {Z}} f(y)g(\textit{\textbf{z}}) \hbox {d}P(y,\textit{\textbf{z}})\nonumber \\= & {} \int _{\mathcal {Y}\times \mathcal {Z}} \langle f,K^*(y,\cdot )\rangle _{\mathcal {H}_Y} \langle g,\hat{K}(\textit{\textbf{z}},\cdot )\rangle _{\mathcal {H}_Z} \hbox {d}P(y,\textit{\textbf{z}}) \nonumber \\= & {} \int _{\mathcal {Y}\times \mathcal {Z}} \langle f\otimes g, K^*(y,\cdot )\otimes \hat{K}(\textit{\textbf{z}},\cdot )\rangle _{\mathcal {H}_Y\otimes \mathcal {H}_Z} \hbox {d}P(y,\textit{\textbf{z}})\nonumber \\= & {} \langle f\otimes g, \mathcal {C}_{YZ}\rangle _{\mathcal {H}_Y\otimes \mathcal {H}_Z}. \end{aligned}$$

(40)

Let us define feature maps $\varPsi :\mathcal {Y}\rightarrow \mathcal {F}_Y\subset \ell _2$ and $\varPhi :\mathcal {Z}\rightarrow \mathcal {F}_Z\subset \ell _2$, respectively,

$$\begin{aligned} \begin{aligned} \varPsi (y)&= (\sqrt{\lambda _1}\psi _1(y), \sqrt{\lambda _2}\psi _2(y),\ldots ), \\ \varPhi (\textit{\textbf{z}})&= (\sqrt{\xi _1}\varphi _1(\textit{\textbf{z}}), \sqrt{\xi _2}\varphi _2(\textit{\textbf{z}}),\ldots ). \end{aligned} \end{aligned}$$

(41)

Then, we can write

$$\begin{aligned} \hat{K}(\textit{\textbf{z}},\textit{\textbf{z}}')= & {} \langle \varPhi (\textit{\textbf{z}}),\varPhi (\textit{\textbf{z}}')\rangle _{\ell _2} = \langle \hat{K}(\textit{\textbf{z}},\cdot ),\hat{K}(\textit{\textbf{z}}',\cdot )\rangle _{\mathcal {H}_Z},\\ K^*(y,y')= & {} \langle \varPsi (y),\varPsi (y')\rangle _{\ell _2} = \langle K^*(y,\cdot ),K^*(y',\cdot )\rangle _{\mathcal {H}_Y}, \end{aligned}$$

where the inner products in $\mathcal {H}_Z$ and $\mathcal {H}_Y$ can be identified by $\ell _2$ inner products in the corresponding feature spaces. Also, for any function $f\in \mathcal {H}_Z$ and $\textit{\textbf{z}}\in \mathcal {Z}$, we can rewrite the expansion in (36) as,

$$\begin{aligned} f(\textit{\textbf{z}})= & {} \langle f,\hat{K}(\textit{\textbf{z}},\cdot ) \rangle _{\mathcal {H}_Z} = \sum _{k=1}^\infty \langle f,\varphi _k \rangle _{L^2(\mathcal {Z},\hat{q})} \varphi _k(\textit{\textbf{z}})\nonumber \\= & {} \sum _{k=1}^\infty \frac{\langle f,\varphi _k \rangle _{L^2(\mathcal {Z},\hat{q})}}{\sqrt{\xi _k}} \varPhi _k(\textit{\textbf{z}}) = \sum _{k=1}^\infty \langle f, \varPhi _k\rangle _{\mathcal {H}_Z} \varPhi _k(\textit{\textbf{z}}), \end{aligned}$$

(42)

where we have defined the functions $\varPhi _k = \sqrt{\xi _k}\varphi _k \in \mathcal {H}_Z$. For convenience of the discussion below, we also define the functions $\varPsi _k:=\sqrt{\lambda _k}\psi _k\in \mathcal {H}_Y$.

Using the identity in (40), we can represent the cross-operators in (39) on the basis coordinates $\varPsi _k \in \mathcal {H}_Y$ and $\varPhi _\ell \in \mathcal {H}_Z$ as follows:

$$\begin{aligned} \begin{aligned} \left[ C_{YZ}\right] _{k\ell }&:=\mathbb {E}_{YZ}[\varPsi _k(Y)\otimes \varPhi _\ell (Z)] = \langle \varPsi _k\otimes \varPhi _\ell , \mathcal {C}_{YZ}\rangle _{\mathcal {H}_Y\otimes \mathcal {H}_Z},\\ \left[ C_{ZZ}\right] _{k\ell }&:=\mathbb {E}_{ZZ}[\varPhi _k(Z)\otimes \varPhi _\ell (Z)]=\langle \varPhi _k\otimes \varPhi _\ell , \mathcal {C}_{ZZ}\rangle _{\mathcal {H}_Z\otimes \mathcal {H}_Z}=\langle \varPhi _k, \mathcal {C}_{ZZ} \varPhi _\ell \rangle _{\mathcal {H}_Z} . \end{aligned}\nonumber \\ \end{aligned}$$

(43)

Thus, the components of the following matrix multiplication are given as

$$\begin{aligned} \left[ C_{YZ}C_{ZZ}^{-1} \right] _{k\ell }= & {} \sum _{j} \left[ C_{YZ}\right] _{kj}\left[ C_{ZZ}^{-1}\right] _{j\ell } \nonumber \\= & {} \sum _j \langle \varPsi _k\otimes \varPhi _j, \mathcal {C}_{YZ}\rangle _{\mathcal {H}_Y\otimes \mathcal {H}_Z}\langle \varPhi _j, \mathcal {C}_{ZZ}^{-1}\varPhi _\ell \rangle _{\mathcal {H}_Z} \nonumber \\= & {} \left\langle \mathcal {C}_{YZ}, \varPsi _k\otimes \left( \sum _j \langle \varPhi _j, \mathcal {C}_{ZZ}^{-1}\varPhi _\ell \rangle _{\mathcal {H_Z}} \varPhi _j \right) \right\rangle _{\mathcal {H}_Y\otimes \mathcal {H_Z}} \nonumber \\= & {} \left\langle \mathcal {C}_{YZ}, \varPsi _k \otimes \mathcal {C}_{ZZ}^{-1} \varPhi _\ell \right\rangle _{\mathcal {H}_Y\otimes \mathcal {H}_Z} \nonumber \\= & {} \left\langle \mathcal {C}_{YZ} \mathcal {C}_{ZZ}^{-1}, \varPsi _k \otimes \varPhi _\ell \right\rangle _{\mathcal {H}_Y\otimes \mathcal {H}_Z} \nonumber \\= & {} \left\langle \mathcal {C}_{YZ} \mathcal {C}_{ZZ}^{-1}\varPhi _\ell , \varPsi _k \right\rangle _{\mathcal {H}_Y}. \end{aligned}$$

(44)

To clarify this derivation, the second equality used the definition in (43), the fourth line used the fact that $\mathcal {C}_{ZZ}^{-1}\varPsi _\ell \in \mathcal {H}_Z$ can be expanded as in (42), and the rest of the lines used the standard tensor identity.

The theory of kernel mean embedding of conditional distributions (see [48, 49]) suggests that

$$\begin{aligned} \mathbb {E}_{Y|\textit{\textbf{z}}} [\varPsi _k(Y)] = \langle \varPsi _k, \mathcal {C}_{YZ} \mathcal {C}_{ZZ}^{-1} \hat{K}(\textit{\textbf{z}},\cdot ) \rangle _{\mathcal {H}_Y}. \end{aligned}$$

(45)

Since $\hat{K}(\textit{\textbf{z}},\cdot ) \in \mathcal {H}_Z$, we can employ the expansion in (42) and deduce

$$\begin{aligned} \mathbb {E}_{Y|\textit{\textbf{z}}} [\varPsi _k(Y)]= & {} \langle \varPsi _k, \mathcal {C}_{YZ} \mathcal {C}_{ZZ}^{-1} \sum _{j=1}^{\infty } \frac{\langle \hat{K}(\textit{\textbf{z}},\cdot ),\varphi _j\rangle _{L^2(\mathcal {Z},\hat{q})}}{\sqrt{\xi _j}}\varPhi _j \rangle _{\mathcal {H}_Y} \nonumber \\= & {} \sum _{j=1}^{\infty } \frac{\langle \hat{K}(\textit{\textbf{z}},\cdot ),\varphi _j\rangle _{L^2(\mathcal {Z},\hat{q})}}{\sqrt{\xi _j}} \langle \varPsi _k, \mathcal {C}_{YZ} \mathcal {C}_{ZZ}^{-1} \varPhi _j \rangle _{\mathcal {H}_Y} \nonumber \\= & {} \sum _{j=1}^{\infty } \frac{1}{\sqrt{\xi _j}}\left[ C_{YZ}C_{ZZ}^{-1} \right] _{kj} \int _{\mathcal {Z}} \hat{K}(\textit{\textbf{z}},\textit{\textbf{z}}') \varphi _j(\textit{\textbf{z}}')\hat{q}(\textit{\textbf{z}}')\,\hbox {d}\textit{\textbf{z}}' \nonumber \\= & {} \sum _{j=1}^{\infty } \left[ C_{YZ}C_{ZZ}^{-1} \right] _{kj} \varPhi _j(\textit{\textbf{z}}), \end{aligned}$$

(46)

where we have used (44) to deduce the third equality above and used the fact that $\varphi _j$ and $\xi _j$ are eigenfunction and eigenvalue of the integral operator in (34). Define,

$$\begin{aligned} \left[ \textit{\textbf{C}}_{{YZ}}\right] _{ks} =\mathbb {E}_{{YZ}} \left[ \psi _{k}(Y)\otimes \varphi _{s}(Z)\right] , \quad \quad \left[ \textit{\textbf{C}}_{{ZZ}}\right] _{sl} =\mathbb {E}_{{ZZ}} \left[ \varphi _{s}(Z)\otimes \varphi _{l}(Z)\right] , \end{aligned}$$

then from (43) and the definitions of the corresponding feature maps in (41),

$$\begin{aligned}&\left[ C_{{YZ}}\right] _{ks} = \sqrt{\lambda _k\xi _s}\left[ \textit{\textbf{C}}_{{YZ}}\right] _{ks}, \quad \quad \left[ C_{{ZZ}}\right] _{sl} = \sqrt{\xi _s\xi _l}\left[ \textit{\textbf{C}}_{{ZZ}}\right] _{sl}, \\&\quad \left[ C_{YZ}C_{ZZ}^{-1} \right] _{k\ell } = \frac{\sqrt{\lambda _k}}{\sqrt{\xi _l}} \left[ \textit{\textbf{C}}_{YZ}\textit{\textbf{C}}_{ZZ}^{-1} \right] _{k\ell }. \end{aligned}$$

Substituting the third equation above to (46) and using the definitions of the feature maps in (41), we obtain

$$\begin{aligned} \mathbb {E}_{Y|\textit{\textbf{z}}} [\psi _k(Y)] = \frac{1}{\sqrt{\lambda _k}} \sum _{j=1}^{\infty } \left[ C_{YZ}C_{ZZ}^{-1} \right] _{kj} \varPhi _j(\textit{\textbf{z}}) = \sum _{j=1}^{\infty } \left[ \textit{\textbf{C}}_{YZ}\textit{\textbf{C}}_{ZZ}^{-1} \right] _{kj} \varphi _j(\textit{\textbf{z}}), \end{aligned}$$

which is exactly the claim in (7).

Appendix B: ACV of the multiscale linear Gaussian model

The full model (18) and (19) can be rewritten as

$$\begin{aligned} \dot{x}= & {} \left( a_{11}x+a_{12}y\right) +\sigma _{x}\xi _{x},\nonumber \\ \dot{y}= & {} \frac{1}{\epsilon }\left( a_{21}x+a_{22}y\right) +\frac{\sigma _{y}}{\sqrt{\epsilon }}\xi _{y}, \end{aligned}$$

(47)

where $\xi _{x}$ and $\xi _{y}$ are independent standard Gaussian noises. Similarly, the closure model (25) can be rewritten as

$$\begin{aligned} \dot{x}_t=\left( a_{11}x_t+a_{12}\varSigma _{12}\varSigma _{22}^{-1}\textit{\textbf{x}}\right) +\sigma _{x}\xi _{x}, \end{aligned}$$

(48)

where $\textit{\textbf{x}}:=\textit{\textbf{x}}_{t-m:t} = \left[ x_{t-m},x_{t-m+1},\ldots ,x_{t}\right] ^\top $ and $ \varSigma _{12}$ and $\varSigma _{22}$ are defined in Eq. (26). To simplify the notation, we drop the time indices $t-m:t$. We also drop the “hat”-notation in $x_t$ and $\textit{\textbf{x}}_t$ since we will use it to denote the Fourier coefficient in this section. In this “Appendix,” we prove that the autocovariance (ACV) function of the closure model (48) is approximately equal to that of the full model (47) for any value of $\epsilon $.

The Fourier transform and inverse Fourier transform are defined as

$$\begin{aligned} \widehat{f}\left( \omega \right) =\int f\left( t\right) \hbox {e}^{-i\omega t}\hbox {d}t, f\left( t\right) =\frac{1}{2\pi }\int \widehat{f}\left( \omega \right) \hbox {e}^{i\omega t}\hbox {d}\omega . \end{aligned}$$

The Fourier transforms of variables x and y of the full model (47) can be obtained as

$$\begin{aligned} \widehat{x}= & {} \frac{\left( i\omega -\frac{1}{\epsilon }a_{22}\right) \sigma _{x}\widehat{\xi }_{x}+a_{12}\frac{\sigma _{y}}{\sqrt{\epsilon }}\widehat{ \xi }_{y}}{\left( i\omega -a_{11}\right) \left( i\omega -\frac{1}{\epsilon } a_{22}\right) -a_{12}\frac{1}{\epsilon }a_{21}}, \end{aligned}$$

(49)

$$\begin{aligned} \widehat{y}= & {} \frac{\left( i\omega -a_{11}\right) \frac{\sigma _{y}}{\sqrt{ \epsilon }}\widehat{\xi }_{y}+\frac{1}{\epsilon }a_{21}\sigma _{x}\widehat{ \xi }_{x}}{\left( i\omega -a_{11}\right) \left( i\omega -\frac{1}{\epsilon } a_{22}\right) -a_{12}\frac{1}{\epsilon }a_{21}}. \end{aligned}$$

(50)

Then, for the full model (47), the resulting spectrum of x is

$$\begin{aligned} \left| \widehat{x}\left( \omega \right) \right| ^{2}=\frac{\left( \omega ^{2}+c_{0}^{2}\right) \sigma _{x}^{2}\left| \widehat{\xi } _{x}\right| ^{2}+d_{0}^{2}\sigma _{y}^{2}\left| \widehat{\xi } _{y}\right| ^{2}}{\left( -\omega ^{2}+\omega _{0}^{2}\right) ^{2}+\gamma _{0}^{2}\omega ^{2}}, \end{aligned}$$

where

$$\begin{aligned} c_{0}= & {} \frac{a_{22}}{\epsilon },d_{0}=\frac{a_{12}}{\sqrt{ \epsilon }}, \omega _{0}=\sqrt{\frac{1}{\epsilon }\left( a_{11}a_{22}-a_{12}a_{21}\right) }, \\ \gamma _{0}= & {} a_{11}+\frac{1}{\epsilon }a_{22}, \left| \widehat{\xi }_{x}\right| ^{2}=1, \left| \widehat{\xi } _{y}\right| ^{2}=1. \end{aligned}$$

Now we compute the Fourier transform of the closure model (48),

$$\begin{aligned} i\omega \widehat{X}=a_{11}\widehat{X}+a_{12}\varSigma _{12}\varSigma _{22}^{-1} \left[ \begin{array}{c} 1 \\ \hbox {e}^{-i\omega \tau } \\ \vdots \\ \hbox {e}^{-i\omega m\tau } \end{array} \right] \widehat{X}+\sigma _{x}\widehat{\xi }_{x}, \end{aligned}$$

(51)

where $\widehat{X}$ is the Fourier transform of $x_t$ in Eq. (48). We need to simplify the quantity

$\varSigma _{12}\varSigma _{22}^{-1}\left[ \begin{array}{cccc} 1&\hbox {e}^{-i\omega \tau }&\cdots&\hbox {e}^{-i\omega m\tau } \end{array} \right] ^\top $ in Eq. (51). Let $S=\varSigma _{12}\varSigma _{22}^{-1}$ be the $1\times \left( m+1\right) $ vector with components denoted by $S\left[ n\right] $ for $n=0,\ldots ,m$. Then, we can write

$$\begin{aligned} \varSigma _{12}\varSigma _{22}^{-1}\left[ \begin{array}{c} 1 \\ \hbox {e}^{-i\omega \tau } \\ \vdots \\ \hbox {e}^{-i\omega m\tau } \end{array} \right] =S\left[ \begin{array}{c} 1 \\ \hbox {e}^{-i\omega \tau } \\ \vdots \\ \hbox {e}^{-i\omega m\tau } \end{array} \right] =\sum _{n=0}^{m}S\left[ n\right] \hbox {e}^{-i\omega n\tau } := \widehat{S}_m\left( \omega \right) , \end{aligned}$$

(52)

which is nothing but the discrete Fourier transform of S. Notice that, for any $n=0,\ldots ,m$,

$$\begin{aligned} \sum _{k=0}^m S\left[ k\right] \gamma _{xx,m}\left[ n-k\right] = \sum _{k=0}^m S\left[ k\right] \varSigma _{22}\left[ k, n\right] = \varSigma _{12}\left[ n\right] = \gamma _{xy,m} \left[ n\right] \end{aligned}$$

(53)

where the first equality is due to the fact that the process is stationary such that $\varSigma _{22}[k,n] = \gamma _{xx,m}[n-k]$, the second equality is due to $S\varSigma _{22}=\varSigma _{12}$, and the last equality is by the definition of the covariance function. By the discrete convolution theorem, we have

$$\begin{aligned} \widehat{S}_m\left( \omega \right) \widehat{\gamma }_{xx,m}\left( \omega \right) =\widehat{\gamma }_{xy,m} \left( \omega \right) , \end{aligned}$$

(54)

where $\widehat{\gamma }_{xx,m}$ and $\widehat{\gamma }_{xy,m}$ are the discrete Fourier transforms of $\gamma _{xx,m}$ and $\gamma _{xy,m}$, respectively. Substituting $ \widehat{S}_m\left( \omega \right) $ in Eq. (54) into Eq. (52), we obtain

$$\begin{aligned} \varSigma _{12}\varSigma _{22}^{-1}\left[ \begin{array}{c} 1 \\ \hbox {e}^{-i\omega \delta t} \\ \vdots \\ \hbox {e}^{-i\omega m\delta t} \end{array} \right] = \frac{\widehat{\gamma }_{xy,m}\left( \omega \right) }{\widehat{ \gamma }_{xx,m}\left( \omega \right) } \longrightarrow \frac{\widehat{\gamma }_{xy}\left( \omega \right) }{\widehat{\gamma }_{xy}\left( \omega \right) } \quad \text{ as } m\rightarrow \infty , \end{aligned}$$

(55)

where $\widehat{\gamma }_{xx}$ and $\widehat{\gamma }_{xy}$ denote the Fourier transform of the covariance functions $\gamma _{xx}$ and $\gamma _{xy}$.

Substituting the limiting case of Eq. (55) into Eq. (51), we can simplify the Fourier transform of the closure model as follows,

$$\begin{aligned} i\omega \widehat{X}=a_{11}\widehat{X}+a_{12}\frac{\widehat{\gamma } _{xy}\left( \omega \right) }{\widehat{\gamma }_{xx}\left( \omega \right) } \widehat{X}+\sigma _{x}\widehat{\xi }_{x}. \end{aligned}$$

(56)

Moreover, based on the Wiener–Khinchin theorem and the cross-correlation theorem, we can further simplify Eq. (56) as

$$\begin{aligned} i\omega \widehat{X}=a_{11}\widehat{X}+a_{12}\frac{\widehat{y}}{\widehat{x}} \widehat{X}+\sigma _{x}\widehat{\xi }_{x}. \end{aligned}$$

(57)

Substituting Eqs. (49) and (50) into Eq. (57), we obtain the Fourier transform of the relevant variable, $\widehat{X}$, of the closure model,

$$\begin{aligned} \widehat{X}=\frac{\left( i\omega -\frac{1}{\epsilon }a_{22}\right) \sigma _{x}\widehat{\xi }_{x}+a_{12}\frac{\sigma _{y}}{\sqrt{\epsilon }}\widehat{ \xi }_{y}}{\left( i\omega -a_{11}\right) \left( i\omega -\frac{1}{\epsilon } a_{22}\right) -a_{12}\frac{1}{\epsilon }a_{21}}, \end{aligned}$$

(58)

which is the same as the $\widehat{x}$ of the full model in Eq. (49). Therefore, the ACV of the closure model (48) is consistent with that of the full model (47) in the limit of $m\rightarrow \infty $. In the numerics, the error comes from the truncation of finite number of memory terms in Eq. (52).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Jiang, S.W., Harlim, J. Modeling of missing dynamical systems: deriving parametric models using a nonparametric framework. Res Math Sci 7, 16 (2020). https://doi.org/10.1007/s40687-020-00217-4

Download citation

Received: 01 April 2020
Accepted: 23 June 2020
Published: 08 July 2020
DOI: https://doi.org/10.1007/s40687-020-00217-4

Modeling of missing dynamical systems: deriving parametric models using a nonparametric framework

Abstract

Access this article

Similar content being viewed by others

Regularization of Hidden Markov Models Embedded into Reproducing Kernel Hilbert Space

Nonparametric estimation for stationary and strongly mixing processes on Riemannian manifolds

Drift Estimation of Multiscale Diffusions Based on Filtered Data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: Kernel mean embedding of conditional distributions

Appendix B: ACV of the multiscale linear Gaussian model

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Modeling of missing dynamical systems: deriving parametric models using a nonparametric framework

Abstract

Access this article

Similar content being viewed by others

Regularization of Hidden Markov Models Embedded into Reproducing Kernel Hilbert Space

Nonparametric estimation for stationary and strongly mixing processes on Riemannian manifolds

Drift Estimation of Multiscale Diffusions Based on Filtered Data

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix A: Kernel mean embedding of conditional distributions

Appendix B: ACV of the multiscale linear Gaussian model

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation