The following article is Open access

State-space Representation of Matérn and Damped Simple Harmonic Oscillator Gaussian Processes

, , and

Published May 2021 © 2021. The Author(s). Published by the American Astronomical Society.
, , Citation Andrés Jordán et al 2021 Res. Notes AAS 5 107 DOI 10.3847/2515-5172/abfe68

2515-5172/5/5/107

Abstract

Gaussian processes (GPs) are used widely in the analysis of astronomical time series. GPs with rational spectral densities have state-space representations which allow ${ \mathcal O }(n)$ evaluation of the likelihood. We calculate analytic state space representations for the damped simple harmonic oscillator and the Matérn 1/2, 3/2 and 5/2 processes.

Export citation and abstract BibTeX RIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Gaussian processes (GPs) are now widely used in the analysis of astronomical time series. Efficient algorithms have been devised that allow likelihood evaluation in ${ \mathcal O }(n)$ operations, where n is the number of observations (see Foreman-Mackey et al. 2017, and references therein). Linear Gaussian state space models allow likelihood evaluation in ${ \mathcal O }(n)$ operations via the Kalman filter (Kalman 1960). In this note we present analytic linear Gaussian state space model representations for the popular Matérn and Damped Simple Harmonic Oscillator (DSHO) GPs, offering an alternative route to efficient likelihood evaluation for these processes. This representation can offer advantages in terms of numerical stability for the Matérn case (D. Foreman-Mackey 2021, personal communication).

There is a correspondence between a stationary GP over ${\mathbb{R}}$ with kernel $k(\cdot )$ and power spectral density $S(\omega )$ that is a rational function of ω, i.e., $S(\omega )\propto p(\omega )/q(\omega )$ with p and q polynomials, and the prior implied by an order p linear stochastic differential equation given by

Equation (1)

where W is a Wiener process (Brownian motion) with variance ${\sigma }_{w}^{2}$ (Hartikainen & Sarkka 2010; Saatçi 2012; Brockwell & Davis 2013). Equation (1) is equivalent to a state space model with the following state equation

Equation (2)

where ${\boldsymbol{x}}=(f(t),\tfrac{{df}}{{dt}},\ldots ,\tfrac{{d}^{p-1}f}{{{dt}}^{p-1}})$, ${\boldsymbol{L}}={\left(\mathrm{0,0},\ldots ,1\right)}^{\top }$ and

and the following measurement equation

where ${\boldsymbol{b}}={({b}_{0},{b}_{1},\ldots ,{b}_{p-1})}^{\top }$ and bj =0 for $j\gt q$. The observed process y would in the astronomical context used to model a time series y(t) observed at times ${\{{t}_{i}\}}_{i=1}^{n}$ with associated measurement errors ${\{{\sigma }_{i}\}}_{i=1}^{n}$.

If ${\boldsymbol{x}}(t)$ is stationary the process y defined by the above state space model can be shown (e.g., Brockwell 2009) to be a zero-mean Gaussian CARMA($p,q$) model with power spectral density

Equation (3)

and variance ${\boldsymbol{\Sigma }}$ that satisfies

Equation (4)

We can express the process given by Equation (2) as a linear Gaussian state space model observed at a set of input points $\{{\boldsymbol{x}}({t}_{i})\}$

Equation (5)

where ${\delta }_{i}\equiv {t}_{i}-{t}_{i-1}$, ${{\boldsymbol{\Phi }}}_{i-1}\equiv \exp ({\boldsymbol{A}}{\delta }_{i})$ and

Equation (6)

where ${\sigma }_{n}^{2}$ is the measurement error, ${{\boldsymbol{c}}}_{i-1}$ is the last column of the matrix $\exp ({\boldsymbol{A}}({\delta }_{i}-h))$ and ${\boldsymbol{H}}$ is a projection matrix that selects the first component of the vector, which is the process in our state space model. With all the conditional probabilities of our process determined, we can perform inference using a filtering algorithm.

A filtering algorithm computes the conditional probabilities of the states ${{\boldsymbol{x}}}_{k}$ given all previous measurements $\{{{\boldsymbol{y}}}_{1},{{\boldsymbol{y}}}_{2},\ldots ,{{\boldsymbol{y}}}_{k}\}\equiv {{\boldsymbol{y}}}_{1:k}$ and a set of parameters ${\boldsymbol{\theta }}$, i.e., $p({{\boldsymbol{x}}}_{k}| {{\boldsymbol{y}}}_{1:k},{\boldsymbol{\theta }})$. For the linear Gaussian case this can done in ${ \mathcal O }(n)$ operations with a set of recursive equations known as the Kalman filter, which are given by 4

Equation (7)

The mean and covariance matrices of the distributions above are obtained with the Kalman filter prediction and update steps (note that the matrices can depend on ${\boldsymbol{\theta }}$ although we have not written explicitly that dependence). The prediction step is given by

Equation (8)

and the update step by

Equation (9)

The recursion is initialized with a prior mean ${{\boldsymbol{\mu }}}_{0}$ and variance ${{\boldsymbol{P}}}_{0}$. To learn the parameters of a GP that has a corresponding state space representation we are interested in the marginal likelihood which can be computed recursively as

Equation (10)

where we define $p({{\boldsymbol{y}}}_{1}| {{\boldsymbol{y}}}_{0:1})\equiv p({{\boldsymbol{y}}}_{1})$. The terms in the product are part of the output of the Kalman filter (see Equation (7)). Thus, the marginal likelihood can be calculated in ${ \mathcal O }(n)$ operations. Armed with the likelihood, we can sample the posterior using a variety of techniques. We note that the use of the Kalman filter to evaluate the likelihood of a CARMA($p,q$) model in the astronomical context has been presented before in Kelly et al. (2014). In their work the likelihood for a CARMA($p,q$) process is evaluated numerically using the algorithm of Jones & Ackerson (1990) which relies on the diagonalization of the ${\boldsymbol{A}}$ matrix using the roots of the a(z) polynomial. In this note we provide analytic expressions for the ${\boldsymbol{\Phi }}$ and ${\boldsymbol{Q}}$ matrices for some GP kernels of wide astronomical application.

2. State Space Models Corresponding to GP Kernels of Interest

2.1. The Matérn Family

One of the most widely used kernels is the Matérn family, which are indexed by ν and have an associated spectral density for a one-dimensional input given by Rasmussen & Williams (2006)

Equation (11)

and is thus of the form of Equation (3) for $\nu =n+1/2$, $n\in {\mathbb{N}}$. More precisely, we have $a{(z)=(\lambda +z)}^{n+1}$ and $b(z)=1$. As an example, for the popular $\nu =3/2$ kernel the corresponding stochastic differential equation is

Equation (12)

where ${\boldsymbol{x}}({\boldsymbol{t}})\equiv (x(t),\tfrac{{dx}(t)}{{dt}})$ and the variance of dW is given in terms of ${\sigma }_{f}^{2}$ and λ by

in order that the power spectra density of the process corresponds to that of Equation (11) for $\nu =3/2$.

For the Matérn family with $\nu =n+1/2$ the matrices ${{\boldsymbol{\Phi }}}_{i-1}$ and ${{\boldsymbol{Q}}}_{i-1}$ defined in Equation (6) can be computed analytically using the Laplace transform and a symbolic mathematics package. We provide the expressions of these matrices for a set of values of ν below.

2.2. Stochastically Driven Damped Simple Harmonic Oscillator

Foreman-Mackey et al. (2017) proposed the stochastic process given by the following stochastic differential equation (SDE) as a physically motivated process to describe stellar photometric variations

Equation (13)

The corresponding power spectral density is given by

Equation (14)

where ${\sigma }_{w}^{2}$ is the spectral density of the white noise process dW. In terms of the power at ${\omega }_{0}$ we have that ${\sigma }_{w}^{2}=S({\omega }_{0}){\omega }_{0}^{4}/{Q}^{2}$, and in terms of the total power ${S}_{\mathrm{tot}}$ we have 5 that ${\sigma }_{w}^{2}=2{S}_{\mathrm{tot}}{\omega }_{0}^{3}/Q$.

From Equation (13) we see that the DSHO is equivalent to a CARMA(2,0) process with $a(z)={w}_{0}^{2}+({w}_{0}/Q)z+{z}^{2}$ and $b(z)=1$. The corresponding vectorial Markovian SDE is

Equation (15)

where ${\boldsymbol{x}}({\boldsymbol{t}})\equiv (x(t),\tfrac{{dx}(t)}{{dt}})$. Just as the case of the Matérn family, the matrices ${{\boldsymbol{\Phi }}}_{i-1}$ and ${{\boldsymbol{Q}}}_{i-1}$ defined in Equation (6) can be computed analytically. Note that when $Q=1/2$ this process reduces to a Matérn 3/2 process with $\lambda ={\omega }_{0}$.

3. Calculation of Matrices needed for Likelihood Calculation

We follow the procedure described in Saatçi (2012) to calculate analytic expressions for the matrices needed for the likelihood calculation for the Matérn and DSHO processes. The matrices ${\boldsymbol{\Phi }}$ are calculated using the identity

Equation (16)

where ${{ \mathcal L }}^{-1}$ denotes the inverse Laplace transform and ${\boldsymbol{I}}$ is the identity matrix. In the case of the Matérn family of processes, the ${\boldsymbol{Q}}$ matrices can be computed analytically via the integral given in Equation (6). In the case of the DSHO, we use the matrix fraction decomposition. If we define matrices ${\boldsymbol{C}}$ and ${\boldsymbol{D}}$ such that ${\boldsymbol{\Sigma }}={\boldsymbol{C}}{{\boldsymbol{D}}}^{-1}$, then Equation (4) is satisfied when ${\boldsymbol{C}}$ and ${\boldsymbol{D}}$ take the form

Equation (17)

and ${\boldsymbol{\Sigma }}{(t)={\boldsymbol{C}}(t){\boldsymbol{D}}(t)}^{-1}$. To calculate the ${\boldsymbol{Q}}$ matrices for the DSHO we use Equation (17) with ${\boldsymbol{C}}(0)=0$ and ${\boldsymbol{D}}(0)={\boldsymbol{I}}$, calculate the matrix exponential using the inverse Laplace transform, and then set ${{\boldsymbol{Q}}}_{i-1}={\boldsymbol{\Sigma }}({\delta }_{i})$. Note that we set ${\sigma }_{w}^{2}=1$ to reduce the notational clutter, ${\boldsymbol{Q}}$ matrices presented here should be multiplied by the actual value of this variance when using them. We also denote ${\omega }_{0}$ for the damped simple harmonic oscillator simply by ω.

3.1. Matérn $\nu =1/2$ (Exponential or Ohrstein–Uhlenbeck Process)

Equation (18)

Equation (19)

3.2. Matérn $\nu =3/2$

Equation (20)

Equation (21)

3.3. Matérn $\nu =5/2$

Equation (22)

Equation (23)

Equation (24)

3.4. DSHO with Parameters ω and Q. Case $Q\geqslant 1/2$.

Define $\beta =\sqrt{{Q}^{2}-1/4}$.

Equation (25)

Equation (26)

Equation (27)

3.5. DSHO with Parameters ω and Q. Case $Q\lt 1/2$.

Define $\beta =\sqrt{1/4-{Q}^{2}}$. The expressions for ${\boldsymbol{\Phi }}$ and ${\boldsymbol{Q}}$ correspond to those of the case $Q\geqslant 1/2$ using the fact that $\cos ({ix})=\cosh (x)$ and $\sin ({ix})=i\sinh (x)$.

Equation (28)

Equation (29)

Equation (30)

Note that the ${\boldsymbol{\Phi }}$ and ${\boldsymbol{Q}}$ matrices for the DSHO process have well-defined limits as $Q\to {0.5}^{+}$ and $Q\to {0.5}^{-}$, and are given by the corresponding matrices for the Matérn $\nu =3/2$ process. This follows from the fact that ${\mathrm{lim}}_{x\to 0}\sin (x)/x={\mathrm{lim}}_{x\to 0}\sinh (x)/x\,=\,1$.

Footnotes

  • 4  

    An excellent introduction to Bayesian filtering is given by Särkkä (2013), from which we take our notation.

  • 5  

    This follows from ${\int }_{-\infty }^{\infty }\,S(\omega )d\omega =\tfrac{{\sigma }_{w}^{2}Q}{2{\omega }_{0}^{3}}$ which is obtained using ${\int }_{-\infty }^{\infty }\tfrac{1}{{\left({a}^{2}-{z}^{2}\right)}^{2}+{b}^{2}{z}^{2}}{dz}=\tfrac{\pi }{{a}^{2}b}.$

Please wait… references are loading.
10.3847/2515-5172/abfe68