Skip to main content
Log in

A second-order iterated smoothing algorithm

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

Simulation-based inference for partially observed stochastic dynamic models is currently receiving much attention due to the fact that direct computation of the likelihood is not possible in many practical situations. Iterated filtering methodologies enable maximization of the likelihood function using simulation-based sequential Monte Carlo filters. Doucet et al. (2013) developed an approximation for the first and second derivatives of the log likelihood via simulation-based sequential Monte Carlo smoothing and proved that the approximation has some attractive theoretical properties. We investigated an iterated smoothing algorithm carrying out likelihood maximization using these derivative approximations. Further, we developed a new iterated smoothing algorithm, using a modification of these derivative estimates, for which we establish both theoretical results and effective practical performance. On benchmark computational challenges, this method beat the first-order iterated filtering algorithm. The method’s performance was comparable to a recently developed iterated filtering algorithm based on an iterated Bayes map. Our iterated smoothing algorithm and its theoretical justification provide new directions for future developments in simulation-based inference for latent variable models such as partially observed Markov process models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Andrieu, C., Doucet, A., Holenstein, R.: Particle Markov chain Monte Carlo methods. J. R. Stat. Soc. Ser. B 72(3), 269–342 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  • Bhadra, A., Ionides, E.L., Laneri, K., Pascual, M., Bouma, M., Dhiman, R.C.: Malaria in Northwest India: data analysis via partially observed stochastic differential equation models driven by Lévy noise. J. Am. Stat. Assoc. 106, 440–451 (2011)

    Article  MATH  Google Scholar 

  • Bjørnstad, O.N., Grenfell, B.T.: Noisy clockwork: time series analysis of population fluctuations in animals. Science 293, 638–643 (2001)

    Article  Google Scholar 

  • Blackwood, J.C., Cummings, D.A.T., Broutin, H., Iamsirithaworn, S., Rohani, P.: Deciphering the impacts of vaccination and immunity on pertussis epidemiology in Thailand. Proc. Natl. Acad. Sci. USA 110, 9595–9600 (2013a)

    Article  Google Scholar 

  • Blackwood, J.C., Streicker, D.G., Altizer, S., Rohani, P.: Resolving the roles of immunity, pathogenesis, and immigration for rabies persistence in vampire bats. Proc. Natl. Acad. Sci. USA 110, 2083720842 (2013b)

    Google Scholar 

  • Blake, I.M., Martin, R., Goel, A., Khetsuriani, N., Everts, J., Wolff, C., Wassilak, S., Aylward, R.B., Grassly, N.C.: The role of older children and adults in wild poliovirus transmission. Proc. Natl. Acad. Sci. USA 111(29), 10604–10609 (2014)

    Article  Google Scholar 

  • Bretó, C., He, D., Ionides, E.L., King, A.A.: Time series analysis via mechanistic models. Ann. Appl. Stat. 3, 319–348 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  • Camacho, A., Ballesteros, S., Graham, A.L., Carrat, F., Ratmann, O., Cazelles, B.: Explaining rapid reinfections in multiple-wave influenza outbreaks: Tristan da Cunha 1971 epidemic as a case study. Proc. R. Soc. Lond. Ser. B 278(1725), 3635–3643 (2011)

    Article  Google Scholar 

  • Chopin, N., Jacob, P.E., Papaspiliopoulos, O.: SMC\(^2\): an efficient algorithm for sequential analysis of state space models. J. R. Stat. Soc. Ser. B 75(3), 397–426 (2013)

    Article  MathSciNet  Google Scholar 

  • Dahlin, J., Lindsten, F., Schön, T.B.: Particle Metropolis-Hastings using gradient and Hessian information. Stat. Comput. 25(1), 81–92 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  • Douc, R., Cappé, O., Moulines, E.: Comparison of resampling schemes for particle filtering. In: Proceedings of the 4th International Symposium on Image and Signal Processing and Analysis, 2005, pp 64–69. IEEE, New York (2005)

  • Doucet, A., Jacob, P. E., and Rubenthaler, S.: Derivative-free estimation of the score vector and observed information matrix with application to state-space models (version 2). arXiv:1304.5768v2 (2013)

  • Earn, D.J., He, D., Loeb, M.B., Fonseca, K., Lee, B.E., Dushoff, J.: Effects of school closure on incidence of pandemic influenza in Alberta. Ann. Int. Med. 156(3), 173–181 (2012)

    Article  Google Scholar 

  • Gordon, N.J., Salmond, D.J., Smith, A.F.: Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc. F. Radar Signal Process. 140, 107–113 (1993)

    Article  Google Scholar 

  • He, D., Dushoff, J., Day, T., Ma, J., Earn, D.J.D.: Inferring the causes of the three waves of the 1918 influenza pandemic in England and Wales. Proc. R. Soc. Lond. Ser. B 280(1766), 20131345 (2013)

    Article  Google Scholar 

  • He, D., Ionides, E.L., King, A.A.: Plug-and-play inference for disease dynamics: measles in large and small populations as a case study. J. R. Soc. Interface 7(43), 271–283 (2010)

    Article  Google Scholar 

  • Ionides, E.L., Bhadra, A., Atchadé, Y., King, A.: Iterated filtering. Ann. Stat. 39, 1776–1802 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Ionides, E.L., Bretó, C., King, A.A.: Inference for nonlinear dynamical systems. Proc. Natl. Acad. Sci. USA 103, 18438–18443 (2006)

    Article  Google Scholar 

  • Ionides, E.L., Nguyen, D., Atchadé, Y., Stoev, S., King, A.A.: Inference for dynamic and latent variable models via iterated, perturbed Bayes maps. P. Natl. Acad. Sci. USA 112(3), 719–724 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  • Kevrekidis, I.G., Gear, C.W., Hummer, G.: Equation-free: the computer-assisted analysis of complex, multiscale systems. Am. Inst. Chem. Eng. J. 50, 1346–1354 (2004)

    Article  Google Scholar 

  • King, A.A., Domenech de Celle, M., Magpantay, F.M.G., Rohani, P.: Avoidable errors in the modelling of outbreaks of emerging pathogens, with special reference to Ebola. Proc. R. Soc. Lond. Ser. B 282, 20150347 (2015)

    Article  Google Scholar 

  • King, A.A., Ionides, E.L., Pascual, M., Bouma, M.J.: Inapparent infections and cholera dynamics. Nature 454, 877–880 (2008)

    Article  Google Scholar 

  • King, A.A., Nguyen, D., Ionides, E.L.: Statistical inference for partially observed Markov processes via the R package pomp. J. Stat. Softw 69, 1–43 (2016)

    Article  Google Scholar 

  • Kloeden, P.E., Platen, E.: Numerical Soluion of Stochastic Differential Equations, 3rd edn. Springer, New York (1999)

    Google Scholar 

  • Kushner, H.J., Clark, D.S.: Stochastic Approximation Methods for Constrained and Unconstrained Systems. Springer, New York (1978)

    Book  MATH  Google Scholar 

  • Laneri, K., Bhadra, A., Ionides, E.L., Bouma, M., Dhiman, R.C., Yadav, R.S., Pascual, M.: Forcing versus feedback: epidemic malaria and monsoon rains in Northwest India. PLoS Comput. Biol. 6(9), e1000898 (2010)

    Article  Google Scholar 

  • Laneri, K., Paul, R.E., Tall, A., Faye, J., Diene-Sarr, F., Sokhna, C., Trape, J.-F., Rodó, X.: Dynamical malaria models reveal how immunity buffers effect of climate variability. Proc. Natl. Acad. Sci. USA 112(28), 8786–8791 (2015)

    Article  Google Scholar 

  • Lavine, J.S., King, A.A., Andreasen, V., Bjrnstad, O.N.: Immune boosting explains regime-shifts in prevaccine-era pertussis dynamics. PLoS ONE 8(8), e72086 (2013)

    Article  Google Scholar 

  • Lavine, J.S., Rohani, P.: Resolving pertussis immunity and vaccine effectiveness using incidence time series. Expert Rev. Vaccines 11, 1319–1329 (2012)

    Article  Google Scholar 

  • Macdonald, G.: The Epidemiology and Control of Malaria. Oxford University Press, Oxford (1957)

    Google Scholar 

  • Martinez-Bakker, M., King, A.A., Rohani, P.: Unraveling the transmission ecology of polio. PLoS Biol. 13(6), e1002172 (2015)

    Article  Google Scholar 

  • Nemeth, C., Fearnhead, P., Mihaylova, L.: Particle approximations of the score and observed information matrix for parameter estimation in state space models with linear computational cost. arXiv:1306.0735 (2013)

  • Nguyen, D. (2015). Iterated smoothing r package, is2. https://r-forge.r-project.org/projects/is2

  • Olsson, J., Cappé, O., Douc, R., Moulines, E.: Sequential Monte Carlo smoothing with application to parameter estimation in nonlinear state space models. Bernoulli 14(1), 155–179 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  • Poyiadjis, G., Doucet, A., Singh, S.S.: Particle approximations of the score and observed information matrix in state space models with application to parameter estimation. Biometrika 98(1), 65–80 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  • Romero-Severson, E., Volz, E., Koopman, J., Leitner, T., Ionides, E.: Dynamic variation in sexual contact rates in a cohort of HIV-negative gay men. Am. J. Epidemiol. 182, 255–262 (2015)

    Article  Google Scholar 

  • Ross, R.: The Prevention of Malaria. Dutton, Boston (1910)

    Google Scholar 

  • Roy, M., Bouma, M.J., Ionides, E.L., Dhiman, R.C., Pascual, M.: The potential elimination of plasmodium vivax malaria by relapse treatment: Insights from a transmission model and surveillance data from NW India. PLoS Negl. Trop. Dis. 7, e1979 (2013)

    Article  Google Scholar 

  • Shrestha, S., Foxman, B., Weinberger, D.M., Steiner, C., Viboud, C., Rohani, P.: Identifying the interaction between influenza and pneumococcal pneumonia using incidence data. Sci. Transl. Med. 5(191), 191ra84 (2013)

    Article  Google Scholar 

  • Shrestha, S., King, A.A., Rohani, P.: Statistical inference for multi-pathogen systems. PLoS Comput. Biol. 7(8), e1002135 (2011)

  • Sisson, S.A., Fan, Y., Tanaka, M.M.: Sequential Monte Carlo without likelihoods. Proc. Natl. Acad. Sci. USA 104(6), 1760–1765 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  • Spall, J.C.: Introduction to Stochastic Search and Optimization. Wiley, Hoboken (2003)

    Book  MATH  Google Scholar 

  • Toni, T., Welch, D., Strelkowa, N., Ipsen, A., Stumpf, M.P.: Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems. J. R. Soc. Interface 6, 187–202 (2009)

  • Wood, S.N.: Statistical inference for noisy nonlinear ecological dynamic systems. Nature 466(7310), 1102–1104 (2010)

    Article  Google Scholar 

  • Yıldırım, S., Singh, S.S., Dean, T., Jasra, A.: Parameter estimation in hidden Markov models with intractable likelihoods using sequential Monte Carlo. J. Comput. Graph. Stat. 24, 846–865 (2015)

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgments

This research was funded in part by National Science Foundation Grant DMS-1308919 and National Institutes of Health Grants 1-U54-GM111274 and 1-U01-GM110712.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edward L. Ionides.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 603 KB)

Appendix: Proofs

Appendix: Proofs

1.1 Proof of Theorem 3

Let

$$\begin{aligned} R=\left[ \begin{array}{cccc} \tau _{0}I_{d\times d} &{}\quad 0_{d\times d} &{}\quad \cdots &{}\quad 0_{d\times d}\\ \tau _{0}I_{d\times d} &{}\quad \tau _{1}I_{d\times d} &{}\quad \ddots &{}\quad \vdots \\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots \\ \tau _{0}I_{d\times d} &{}\quad \tau _{1}I_{d\times d} &{}\quad \cdots &{}\quad \tau _{N}I_{d\times d} \end{array}\right] , \end{aligned}$$
(8)

where \(I_{d\times d}\) is identity matrix of dimension d and \(0_{d\times d}\) is zero matrix of dimension d, then a random walk noise will be \(R\tau Z_{0:N}\). From Assumption 5, \(\breve{\ell }\) is four times continuously differentiable for \(\theta ^{[N+1]}\). Since N is fixed, we can apply Theorem 1, with \(\Sigma =\mathrm {Cov}(RZ_{0:N})=\breve{\Psi }_N\), to obtain the existence of an \(\eta \) and a \(C_8\) such that for every \(\tau < \eta \) we have,

$$\begin{aligned}&\left| \breve{\mathbb {E}}\left( \breve{\Theta }_{0:N}-\theta ^{\left[ N+1\right] }\left| \breve{Y}_{1:N}=y^*_{1:N}\right. \right) -\tau ^{2}\breve{\Psi }_N\nabla {\breve{\ell }}\left( \theta ^{\left[ N+1\right] }\right) \right| \nonumber \\&\quad <C_8\tau ^{4}, \end{aligned}$$
(9)

where

$$\begin{aligned} \breve{\Psi }_N=\left[ \begin{array}{cccc} \tau _{0}^2\Psi &{}\quad \tau _{0}^2\Psi &{}\quad \cdots &{}\quad \tau _{0}^2\Psi \\ \tau _{0}^2\Psi &{}\quad \tau _{0}^2+\tau _{1}^2\Psi &{}\quad \ddots &{}\quad \tau _{0}^2+\tau _{1}^2\Psi \\ \vdots &{}\quad \vdots &{}\quad \ddots &{}\quad \vdots \\ \tau _{0}^2\Psi &{}\quad \tau _{0}^2+\tau _{1}^2\Psi &{}\quad \cdots &{}\quad \sum _{i=0}^N\tau _{i}^2\Psi \end{array}\right] . \end{aligned}$$
(10)

Note that Assumptions 1 and 2 are automatically satisfied for the multivariate normal distribution with mean zero and variance \(\breve{\Psi }_N\), corresponding to the random variable \(RZ_{0:N}\). As a result, for fixed \(\tau _{0},\dots ,\tau _{N}\) and for a random walk noise, we have

$$\begin{aligned}&\left| \nabla {\breve{\ell }}\left( \theta ^{\left[ N+1\right] }\right) -\tau ^{-2}\breve{\Psi }_N^{-1}\breve{\mathbb {E}}\left( \breve{\Theta }_{0:N}-\theta ^{\left[ N+1\right] }\left| \breve{Y}_{1:N}=y^*_{1:N}\right. \right) \right| \\&\quad <C_9\tau ^{2}. \end{aligned}$$

An application of the Gaussian-Jordan inverse method gives

$$\begin{aligned}&\breve{\Psi }_N^{-1}=\\&\quad \left[ \begin{array}{cccc} (\tau _{0}^{-2}+\tau _{1}^{-2})\Psi ^{-1} &{}\quad -\tau _{1}^{-2}\Psi ^{-1} &{}\quad \cdots &{}\quad 0\\ -\tau _{1}^{-2}\Psi ^{-1} &{}\quad (\tau _{1}^{-2}+\tau _{2}^{-2})\Psi ^{-1} &{}\quad \cdots &{}\quad \vdots \\ 0 &{}\quad -\tau _{2}^{-2}\Psi ^{-1}&{}\quad \cdots &{}\quad \vdots \\ \vdots &{}\quad \vdots &{}\quad \vdots &{}\quad \vdots \\ 0 &{}\quad 0 &{}\quad (\tau _{N-1}^{-2}+\tau _{N}^{-2})\Psi ^{-1} &{}\quad -\tau _{N}^{-2}\Psi ^{-1}\\ 0 &{}\quad 0 &{}\quad -\tau _{N}^{-2}\Psi ^{-1} &{}\quad \tau _{N}^{-2}\Psi ^{-1} \end{array}\right] . \end{aligned}$$

We write \(\nabla _{n} \breve{\ell }(\theta ^{[N+1]})\) for the d-dimensional vector of partial derivatives of \(\breve{\ell }(\theta ^{[N+1]})\) with respect to each of the d components of \(\theta _{n}\). An application of the chain rule gives the identity

$$\begin{aligned} \nabla {{\ell }}\left( \theta \right) =\sum _{n=0}^{N}\nabla _{n} \breve{\ell }\left( \theta ^{[N+1]}\right) , \end{aligned}$$

giving rise to an inequality,

$$\begin{aligned}&\left| \nabla {{\ell }}\left( \theta \right) -\tau ^{-2}\sum _{n=0}^{N}\left\{ \breve{\Psi }_N^{-1}\breve{\mathbb {E}}\left( \breve{\Theta }_{0:N}-\theta ^{[N+1]}|\breve{Y}_{1:N}=y^*_{1:N}\right) \right\} _{n}\right| \\&\quad <C_{10}\tau ^{2}, \end{aligned}$$

where \(\left\{ s\right\} _{n}\) is the entries \(\left\{ dn+1,...,d(n+1)\right\} \) of a vector \(s\in R^{d(N+1)}\). Decomposing the matrix multiplication by \(\breve{\Psi }_N^{-1}\) into \(d\times d\) blocks, we have

$$\begin{aligned}&\tau ^{-2}\sum _{n=0}^{N}\left\{ \breve{\Psi }_N^{-1}\breve{\mathbb {E}}\left( \breve{\Theta }_{0:N}-\theta ^{[N+1]}|\breve{Y}_{1:N}=y^*_{1:N}\right) \right\} _{n} \nonumber \\&\quad =\tau ^{-2}\sum _{n=0}^{N}\mathrm {SumCol}_{n}\left( \breve{\Psi }_N^{-1}\right) \breve{\mathbb {E}}\left( \breve{\Theta }_{n}-\theta |\breve{Y}_{1:N}=y^*_{1:N}\right) ,\nonumber \\ \end{aligned}$$
(11)

where \(\mathrm {SumCol}_{n}\) is the sum of the nth column in the \(d\times d\) block construction of \(\breve{\Psi }_N^{-1}\). Every column of \(\breve{\Psi }_N^{-1}\) except the first sums to 0, and this special structure of \(\tilde{\Psi }_N^{-1}\) gives a simple form,

$$\begin{aligned}&\left| \sum _{n=0}^{N}\nabla _{n}{\breve{\ell }}\left( \theta ^{[N+1]}\right) -\tau ^{-2}\Psi ^{-1}\tau _{0}^{-2}\breve{\mathbb {E}}\left( \breve{\Theta }_{0}-\theta |\breve{Y}_{1:N}=y^*_{1:N}\right) \right| \\&\quad <C_{11}\tau ^{2}. \end{aligned}$$

This can be written as

$$\begin{aligned} \left| \nabla {\ell }\left( \theta \right) -\tau ^{-2}\Psi ^{-1}\tau _{0}^{-2}\breve{\mathbb {E}}\left( \breve{\Theta }_{0}-\theta |\breve{Y}_{1:N}=y^*_{1:N}\right) \right| <C_{11}\tau ^{2}. \end{aligned}$$

1.2 Proof of Theorem 4

Using similar set up as above, let the random walk noise be \(R\tau Z_{0:N}\) with R defined as in Eq. (8). By selecting \(p_{\Theta _{0:N}}\) to follow a multivariate normal distribution, Assumption 4 is also satisfied. From Theorem 2, for fixed \(\tau _{0}, \dots , \tau _{N}\), there exist \(\eta \) and \(C_{12}\) such that for \(0<\tau <\eta \),

$$\begin{aligned}&\left| \nabla ^2{\breve{\ell }}\left( \theta ^{[N+1]}\right) -\tau ^{-4}\right. \nonumber \\&\quad \left. \left[ \breve{\Psi }_N^{-1}\left( {\breve{\mathrm{C}}\mathrm {ov}}_{\theta ^{[N+1]},\tau }\left( \breve{\Theta }_{0:N}|\breve{Y}_{1:N}=y^*_{1:N}\right) -\tau ^{2}\breve{\Psi }_N\right) \breve{\Psi }_N^{-1}\right] \right| \nonumber \\&\qquad <C_{12}\tau ^{2}. \end{aligned}$$
(12)

Define \(\nabla ^2_{s,n}{\breve{\ell }}\left( \theta ^{[N+1]}\right) \) as

$$\begin{aligned} \nabla ^{2}_{s,n}\breve{\ell }\left( \theta ^{[N+1]}\right) =\frac{{{\partial }}^{2}\breve{\ell }\left( \theta ^{[N+1]}\right) }{{{\partial \theta }}_{s}{{\partial \theta }}_{n}}. \end{aligned}$$

Applying the chain rule, we have

$$\begin{aligned} \nabla ^2{\ell }\left( \theta \right) =\sum _{s=0}^{N}\sum _{n=0}^{N}\nabla _{s,n}^2{\breve{\ell }}\left( \theta ^{[N+1]}\right) . \end{aligned}$$

Adding up term in Eq. (12), we get

$$\begin{aligned}&\left| \sum _{s=0}^{N}\sum _{n=0}^{N}\nabla _{s,n}^{2}{\breve{\ell }}(\theta ^{[N+1]})-\tau ^{-4}\right. \\&\left. \sum _{s=0}^{N}\sum _{n=0}^{N}\left[ \breve{\Psi }_N^{-1}\left( {\breve{\mathrm{C}}\mathrm {ov}}_{\theta ^{[N+1]},\tau }\left( \breve{\Theta }_{0:N}|\breve{Y}_{1:N}=y^*_{1:N}\right) -\tau ^{2}\breve{\Psi }_N\right) \breve{\Psi }_N^{-1}\right] _{s,n}\right| \\&\quad <C_{14}\tau ^{2}, \end{aligned}$$

where \(\left\{ A \right\} _{s,n}\) is the entries of rows \(\left\{ ds+1,...,d(s+1)\right\} \) and of columns \(\left\{ dn+1,...,d(n+1)\right\} \) of a matrix \(A\in R^{d(N+1)\times d(N+1)}\). Therefore,

$$\begin{aligned}&\Bigg |\nabla ^2{\ell }\left( \theta \right) -\tau ^{-4}\Bigg .\\&\left. \sum _{s=0}^{N}\sum _{n=0}^{N}\left[ \breve{\Psi }_N^{-1}\left( {\breve{\mathrm{C}}\mathrm {ov}}_{\theta ^{[N+1]},\tau }\left( \breve{\Theta }_{0:N}|\breve{Y}_{1:N}=y^*_{1:N}\right) -\tau ^{2}\breve{\Psi }_N\right) \breve{\Psi }_N^{-1}\right] _{s,n}\right| \\&\quad <C_{14}\tau ^{2}. \end{aligned}$$

Defining \(\mathrm {SumCol}_n\) as in Eq. (11), we have

$$\begin{aligned}&\sum _{s=0}^{N}\sum _{n=0}^{N}\left[ \breve{\Psi }_N^{-1}\left( {\breve{\mathrm{C}}\mathrm {ov}}_{\theta ^{[N+1]},\tau }\left( \breve{\Theta }_{0:N}|\breve{Y}_{1:N}=y^*_{1:N}\right) -\tau ^{2}\breve{\Psi }_N\right) \breve{\Psi }_N^{-1}\right] _{s,n}\\&\quad =\sum _{s=0}^{N}\sum _{n=0}^{N}\mathrm{SumCol}_{\mathrm {s}}(\breve{\Psi }_N^{-1})\mathrm {SumCol}_{n}(\breve{\Psi }_N^{-1})\\&\qquad \times \left( {\breve{\mathrm{C}}\mathrm {ov}}_{\theta ^{[N+1]},\tau }\left( \breve{\Theta }_{s},\breve{\Theta }_{n}|\breve{Y}_{1:N}=y^*_{1:N}\right) -\sum _{k=0}^{s\wedge n}\tau _k^{2}\tau ^2\Psi \right) \Psi ^{-1}\\&\quad =\left( {\breve{\mathrm{V}}\mathrm {ar}}_{\theta ^{[N+1]},\tau }\left( \breve{\Theta }_0|\breve{Y}_{1:N}=y^*_{1:N}\right) -\tau _0^{2}\tau ^2\Psi \right) . \end{aligned}$$

The last equality follows since \(\breve{\Psi }_N^{-1}\) is symmetric matrix with block of \(d\times d\) for which each column except the first sums to 0. Thus, we obtain

$$\begin{aligned}&\left| \nabla ^2{\ell }\left( \theta \right) -\tau ^{-4}\Psi ^{-1}\left( {\breve{\mathrm{V}}\mathrm {ar}}_{\theta ^{[N+1]},\tau }\left( \breve{\Theta }_{0}|\breve{Y}_{1:N}=y^*_{1:N}\right) -\tau _0^{2}\tau ^2\Psi \right) \Psi ^{-1}\right| \\&\quad <C_{15}\tau ^{2}. \end{aligned}$$

1.3 Proof of Theorem 5

In order to prove Theorem 5, we use the following corollary to Theorem 1.

Corollary 1

Suppose the perturbation kernel takes a value \(\kappa \in \mathcal {K}\) satisfying Assumptions 6 and 7. Suppose also assumption 3. There exists an \(\eta \) and a constant \(C_{16}\) such that for every \(0<\tau <\eta \) and every \(\kappa \in \mathcal {K}\),

$$\begin{aligned} \left| \breve{\mathbb {E}}\left( \breve{\Theta }-\theta \left| \breve{Y}=y^*\right. \right) -\tau ^{2}\Sigma \nabla \ell \left( \theta \right) \right| <C_{16}\tau ^{4}. \end{aligned}$$
(13)

Corollary 1 follows directly from applying Theorem 1, noting that Assumptions 6 and 7 imply a uniform bound on \(C_2\) in Theorem 1. Applying Corollary 1, we obtain the existence of an \(\eta \) and \(C_{18}\) such that for every \(\tau < \eta \) we have

$$\begin{aligned}&\left| \tau ^{2}\breve{\Psi }_N\nabla {\breve{\ell }}\left( \theta ^{\left[ N+1\right] }\right) -\breve{\mathbb {E}}\left( \breve{\Theta }_{0:N}-\theta ^{\left[ N+1\right] }\left| \breve{Y}_{1:N}=y^*_{1:N}\right. \right) \right| \nonumber \\&\quad <C_{18}\tau ^{4}. \end{aligned}$$
(14)

For compactness of notation, we write \(E_n=\breve{\mathbb {E}}\left( \breve{\Theta }_{n}-\theta \left| \breve{Y}_{1:N}=y^*_{1:N}\right. \right) \) and \(D_n=\Psi \nabla _n{\breve{\ell }}\left( \theta ^{\left[ N+1\right] }\right) \). With \(\breve{\Psi }_N\) as in Eq. (10), writing out terms of the vector equation in (14) gives

$$\begin{aligned}&E_0 = \tau ^2 \tau _0^2 \sum _{n=0}^N D_n +O(\tau ^4), \end{aligned}$$
(15)
$$\begin{aligned}&E_1 = \tau ^2 \tau _0^2 \sum _{n=0}^N D_n + \tau ^2 \tau _1^2 \sum _{n=1}^N D_n+O(\tau ^4), \end{aligned}$$
(16)
$$\begin{aligned}&\vdots \end{aligned}$$
(17)
$$\begin{aligned}&E_{N-1} = \tau ^2 \tau _0^2 \sum _{n=0}^N D_n +\cdots + \tau ^2 \tau _{N-1}^2 \sum _{n=N-1}^N D_n +O(\tau ^4), \nonumber \\ \end{aligned}$$
(18)
$$\begin{aligned}&E_N = \tau ^2 \tau _0^2 \sum _{n=0}^N D_n +\cdots + \tau ^2 \tau _{N}^2 D_N +O(\tau ^4). \end{aligned}$$
(19)

Using our assumption that for all \(n=1\ldots N\), \(\tau _n=O(\tau ^2)\), we get that \(E_n=E_0+O(\tau ^{4})\), from which we can conclude that

$$\begin{aligned} \frac{1}{N+1}\sum _{n=0}^NE_n=E_0+O(\tau ^{4}). \end{aligned}$$

From Eq. (14) and the chain rule, for fixed \(\tau _0\) we have

$$\begin{aligned} \left| \nabla {\ell }\left( \theta \right) -\tau ^{-2}\Psi ^{-1}\tau _{0}^{-2}\breve{\mathbb {E}}\left( \breve{\Theta }_{0}-\theta |\breve{Y}_{1:N}=y^*_{1:N}\right) \right| <C_{19}\tau ^{2}, \end{aligned}$$

which then completes the proof.

1.4 Proof of Theorem 6

As for Theorem 5, we need a corollary to Theorem 2 over the kernel set \(\mathcal {K}\), making use of Assumptions 6 and 7.

Corollary 2

Suppose Assumptions 3, 6, and 7. Suppose also that every kernel in \(\mathcal {K}\) satisfies the mesokurtic condition in Assumptions 3. There exists an \(\eta \) and a constant \(C_{20}\) such that for every \(0<\tau <\eta \) and every \(\kappa \in \mathcal {K}\),

$$\begin{aligned}&\left| \breve{\mathbb {E}}\left[ \left( \breve{\Theta }-\theta \right) \left( \breve{\Theta }-\theta \right) ^{\top }\left| \breve{Y}=y^*\right. \right] -\tau ^{2}\Sigma -\tau ^{4}\Sigma \left( \nabla ^{2}\ell (\theta )\right) \Sigma \right| \\&\quad < C_{20}\tau ^{6}. \end{aligned}$$

Applying Corollary 2, we have

$$\begin{aligned}&\left| \tau ^{4}\breve{\Psi }_N\nabla ^2{\breve{\ell }}\left( \theta ^{[N+1]}\right) \breve{\Psi }_N\right. \\&\quad \left. -\left( {\breve{\mathrm{C}}\mathrm {ov}}_{\theta ^{[N+1]},\tau }\left( \breve{\Theta }_{0:N}|\breve{Y}_{1:N}{=}y^*_{1:N}\right) {-}\tau ^{2}\breve{\Psi }_N\right) \right| <C_{21}\tau ^{6}. \end{aligned}$$

For compact notation, we write

$$\begin{aligned} \breve{C}ov_{s,n}=\breve{\mathrm{C}}{\mathrm {ov}}\left( \breve{\Theta }_{s},\breve{\Theta }_{n}|\breve{Y}_{1:N}=y^*_{1:N}\right) -\sum _{k=0}^{s\wedge {n}}\tau _k^{2}\tau ^2\Psi \end{aligned}$$

and

$$\begin{aligned} \nabla ^2_{s,n}\breve{\ell }= \nabla ^2_{s,n}\breve{\ell }\left( \theta ^{[N+1]}\right) . \end{aligned}$$

From the diagonal terms of the above matrix norm inequality, we derive \(N+1\) equations,

$$\begin{aligned}&\breve{\mathrm{C}}{\mathrm {ov}}_{0,0}=\tau ^4\left[ \breve{\Psi }_N\nabla ^2\breve{\ell }\breve{\Psi }_N\right] _{0,0}+O(\tau ^6), \end{aligned}$$
(20)
$$\begin{aligned}&{\breve{\mathrm{C}}\mathrm {ov}}_{1,1}=\tau ^4\left[ \breve{\Psi }_N\nabla ^2\breve{\ell }\breve{\Psi }_N\right] _{1,1}+O(\tau ^6), \end{aligned}$$
(21)
$$\begin{aligned}&\vdots \end{aligned}$$
(22)
$$\begin{aligned}&{\breve{\mathrm{C}}\mathrm {ov}}_{N-1,N-1}=\tau ^4\left[ \breve{\Psi }_N\nabla ^2\breve{\ell }\breve{\Psi }_N\right] _{N-1,N-1}+O(\tau ^6), \end{aligned}$$
(23)
$$\begin{aligned}&{\breve{\mathrm{C}}\mathrm {ov}}_{N,N}=\tau ^4\left[ \breve{\Psi }_N\nabla ^2\breve{\ell }\breve{\Psi }_N\right] _{N,N}+O(\tau ^6). \end{aligned}$$
(24)

Using (20) through (24), and expanding out a matrix multiplication, we get

$$\begin{aligned}&\left[ \breve{\Psi }_N\nabla ^2\breve{\ell }\breve{\Psi }_N\right] _{n,n} = \Psi ^2\sum _{j=0}^n\\&\quad \left( \sum _{k=0}^j\tau _k^2\right) \left[ \sum _{i=0}^n\left( \sum _{k=0}^i\tau _k^2\right) \nabla ^2_{i,j}\breve{\ell }+\sum _{i=n+1}^N\left( \sum _{k=0}^n\tau _k^2\right) \nabla ^2_{i,j}\breve{\ell }\right] \\&\qquad + \Psi ^2\sum _{j=n+1}^N\left( \sum _{k=0}^n\tau _k^2\right) \\&\quad \left[ \sum _{i=0}^n\left( \sum _{k=0}^i\tau _k^2\right) \nabla ^2_{i,j}\breve{\ell }+\sum _{i=n+1}^N\left( \sum _{k=0}^n\tau _k^2\right) \nabla ^2_{i,j}\breve{\ell }\right] . \end{aligned}$$

Using our assumption that for all \(n=1\ldots N\), \(\tau _n=O(\tau ^2)\), we get that

$$\begin{aligned} \breve{\mathrm{C}}\mathrm {ov}_{n,n}=\breve{\mathrm{C}}\mathrm {ov}_{0,0}+O(\tau ^{6}), \end{aligned}$$

from which we can conclude that

$$\begin{aligned} \frac{1}{N+1}\sum _{n=0}^N\breve{\mathrm{C}}\mathrm {ov}_{n,n}=\breve{\mathrm{C}}\mathrm {ov}_{0,0}+O(\tau ^{6}). \end{aligned}$$

For \(n=0\), we have

$$\begin{aligned} \left[ \breve{\Psi }_N\nabla ^2\breve{\ell }\breve{\Psi }_N\right] _{0,0}=\Psi ^2\tau _0^4\sum _{j=0}^N\sum _{i=0}^N\nabla ^2_{i,j}\breve{\ell }, \end{aligned}$$
(25)

from which, applying the chain rule completes the proof.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Nguyen, D., Ionides, E.L. A second-order iterated smoothing algorithm. Stat Comput 27, 1677–1692 (2017). https://doi.org/10.1007/s11222-016-9711-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-016-9711-9

Keywords

Navigation