Skip to main content
Log in

Bayesian prediction of spatial data with non-ignorable missingness

  • Regular Article
  • Published:
Statistical Papers Aims and scope Submit manuscript

Abstract

In spatial data, especially in geostatistics data where measurements are often provided by satellite scanning, some parts of data may get missed. Due to spatial dependence in the data, these missing values probably are caused by some latent spatial random fields. In this case, ignoring missingness is not logical and may lead to invalid inferences. Thus incorporating the missingness process model into the inferences could improve the results. There are several approaches to take into account the non-ignorable missingness, one of them is the shared parameter model method. In this paper, we extend it for spatial data so that we will have a joint spatial Bayesian shared parameter model. Then the missingness process will be jointly modeled with the measurement process and one or more latent spatial random fields as shared parameters would describe their association. Bayesian inference is implemented by Integrated nested Laplace approximation. A computationally effective approach is applied via a stochastic partial differential equation for approximating latent Gaussian random field. In a simulation study, the proposed spatial joint model is compared with a model that assumes data are missing at random. Based on these two models, the lake surface water temperature data for lake Vänern in Sweden are analyzed. The results of estimation and prediction confirm the efficiency of the spatial joint model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  • Abramowitz M, Stegun I (1972) Handbook of mathematical functions. Courier Dover Publications, New York

    MATH  Google Scholar 

  • Bahari F, Parsi S, Ganjali M (2019) Empirical likelihood inference in general linear model with missing values in response and covariates by MNAR mechanism, Statistical Papers, pp 1–32

  • Banerjee S, Carlin B, Gelfand A (2004) Hierarchical modelling and analysis for spatial data, monographs on statistics and applied probability. Chapman & Hall, New York

    Google Scholar 

  • Banerjee S, Gelfand AE, Finley AO, Sang H (2008) Gaussian predictive process models for large spatial data sets. J R Stat Soc 70:825–848

    Article  MathSciNet  Google Scholar 

  • Blangiardo M, Cameletti M (2015) Spatial and spatio temporal bayesian models with RINLA. John Wiley, New York

    MATH  Google Scholar 

  • Breslow NE, Clyton DG (1993) Approximate inference in generalized linear mixed models. J Am Stat Assoc 88:9–25

    Google Scholar 

  • Brenner S, Scott R (2008) The mathematical theory of finite element methods. Springer, New York

    Book  Google Scholar 

  • Cressie AC, Johannesson G (2008) Fixed kriging for very large spatial data sets. J R Stat Soc 70:209–226

    Article  MathSciNet  Google Scholar 

  • Daniels M, Hogan J (2008) Missing data in longitudinal studies: strategies for bayesian modelling and sensitivity analysis. Chapman & Hall, New York

    Book  Google Scholar 

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm (with discussion). J R Stat Soc 39:1–38

    MATH  Google Scholar 

  • Diggle P, Tawn JA, Moyeed RA (1998) Model-based geostatistic. J R Stat Soc 47:299–350

    Article  MathSciNet  Google Scholar 

  • Diggle P, Menezes R, SU T (2010) Geostatistical inference under preferential sampling (with discussion). Appl Stat 59:191–232

    MathSciNet  Google Scholar 

  • Eidsvik J, Rue H, Martino S (2009) Approximate Bayesian inference in spatial generalized linear mixed models. Scand J Stat 36:1–22

    MathSciNet  MATH  Google Scholar 

  • Follmann D, Wu M (1995) An approximate generalized linear model with random effects for informative missing data. Biometrics 51:151–68

    Article  MathSciNet  Google Scholar 

  • Fuglstad G, Simpson D, Lingren F, Rue H (2015) Constructing priors that penalize the complexity of Gaussian random fields. J Am Stat Assoc 6:1–8

    MATH  Google Scholar 

  • Holand A, Steinsland I, Martino S, Jensen H (2010) Animal models and integrated nested laplas approximations, Preprint Statistics

  • Hook S, Wilson RC, Maccallum S, Merchant CJ (2012) Lake surface temperature. J Am Meteorol Soc 93:S18–S19

    Google Scholar 

  • Johns C, Nychka D, Kittel T, Daly C (2003) Infilling sparse records of spatial fields. J Am Stat Assoc 98:796–806

    Article  MathSciNet  Google Scholar 

  • Karimi O, Mohammadzadeh M (2010) Bayesian spatial regression models with closed Skew normal correlated errors and missing observations. Stat Pap 53:205–218

    Article  MathSciNet  Google Scholar 

  • Lindgren F, Rue H, Lindstr O (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. J R Stat Soc 73:423–498

    Article  MathSciNet  Google Scholar 

  • Little RJA (1995) Modelling the drop-out mechanism in repeated-measures studies. J Am Stat Assoc 90:1112–1121

    Article  Google Scholar 

  • Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley, New York

    Book  Google Scholar 

  • Maccallum SN, Merchant CJ (2012) Surface water temperature observations of large lakes by optimal estimation. J Can Aeronaut Space 38:25–45

    Google Scholar 

  • Pati D, Reich BJ, Dunson DB (2011) Bayesian geostatistical modelling with informative sampling locations. Biometrika 98:35–48

    Article  MathSciNet  Google Scholar 

  • Rubin D (1976) Inference and missing data. Biometrika 63:581–590

    Article  MathSciNet  Google Scholar 

  • Rue H, Martino S (2007) Approximated Bayesian inference for hierarchical Gaussian Markove field models. J Stat Plan Inference 10:3177–3192

    Article  Google Scholar 

  • Rue H, Martino S, Chopin N (2009) Approximate Bayesian inference for latent Gaussian models by using integrated nested laplace approximations. J Stat Plan Inference 137:3177–3199

    Article  Google Scholar 

  • Smith TM, Reynolds RW, Livezey RE, Stokes DC (1996) Reconstruction of historical sea-surface temperatures using empirical orthogonal functions. J Clim 9:1403–1420

    Article  Google Scholar 

  • Steinsland I, Thorrud Larsen C, Roulin A, Jensen H (2014) Quantitative genetic modelling and inference in the presence non-ignorable missing data. Int J Org Evol 68:1735–1747

    Article  Google Scholar 

  • Vonesh E, Greene T, Schluchter M (2006) Shared parameter models for joint analysis of longitudinal data and event times. Stat Med 25:143–163

    Article  MathSciNet  Google Scholar 

  • Whittle P (1954) On stationary processes in the plane. Biometrika 41:434–449

    Article  MathSciNet  Google Scholar 

  • Whittle P (1963) Stochastic processes in several dimensions. Bull Int Stat Inst 40:974–994

    MathSciNet  MATH  Google Scholar 

  • Wu MC, Carroll RJ (1988) Estimation and comparison of changes in the presence of informative right censoring by modelling the censoring process. Biometrics 44:175–88

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors are thankful to Dr Claire Miller from the School of Mathematics and Statistics, University of Glasgow, for introducing the Arc-Lake project and her guidance about the data set during the research period for this paper. Also, we appreciate Ingelin Steinsland, a professor at the Department of Mathematical Sciences at Norwegian University of Science and Technology and Professor Alfred Stein at the Department of Earth Observation Science at the University of Twente for their helpful guidance. Receiving support from the Center of Excellence in Analysis of Spatio-Temporal Correlated Data at Tarbiat Modares University is also acknowledged.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohsen Mohammadzadeh.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zahmatkesh, S., Mohammadzadeh, M. Bayesian prediction of spatial data with non-ignorable missingness. Stat Papers 62, 2247–2268 (2021). https://doi.org/10.1007/s00362-020-01186-0

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00362-020-01186-0

Keywords

Navigation