Abstract
Spatially distributed processes can be modeled as random fields. The complex spatial dependence is then incorporated in the joint probability density function. Knowledge of the joint probability density allows predicting missing data. While environmental data often exhibit significant deviations from Gaussian behavior (rainfall, wind speed, and earthquakes being characteristic examples), only a few non-Gaussian joint probability density functions admit explicit expressions. In addition, random field models are computationally costly for big datasets. We propose an “effective distribution” approach which is based on the product of univariate conditional probability density functions modified by local interactions. The effective densities involve local parameters that are estimated by means of kernel regression. The prediction of missing data is based on the median value from an ensemble of simulated states generated from the effective distribution model. The latter can capture non-Gaussian dependence and is applicable to large spatial datasets, since it does not require the storage and inversion of large covariance matrices. We compare the predictive performance of the effective distribution approach with classical geostatistical methods using Gaussian and non-Gaussian synthetic data. We also apply the effective distribution approach to the reconstruction of gaps in large raster data.
Similar content being viewed by others
References
Adler RJ (1981) The geometry of random fields. Wiley, New York
Ailliot P, Allard D, Monbet V, Naveau P (2015) Stochastic weather generators: an overview of weather type models. J Soc Fr Stat 156:101–113
Alexandropoulos GC, Sagias NC, Berberidis K (2007) On the multivariate Weibull fading model with arbitrary correlation matrix. Antennas Wirel Propag Lett IEEE 6:93–95
Allard D (2012) Modeling spatial and spatio-temporal non Gaussian processes. In: Porcu E, Montero JM, Schlather M (eds) Advances and challenges in space–time modelling of natural events, Lecture notes in statistics, vol 207. Springer, Berlin, pp 141–164
Anderson TW (1984) An introduction to multivariate statistical analysis, 3rd edn. Wiley, New York
Banerjee S, Carlin BP, Gelfand AE (2014) Hierarchical modeling and analysis for spatial data. CRC Press, Boca Raton
Baxevani A, Lennartsson J (2015) A spatiotemporal precipitation generator based on a censored latent Gaussian field. Water Resour Res 51:4338–4358
Baxevani A, Podgórski K, Wegener J (2014) Sample path asymmetries in non-Gaussian random processes. Scand J Stat 41:1102–1123
Bertschinger E (2001) Multiscale Gaussian random fields and their application to cosmological simulations. Astrophys J Suppl Ser 137:1–20
Beuman TH, Turner AM, Vitelli V (2012) Stochastic geometry and topology of non-Gaussian fields. Proc Natl Acad Sci 109:19943–19948
Beuman TH, Turner AM, Vitelli V (2013) Extrema statistics in the dynamics of a non-Gaussian random field. Phys Rev E 87:022142
Bolin D, Wallin J (2016) Spatially adaptive covariance tapering. Spat Stat 18:163–178
Brook D (1964) On the distinction between the conditional probability and the joint probability approaches in the specification of nearest-neighbour systems. Biometrika 51:481–483
Catelan P, Lucchin F, Matarrese S (1988) Peak number density of non-Gaussian random fields. Phys Rev Lett 61:267–270
Chilès JP, Delfiner P (2012) Geostatistics: modeling spatial uncertainty, 2nd edn. Wiley, New York
Christakos G (1992) Random field models in earth sciences. Academic Press, San Diego
Cleveland WS, Loader C (1996) Smoothing by local regression: principles and methods. In: Härdle W, Shimek G (eds) Statistical theory and computational aspects of smoothing, Proceedings of the COMPSTAT ’94 satellite meeting, pp 10–49. Springer
Cooley D, Sain SR (2010) Spatial hierarchical modeling of precipitation extremes from a regional climate model. J Agric Biol Environ Stat 15:381–402
Cressie N (1993) Spatial statistics. Wiley, New York
Cressie N, Johannesson G (2008) Fixed rank kriging for very large spatial data sets. J R Stat Soc Ser B Stat Methodol 70:209–226
Cressie N, Wikle CK (2011) Statistics for spatio-temporal data. Wiley, Hoboken
Davison AC, Huser R, Thibaud E (2013) Geostatistics of dependent and asymptotically independent extremes. Math Geosci 45:511–529
Deutsch CV (2002) Geostatistical reservoir modeling. Oxford University Press, New York
Diggle PJ, Tawn JA, Moyeed RA (1998) Model-based geostatistics. J R Stat Soc Ser C Appl Stat 47:299–350
Elogne SN, Thomas C, Perrin O (2008) Nonparametric estimation of smooth stationary covariance functions by interpolation methods. Stat Inference Stoch Process 11:177–205
Emery X, Lantuéjoul C (2006) TBSIM: a computer program for conditional simulation of three-dimensional gaussian random fields via the turning bands method. Comput Geosci 32:1615–1628
Feynman RP (1982) Statistical mechanics. Benjamin and Cummings, Reading
Garcia O (1981) Simplified method-of-moments estimation for the Weibull distribution. NZ J For Sci 11:304–306
García-Soidán PH, Febrero-Bande M, González-Manteiga W (2004) Nonparametric kernel estimation of an isotropic semivariogram. J Stat Plan Inference 121:65–92
Gelfand AE (2012) Hierarchical modeling for spatial data problems. Spatial Stat 1:30–39
Genton MG (2004) Skew-elliptical distributions and their applications: a journey beyond normality. CRC Press, Boca Raton
Genton MG, Kleiber W (2015) Cross-covariance functions for multivariate geostatistics. Stat Sci 30:147–163
Gerber F, de Jong R, Schaepman ME, Schaepman-Strub G, Furrer R (2018) Predicting missing values in spatio-temporal remote sensing data. IEEE Trans Geosci Remote Sens 56:2841–2853
Ghosh S (2018) Kernel smoothing: principles, methods and applications. Wiley, Hoboken
Gneiting T, Kleiber W, Schlather M (2010) Matérn cross-covariance functions for multivariate random fields. J Am Stat Assoc 105:1167–1177
Goovaerts P (1997) Geostatistics for natural resources evaluation. Oxford University Press, New York
Grigoriu M (2013) Stochastic calculus: applications in science and engineering. Springer, London
Hall P, Fisher N, Hoffman B (1994) Properties of nonparametric estimators of autocovariance for stationary random fields. Ann Stat 22:2115–2134
Hennessey JP Jr (1977) Some aspects of wind power statistics. J Appl Meteorol 16:119–128
Hristopulos DT (2003) Spartan Gibbs random field models for geostatistical applications. SIAM J Sci Comput 24:2125–2162
Hristopulos DT (2015a) Covariance functions motivated by spatial random field models with local interactions. Stoch Env Res Risk Assess 29:739–754
Hristopulos DT (2015b) Stochastic local interaction (SLI) model: bridging machine learning and geostatistics. Comput Geosci 85:26–37
Hristopulos DT, Elogne S (2007) Analytic properties and covariance functions of a new class of generalized Gibbs random fields. IEEE Trans Inf Theory 53:4667–4679
Hristopulos DT, Porcu E (2014) Multivariate Spartan spatial random field models. Probab Eng Mech 37:84–92
Hristopulos DT, Petrakis M, Kaniadakis G (2014) Finite-size effects on return interval distributions for weakest-link-scaling systems. Phys Rev E 89:052142
Hristopulos DT, Petrakis MP, Kaniadakis G (2015) Weakest-link scaling and extreme events in finite-sized systems. Entropy 17:1103–1122
Kardar M (2007) Statistical physics of fields. Cambridge University Press, Cambridge
Kazianka H, Pilz J (2010) Spatial interpolation using copula-based geostatistical models. In: Atkinson PM, Lloyd CD (eds) geoENV VII—geostatistics for environmental applications. Springer, Dordrecht, pp 307–319
Kazianka H, Pilz J (2011) Bayesian spatial modeling and interpolation using copulas. Comput Geosci 37:310–319
Kleiber W, Katz RW, Rajagopalan B (2012) Daily spatiotemporal precipitation simulation using latent and transformed Gaussian processes. Water Resour Res 48:W01523/17
Kotz S, Nadarajah S (2004) Multivariate t-distributions and their applications. Cambridge University Press, New York
Lantuéjoul C (2002) Geostatistical simulation: models and algorithms. Springer, Berlin
Lebrun R, Dutfoy A (2009) An innovating analysis of the nataf transformation from the copula viewpoint. Probab Eng Mech 24:312–320
Li H, Zhang D (2013) Stochastic representation and dimension reduction for non-Gaussian random fields: review and reflection. Stoch Environ Res Risk Assess 27:1621–1635
Lindgren F, Rue H, Lindström J (2011) An explicit link between Gaussian fields and Gaussian Markov random fields: the SPDE approach. J R Stat Soc B 73:423–498
Martinez J, Iglewicz B (1984) Some properties of the Tukey g and h family of distributions. Commun Stat Theory Methods 13:353–369
Menafoglio A, Guadagnini A, Secchi P (2014) A kriging approach based on aitchison geometry for the characterization of particle-size curves in heterogeneous aquifers. Stoch Environ Res Risk Assess 28:1835–1851
Monahan AH (2018) Idealized models of the joint probability distribution of wind speeds. Nonlinear Process Geophys 25:335–353
Monbet V, Prevosto M (2001) Bivariate simulation of non stationary and non Gaussian observed processes: application to sea state parameters. Appl Ocean Res 23:139–145
Mussardo G (2010) Statistical field theory. Oxford University Press, New York
Nadaraya EA (1964) On estimating regression. Theory Probab Appl 9:141–142
Nataf A (1962) Determination des distributions dont les marges sont données. C R Acad Sci 225:42–43
Olea RA (2012) Geostatistics for engineers and earth scientists. Springer, New York
Oliveira VD, Kedem B, Short DA (1997) Bayesian prediction of transformed Gaussian random fields. J Am Stat Assoc 92:1422–1433
Rasmussen CE, Williams CKI (2006) Gaussian processes for machine learning. MIT Press, Cambridge (online; accessed on October 31, 2018)
Rivoirard J (1987) Two key parameters when choosing the kriging neighborhood. Math Geol 19:851–856
Rulloni V, Bustos O, Flesia AG (2012) Large gap imputation in remote sensed imagery of the environment. Comput Stat Data Anal 56:2388–2403
Sagias NC, Karagiannidis GK (2005) Gaussian class multivariate Weibull distributions: theory and applications in fading channels. IEEE Trans Inf Theory 51:3608–3619
Sang H, Gelfand AE (2009) Hierarchical modeling for extreme values observed over space and time. Environ Ecol Stat 16:407–426
Smith RS, O’Conell MD (2005) Interpolation and gridding of aliased geophysical data using constrained anisotropic diffusion to enhance trends. Geophysics 70:V121–V127
Sornette D (2004) Critical phenomena in natural sciences. Springer, Berlin
Tukey JW (1977) Exploratory data analysis, vol 1. Addison-Wesley, Reading
Vanmarcke E (2010) Random fields: analysis and synthesis. World Scientific, Hackensack
Vapnik VN (2000) The nature of statistical learning. Springer, New York
Varin C, Reid N, Firth D (2011) An overview of composite likelihood methods. Stat Sin 21:5–42
Wackernagel H (2003) Multivariate geostatistics. Springer, Berlin
Watson GS (1964) Smooth regression analysis. Sankhya Ser A 26:359–372
Xu G, Genton M (2017) Tukey g-and-h random fields. J Am Stat Assoc 112:1236–1249
Yaglom AM (1987) Correlation theory of stationary and related random functions I. Springer, New York
Yu K, Mateu J, Porcu E (2007) A kernel-based method for nonparametric estimation of variograms. Stat Neerl 61:173–197
Žukovič M, Hristopulos DT (2012) Reconstruction of missing data in remote sensing images using conditional stochastic optimization with global geometric constraints. Stoch Environ Res Risk Assess 27:785–806
Žukovič M, Hristopulos DT (2013) A directional gradient-curvature method for gap filling of gridded environmental spatial data with potentially anisotropic correlations. Atmos Environ 77:901–909
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Hristopulos, D.T., Baxevani, A. Effective probability distribution approximation for the reconstruction of missing data. Stoch Environ Res Risk Assess 34, 235–249 (2020). https://doi.org/10.1007/s00477-020-01765-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00477-020-01765-5