Abstract
We develop a method for probabilistic prediction of extreme value hot-spots in a spatio-temporal framework, tailored to big datasets containing important gaps. In this setting, direct calculation of summaries from data, such as the minimum over a space-time domain, is not possible. To obtain predictive distributions for such cluster summaries, we propose a two-step approach. We first model marginal distributions with a focus on accurate modeling of the right tail and then, after transforming the data to a standard Gaussian scale, we estimate a Gaussian space-time dependence model defined locally in the time domain for the space-time subregions where we want to predict. In the first step, we detrend the mean and standard deviation of the data and fit a spatially resolved generalized Pareto distribution to apply a correction of the upper tail. To ensure spatial smoothness of the estimated trends, we either pool data using nearest-neighbor techniques, or apply generalized additive regression modeling. To cope with high space-time resolution of data, the local Gaussian models use a Markov representation of the Matérn correlation function based on the stochastic partial differential equations (SPDE) approach. In the second step, they are fitted in a Bayesian framework through the integrated nested Laplace approximation implemented in R-INLA. Finally, posterior samples are generated to provide statistical inferences through Monte-Carlo estimation. Motivated by the 2019 Extreme Value Analysis data challenge, we illustrate our approach to predict the distribution of local space-time minima in anomalies of Red Sea surface temperatures, using a gridded dataset (11315 days, 16703 pixels) with artificially generated gaps. In particular, we show the improved performance of our two-step approach over a purely Gaussian model without tail transformations.
Similar content being viewed by others
References
Bivand, R., Gómez-Rubio, V., Rue, H.: Spatial data analysis with r-INLA with some extensions. J. Stat. Softw. 63(20), 1–31 (2015)
Blanchet, J., Creutin, J.-D.: Co-occurrence of extreme daily rainfall in the french mediterranean region. Water Resour. Res. 53(11), 9330–9349 (2017)
Bortot, P., Coles, S., Tawn, J.: The multivariate gaussian tail model: an application to oceanographic data. J. Royal Stat. Soc. Series C (Appl. Stat.) 49(1), 31–049 (2000)
Cantin, N.E., Cohen, A.L., Karnauskas, K.B., Tarrant, A.M., McCorkle, D.C.: Ocean warming slows coral growth in the central Red Sea. Science 329, 322–325 (2010)
Castro-Camilo, D., Huser, R.: Local likelihood estimation of complex tail dependence structures, applied to U.S. precipitation extremes. Journal of the American Statistical Association, To appear (2019)
Castro-Camilo, D., Huser, R., Rue, H.: A spliced gamma-generalized Pareto model for short-term extreme wind speed probabilistic forecasting. J. Agricult. Biol Environ Stat 24(3), 517–534 (2019)
Chaidez, V., Dreano, D., Agusti, S., Duarte, C.M., Hoteit, I.: Decadal trends in red sea maximum surface temperature. Sci. Reports 7(1), 1–8 (2017)
Chavez-Demoulin, V., Davison, A.C.: Generalized additive modelling of sample extremes. J. Royal Stat. Soc. Series C (Appl. Stat.) 54, 207–222 (2005)
Coles, S., Heffernan, J., Tawn, J.: Dependence measures for extreme value analyses. Extremes 2(4), 339–365 (1999)
Cressie, N.: Statistics for spatial data. Wiley, New York (1993)
Cressie, N., Wikle, C.K.: Statistics for spatio-temporal data. Wiley, New York (2015)
Davison, A.C., Padoan, S., Ribatet, M.: Statistical modelling of spatial extremes. Stat. Sci. 27(2), 161–186 (2012)
Davison, A.C., Ramesh, N.I.: Local likelihood smoothing of sample extremes. J. Royal Stat. Soc. Series B (Stat. Methodol.) 62, 191–208 (2000)
De Coninck, A., De Baets, B., Kourounis, D., Verbosio, F., Schenk, O., Maenhout, S., Fostier, J.: Needles: toward large-scale genomic prediction with marker-by-environment interaction. Genetics 203(1), 543–555 (2016)
Donlon, C.J., Martin, M., Stark, J., Roberts-Jones, J., Fiedler, E., Wimmer, W.: The operational sea surface temperature and sea ice analysis (OSTIA) system. Remote Sens. Environ. 116, 140–158 (2012)
Ferreira, A., De Haan, L.: The generalized Pareto process; with a view towards application and simulation. Bernoulli 20(4), 1717–1737 (2014)
Gerber, F., De Jong, R., Schaepman, M.E., Schaepman-Strub, G., Furrer, R.: Predicting missing values in spatio-temporal remote sensing data. IEEE Trans. Geosci. Remote Sens. 56(5), 2841–2853 (2018)
Gneiting, T., Ranjan, R.: Comparing density forecasts using threshold- and quantile-weighted scoring rules. J. Business Econ. Stat. 29(3), 411–422 (2011)
Hazra, A., Huser, R.: Estimating high-resolution Red Sea surface temperature hotspots, using a low-rank semiparametric spatial model. arXiv:1912.05657 (2020)
Henn, B., Raleigh, M.S., Fisher, A., Lundquist, J.D.: A comparison of methods for filling gaps in hourly near-surface air temperature data. J. Hydrometeorol. 14(3), 929–945 (2013)
Hoegh-Guldberg, O., Cai, R., Poloczanska, E.S., Brewer, P., Sundby, S., Hilmi, K., Fabry, V.J., Jung, S.: The Ocean. In: Barros, V.R., Field, C.B., Dokken, D.J., Mastrandrea, M.D., Mach, K.J., Bilir, T.E., Chatterjee, M., Ebi, K.L., Estrada, Y.O., Genova, R.C., Girma, B., Kissel, E.S., Levy, A.N., Maccracken, S., Mastrandrea, P.R., White, L.L. (eds.) Climate change 2014: impacts, adaptation, and vulnerability. Part B2 regional aspects. contribution of working group II to the fifth assessment report of the intergovernmental panel on climate change, pp 1655–1731. Cambridge University Press, Cambridge (2014)
Huser, R.: Editorial: EVA 2019 data competition on spatio-temporal prediction of Red Sea surface temperature extremes. Extremes, To appear (2020)
Jonathan, P., Randell, D., Wu, Y., Ewans, K.: Return level estimation from non-stationary spatial data exhibiting multidimensional covariate effects. Ocean Eng. 88, 520–532 (2014)
Kourounis, D., Fuchs, A., Schenk, O.: Toward the next generation of multiperiod optimal power flow solvers. IEEE Trans Power Syst 33(4), 4005–4014 (2018)
Krainski, E.T., Gȯmez-Rubio, V., Bakka, H., Lenzi, A., Castro-Camilo, D., Simpson, D., Lindgren, F., Rue, H: Advanced spatial modeling with stochastic partial differential equations using R and INLA. Chapman and Hall/CRC, London (2018)
Lindgren, F., Rue, H., Lindström, J.: An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. J. Royal Stat. Soc. Series B (Stat. Methodol) 73(4), 423–498 (2011)
Mariethoz, G., McCabe, M.F., Renard, P.: Spatiotemporal reconstruction of gaps in multivariate fields using the direct sampling approach. Water Resour. Res. 48(10) (2012)
Mhalla, L., de Carvalho, M., Chavez-Demoulin, V.: Regression-type models for extremal dependence. Scand. J. Stat. 46(4), 1141–1167 (2019)
van Niekerk, J., Bakka, H., Rue, H, Schenk, L.: New frontiers in Bayesian modeling using the INLA package in R. arXiv:1907.10426 (2019)
Opitz, T.: Latent Gaussian modeling and INLA: a review with focus on space-time applications. J. French Stat. Soc. (Special Issue on Space-Time Statistics) 158(3), 62–85 (2017)
Opitz, T., Huser, R., Bakka, H., Rue, H.: INLA goes extreme: Bayesian tail regression for the estimation of high spatio-temporal quantiles. Extremes 21(3), 441–462 (2018)
Padhee, S.K., Dutta, S.: Spatio-temporal reconstruction of MODIS NDVI by regional land surface phenology and harmonic analysis of time-series. GISci. Remote Sens. 56(8), 1261–1288 (2019)
Pauli, F., Coles, S.: Penalized likelihood inference in extreme value analyses. J. Appl. Stat. 28(5), 547–560 (2001)
Rue, H., Martino, S., Chopin, N.: Approximate Bayesian inference for latent gaussian models by using integrated nested laplace approximations. J. Royal Stat. Soc. Series B (Stat. Methodol.) 71(2), 319–392 (2009)
Rue, H., Riebler, A., Sørbye, S.H., Illian, J.B., Simpson, D.P., Lindgren, F.K.: Bayesian computing with INLA: a review. Annual Rev. Stat. Appl. 4, 395–421 (2017)
Simpson, D., Rue, H., Riebler, A., Martins, T.G., Sørbye, S.H.: Penalising model component complexity: a principled, practical approach to constructing priors. Stat. Sci. 32(1), 1–28 (2017)
Simpson, E.S., Wadsworth, J.L.: Conditional modelling of spatio-temporal extremes for Red Sea surface temperatures. arXiv:2002.04362(2020)
Spalding, M., Spalding, M.D., Ravilious, C., Green, E.P., et al.: World atlas of coral reefs. University of California Press, Berkeley (2001)
Sun, Y., Genton, M.G.: Functional boxplots. J. Comput. Graph. Stat. 20(2), 316–334 (2011)
Thibaud, E., Opitz, T.: Efficient inference and simulation for elliptical pareto processes. Biometrika 102(4), 855–870 (2015)
Tierney, L., Kadane, J.B.: Accurate approximations for posterior moments and marginal densities. J. Am. Stat. Assoc. 81(393), 82–86 (1986)
Verbosio, F., Coninck, A.D., Kourounis, D., Schenk, O.: Enhancing the scalability of selected inversion factorization algorithms in genomic prediction. J. Comput. Sci. 22(Supplement C), 99–108 (2017)
Wadsworth, J.L., Tawn, J.: Higher-dimensional spatial extremes via single-site conditioning. arXiv:1912.06560 (2019)
Wang, G., Garcia, D., Liu, Y., De Jeu, R., Dolman, A.J.: A three-dimensional gap filling method for large geophysical datasets: application to global satellite soil moisture observations. Environ. Modell. Softw. 30, 139–142 (2012)
Wood, S.N.: Thin plate regression splines. J. Royal Stat. Soc. Series B (Stat. Methodol.) 65(1), 95–114 (2003)
Wood, S.N.: Low-rank scale-invariant tensor product smooths for generalized additive mixed models. Biometrics 62(4), 1025–1036 (2006)
Wood, S.N.: Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. J. Royal Stat. Soc. Series B (Stat. Methodol.) 73(1), 3–36 (2011)
Wood, S.N.: Generalized additive models: an introduction with r, 2nd edn. Chapman and Hall/CRC, London (2017)
Wood, S.N., Pya, N., Sȧfken, B.: Smoothing parameter and model selection for general smooth models. J. Am. Stat. Assoc. 111(516), 1548–1563 (2016)
Xing, C., Chen, N., Zhang, X., Gong, J.: A machine learning based reconstruction method for satellite remote sensing of soil moisture images with in situ observations. Remote Sens. 9(5), 484 (2017)
Xu, G., Genton, M.G.: Tukey g-and-h random fields. J. Am. Stat. Assoc. 112(519), 1236–1249 (2017)
Yin, G., Mariethoz, G., McCabe, M.F.: Gap-filling of Landsat 7 imagery using the direct sampling method. Remote Sens. 9(1), 12 (2017)
Youngman, B.D.: Generalized additive models for exceedances of high thresholds with an application to return level estimation for U.S. wind gusts. J. Am. Stat. Assoc. 114(528), 1865–1879 (2019)
Yuan, H., Dai, Y., Xiao, Z., Ji, D., Shangguan, W.: Reprocessing the MODIS leaf area index products for land surface and climate modelling. Remote Sens. Environ. 115(5), 1171–1187 (2011)
Acknowledgements
This work started when Daniela Castro-Camilo was a postdoctoral fellow at King Abdullah University of Science and Technology (KAUST). Support from the KAUST Supercomputing Laboratory and access to Shaheen is therefore gratefully acknowledged. Linda Mhalla acknowledges the financial support of the Swiss National Science Foundation.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Castro-Camilo, D., Mhalla, L. & Opitz, T. Bayesian space-time gap filling for inference on extreme hot-spots: an application to Red Sea surface temperatures. Extremes 24, 105–128 (2021). https://doi.org/10.1007/s10687-020-00394-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10687-020-00394-z