Research paper
atakrig: An R package for multivariate area-to-area and area-to-point kriging predictions

https://doi.org/10.1016/j.cageo.2020.104471Get rights and content

Highlights

  • Point-scale direct- and cross-variograms can be deconvoluted from irregular areas.

  • Spatial scaling conversion between different supports implemented based on Kriging.

  • A flexible area kriging package for general application developed.

Abstract

Geostatistical interpolation methods are used in diverse disciplines, such as environmental science, ecology, and hydrology. With the increasing availability of areal spatial data, area-to-area and area-to-point interpolations have great application potential. In this study, based on the variogram deconvolution algorithm proposed by Goovaerts (2008), an open-source area-to-area kriging package atakrig is developed in the R environment. In atakrig, point-scale variogram and cross-variogram can be automatically deconvoluted from spatial areal samples. It provides a general framework for area-to-area and area-to-point ordinary kriging and cokriging. Two applications show that the package works well in river runoff prediction and missing data interpolation for remote sensing aerosol optical depth. The package can be deployed on different operating systems and computer hardware platforms.

Introduction

Spatial data conversion between different spatial scales is a common challenge in geography applications because of spatial mismatch between different data sources, such as various types of observation sensors, sample collection methods, or administrative units. In human population mapping, for example, national census data are usually reported in terms of administrative units. However, information about downscaled accurate population distributions is very important for making decisions pertaining to public matters, such as environmental pollution mitigation, public health, natural disaster relief operations, and infrastructure allocation (Liu et al., 2008; Dmowska and Stepinski, 2017). The purposes of scaling conversion and prediction might be different for different model inputs or analyses. Geostatistics-based spatial scaling statistics, area-to-point kriging, and area-to-area kriging have been widely used in diverse applications, including remote sensing downscaling (Pardo-Iguzquiza et al., 2010; Wang et al., 2015), crop yield prediction (Brus et al., 2018), determination of soil organic carbon distribution (Kerry et al., 2012), and disease mapping (Goovaerts, 2009). The aforementioned methods can be used to disaggregate areal data into predictions at the levels of points and different areas (Gotway and Young, 2002; Kyriakidis, 2004; Yoo and Kyriakidis, 2006). Schirrmann et al. (2012) mapped soil phosphorus content on a fine scale in coarse samples. Based on co-registered multivariate satellite sensor images, Atkinson et al. (2008) and Pardo-Igúzquiza et al. (2006) proposed area-to-point downscaling cokriging for super-resolution mapping with remotely sensed images, where the pixel size to be predicted is smaller than the pixel sizes of the input images. Jin et al. (2018) proposed geographically weighted area-to-area regression kriging to downscale soil moisture data and obtained 1-km resolution soil moisture products from 25-km resolution soil moisture products. In addition to downscaling, to validate a large-scale remote sensing product, Hu et al. (2015) applied area-to-area kriging to multiple irregular small-scale observations for predicting large-scale sensible heat flux, an important index of land surface water and heat balance. Instead of explicitly obtaining a point-scale variogram model derived from areal samples by deconvolution, Müeller and Thompson (2015) also proposed a topological restricted maximum likelihood method to consider spatial correlations between irregular areas and applied it to predict runoff signatures in ungauged basins.

A myriad of packages and software applications are available for geostatistical prediction, such as gslib (Deutsch and Journel, 1997), gstat (Pebesma and Wesseling, 1998; Pebesma, 2004), geoR (Ribeiro and Diggle, 2018), SAGA (Conrad et al., 2015), and ArcGIS (ESRI, 2018). According to the CRAN task view, in R language alone, more than 10 packages related to geostatistics have been developed (Bivand, 2019). gstat, which offers rich kriging functions, is one of the most frequently used packages in R language (Pebesma, 2004). The package can be used for area-to-point kriging. It discretizes an area into regular grids and each grid has equal weight in calculating areal average value. Moreover, it does not implement the function for fitting a point-scale variogram from areal data, which is necessary for area-to-area kriging. There are also some R packages to handle change of support prediction for spatial or spatial-temporal data, such as spatialCovariance for covariance matrix computing (Clifford, 2015), stcos (Bradley et al., 2015), stUPscales (Torres-Matallana, 2019), and rtop (Skøien et al., 2014). They mainly consider correlation in a single variate. DSCOKRI is another downscaling cokriging program for remote sensing only, and it is written in Fortran-77 language (Pardo-Iguzquiza et al., 2010). Although area-to-area kriging has been applied successfully in many studies, and the associated calculation flow is similar to that of traditional point support kriging, an easily accessible and applicable open-source package for general area-to-area cokriging has not been developed yet. We extended the framework of Goovaerts (2008) with cokriging and implement it in the R environment (R Core Team, 2019).

The remainder of the paper is organized as follows. In Section 2, we present the general theories of area-to-area and area-to-point kriging. The package implementation of the theories is described in Section 3. Then two applications are presented in Section 4. Finally, conclusions are presented in Section 5.

Section snippets

Model description

The theory of area-to-area kriging is well established and solved by many approaches (Goovaerts, 2008; Gottschalk, 1993; Gottschalk et al., 2006; Skøien et al., 2006, Skøien et al., 2014). Although it is similar to the theory of traditional point kriging, one of the most important differences between the two theories is covariance calculation between samples. In ordinary area-to-area kriging, for example, the predictor of an area with unknown value is calculated from a linear combination of

The atakrig package

We developed atakrig in the R environment, a very popular open-source environment for statistics. The package tries to bridge the gap between geostatistical areal prediction and point prediction. The main functions of atakrig include deconvolution of point-scale variograms from irregular/regular spatial areal data and implementation of area-to-area and area-to-point kriging and cokriging. This package was developed as a supplement to geostatistical point-to-point and point-to-area prediction.

River runoff prediction

Runoff prediction at unobserved locations is a fundamental problem in hydrology. In many ungauged or poorly gauged basins, reliable runoff prediction remains a major challenge. In this demonstration, we use data from the rtop package developed by Skøien et al. (2014). This set contains the recorded average summer runoff data of 57 catchment polygons in the federal country of Upper Austria. It is a subset of a full dataset consisting of 134 catchment polygons which can be downloaded from //www.hydro.tuwien.ac.at/fileadmin/mediapool-hydro/Downloads/rtopData.zip

Conclusion

Area-to-area interpolation is being used increasingly nowadays, especially because different supports of samples can be obtained easily from different sources. Herein, we developed an area-to-area kriging interpolation package in the popular R environment under the geostatistical framework. It supplements the existing rich geostatistic packages. The developed package can be distributed on different operating systems and on different computer hardware platforms, ranging from personal laptops to

Author contribution statement

MH designed the package and wrote the manuscript. YH edited various sections of the manuscript and wrote some function of the package.

Declaration of competing interest

We declare that we have no conflicts of interest.

Acknowledgments

This work was supported by the National Science and Technology Major Project (grant number 2017YFA0604804, 2017ZX10201302), the National Natural Science Foundation of China (grant number 41771434, 41601608). We thank Dr. Jon Olav Skøien and two anonymous reviewers for their constructive comments and suggestions which helped to improve the quality of this manuscript.

References (37)

  • P.M. Atkinson et al.

    Downscaling cokriging for super-resolution mapping of continua in remotely sensed images

    IEEE Trans. Geosci. Rem. Sens.

    (2008)
  • R. Bivand

    CRAN Task View: Analysis of Spatial Data

    (2019)
  • J.R. Bradley et al.

    Spatio-temporal change of support with application to American Community Survey multi-year period estimates

    Stat

    (2015)
  • D. Clifford

    spatialCovariance: Computation of Spatial Covariance Matrices for Data on Rectangles

    (2015)
  • O. Conrad et al.

    System for automated geoscientific analyses (SAGA) v. 2.1.4

    Geosci. Model Dev.

    (2015)
  • C.V. Deutsch et al.

    GSLIB: Geostatistical Software Library and User's Guide

    (1997)
  • D. Eddelbuettel et al.

    Rcpp: seamless R and C++ integration

    J. Stat. Software

    (2011)
  • D. Eddelbuettel
    (2013)
  • Cited by (21)

    • The transfR toolbox for transferring observed streamflow series to ungauged basins based on their hydrogeomorphology

      2023, Environmental Modelling and Software
      Citation Excerpt :

      They essentially rely on geostatistical spatial interpolation (Skøien and Blöschl, 2007; Isaak et al., 2014; Müller and Thompson, 2015; Farmer, 2016), transformation functions applied to hydrographs (Andréassian et al., 2012), or hydraulic routing (Tewolde and Smithers, 2007; Song et al., 2011). Some of these approaches were complemented by the development of R packages such as rtop (Skøien et al., 2014) and atakrig (Hu and Huang, 2020). In comparison with the rainfall–runoff approach, runoff–runoff modelling approaches do not require an explicit modelling of the hydrological response to meteorological forcing data.

    • Unpacking dasymetric modelling to correct spatial bias in environmental model outputs

      2022, Environmental Modelling and Software
      Citation Excerpt :

      Our study describes spatial bias correction with a method that complements methods found in the literature, mainly based on point-observations, in that it specifically addresses downscaling of data with (arbitrary) areal spatial support. Closest areal interpolation alternative, area-to-area kriging, results in areal output, but require areal features to be first cast to point representation for estimation of the semi-variogram (Hu and Huang, 2020). This intermediate step is unambiguous.

    • Modeling the electricity consumption by combining land use types and landscape patterns with nighttime light imagery

      2021, Energy
      Citation Excerpt :

      Kyriakidis [50] systematically described the general downscaling method of geostatistics, such as area to area Kriging and area to point Kriging. These methods use the principle of deconvolution to realize the downscaling estimation [51–53]. Cheng et al. [54] have combined area to point Kriging and random forest method to downscale population distribution at 1 km resolution.

    • CoYangCZ: a new spatial interpolation method for nonstationary multivariate spatial processes

      2024, International Journal of Geographical Information Science
    View all citing articles on Scopus
    View full text