Elsevier

Spatial Statistics

Volume 43, June 2021, 100520
Spatial Statistics

Compositionally-warped additive mixed modeling for a wide variety of non-Gaussian spatial data

https://doi.org/10.1016/j.spasta.2021.100520Get rights and content

Abstract

As with the advancement of geographical information systems, non-Gaussian spatial data sets are getting larger and more diverse. This study develops a general framework for fast and flexible non-Gaussian regression, especially for spatial/spatiotemporal modeling. The developed model, termed the compositionally-warped additive mixed model (CAMM), combines an additive mixed model (AMM) and the compositionally-warped Gaussian process to model a wide variety of non-Gaussian continuous data including spatial and other effects. A specific advantage of the proposed CAMM is that it requires no explicit assumption of data distribution unlike existing AMMs. Monte Carlo experiments show the estimation accuracy and computational efficiency of CAMM for modeling non-Gaussian data including fat-tailed and/or skewed distributions. Finally, the model is applied to crime data to examine the empirical performance of the regression analysis and prediction. The result shows that CAMM provides intuitively reasonable coefficient estimates and outperforms AMM in terms of prediction accuracy. CAMM is verified to be a fast and flexible model that potentially covers a wide variety of non-Gaussian data modeling.

Introduction

A wide variety of spatial and spatiotemporal data is now becoming available. In addition to conventional spatial data collected and published in a top-down manner (e.g., census statistics), an increasing number of sensing data sets (e.g., remotely sensed images, smart sensor data) and other spatial data assembled, estimated, and disseminated by private companies and volunteers are available in the era of open data (Volunteered Geographic information; see Haklay, 2013). Such databases include Google Earth Engine (https://earthengine.google.com/), WorldPop (https://www.worldpop.org/), and Worldometer (https://www.worldometers.info/). This rapid growth of spatiotemporal open data drives people to use regression modeling to reveal hidden factors behind social issues such as crime occurrence (e.g., Kajita and Kajita, 2020) the spread of COVID-19 (e.g., Sannigrahi et al., 2020).

Together with an increase of spatial data, regression modeling for a wide variety of non-Gaussian spatial data is getting more and more important. According to Yan et al. (2020), representative modeling for non-Gaussian spatial data is classified with (a) variable transformation and (b) generalized linear modeling. The former converts explained variables, which have a non-Gaussian distribution, to Gaussian variables through a transformation function. Logarithmic transformation (Dowd, 1982), Box–Cox transformation (Kitanidis and Shen, 1996), Tukey g-and-h transformation (Xu and Gentson, 2017), and other transformation functions have been used in geostatistics (see Cressie and Wikle, 2011). In the machine learning literature, the warped Gaussian process (WGP; Snelson et al., 2003) is a general framework including the aforementioned transformations. It has been extended to Bayesian WGP (Lázaro-Gredilla, 2012) and the compositionally-warped GP (CWGP; Rios and Tobar, 2019), which we will focus on later.

The generalized linear model (b) is widely used for spatial modeling as well. This model accommodates spatial random effects and has been developed and extended under the Bayesian framework (Gotway and Stroup, 1997, Diggle et al., 1998). While Bayesian models can be slow because of the simulation step, fast Bayesian inference including the integrated nested Laplace approximation (Rue et al., 2009) and Vecchia–Laplace approximation (Zilber and Katzfuss, 2021) has been developed for fast non-Gaussian spatial regression modeling. The generalized additive (mixed) model (e.g., Wood, 2017) is another popular model that is applicable for non-Gaussian spatial data modeling (see, Umlauf et al., 2015). Computationally efficient estimation algorithms for the generalized additive model, including the fast restricted maximum likelihood (REML; Wood, 2011) and the separation of anisotropic penalties algorithm (Rodríguez-Álvarez et al., 2015) have been developed and implemented in a wide variety of software packages (see Mai and Zhang, 2018 for review).1

A critical limitation of the above-mentioned non-Gaussian models includes the need to assume data distribution a priori.2 Because the true distribution behind data is usually unknown in empirical studies, distribution assumption can lead to model misspecification. Exceptionally, CWGP, which is categorized in (a), estimates data distribution without explicitly assuming an a priori distribution (see Section 3). In short, CWGP is a distribution-free model that is potentially useful for a wide variety of spatial and spatiotemporal data, although the original CWGP cannot consider spatial and/or temporal effects.

This study proposes compositionally-warped additive mixed modeling (CAMM) as a unified regression framework for a wide variety of Gaussian and non-Gaussian continuous data. It is developed by combining CWGP and additive mixed models (AMM) that take roles to estimate data distribution without explicit prior information and quantify the spatial and other smooth and/or group effects depending on covariates, respectively. The remainder of the sections is organized as follows. Sections 2 Additive mixed model (AMM), 3 Compositionally-warped Gaussian process (CWGP) introduce AMM and CWGP, respectively. Then, Section 4 develops CAMM by combining them. Section 5 performs Monte Carlo experiments to examine coefficients-estimation accuracy and computational efficiency of the developed model. Section 6 employs CAMM to crime analysis in Tokyo, Japan. Finally, Section 7 concludes our discussion.

Section snippets

Additive mixed model (AMM)

AMM is a regression model which accommodates spatial, temporal, and many other effects. The linear AMM is defined as follows: y=Xβ+k=1Kbk+ε,,εN0,σ2I,where y=[y1,,yN] is an N × 1 vector of explained variables, X is an N ×J matrix of J covariables assuming fixed effects, β is a J × 1 vector of fixed coefficients. “ ” represents the matrix transpose. σ2 is the variance parameter, 0 is a zero vector, and I is an identity matrix. K is the number of random effects.

The parameter bk is a vector

Compositionally-warped Gaussian process (CWGP)

Snelson et al. (2003) developed the warped GP (WGP) that converts non-Gaussian explained variables y to Gaussian variables through a transformation (or warping) function φ(). For independent samples, the WGP is defined as φωyNμ,σ2I,where φωy=[φωy1,,φω(yN)], ω denotes the parameters characterizing the transformation, and μ is a mean vector. Logarithmic, Box–Cox, and other transformations are available for the φω() function. The log-normal kriging and trans-Gaussian kriging (see Cressie, 2003

Model

This section combines CWGP and AMM and develops the CAMM for non-Gaussian data modeling without explicitly assuming any distribution.

The CAMM is defined as φωy=Xβ+k=1Kxkbk+ε,εN0,σ2I,with φωy=φωD(φωD1((φω2(φω1(y))))),bk=Ekγk,γkN(0k,σ2Vk(θk)). CAMM describes a wide variety of non-Gaussian explained variables through the transformation Eq. (16). While we consider Eq. (15) assuming the varying coefficients βk1+bk on xk, CAMM can consider many other effects by specifying the random effects

Setting

This section compares the accuracy of the coefficient estimation accuracy and computational time through Monte Carlo experiments. Among coefficient specifications, we focus on spatially varying coefficients (SVCs) because of the following reasons: (i) SVC modeling is popular in spatial fields (see Fotheringham et al., 2003); (ii) SVC estimates tend to be unstable (e.g., Wheeler and Tiefelsdorf, 2005, Cho et al., 2009, Murakami et al., 2017); (iii) SVC estimation can be very slow depending on

Outline

Crime modeling attracts attention because it can be used to maximize crime prevention with limited resources by forecasting the offenses (Meijer and Wessels, 2019, Kajita and Kajita, 2020). For example, in Santa Cruz, Los Angles, and other cities in the USA, crime modeling and prediction results are utilized for their policing arrangements such as designing effective patrol routes (PredPol: https://www.predpol.com/). The number of crimes reduced in each city as explained in //www.predpol.com/results/

Concluding remarks

This study proposed the compositionally-warped additive mixed modeling (CAMM), which is a general framework for fast non-Gaussian regression modeling. Unlike other non-Gaussian additive models, CAMM does not require any explicit assumptions on data distribution. CAMM will be useful for estimating spatial, temporal, and other effects while avoiding misspecification relating to data distribution. The Monte Carlo experiments verify the accuracy and computational efficiency of our model, and the

Acknowledgments

This work was supported by JSPS, Japan KAKENHI Grant Numbers 17H02046, 18H03628, 20K13261. Also, these research results were obtained from the research commissioned by the National Institute of Information and Communications Technology (NICT) , Japan.

References (56)

  • ZilberD. et al.

    Vecchia-Laplace approximations of generalized Gaussian processes for big non-Gaussian spatial data

    Comput. Statist. Data Anal.

    (2021)
  • Autant-BernardC. et al.

    Quantifying knowledge spillovers using spatial econometric models

    J. Reg. Sci.

    (2011)
  • BatesD.M.

    lme4: Mixed-effects modeling with R

    (2010)
  • BiniL.M. et al.

    Coefficient shifts in geographical ecology: an empirical evaluation of spatial and non-spatial regression

    Ecography

    (2009)
  • BrunsdonC. et al.

    Geographically weighted regression

    J. R. Stat. Soc. Ser. D Statist.

    (1998)
  • ChoS. et al.

    Extreme coefficients in geographically weighted regression and their effects on mapping

    GISci. Remote Sens.

    (2009)
  • CressieN.

    Statistics for Spatial Data

    (2003)
  • CressieN. et al.

    Statistics for Spatio-Temporal Data

    (2011)
  • Damianou, A., Lawrence, N., 2013. Deep gaussian processes. In: Proceedings of the International Conference on...
  • DiggleP.J. et al.

    Model-based geostatistics

    J. R. Stat. Soc. Ser. C. Appl. Stat.

    (1998)
  • DowdP.A.

    Lognormal kriging—the general case

    J. Int. Assoc. Math. Geol.

    (1982)
  • EgozcueJ.J. et al.

    Isometric logratio transformations for compositional data analysis

    Math. Geol.

    (2003)
  • FarrellG.

    Preventing repeat victimization

    Crime Justice

    (1995)
  • FelsonM.

    Crime and Everyday Life

    (1994)
  • FonsecaT.C. et al.

    Non-Gaussian spatiotemporal modelling through scale mixing

    Biometrika

    (2011)
  • FotheringhamA.S. et al.

    Geographically Weighted Regression: The Analysis of Spatially Varying Relationships

    (2003)
  • GotwayC.A. et al.

    A generalized linear model approach to spatial data analysis and prediction

    J. Agric. Biol. Environ. Stat.

    (1997)
  • GriffithD.A.

    Spatial Autocorrelation and Spatial Filtering: Gaining Understanding Through Theory and Scientific Visualization

    (2003)
  • Cited by (5)

    • Spherical Poisson point process intensity function modeling and estimation with measure transport

      2022, Spatial Statistics
      Citation Excerpt :

      Largely for historical reasons, statisticians also tend to be involved in applications that are different from those in the ML & AI community, and are drivers of integrating these new techniques with more classical statistical approaches in areas such as official statistics, geophysics, and ecology (e.g., McDermott and Wikle, 2019; Schafer et al., 2020). Articles that incorporate ML & AI in spatial statistics include those of Gerber and Nychka (2021) and Lenzi et al. (2021), who use neural networks for estimating parameters governing spatial processes; McDermott and Wikle (2019), who use echo state networks for the modeling and forecasting of spatio-temporal phenomena; Zammit-Mangion and Wikle (2020), who use convolution neural networks to model the dynamics of geophysical phenomena in a dynamic spatio-temporal model; and Sidén and Lindsten (2020), Li et al. (2020), Zammit-Mangion et al. (2021) and Murakami et al. (2021), who use ideas and technologies in deep learning to model arbitrarily complex spatial covariances or data models in spatial applications. This list is by no means exhaustive, but further highlights the considerable recent awareness of the role ML & AI has in solving some of the pertinent challenges in spatial statistics in the last few years.

    • Statistical Deep Learning for Spatial and Spatiotemporal Data

      2023, Annual Review of Statistics and Its Application
    View full text