Bayesian inference for big spatial data using non-stationary spectral simulation
Introduction
There is increasing interest in using spatial statistical methods to model environmental processes. This is partially due to the emergence of remote sensing instruments and the popularity of Geographic Information Systems (GIS) software (e.g. see, Stein et al., 2006, Kalkhan, 2011 for standard references). The main goal of these analyses is to make predictions at observed and unobserved locations and provide uncertainty quantification. Early works make the assumption that the process is weakly stationary (e.g., see Cressie, 1993 for a review); that is, the covariance between the response at two different locations is a function of the spatial lag. However, non-stationary processes are much more common in environmental systems observed over large heterogeneous spatial domains (see Bradley et al., 2016 for a discussion). There are many models for non-stationary spatial data, and reduced ranks basis function expansions have become a popular choice (Banerjee et al., 2008, Cressie and Johannesson, 2008). However, there are inferential issues with reduced rank methods in the spatial setting (Stein, 2014), and consequently, there is renewed interest in proposing computationally efficient approximately full-rank models (Nychka et al., 2015, Datta et al., 2016a, Katzfuss, 2017, Bradley et al., 2020, Katzfuss et al., 2020). Thus, in this article our primary goal is to develop an efficient full rank non-stationary spatial statistical model.
There are numerous methods available to model non-stationary spatial data. For example, process convolution (Higdon, 1998, Paciorek and Schervish, 2006, Neto et al., 2014) convolves a known spatially referenced function with a spatial process typically assumed to be Gaussian. There are several related, but different approaches available. For example, using a finite integral representation of a process convolution results in a basis function expansion (Cressie and Wikle, 2011 page 157). Several parameterizations of basis function expansions are available, including: fixed rank kriging (Cressie and Johannesson, 2008), lattice kriging (Nychka et al., 2015), the predictive process (Banerjee et al., 2008), and a stochastic partial differential equation approach (Lindgren et al., 2011), among others.
An alternative to modeling non-stationarity with spatial basis functions is to assume a deformation (Sampson and Guttorp, 1992). Here, Euclidean space is “deformed”, or warped, so that far away locations can be more correlated, and vice versa. The parameter space for this method is considerably smaller than many parameterizations using spatial basis function expansions (e.g., see Cressie and Johannesson, 2008, Kang and Cressie, 2011 for examples), and is full rank. A similar but different approach to deformation is referred to as “dimension expansion” (Bornn et al., 2012). This method involves extending the dimension of the locations to a higher dimensional space. This methodology is based on the surprising result that every non-stationary covariance function in can be written as a stationary covariogram defined on locations in (Perrin and Meiring, 2003). Recently, Bornn et al. (2012) proposed a method of moments approach to analyzing spatio-temporal data using dimension expansion. To our knowledge, the dimension expansion approach has not been implemented using a Bayesian framework.
Thus, our first contribution is to introduce dimension expansion to the Bayesian setting to analyze big spatial data. To achieve a computationally efficient approach to dimension expansion in the Bayesian setting we offer three technical results. In our first technical result, we provide a “non-stationary version” of Bochner’s Theorem (Bochner, 1959). That is, we show that a non-stationary covariance function can be written as a convolution of the cosine function with a spectral density. The proof of this result simply involves combining Perrin and Meiring’s (2003) dimension expansion result with Bochner’s Theorem. This result opens up new opportunities to use spectral methods to model non-stationary spatial process. Other methods exist (e.g. see, Priestley, 1965, Martin, 1982) to model non-stationary data using spectral densities. However, these methods involve difficult to interpret types of “quasi-stationarity” assumptions (see, Sayeed and Jones, 1995 for a discussion), while our approach can be easily interpreted through dimension expansion. Castruccio and Guinness (2017) have also proposed an approach that uses evolutionary spectrum and incorporates an axial symmetric structure into their model.
The second technical result developed in this manuscript follows from our non-stationary version of Bochner’s Theorem. Specifically, we extend Mejía and Rodríguez-Iturbe’s (1974) method for spectral simulation of a stationary spatial processes to non-stationary spatial processes. This makes it straightforward to simulate in the high-dimensional non-stationary setting because spectral simulation does not require the inverse and storage of a high-dimensional covariance matrix (i.e., is matrix free). In practice, Gaussian spatial datasets correspond to a likelihood that is difficult to compute in high dimensions (i.e., when the dimension of the data is large) because this requires computation and dynamic memory. While non-stationary spectral simulation is “matrix free” the implementation of our model will require additional steps that include operations on low dimensional matrices. Consequently, we describe our method as “large-matrix-free”.
To further aid in computation we develop a new technique to re-weight the data model in a spatial hierarchical model. Our re-weighting method imposes conditional independence between the spatially co-varying random effect at observed locations and the data given the covariance parameters, and conditional independence between variance/covariance parameters and a large subset of the data. As a result, our re-weighting method allows us to use non-stationary spectral simulation to update the spatially co-varying random vector within a collapsed Gibbs sampler (Liu, 1994), and update other parameters based on a low-dimensional sub-sample. Furthermore, these conditional independence assumptions do not change the marginal distribution for the data, and consequently, the marginal statistical properties of the data are invariant to our added assumptions of conditional independence. This re-weighting technique is similar to what is done in a recent paper by Bradley (2021). Overall, our method is computationally feasible, full-rank, does not require storage of large matrices, and can be implemented on irregularly spaced locations. This last feature is particularly important as spectral methods based on the discrete Fourier transform often require regularly spaced locations (Fuentes, 2002, Fuentes et al., 2008).
The remaining sections of this article are organized as follows. Section 2 introduces our proposed statistical model, our first two theoretical results, and our re-weighting technique. In Section 3, we describe our implementation using a collapsed Gibbs sampler. In Section 4, we present a simulation study and compare our approach to the Nearest Neighbor Gaussian Process (NNGP; Datta et al., 2016b) and the general Vecchia approximation (Katzfuss and Guinness, 2019) when data are generated from an additive model with nonlinear fixed effects and stationary spatial random effects. Additionally, we present a simulation study where we compare to a spatial process convolution (SPC; Paciorek and Schervish, 2006) approach, and a stochastic partial differential equation (SPDE; Lindgren et al., 2011) approach in the purely non-stationary setting. In Section 5, we implement our model using the benchmark ozone dataset analyzed in Cressie and Johannesson (2008) and Zhang et al. (2019). Finally, Section 6 contains a discussion. For ease of exposition, all proofs are given in the appendices.
Section snippets
Methodology
Let be a spatial process defined for all , where is the spatial domain of interest in -dimensional Euclidean space, . We observe the value of at a finite set of locations , , . The data is decomposed additively with where , is the Gaussian process of principal interest, and the Gaussian process represents measurement error. The measurement error is assumed to be uncorrelated with mean-zero and variance .
The process is further
Computation: Collapsed Gibbs sampling
In this section, we outline the steps needed for collapsed Gibbs sampling. Gibbs sampling requires simulating from full-conditional distributions (Gelfand and Smith, 1990). In a collapsed Gibbs sampler, some of the events conditioned on in the full-conditional distribution are integrated out (Liu, 1994). In Algorithm 1, we present the steps needed for our proposed collapsed Gibbs sampler. The expressions for the full-conditional distributions listed in Algorithm 1 are derived in Appendix B.
Simulation studies
We simulate data in a variety of settings, and compare to several state-of-the-art methods in spatial statistics including the nearest neighbor Gaussian process (NNGP) model (Datta et al., 2016b), the general Vecchia approximation (Katzfuss and Guinness, 2019, Katzfuss et al., 2020), spatial process convolution (SPC; Paciorek and Schervish, 2006), and a stochastic partial differential equation (SPDE; Lindgren et al., 2011) approach. We refer to the general Vecchia approximation to a Gaussian
Ozone data application: Data description
As an illustration, we analyze the ozone dataset used in Cressie and Johannesson (2008), which has become a benchmark dataset in the spatial statistics literature (e.g., see Zhang et al., 2019). This dataset consists of values of total column ozone (TCO) in Dobson units (see Figure 6 for a plot of the data). The dataset was obtained through a Dobson spectrophotometer on board the Nimbus-7 polar orbiting satellite on October 1st, 1988. For details on how these data were collected see
Discussion
Bayesian analysis of big Gaussian spatial data is a challenging and important problem. We propose a Bayesian approach using non-stationary spectral simulation. To develop non-stationary spectral simulation we combine Bochner’s theorem with dimension expansion (Perrin and Meiring, 2003), and apply Mejía and Rodríguez-Iturbe’s (1974) spectral simulation method. The advantage is that no large matrix inversion or storage is needed to approximately simulate a non-stationary full-rank Gaussian
Acknowledgments
Jonathan Bradley and Hou-Cheng Yang’s research was partially supported by the U.S. National Science Foundation (NSF) grant SES-1853099. Jonathan Bradley’s research was also partially supported by the National Institutes of Health under grant 1R03AG070669-01. The authors would like to thank the reviewers for their helpful comments, and they are especially indebted to one reviewer’s coments, which led to the development of Section 3.2.
References (56)
- et al.
Improving the performance of predictive process modeling for large datasets
Comput. Statist. Data Anal.
(2009) Limitations on low rank approximations for covariance matrices of spatial data
Spatial Stat.
(2014)- et al.
Hierarchical Modeling and Analysis for Spatial Data
(2015) - et al.
Gaussian predictive process models for large spatial data sets
J. R. Stat. Soc. Ser. B Stat. Methodol.
(2008) Lectures on Fourier Integrals
(1959)- et al.
Modeling nonstationary processes through dimension expansion
J. Amer. Statist. Assoc.
(2012) An approach to incorporate subsampling into a generic Bayesian hierarchical model
J. Comput. Graph. Statist.
(2021)- et al.
Selection of rank and basis functions in the Spatial Random Effects model
- et al.
A comparison of spatial predictors when datasets could be very large
Stat. Surv.
(2016) - et al.
Hierarchical models for spatial data with errors that are correlated with the latent process
Statist. Sinica
(2020)
An evolutionary spectrum approach to incorporate large-scale geographical descriptors on global processes
J. R. Stat. Soc. Ser. C. Appl. Stat.
Statistics for Spatial Data
Fixed rank kriging for very large spatial data sets
J. R. Stat. Soc. Ser. B Stat. Methodol.
Statistics for Spatio-Temporal Data
Hierarchical nearest-neighbor Gaussian process models for large geostatistical datasets
J. Amer. Statist. Assoc.
On nearest-neighbor Gaussian process models for massive spatial data
Wiley Interdiscip. Rev. Comput. Stat.
Estimating covariances of locally stationary processes: Consistency of best basis methods
Package ‘spNNGP’
Spectral methods for nonstationary spatial processes
Biometrika
A class of nonseparable and nonstationary spatial temporal covariance functions
Environmetrics
Sampling-based approaches to calculating marginal densities
J. Am. Stat. Assoc.
Local Gaussian process approximation for large computer experiments
J. Comput. Graph. Statist.
GpGp: fast Gaussian process computation using Vecchia’s approximation
A process-convolution approach to modelling temperatures in the North Atlantic Ocean
Environ. Ecol. Stat.
Non-stationary spatial modeling
Bayesian Stat.
Spatial Statistics: Geospatial Information Modeling and Thematic Mapping
Bayesian inference for the Spatial Random Effects model
J. Amer. Statist. Assoc.
A multi-resolution approximation for massive spatial datasets
J. Amer. Statist. Assoc.
Cited by (4)
A review of industrial big data for decision making in intelligent manufacturing
2022, Engineering Science and Technology, an International JournalCitation Excerpt :proposed a novel method for the holistic, simulation driven ship design optimization under uncertainty in the big data era. [68] put forward a method of simulation for big data system based on Markov model and IoT system. [69] introduced Bayesian hierarchical modeling to dimension expansion, which originally has only been modeled using a method of moments approach and combined dimension expansion with a spectral method to model big non-stationary spatial fields in a computationally efficient manner. [70]
Spatial Clustering Regression of Count Value Data via Bayesian Mixture of Finite Mixtures
2023, Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining