Elsevier

Spatial Statistics

Volume 38, August 2020, 100459
Spatial Statistics

Kernel mean embedding based hypothesis tests for comparing spatial point patterns

https://doi.org/10.1016/j.spasta.2020.100459Get rights and content

Abstract

This paper introduces an approach for detecting differences in the first-order structures of spatial point patterns. The proposed approach leverages the kernel mean embedding in a novel way by introducing its approximate version tailored to spatial point processes. While the original embedding is infinite-dimensional and implicit, our approximate embedding is finite-dimensional and comes with explicit closed-form formulas. With its help we reduce the pattern comparison problem to the comparison of means in the Euclidean space. Hypothesis testing is based on conducting t-tests on each dimension of the embedding and combining the resulting p-values using one of the recently introduced p-value combination techniques. If desired, corresponding Bayes factors can be computed and averaged over all tests to quantify the evidence against the null. The main advantages of the proposed approach are that it can be applied to both single and replicated pattern comparisons and that neither bootstrap nor permutation procedures are needed to obtain or calibrate the p-values. Our experiments show that the resulting tests are powerful and the p-values are well-calibrated; two applications to real world data are presented.

Introduction

Comparison of spatial point patterns is of practical importance in a number of scientific fields including ecology, epidemiology, and criminology. For example, such comparisons may reveal differential effects of the environment on plant species spread, uncover spatial variation in disease risk, or detect seasonal differences in crime locations (see e.g. Baddeley et al., 2015). While exploratory analyses are vital for obtaining deep insights about pattern differences, such analyses can be subjective unless supplemented with formal hypothesis tests.

In this paper we are interested in comparing the first-order structures of point patterns. Consider two point processes P and Q over the region AR2 with the first-order intensities given by λP() and λQ(). Given realizations from these processes, we would like to detect whether there are statistically significant differences in the first-order intensities. However, testing for equality, λP()=λQ(), is not flexible enough. For example, when studying the spatial variation in disease risk, the diseased population is only a small fraction compared to the control population; naturally, the corresponding observed patterns will differ significantly in the overall counts of points—yet this is irrelevant to the substantive question. The more appropriate null hypothesis posits that there exists a constant c such that λP()=cλQ(). Equality within a constant factor means that the intensities have the same functional form of spatial variation. To avoid dealing with the nuisance parameter c, one can normalize the intensities to integrate to 1, giving rise to probability distributions p() and q() over the region A; in Cucala (2006) and Fuentes-Santos et al. (2017) these are called the densities of event locations. Now, our null hypothesis is equivalent to the equality p()=q(), which is an instance of the two-sample hypothesis testing problem (see, e.g. Anderson et al., 1994).

In practice it is desirable to have nonparametric hypothesis testing approaches to pattern comparison that: (a) capture a particular aspect of difference; (b) can be applied to both single and replicated patterns; and (c) do not depend on resampling methods for (re-)calibration. Early nonparametric tests for pattern comparison (Diggle et al., 1991, Hahn, 2012) probe for differences in the K-functions (Ripley, 1976) of point patterns. Being based on a second-order property, the detected differences conflate the spatial variation in intensities with the interaction properties. Concentrating on the first-order properties, Kelsall and Diggle, 1995b, Kelsall and Diggle, 1995a and Davies and Hazelton (2010) estimate the logarithm of the ratio between the intensities using kernel density estimation. Other approaches rely on counts of events (Andresen, 2009, Alba-Fernández et al., 2016) or normalized count of events (Zhang and Zhuang, 2017) within pre-specified areas. The recent work (Fuentes-Santos et al., 2017) detects differences in the first-order structure by looking at the L2-distance between kernel density estimates of the probability distributions p() and q(). All of these truly first-order comparison approaches are limited to single patterns, and with the exception of Zhang and Zhuang (2017) they are calibrated with resampling methods. The latter issue can result in prohibitive computation costs in industrial settings where thousands of pattern comparisons may be needed together with requiring high precision p-values to account for multiple testing corrections.

In this paper, we introduce an approach that leverages the kernel mean embedding (KME) (Berlinet and Thomas-Agnan, 2004, Smola et al., 2007, Gretton et al., 2012, Muandet et al., 2017) to test for the equality p()=q(), which allows us to detect differences in the first-order structure of point patterns. Our approach is based on introducing an approximate version of the kernel mean embedding, aKME. While the original KME is infinite-dimensional and implicit, our approximate kernel mean embedding is finite-dimensional and comes with explicit closed-form formulas. With the help of aKME, we reduce the pattern comparison problem to the comparison of means in the Euclidean space.

The resulting pattern comparison test is surprisingly simple and a complete implementation is provided in the Appendix B. The computation of aKME is illustrated in Fig. 1.1. First, the points in the pattern are projected onto a line, which is followed by the application of sincos functions with a specific frequency; this step can be seen as wrapping the line onto a circle of some radius. The resulting sincos values are separately averaged to give two numbers that provide a “fingerprint” of the point pattern behavior with respect to the direction of the line and the scale that corresponds to the frequency (i.e. circle circumference). The process is repeated with a multitude of lines and frequencies; assuming m lines and frequencies per line, we obtain m such fingerprints; these are concatenated together to give an overall D=2m dimensional aKME. Finally, to compare patterns, we compare their aKMEs by applying t-tests on each coordinate of the embedding. We combine the resulting D separate p-values into a single overall p-value using one of the recently introduced p-value combination techniques, such as harmonic mean (Good, 1958, Wilson, 2019) or Cauchy combination test (Liu and Xie, 2020), leading to well-calibrated and powerful tests as confirmed by the simulations.

The connection to the original KME guides the choice of the parameters for this construction and provides approximation guarantees that are crucial to the consistency of the hypothesis testing. The main advantages of the proposed approach are that it can be applied to both single and replicated pattern comparisons, and that neither bootstrap nor permutation procedures are needed to obtain or calibrate the p-values. In addition, being based on t-tests, one can compute Bayes factors for each of the involved tests allowing to quantify evidence supporting the hypothesis of difference for each directionality/scale represented in aKME; one can also report the averaged Bayes factor as an overall summary of this evidence.

The ideas developed in this paper are in line with the recent surge of interest in applying reproducing kernel Hilbert space techniques to the comparison of probability distributions. For example, the Maximum Mean Discrepancy (MMD) is a measure of divergence between distributions (Gretton et al., 2012) which has already found numerous applications in statistics and machine learning. Similarly, the kernel mean embedding (Berlinet and Thomas-Agnan, 2004, Smola et al., 2007) has been receiving increased attention, see for example the recent review (Muandet et al., 2017) and citations therein. Some of these notions can be traced back and seen as closely related to N-distances (Zinger et al., 1992) and energy distances (Baringhaus and Franz, 2004, Székely and Rizzo, 2005). Our approximate embedding has its roots in the Random Fourier Features (Rahimi and Recht, 2007), its improvements (Avron et al., 2016, Yu et al., 2016, Munkhoeva et al., 2018), and its application to the MMD (Zhao and Meng, 2015); the scheme we propose in this paper is tailored to the two-dimensional setting, and has the ability to provide higher-order approximations. There has already been some interest in applying the reproducing kernel methodology to spatial point processes, the roots going back to the 1980s (Bartoszynski et al., 1981, Silverman, 1982) and more recently in Flaxman et al., 2017, Jitkrittum et al., 2017 and Yang et al. (2019). We discuss some of the connections between reproducing kernel machinery and kernel density estimation based methods commonly used with spatial point patterns in Section 2.

The main contributions of this paper are the proposed approximate kernel mean embedding (Section 3) and the hypothesis testing framework for comparison of point patterns (Section 5). After investigating the empirical properties of the resulting tests on simulated data (Section 6.1), we present applications of the methodology to two real world datasets (Section 6.2).

Section snippets

Kernel mean embedding.

Mathematically rigorous development of the kernel mean embedding requires the machinery of the reproducing kernel Hilbert spaces, and the interested reader is referred to Muandet et al. (2017). For our purposes, it will be sufficient to have an intuitive understanding of the kernel mean embedding as expressed in terms of the feature maps.

Given a data instance xX (in our context x will be a point in some region of R2), a nonlinear transformation ϕ:XF can be used to lift this point into a

Approximate kernel mean embedding

Instead of relying on the kernel trick, in this section we take an orthogonal path to avoiding the infinite-dimensionality of the kernel mean embedding. Namely, specializing to the case X=R2, we construct a finite-dimensional approximate feature map ϕ:R2RD such that k(x,y)ϕ(x)ϕ(y). As a result, testing p=q can be reduced to testing μp=μq in the D-dimensional Euclidean space.

Since it allows obtaining closed form formulas, for the rest of the paper we will concentrate on the Gaussian kernel k(x

Spatial point pattern aKME

The goal of this section is to introduce the aKME for a point process and show that it can be estimated in an unbiased manner from the realizations of the point process. As a preliminary, we will go over the notions of the first-order intensity and the density of event locations for spatial point processes. While for an inhomogeneous Poisson process these two are equivalent up to a normalization, in general there are differences that should be taken into consideration when conducting replicated

Comparing spatial point patterns with aKME

Consider two point processes P and Q in AR2 with the first-order intensity functions given by λP() and λQ(). We would like to test the null hypothesis of whether there exists a constant c such that λP()=cλQ(). Equality up to a constant factor means that the intensities of the two processes have the same functional form. This is different from testing λP()=λQ() because our null hypothesis can hold true even if the realizations from P and Q have vastly differing numbers of events. We start

Experiments

Our goal in this section is to investigate the size and the power of the proposed tests. We also demonstrate two applications to real world data. The aKME embedding is constructed using four radial projections and four roots for the polar Gauss–Hermite formula (i.e. m=4, =4 in the notation of Section 3) resulting in D=32. To avoid the selection of the kernel width parameter, we concatenate together aKMEs corresponding to σ=116,18, and 14 when the point pattern domain is the unit square; the

Conclusion

We have introduced an approach to detect differences in the first-order structure of spatial point patterns. The proposed approach leverages the kernel mean embedding in a novel way, by introducing its approximate version. Hypothesis testing is based on conducting t-tests on each dimension of the approximate embedding and combining them using either the harmonic mean or Cauchy approach. Our experiments confirm that the resulting tests are powerful and the p-values are well-calibrated. Two

Acknowledgments

We are grateful to Alfred Stein for his editorial efforts and to the reviewers for their constructive comments which have led to a much improved version of this article. We thank Tonglin Zhang and Isabel Fuentes-Santos for providing the source code of the methods from their respective papers.

References (61)

  • ZhangT. et al.

    Testing proportionality between the first-order intensity functions of spatial point processes

    J. Multivariate Anal.

    (2017)
  • AcostaJ. et al.

    On the effective geographic sample size

    J. Stat. Comput. Simul.

    (2018)
  • Arias-CastroE. et al.

    Distribution-free multiple testing

    Electron. J. Stat.

    (2017)
  • AvronH. et al.

    Quasi-Monte Carlo feature maps for shift-invariant kernels

    J. Mach. Learn. Res.

    (2016)
  • BaddeleyA. et al.

    Spatial Point Patterns: Methodology and Applications with R

    (2015)
  • BarberR.F. et al.

    Controlling the false discovery rate via knockoffs

    Ann. Statist.

    (2015)
  • BartoszynskiR. et al.

    Some nonparametric techniques for estimating the intensity function of a cancer related nonstationary Poisson process

    Ann. Statist.

    (1981)
  • BenjaminiY. et al.

    Controlling the false discovery rate: A practical and powerful approach to multiple testing

    J. R. Stat. Soc. Ser. B Stat. Methodol.

    (1995)
  • BerlinetA. et al.

    Reproducing kernel hilbert space in probability and statistics

    (2004)
  • BillingsleyP.

    Probability and Measure

    (1995)
  • ChenS.X. et al.

    A two-sample test for high-dimensional data with applications to gene-set testing

    Ann. Statist.

    (2010)
  • CoverT.M.

    Geometrical and statistical properties of systems of linear inequalities with applications in pattern recognition

    IEEE Trans. Electron. Comput.

    (1965)
  • CucalaL.

    Two-Dimensional Spacings and Noisy Observations in the Analysis of Spatial Point Patterns

    (2006)
  • DaviesT.M. et al.

    Adaptive kernel estimation of spatial relative risk

    Stat. Med.

    (2010)
  • DiggleP.J.

    A point process modelling approach to raised incidence of a rare phenomenon in the vicinity of a prespecified point

    J. R. Stat. Soc. Ser. A Stat. Soc.

    (1990)
  • DiggleP.J. et al.

    Analysis of variance for replicated spatial point patterns in clinical neuroanatomy

    J. Amer. Statist. Assoc.

    (1991)
  • DuongT. et al.

    Closed-form density-based framework for automatic detection of cellular morphology changes

    Proc. Natl. Acad. Sci.

    (2012)
  • Eric, M., Bach, F.R., Harchaoui, Z., 2008. Testing for homogeneity with kernel Fisher discriminant analysis. In:...
  • FlaxmanS. et al.

    Poisson intensity estimation with reproducing kernels

    Electron. J. Stat.

    (2017)
  • GoodI.J.

    Significance tests in parallel and in series

    J. Amer. Statist. Assoc.

    (1958)
  • View full text