Skip to main content
Log in

A Multiscale Spatially Varying Coefficient Model for Regional Analysis of Topsoil Geochemistry

  • Published:
Journal of Agricultural, Biological and Environmental Statistics Aims and scope Submit manuscript

Abstract

A motivating example for this paper is to study a topsoil geochemical process across a large region. In regional environmental health studies, ambient levels of toxic substances in topsoil are commonly used as surrogates for personal exposure to toxic substances. However, toxicity levels in topsoil are usually sparsely measured at a limited number of point locations. Consequently, topsoil measurements only provide highly localized regional information and cannot be representative of the surrounding area. Instead, it is standard practice to use point-referenced measurements of stream sediments, because they are widely available across a region and are correlated with topsoil measurements at nearby locations. For more effective regional modeling of topsoil geochemistry, we develop a spatially varying coefficient model that integrates point-level topsoil and point-referenced area-level stream sediment data. The proposed model incorporates two spatial characteristics: the local spatial autocorrelation in the latent topsoil process and the spatially varying relationship between the latent topsoil and stream sediment processes. The former is modeled indirectly via a conditional autoregressive model for the stream sediment process, and the latter is modeled by spatially varying coefficients that follow a multivariate Gaussian process. We apply the proposed model to a real dataset of arsenic concentration and demonstrate better performance than competing models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Aelion, C. M., Davis, H. T., McDermott, S., and Lawson, A. B. (2008), “Metal concentrations in rural topsoil in South Carolina: potential for human health impact,” Science of the Total Environment, 402, 149–156.

    Article  Google Scholar 

  • Calder, C., Craigmile, P., and Zhang, J. (2009), “Regional spatial modeling of topsoil geochemistry,” Biometrics, 65, 206–215.

    Article  MathSciNet  Google Scholar 

  • Calder, C. A., Holloman, C. H., Bortnick, S. M., Strauss, W., and Morara, M. (2008), “Relating ambient particulate matter concentration levels to mortality using an exposure simulator,” Journal of the American Statistical Association, 103, 137–148.

    Article  MathSciNet  Google Scholar 

  • Cannon, W. F., Woodruff, L. G., and Pimley, S. (2004), “Some statistical relationships between stream sediment and soil geochemistry in northwestern Wisconsin–can stream sediment compositions be used to predict compositions of soils in glaciated terranes?” Journal of Geochemical Exploration, 81, 29–46.

    Article  Google Scholar 

  • Cressie, N., Buxton, B. E., Calder, C. A., Craigmile, P. F., Dong, C., McMillan, N. J., Morara, M., Santner, T. J., Wang, K., Young, G., et al. (2007), “From sources to biomarkers: a hierarchical Bayesian approach for human exposure modeling,” Journal of Statistical Planning and Inference, 137, 3361–3379.

    Article  MathSciNet  Google Scholar 

  • Gelfand, A. E., Kim, H.-J., Sirmans, C., and Banerjee, S. (2003), “Spatial modeling with spatially varying coefficient processes,” Journal of the American Statistical Association, 98, 387–396.

    Article  MathSciNet  Google Scholar 

  • Gelman, A., et al. (2006), “Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper),” Bayesian Analysis, 1, 515–534.

    Article  MathSciNet  Google Scholar 

  • Geweke, J. (1992), “Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments,” In Proceedings of the Fourth Valencia International Conference on Bayesian Staitiscs, New York: Oxford University, 169–193.

  • Gneiting, T. and Raftery, A. E. (2007), “Strictly proper scoring rules, prediction, and estimation,” Journal of the American Statistical Association, 102, 359–378.

    Article  MathSciNet  Google Scholar 

  • Grunsky, E. C., Drew, L. J., and Sutphin, D. M. (2009), “Process recognition in multi-element soil and stream-sediment geochemical data,” Applied Geochemistry, 24, 1602–1616.

    Article  Google Scholar 

  • Kim, H. and Lee, J. (2017), “Hierarchical spatially varying coefficient process model,” Technometrics, 59, 521–527.

    Article  MathSciNet  Google Scholar 

  • Krueger, F., Lerch, S., Thorarinsdottir, T. L., and Gneiting, T. (2016), “Probabilistic forecasting and comparative model assessment based on Markov chain Monte Carlo output,” arXiv preprint arXiv:1608.06802.

  • Martín, J. A. R., Arias, M. L., and Corbí, J. M. G. (2006), “Heavy metals contents in agricultural topsoils in the Ebro basin (Spain). Application of the multivariate geoestatistical methods to study spatial variations,” Environmental Pollution, 144, 1001–1012.

    Article  Google Scholar 

  • Mielke, H., Gonzales, C., Smith, M., and Mielke, P. (1999), “The urban environment and children’s health: soils as an integrator of lead, zinc, and cadmium in New Orleans, Louisiana, USA,” Environmental Research, 81, 117–129.

    Article  Google Scholar 

  • Minasny, B. and McBratney, A. B. (2007), “Spatial prediction of soil properties using EBLUP with the Matérn covariance function,” Geoderma, 140, 324–336.

    Article  Google Scholar 

  • Robinson, T. and Metternicht, G. (2006), “Testing the performance of spatial interpolation techniques for mapping soil properties,” Computers and electronics in agriculture, 50, 97–108.

    Article  Google Scholar 

  • Seaber, P. R., Kapinos, F. P., and Knapp, G. L. (1987), “Hydrologic unit maps,” Tech. rep., Denver, Colorado, U.S. Geological Survey.

  • Spiegelhalter, D. J., Best, N. G., Carlin, B. P., and Van Der Linde, A. (2002), “Bayesian measures of model complexity and fit,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64, 583–639.

    Article  MathSciNet  Google Scholar 

  • Wei, J.-B., Xiao, D.-N., Zeng, H., and Fu, Y.-K. (2008), “Spatial variability of soil properties in relation to land use and topography in a typical small watershed of the black soil region, northeastern China,” Environmental geology, 53, 1663–1672.

    Article  Google Scholar 

  • WHO. (1981), Arsenic, Geneva: World Health Organization. Environmental Health Criteria 18.

Download references

Acknowledgements

The authors thank the Editor, Associate Editor, and Referee for reviewing the manuscript and providing valuable comments. This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2018R1C1B6004511).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Heeyoung Kim.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 51 KB)

Appendix: Derivation of the Full Conditional Distributions

Appendix: Derivation of the Full Conditional Distributions

We define the parameter set \(\Theta =\{\tilde{\varvec{\beta }},\varvec{\mu _{\beta }},T, \phi _0, \tau ^{T}, \omega ^{T}, \mu ^{S}, \tau ^{S}, \gamma , \omega ^{S} \}\). Let \(\Theta _{-\theta }\) denote the parameter set except for the parameter \(\theta \). Similarly, let \(Y^{S}(W_{-j})\) denote the column vector of the latent stream sediment process in the watersheds \(W_1, \ldots , N^W\), expect for the jth watershed, \(W_j\). The full posterior distribution is given by

$$\begin{aligned} p(Y^{T},Y^{S},\Theta \mid Z^{T},Z^{S}) \propto p(Z^{S}\mid Y^{S})p(Z^{T}\mid Y^{T})p(Y^{T}\mid Y^{S})p(Y^{S})p(\Theta ). \end{aligned}$$

With the prior specifications presented in Sect. 3, the full conditional distributions for the parameters in the proposed model are derived. The full conditional distribution for \(Y^{S}(W_{j})\) is

$$\begin{aligned} \begin{aligned}&p(Y^{S}(W_{j})\mid Y^{S}(W_{-j}), Y^{T}, \Theta , Z^{T}, Z^{S})\\&\propto \prod _{k=1}^{N_{j}^{S}}\frac{1}{\sqrt{2\pi \omega ^{S}}} \exp \left( {-\frac{(Z_{k}^{S}(W_{j})-Y^{S}(W_{j}) )^{2}}{2w^{S}} }\right) \\&\quad \prod _{s_{i} \in W_{j} }\frac{1}{\sqrt{2\pi \omega ^{S}}}\exp \left( {-\frac{(Y^{T}(s_{i}) - \beta _{0}(s_{i}) - \beta _{1}(s_{i})Y^{S}(W_{j}))^{2}}{2\tau ^T}}\right) \\&\qquad p(Y^{S}(W_{j}) \mid Y^{S}(W_{-j}), \mu ^{S}, \tau ^{S}, \gamma )\\&\quad \propto \exp \left( -\frac{(Y^{S}(W_{j})-\mu _{1})^{2}}{2\sigma _{1} } \right) \exp \left( -\frac{(Y^{S}(W_{j})-\mu _{2})^{2}}{2\sigma _{2} } \right) \exp \left( -\frac{(Y^{S}(W_{j})-\mu _{3})^{2}}{2\sigma _{3} } \right) , \end{aligned} \end{aligned}$$
(5)

where \(\mu _{1} = \sum _{k=1}^{N_{j}^{s}} Z_{k}^{S}(W_{j}) /N_{j}^{S}\), \( \sigma _{1} = \omega ^{S}/N_{j}^{S}\), \(\mu _{2} = \sum _{s_{i} \in W_{j}} ((Y^{T}(s_i)-\beta _{0}(s_{i}))\beta _{1}(s_{i})) /\sum _{s_{i} \in W_{j}} (\beta _{1}(s_{i}))^{2}\), \( \sigma _{2} = \tau ^T / \sum _{s_{i} \in W_{j}}(\beta _{1}(s_{i}))^{2} \), \(\mu _{3} = \mu _{j} + \Sigma _{j(-j)} \Sigma _{(-j)(-j)}^{-1} (Y^{S}(-W_{j}) - \mu _{(-j)} )\), and \(\sigma _{3} = \Sigma _{jj} -\Sigma _{(-j)(-j)}\Sigma _{(-j)(-j)}^{-1}\Sigma _{j(-j)}\), where \(\mu _{j}\) denotes the jth element of \(\mu ^{S} \varvec{1}\) and \(\mu _{(-j)}\) denotes the vector of all elements in \(\mu ^{S} \varvec{1}\) except for the jth element. Similarly, \(\Sigma _{jj}\) denotes the (jj)th element of \(\tau ^{S} (\varvec{I}-\gamma ^{S} \varvec{A})^{-1}\) and \(\Sigma _{(-j)(-j)}\) denotes all elements in \(\tau ^{S} (\varvec{I}-\gamma ^{S} \varvec{A})^{-1}\) except for the element in the jth row and jth column. Equation (5) is further simplified as

$$\begin{aligned} p(Y^{S}(W_j)\mid Y^{S}(W_{-j}), Y^{T},Z^{T},Z^{S}, \Theta ) \propto N\left( \frac{\mu _{3}/\sigma _{3} + \mu _{4}/\sigma _{4}}{1/\sigma _{3} + 1/\sigma _{4}}, (1/\sigma _{3} + 1/\sigma _{4})^{-1} \right) , \end{aligned}$$

where \(\mu _{4} = \frac{\mu _{1}\sigma _{2}+\mu _{2}\sigma _{1}}{\sigma _{1}+\sigma _{2}}\) and \(\sigma _{4} = \frac{\sigma _{1}\sigma _{2}}{\sigma _{1} + \sigma _{2}}\).

The full conditional distribution for \(Y^{T}(s_{i})\) is

$$\begin{aligned} \begin{aligned}&p(Y^{T}(s_{i}) \mid Y^{S}, Z^{T},Z^{S}, \Theta )\\&\quad \propto \exp \left( {- \frac{(Z^{T}(s_{i})-Y^{T}(s_{i}))^{2}}{2\omega ^{T}} }\right) \exp \left( {-\frac{(Y^{T}(s_{i})-\beta _{0}(s_i)-\beta _{1}(s_{i})Y^{S}( W_{w(s_i)}) )^{2}}{2\tau ^{T}}}\right) \\&\quad \propto N \left( \left( \frac{Z^{T}(s_{i})}{\omega ^{T}} + \frac{\beta _{0}+\beta _{1}(s_{i})Y^{S}(W_{w(s_i)})}{\tau ^{T}}\right) \Bigg /\left( \frac{1}{\omega ^{T}}+\frac{1}{\tau ^{T}} \right) , \left( \frac{1}{\omega ^{T}} +\frac{1}{\tau ^{T}}\right) ^{-1}\right) . \end{aligned} \end{aligned}$$

The full conditional distribution for \(\omega ^{T}\), with the prior \(p(\omega ^{T})\sim IG(a, b)\), is

$$\begin{aligned} \begin{aligned}&p(\omega ^{T} \mid Y^{T}, Y^{S}, Z^{T},Z^{S}, \Theta _{-\omega ^{T}}) \\&\quad \propto \prod _{i=1}^{N^{T}} \exp \left( -{ \frac{(Z^{T}(s_{i})-Y^{T}(s_{i}))^{2}}{2\omega ^{T}} }\right) p(\omega ^{T})\\&\quad \propto IG\left( a+\frac{N^{T}}{2}, b+\frac{\sum _{i=1}^{N^{T}}(Y^{T}(s_{i})-Z^{T}(s_{i}))^{2} }{2}\right) . \end{aligned} \end{aligned}$$

The full conditional distribution for \(\omega ^{S}\), with the prior \(p(\omega ^{S}) \sim IG(a, b)\), is

$$\begin{aligned} \begin{aligned}&p(\omega ^{S} \mid Y^{T}, Y^{S}, Z^{T},Z^{S}, \Theta _{\omega ^{S}}) \\&\quad \propto \prod _{j=1}^{N_{W}} \prod _{k=1}^{N_{j}^{S}}\exp \left( -\frac{(Z_{k}^{S}(W_{j})- Y^{S}(W_{j}))^{2} }{2\omega ^{S}}\right) p(\omega ^{S})\\&\quad \propto IG\left( a+\frac{\sum _{j=1}^{N_{w}}N_{j}^{S}}{2}, b+\frac{\sum _{j=1}^{N_{W}} \sum _{k=1}^{N_{j}^{S}}(Z_{k}^{S}(W_{j})- Y^{S}(W_{j}))^{2} }{2}\right) . \end{aligned} \end{aligned}$$

The full conditional distribution for \(\mu ^{S}\), with the prior \(p(\mu ^{S}) \sim N(a, b)\), is

$$\begin{aligned} \begin{aligned}&p(\mu ^{S} \mid Y^{T}, Y^{S}, Z^{T},Z^{S}, \Theta _{-\mu ^{S}}) \\&\quad \propto \exp {\left( -\frac{1}{2}(Y^{S}-\mu ^{S}\varvec{1})'(\tau ^{S}(\varvec{I}-\gamma \varvec{A})^{-1})^{-1}(Y^{S}-\mu ^{S}\varvec{1})\right) } p(\mu ^{S})\\&\quad \propto \exp \left( -\frac{(\mu ^{S}-\mu _{5})^{2}}{2\sigma _{5} } \right) , \end{aligned} \end{aligned}$$

where \(\mu _{5} = \left( \varvec{1}_{N^{W}}'(\tau ^{S}(\varvec{I}-\gamma \varvec{A})^{-1})^{-1}Y^{S} + a/b\right) / (\varvec{1}_{N^{W}}'(\tau ^{S}(\varvec{I}-\gamma \varvec{A})^{-1})^{-1}\varvec{1}_{N^{W}} +1/b)\) and \(\sigma _{5} = (\varvec{1}_{N^{W}}'(\tau ^{S}(\varvec{I}-\gamma \varvec{A})^{-1})^{-1}\varvec{1}_{N^{W}} +1/b)^{-1}\).

The full conditional distribution for \(\varvec{\mu }^{T}\), with the prior \(p(\varvec{\mu }^{T}) \sim N(\varvec{\mu }_\beta , \Sigma _\beta )\), is

$$\begin{aligned} \begin{aligned}&p(\varvec{\mu }^{T} \mid Y^{T}, Y^{S}, Z^{T},Z^{S}, \Theta _{-\varvec{\mu }^{T}}) \\&\quad \propto N\left( \left[ \Sigma _\beta ^{-1}+ \frac{Y^{S'}Y^{S}}{\tau ^{T}}\right] ^{-1} \left[ \Sigma _\beta ^{-1} \varvec{\mu }_\beta + \frac{Y^{S'}(Z^{T}-{\tilde{Y}}^{S'}\tilde{\varvec{\beta }}) }{ \tau ^{T} } \right] , \left[ \Sigma _\beta ^{-1}+ \frac{Y^{S'}Y^{S}}{\tau ^{T}}\right] ^{-1} \right) . \end{aligned} \end{aligned}$$

The full conditional distribution for \(\tilde{\varvec{\beta }}\), with the prior \(p(\tilde{\varvec{\beta }}) \sim N(\varvec{\mu }_{\beta _{s}}, \Sigma _{\beta _{s}})\), is

$$\begin{aligned} \begin{aligned}&p(\tilde{\varvec{\beta }} \mid Y^{T}, Y^{S}, Z^{T},Z^{S}, \Theta _{-\tilde{\varvec{\beta }}}) \\&\quad \propto N\left( \left[ \Sigma _{\beta _{s}}^{-1}+ \frac{{\tilde{Y}}^{S'}{\tilde{Y}}^{S}}{\tau ^{T}}\right] ^{-1} \left[ \Sigma _{\beta _{s}}^{-1} \varvec{\mu }_{\beta _{s}} + \frac{{\tilde{Y}}^{S'}(Z^{T}-Y^{S'}\varvec{\mu }^{T}) }{ \tau ^{T} } \right] , \left[ \Sigma _{\beta _{s}}^{-1}+ \frac{{\tilde{Y}}^{S'}{\tilde{Y}}^{S}}{\tau ^{T}}\right] ^{-1} \right) . \end{aligned} \end{aligned}$$

To sample \(\tau ^{S}\) from its full conditional distribution with the prior \(p(\tau ^{S}) \sim \) half-Cauchy(ab), we rewrite Eq. (3) as follows:

$$\begin{aligned} Y^S = \mu ^S {\varvec{1}}_{N^W\times 1} + {\mathbf {z}}, \quad {\mathbf {z}} \sim N \left( {\varvec{0}}, \tau ^S(\varvec{I}-\gamma \varvec{A})^{-1} \right) , \end{aligned}$$

where \({\mathbf {z}}=(z_1,\ldots , z_{N^W})^T\) is a vector of spatial random effects with the element \(z_j\) for watershed \(W_j\), and \({\varvec{0}}\) is a vector of zeros. We reparametrize \({\mathbf {z}}\) as described in Gelman et al. (2006): \(z_j = \xi \eta _j\), where \(\eta _j \sim N(0, \sigma ^2_{\eta })\). Then, \(\sqrt{\tau ^{S}} = |\xi |\sigma _\eta \). By the reparametrization, we can sample \(\tau ^{S}\) by sampling \(\xi \) and \(\sigma _\eta \):

$$\begin{aligned}&p(\xi \mid Y^{T}, Y^{S}, Z^{T},Z^{S}, \Theta _{-\tau ^{S}}) \\&\quad \propto N\left( \left[ 1/b + \omega ^S/\sigma ^2_\eta \right] ^{-1}\left[ a/b + \sum (Y^{T}-\mu ^{S}\varvec{1}) / \frac{\omega ^S}{\sigma ^2_\eta }\right] , [1/b + \omega ^S/\sigma ^2_\eta ]^{-1} \right) ,\\&p(\sigma ^2_\eta \mid Y^{T}, Y^{S}, Z^{T},Z^{S}, \Theta _{-\tau ^{S}}) \\&\quad \propto IG\left( 0.5+ \frac{N^{W}}{2}, 0.5 +\frac{ (Y^{S}-\mu ^{S}\varvec{1})'(\varvec{I}-\gamma \varvec{A})(Y^{S}-\mu ^{S}\varvec{1}) }{2\xi }\right) . \end{aligned}$$

Similarly, to sample \(\tau ^{T}\) from its full conditional distribution with the prior \(p(\tau ^{T}) \sim \) half-Cauchy(ab), we reparametrize \(\tau ^{T}\) as \(\sqrt{\tau ^{T}} = |\xi ^{T}|\sigma ^{T}_\eta \). By the reparametrization, we can sample \(\tau ^{T}\) by sampling \(\xi ^{T}\) and \(\sigma ^{T}_\eta \):

$$\begin{aligned}&p(\xi ^{T} \mid Y^{T}, Y^{S}, Z^{T},Z^{S}, \Theta _{-\tau ^{T}}) \\&\quad \propto N\left( \left[ 1/b + \omega ^T/(\sigma ^T_\eta )^2 \right] ^{-1}\left[ a/b + \sum (Y^T - \varvec{\mu }^{T}) / \frac{\omega ^T}{(\sigma _\eta ^T)^2}\right] , [1/b + \omega ^T/(\sigma _\eta ^T)^2]^{-1} \right) ,\\&\qquad p((\sigma _\eta ^T)^2_\eta \mid Y^{T}, Y^{S}, Z^{T},Z^{S}, \Theta _{-\tau ^{T}}) \propto IG\left( 0.5+ \frac{N^{W}}{2}, 0.5 +\frac{ (Y^T-\varvec{\mu }^{T})^2 }{2\xi ^T}\right) . \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, K., Kim, H., Kim, V. et al. A Multiscale Spatially Varying Coefficient Model for Regional Analysis of Topsoil Geochemistry. JABES 25, 74–89 (2020). https://doi.org/10.1007/s13253-019-00379-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13253-019-00379-x

Keywords

Navigation