Abstract
A motivating example for this paper is to study a topsoil geochemical process across a large region. In regional environmental health studies, ambient levels of toxic substances in topsoil are commonly used as surrogates for personal exposure to toxic substances. However, toxicity levels in topsoil are usually sparsely measured at a limited number of point locations. Consequently, topsoil measurements only provide highly localized regional information and cannot be representative of the surrounding area. Instead, it is standard practice to use point-referenced measurements of stream sediments, because they are widely available across a region and are correlated with topsoil measurements at nearby locations. For more effective regional modeling of topsoil geochemistry, we develop a spatially varying coefficient model that integrates point-level topsoil and point-referenced area-level stream sediment data. The proposed model incorporates two spatial characteristics: the local spatial autocorrelation in the latent topsoil process and the spatially varying relationship between the latent topsoil and stream sediment processes. The former is modeled indirectly via a conditional autoregressive model for the stream sediment process, and the latter is modeled by spatially varying coefficients that follow a multivariate Gaussian process. We apply the proposed model to a real dataset of arsenic concentration and demonstrate better performance than competing models.
Similar content being viewed by others
References
Aelion, C. M., Davis, H. T., McDermott, S., and Lawson, A. B. (2008), “Metal concentrations in rural topsoil in South Carolina: potential for human health impact,” Science of the Total Environment, 402, 149–156.
Calder, C., Craigmile, P., and Zhang, J. (2009), “Regional spatial modeling of topsoil geochemistry,” Biometrics, 65, 206–215.
Calder, C. A., Holloman, C. H., Bortnick, S. M., Strauss, W., and Morara, M. (2008), “Relating ambient particulate matter concentration levels to mortality using an exposure simulator,” Journal of the American Statistical Association, 103, 137–148.
Cannon, W. F., Woodruff, L. G., and Pimley, S. (2004), “Some statistical relationships between stream sediment and soil geochemistry in northwestern Wisconsin–can stream sediment compositions be used to predict compositions of soils in glaciated terranes?” Journal of Geochemical Exploration, 81, 29–46.
Cressie, N., Buxton, B. E., Calder, C. A., Craigmile, P. F., Dong, C., McMillan, N. J., Morara, M., Santner, T. J., Wang, K., Young, G., et al. (2007), “From sources to biomarkers: a hierarchical Bayesian approach for human exposure modeling,” Journal of Statistical Planning and Inference, 137, 3361–3379.
Gelfand, A. E., Kim, H.-J., Sirmans, C., and Banerjee, S. (2003), “Spatial modeling with spatially varying coefficient processes,” Journal of the American Statistical Association, 98, 387–396.
Gelman, A., et al. (2006), “Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper),” Bayesian Analysis, 1, 515–534.
Geweke, J. (1992), “Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments,” In Proceedings of the Fourth Valencia International Conference on Bayesian Staitiscs, New York: Oxford University, 169–193.
Gneiting, T. and Raftery, A. E. (2007), “Strictly proper scoring rules, prediction, and estimation,” Journal of the American Statistical Association, 102, 359–378.
Grunsky, E. C., Drew, L. J., and Sutphin, D. M. (2009), “Process recognition in multi-element soil and stream-sediment geochemical data,” Applied Geochemistry, 24, 1602–1616.
Kim, H. and Lee, J. (2017), “Hierarchical spatially varying coefficient process model,” Technometrics, 59, 521–527.
Krueger, F., Lerch, S., Thorarinsdottir, T. L., and Gneiting, T. (2016), “Probabilistic forecasting and comparative model assessment based on Markov chain Monte Carlo output,” arXiv preprint arXiv:1608.06802.
Martín, J. A. R., Arias, M. L., and Corbí, J. M. G. (2006), “Heavy metals contents in agricultural topsoils in the Ebro basin (Spain). Application of the multivariate geoestatistical methods to study spatial variations,” Environmental Pollution, 144, 1001–1012.
Mielke, H., Gonzales, C., Smith, M., and Mielke, P. (1999), “The urban environment and children’s health: soils as an integrator of lead, zinc, and cadmium in New Orleans, Louisiana, USA,” Environmental Research, 81, 117–129.
Minasny, B. and McBratney, A. B. (2007), “Spatial prediction of soil properties using EBLUP with the Matérn covariance function,” Geoderma, 140, 324–336.
Robinson, T. and Metternicht, G. (2006), “Testing the performance of spatial interpolation techniques for mapping soil properties,” Computers and electronics in agriculture, 50, 97–108.
Seaber, P. R., Kapinos, F. P., and Knapp, G. L. (1987), “Hydrologic unit maps,” Tech. rep., Denver, Colorado, U.S. Geological Survey.
Spiegelhalter, D. J., Best, N. G., Carlin, B. P., and Van Der Linde, A. (2002), “Bayesian measures of model complexity and fit,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64, 583–639.
Wei, J.-B., Xiao, D.-N., Zeng, H., and Fu, Y.-K. (2008), “Spatial variability of soil properties in relation to land use and topography in a typical small watershed of the black soil region, northeastern China,” Environmental geology, 53, 1663–1672.
WHO. (1981), Arsenic, Geneva: World Health Organization. Environmental Health Criteria 18.
Acknowledgements
The authors thank the Editor, Associate Editor, and Referee for reviewing the manuscript and providing valuable comments. This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2018R1C1B6004511).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Appendix: Derivation of the Full Conditional Distributions
Appendix: Derivation of the Full Conditional Distributions
We define the parameter set \(\Theta =\{\tilde{\varvec{\beta }},\varvec{\mu _{\beta }},T, \phi _0, \tau ^{T}, \omega ^{T}, \mu ^{S}, \tau ^{S}, \gamma , \omega ^{S} \}\). Let \(\Theta _{-\theta }\) denote the parameter set except for the parameter \(\theta \). Similarly, let \(Y^{S}(W_{-j})\) denote the column vector of the latent stream sediment process in the watersheds \(W_1, \ldots , N^W\), expect for the jth watershed, \(W_j\). The full posterior distribution is given by
With the prior specifications presented in Sect. 3, the full conditional distributions for the parameters in the proposed model are derived. The full conditional distribution for \(Y^{S}(W_{j})\) is
where \(\mu _{1} = \sum _{k=1}^{N_{j}^{s}} Z_{k}^{S}(W_{j}) /N_{j}^{S}\), \( \sigma _{1} = \omega ^{S}/N_{j}^{S}\), \(\mu _{2} = \sum _{s_{i} \in W_{j}} ((Y^{T}(s_i)-\beta _{0}(s_{i}))\beta _{1}(s_{i})) /\sum _{s_{i} \in W_{j}} (\beta _{1}(s_{i}))^{2}\), \( \sigma _{2} = \tau ^T / \sum _{s_{i} \in W_{j}}(\beta _{1}(s_{i}))^{2} \), \(\mu _{3} = \mu _{j} + \Sigma _{j(-j)} \Sigma _{(-j)(-j)}^{-1} (Y^{S}(-W_{j}) - \mu _{(-j)} )\), and \(\sigma _{3} = \Sigma _{jj} -\Sigma _{(-j)(-j)}\Sigma _{(-j)(-j)}^{-1}\Sigma _{j(-j)}\), where \(\mu _{j}\) denotes the jth element of \(\mu ^{S} \varvec{1}\) and \(\mu _{(-j)}\) denotes the vector of all elements in \(\mu ^{S} \varvec{1}\) except for the jth element. Similarly, \(\Sigma _{jj}\) denotes the (j, j)th element of \(\tau ^{S} (\varvec{I}-\gamma ^{S} \varvec{A})^{-1}\) and \(\Sigma _{(-j)(-j)}\) denotes all elements in \(\tau ^{S} (\varvec{I}-\gamma ^{S} \varvec{A})^{-1}\) except for the element in the jth row and jth column. Equation (5) is further simplified as
where \(\mu _{4} = \frac{\mu _{1}\sigma _{2}+\mu _{2}\sigma _{1}}{\sigma _{1}+\sigma _{2}}\) and \(\sigma _{4} = \frac{\sigma _{1}\sigma _{2}}{\sigma _{1} + \sigma _{2}}\).
The full conditional distribution for \(Y^{T}(s_{i})\) is
The full conditional distribution for \(\omega ^{T}\), with the prior \(p(\omega ^{T})\sim IG(a, b)\), is
The full conditional distribution for \(\omega ^{S}\), with the prior \(p(\omega ^{S}) \sim IG(a, b)\), is
The full conditional distribution for \(\mu ^{S}\), with the prior \(p(\mu ^{S}) \sim N(a, b)\), is
where \(\mu _{5} = \left( \varvec{1}_{N^{W}}'(\tau ^{S}(\varvec{I}-\gamma \varvec{A})^{-1})^{-1}Y^{S} + a/b\right) / (\varvec{1}_{N^{W}}'(\tau ^{S}(\varvec{I}-\gamma \varvec{A})^{-1})^{-1}\varvec{1}_{N^{W}} +1/b)\) and \(\sigma _{5} = (\varvec{1}_{N^{W}}'(\tau ^{S}(\varvec{I}-\gamma \varvec{A})^{-1})^{-1}\varvec{1}_{N^{W}} +1/b)^{-1}\).
The full conditional distribution for \(\varvec{\mu }^{T}\), with the prior \(p(\varvec{\mu }^{T}) \sim N(\varvec{\mu }_\beta , \Sigma _\beta )\), is
The full conditional distribution for \(\tilde{\varvec{\beta }}\), with the prior \(p(\tilde{\varvec{\beta }}) \sim N(\varvec{\mu }_{\beta _{s}}, \Sigma _{\beta _{s}})\), is
To sample \(\tau ^{S}\) from its full conditional distribution with the prior \(p(\tau ^{S}) \sim \) half-Cauchy(a, b), we rewrite Eq. (3) as follows:
where \({\mathbf {z}}=(z_1,\ldots , z_{N^W})^T\) is a vector of spatial random effects with the element \(z_j\) for watershed \(W_j\), and \({\varvec{0}}\) is a vector of zeros. We reparametrize \({\mathbf {z}}\) as described in Gelman et al. (2006): \(z_j = \xi \eta _j\), where \(\eta _j \sim N(0, \sigma ^2_{\eta })\). Then, \(\sqrt{\tau ^{S}} = |\xi |\sigma _\eta \). By the reparametrization, we can sample \(\tau ^{S}\) by sampling \(\xi \) and \(\sigma _\eta \):
Similarly, to sample \(\tau ^{T}\) from its full conditional distribution with the prior \(p(\tau ^{T}) \sim \) half-Cauchy(a, b), we reparametrize \(\tau ^{T}\) as \(\sqrt{\tau ^{T}} = |\xi ^{T}|\sigma ^{T}_\eta \). By the reparametrization, we can sample \(\tau ^{T}\) by sampling \(\xi ^{T}\) and \(\sigma ^{T}_\eta \):
Rights and permissions
About this article
Cite this article
Kim, K., Kim, H., Kim, V. et al. A Multiscale Spatially Varying Coefficient Model for Regional Analysis of Topsoil Geochemistry. JABES 25, 74–89 (2020). https://doi.org/10.1007/s13253-019-00379-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13253-019-00379-x