Ensemble Spatial Interpolation: A New Approach to Natural or Anthropogenic Variable Assessment

Egaña, Alvaro; Navarro, Felipe; Maleki, Mohammad; Grandón, Francisca; Carter, Francisco; Soto, Fabián

doi:10.1007/s11053-021-09860-2

Ensemble Spatial Interpolation: A New Approach to Natural or Anthropogenic Variable Assessment

Original Paper
Published: 14 April 2021

Volume 30, pages 3777–3793, (2021)
Cite this article

Natural Resources Research Aims and scope Submit manuscript

Alvaro Egaña ORCID: orcid.org/0000-0001-8720-4783¹,
Felipe Navarro¹,
Mohammad Maleki²,
Francisca Grandón¹,
Francisco Carter¹ &
…
Fabián Soto¹

405 Accesses
1 Citation
Explore all metrics

Abstract

This paper presents a novel and versatile framework for building ensemble spatial interpolation functions. As with all ensemble methods, the central idea is to assemble a voting scheme where a set of weak interpolation functions are combined, by using an aggregation function, to produce a strong ensemble response. In the presented scheme, voter interpolation functions are weak because they deal with a minimal portion of sample data extracted from spatial random partition elements, while the ensemble as a whole uses all the available information as much as possible. The random partitions scheme behaves as a bootstrapping strategy applied in a spatial context. Experiments show that the proposed framework has the promising ability to produce robust interpolation functions that can both scale to handle large sample data sets and deal with uncertainty quantifications, although weak voter interpolation functions are deterministic or highly data-consuming.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Climate Change and Drought: From Past to Future

Article 12 May 2018

What is the relationship between land use and surface water quality? A review and prospects from remote sensing perspective

Article 16 June 2022

Empowering real-time flood impact assessment through the integration of machine learning and Google Earth Engine: a comprehensive approach

Article 03 April 2024

Data Availability

For further experimentation, data used in this work and a Python 3.x open-source library with algorithm implementations can be found here: https://github.com/alges/pyESI.

Notes

By the French mathematician Marie-Jean-Antoine Nicolas de Caritat, 1785.
\(\theta = [a_1^{\theta }, b_1^{\theta }] \times \cdots \times [a_d^{\theta }, b_d^{\theta}].\)
Sub-index \(({{{\mathcal {P}}}, {{\mathcal {M}}}})\) expresses that the interpolation is made using the information provided by positions and measurements.
Tree data structures are organized in levels: root node is at level 0, its child nodes are at level 1, and so on. The depth or height is the maximum level of a leaf (terminal) node in a tree (Preiss 1999).
This can be easily verified by noticing that each tree node has two children, so every tree level node amount is increasing exponentially with base 2.

References

Akima, H. (1978). A method of bivariate interpolation and smooth surface fitting for irregularly distributed data points. ACM Transactions on Mathematical Software (TOMS), 4(2), 148–159. https://doi.org/10.1145/355780.355786.
Article Google Scholar
Battalgazy, N., & Madani, N. (2019). Categorization of mineral resources based on different geostatistical simulation algorithms: a case study from an iron ore deposit. Natural Resources Research. https://doi.org/10.1007/s11053-019-09474-9.
Article Google Scholar
Bentley, J. L. (1975). Multidimensional binary search trees used for associative searching. Communications of the ACM, 10(1145/361002), 361007.
Google Scholar
Boisvert, J. B., & Deutsch, C. V. (2011). Programs for kriging and sequential Gaussian simulation with locally varying anisotropy using non-Euclidean distances. Computers and Geosciences. https://doi.org/10.1016/j.cageo.2010.03.021.
Article Google Scholar
Breiman, L. (2001). Random forests. Machine Learning. https://doi.org/10.1023/A:1010933404324.
Article Google Scholar
Burrough, P. A. (1986). Principles of geographical information systems for land resources assessment. Principles of geographical information systems for land resources assessment. https://doi.org/10.1097/00010694-198710000-00012.
Article Google Scholar
Chan, P. K., & Stolfo, S. J. (1995). A comparative evaluation of voting and meta-learning on partitioned data. In Machine learning proceedings 1995, ICML’95, https://doi.org/10.1016/b978-1-55860-377-6.50020-7.
Chan, P. K., & Stolfo, S. J. (1997). On the accuracy of meta-learning for scalable data mining. Journal of Intelligent Information Systems. https://doi.org/10.1023/A:1008640732416.
Article Google Scholar
Chilès, J. P., & Delfiner, P. (2012). Geostatistics: Modeling spatial uncertainty. 2nd edition. New York: Wiley. https://doi.org/10.1002/9781118136188.
Cohen, S., & Intrator, N. (2000). A hybrid projection based and radial basis function architecture. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). https://doi.org/10.1007/3-540-45014-9_14.
Article Google Scholar
Cohen, S., & Intrator, N. (2002). A hybrid projection-based and radial basis function architecture: Initial values and global optimisation. Pattern Analysis and Applications. https://doi.org/10.1007/s100440200010.
Article Google Scholar
Collins, M., Schapire, R. E., & Singer, Y. (2002). Logistic regression, AdaBoost and Bregman distances. Machine Learning, 48(1–3), 253–285.
Article Google Scholar
Cressie, N. (2015). Statistics for spatial data. New York: Wiley.
Google Scholar
Davies, M. M., & Van Der Laan, M. J. (2016). Optimal spatial prediction using ensemble machine learning. International Journal of Biostatistics. https://doi.org/10.1515/ijb-2014-0060.
Article Google Scholar
Den Hertog, D., Kleijnen, J. P., & Siem, A. Y. (2006). The correct Kriging variance estimated by bootstrapping. Journal of the Operational Research Society. https://doi.org/10.1057/palgrave.jors.2601997.
Article Google Scholar
Dietterich, T. G. (2000). An experimental comparison of three methods for constructing ensembles of decision trees. Machine Learning. https://doi.org/10.1023/A:1007607513941.
Article Google Scholar
Duin, R. P. (2002). The combining classifier: To train or not to train? Proceedings - International Conference on Pattern Recognition. https://doi.org/10.1109/icpr.2002.1048415.
Article Google Scholar
Džeroski, S., & Ženko, B. (2004). Is combining classifiers with stacking better than selecting the best one? Machine Learning. https://doi.org/10.1023/B:MACH.0000015881.36452.6e.
Article Google Scholar
Emery, X., & Arroyo, D. (2018). On a continuous spectral algorithm for simulating non-stationary Gaussian random fields. Stochastic Environmental Research and Risk Assessment. https://doi.org/10.1007/s00477-017-1402-3.
Article Google Scholar
Emery, X., & Maleki, M. (2019). Geostatistics in the presence of geological boundaries: Application to mineral resources modeling. Ore Geology Reviews. https://doi.org/10.1016/j.oregeorev.2019.103124.
Article Google Scholar
Evgeniou, T., Pontil, M., & Elisseeff, A. (2004). Leave one out error, stability, and generalization of voting combinations of classifiers. Machine Learning. https://doi.org/10.1023/B:MACH.0000019805.88351.60.
Article Google Scholar
Fernández-Delgado, M., Cernadas, E., Barro, S., & Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems? The Journal of Machine Learning Research, 15(1), 3133–3181.
Google Scholar
Fouedjio, F., & Séguret, S. (2016). Predictive geological mapping using closed-form non-stationary covariance functions with locally varying anisotropy: Case study at El Teniente Mine (Chile). Natural Resources Research. https://doi.org/10.1007/s11053-016-9293-4.
Article Google Scholar
Franco-Villoria, M., & Ignaccolo, R. (2017). Bootstrap based uncertainty bands for prediction in functional kriging. Spatial Statistics. https://doi.org/10.1016/j.spasta.2017.06.005.
Article Google Scholar
Franke, R. (1982). Smooth interpolation of scattered data by local thin plate splines. Computers and Mathematics with Applications. https://doi.org/10.1016/0898-1221(82)90009-8.
Article Google Scholar
Franke, R., & Nielson, G. M. (1991). Scattered data interpolation and applications: A tutorial and survey. In Geometric modeling, Springer (pp. 131–160). https://doi.org/10.1007/978-3-642-76404-2_6.
Friedman, J., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting (with discussion and a rejoinder by the authors). The Annals of Statistics, 28(2), 337–407.
Article Google Scholar
Georganos, S., Grippa, T., Niang Gadiaga, A., Linard, C., Lennert, M., Vanhuysse, S., et al. (2019). Geographical random forests: A spatial extension of the random forest algorithm to address spatial heterogeneity in remote sensing and population modelling. Geocarto International. https://doi.org/10.1080/10106049.2019.1595177.
Article Google Scholar
Gielsdorf, F., & Hillmann, T. (2012). Mathematics and statistics. In Kresse, W., & Danko, D. M. (eds.), Springer handbook of geographic information, Berlin: Springer (pp. 7–10). https://doi.org/10.1007/978-3-540-72680-7_2.
Guhaniyogi, R., & Banerjee, S. (2019). Multivariate spatial meta kriging. Statistics and Probability Letters. https://doi.org/10.1016/j.spl.2018.04.017.
Article Google Scholar
Hastie, T., Tibshirani, R., & Friedman, J. (2009). Elements of statistical learning. 2nd ed. Springer. https://doi.org/10.1007/978-0-387-84858-7.
Hengl, T., Heuvelink, G. B., Kempen, B., Leenaars, J. G., Walsh, M. G., Shepherd, K. D., et al. (2015). Mapping soil properties of Africa at 250 m resolution: Random forests significantly improve current predictions. PLoS ONE. https://doi.org/10.1371/journal.pone.0125814.
Article Google Scholar
Hothorn, T., & Lausen, B. (2005). Bundling classifiers by bagging trees. Computational Statistics and Data Analysis. https://doi.org/10.1016/j.csda.2004.06.019.
Article Google Scholar
Huang, Y. S., & Suen, C. Y. (1995). A method of combining multiple experts for the recognition of unconstrained handwritten numerals. IEEE Transactions on Pattern Analysis and Machine Intelligence. https://doi.org/10.1109/34.368145.
Article Google Scholar
Jacobs, R. A. (1995). Methods for combining experts’ probability assessments. Neural Computation, 7(5), 867–888. https://doi.org/10.1162/neco.1995.7.5.867.
Article Google Scholar
Jacobs, R. A., Jordan, M. I., Nowlan, S. J., & Hinton, G. E. (1991). Adaptive mixtures of local experts. Neural Computation. https://doi.org/10.1162/neco.1991.3.1.79.
Article Google Scholar
Jordan, M. I., & Xu, L. (1995). Convergence results for the EM approach to mixtures of experts architectures. Neural Networks. https://doi.org/10.1016/0893-6080(95)00014-3.
Article Google Scholar
Journel, A. G., & Huijbregts, C. J. (1978). Mining geostatistics (Vol. 600). London: Academic press.
Google Scholar
Kleijnen, J. P. C. (2012). Simulation optimization via bootstrapped kriging: Tutorial. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.1860175.
Article Google Scholar
Krcho, J. (1973). Morphometric analysis of relief on the basis of geometric aspect of field theory. Acta Geographica Universitatis Comenianae, Geographico-Physica, 1(1), 7–233.
Google Scholar
Kuncheva, L. I. (2014). Combining pattern classifiers: methods and algorithms (2nd ed.). New York: Wiley. https://doi.org/10.1002/9781118914564.
Book Google Scholar
Lakshminarayanan, B., Roy, D. M., & Teh, Y. W. (2014). Mondrian forests: Efficient online random forests. Advances in Neural Information Processing Systems, 4, 3140–3148.
Google Scholar
Lakshminarayanan, B., Roy, D. M., & Teh, Y. W. (2016). Mondrian forests for large-scale regression when uncertainty matters. In Proceedings of the 19th international conference on artificial intelligence and statistics, AISTATS 2016.
LantuéJoul, C. (2002). Geostatistical simulation. Berlin: Springer. https://doi.org/10.1007/978-3-662-04808-5.
Book Google Scholar
Laslett, G. M., McBratney, A. B., Pahl, P. J., & Hutchinson, M. F. (1987). Comparison of several spatial prediction methods for soil pH. Journal of Soil Science, 38(2), 325–341. https://doi.org/10.1111/j.1365-2389.1987.tb02148.x.
Article Google Scholar
Li, J., & Heap, A. D. (2014). Spatial interpolation methods applied in the environmental sciences: A review. Environmental Modelling & Software, 53, 173–189. https://doi.org/10.1016/j.envsoft.2013.12.008.
Article Google Scholar
Li, J., Heap, A. D., Potter, A., & Daniell, J. J. (2011). Application of machine learning methods to spatial interpolation of environmental variables. Environmental Modelling and Software. https://doi.org/10.1016/j.envsoft.2011.07.004.
Article Google Scholar
Liu, Y., Cao, G., Zhao, N., Mulligan, K., & Ye, X. (2018). Improve ground-level PM2.5 concentration mapping using a random forests-based geostatistical approach. Environmental Pollution. https://doi.org/10.1016/j.envpol.2017.12.070.
Article Google Scholar
Matheron, G. (1965). Les variables régionalisées et leur estimation: une application de la théorie des fonctions aléatoires aux sciences de la nature. Masson et CIE.
McCauley, J. D., & Engel, B. A. (1997). Approximation of noisy bivariate traverse data for precision mapping. Transactions of the American Society of Agricultural Engineers, 40(1), 237–245. https://doi.org/10.13031/2013.21236.
Article Google Scholar
Menafoglio, A., Gaetani, G., & Secchi, P. (2018). Random domain decompositions for object-oriented Kriging over complex domains. Stochastic Environmental Research and Risk Assessment, 32(12), 3421–3437. https://doi.org/10.1007/s00477-018-1596-z.
Article Google Scholar
Mitáš, L., & Mitášová, H. (1988). General variational approach to the interpolation problem. Computers and Mathematics with Applications. https://doi.org/10.1016/0898-1221(88)90255-6.
Article Google Scholar
Mitáš, L., & Mitášová, H. (1999). Finding appropriate interpolation methods for. Geographical information systems: Principles, techniques, management and applications, 1, 481–492.
Google Scholar
Nwaila, G. T., Zhang, S. E., Frimmel, H. E., Manzi, M. S., Dohm, C., Durrheim, R. J., et al. (2020). Local and target exploration of conglomerate-hosted gold deposits using machine learning algorithms: a case study of the witwatersrand gold ores. South Africa: Natural Resources Research. https://doi.org/10.1007/s11053-019-09498-1.
Book Google Scholar
Orton, T. G., Pringle, M. J., Bishop, T. F., Menzies, N. W., & Dang, Y. P. (2020). Increment-averaged kriging for 3-D modelling and mapping soil properties: Combining machine learning and geostatistical methods. Geoderma. https://doi.org/10.1016/j.geoderma.2019.114094.
Article Google Scholar
Philip, G. M., & Watson, D. F. (1987). Neighborhood discontinuities in bivariate interpolation of scattered observations. Mathematical Geology, 19(1), 69–74. https://doi.org/10.1007/BF01275435.
Article Google Scholar
Preiss, B. R. (1999). Data structures and algorithms. New York: Wiley.
Google Scholar
Re, M., & Valentini, G. (2012). Ensemble methods: A review. In M. J. Way, J. D. Scargle, K. M. Ali, & A. N. Srivastava (Eds.), Advances in machine learning and data mining for astronomy (pp. 563–593). New York: Taylor & Francis.
Google Scholar
Reid, S., & Grudic, G. (2009). Regularized linear models in stacked generalization. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). https://doi.org/10.1007/978-3-642-02326-2_12.
Article Google Scholar
Roy, D. M., Teh, Y. W. (2009). The Mondrian process. In Advances in neural information processing systems 21–proceedings of the 2008 conference.
Sagi, O., & Rokach, L. (2018). Ensemble learning: A survey. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 8(4), e1249.
Google Scholar
Sekulić, A., Kilibarda, M., Heuvelink, G. B., Nikolić, M., & Bajat, B. (2020). Random forest spatial interpolation. Remote Sensing. https://doi.org/10.3390/rs12101687.
Article Google Scholar
Shepard, D. (1968). A two-dimensional interpolation function for irregularly-spaced data. In Proceedings of the 1968 23rd ACM national conference, ACM 1968.
Sibson, R. (1981). A brief description of natural neighbour interpolation in interpreting multivariate data. New York: Wiley.
Google Scholar
Sjöstedt-de Luna, S., & Young, A. (2003). The bootstrap and kriging prediction intervals. Scandinavian Journal of Statistics. https://doi.org/10.1111/1467-9469.00325.
Article Google Scholar
Su, H., Shen, W., Wang, J., Ali, A., & Li, M. (2020). Machine learning and geostatistical approaches for estimating aboveground biomass in Chinese subtropical forests. Forest Ecosystems. https://doi.org/10.1186/s40663-020-00276-7.
Article Google Scholar
Tibshirani, R. J., & Efron, B. (1993). An introduction to the bootstrap. Monographs on Statistics and Applied Probability, 57, 1–436.
Google Scholar
Watson, D. F. (1992). Contouring: A guide to the analysis and display of spatial data. Amsterdam: Elesiver. https://doi.org/10.1016/0098-3004(93)90069-h.
Book Google Scholar
Wilkinson, B., & Allen, M. (2004). Parallel programming: Techniques and applications using networked workstations and parallel computers (2nd ed.). New Yrok: Prentice-Hall Inc.
Google Scholar
Zhang, S. E., Nwaila, G. T., Tolmay, L., Frimmel, H. E., & Bourdeau, J. E. (2020). Integration of machine learning algorithms with gompertz curves and kriging to estimate resources in gold deposits. Natural Resources Research. https://doi.org/10.1007/s11053-020-09750-z.
Article Google Scholar

Download references

Acknowledgments

The authors acknowledge funding from the Chilean National Agency for Research and Development (ANID) PIA-Project AFB180004. The third author also acknowledge funding from ANID through grants CONICYT/FONDECYT/N°3180655. The authors also acknowledge the valuable support of Prof. Xavier Emery and all team members of the ALGES Lab at the AMTC and the Department of Mining Engineering at the University of Chile. Finally, the authors thank the three anonymous reviewers whose comments/suggestions helped improve and clarify this manuscript.

Author information

Authors and Affiliations

Advanced Laboratory of Geostatistical Supercomputing (ALGES), Advanced Mining Technology Center (AMTC), Department of Mining Engineering, Universidad de Chile, Santiago, Chile
Alvaro Egaña, Felipe Navarro, Francisca Grandón, Francisco Carter & Fabián Soto
Department of Metallurgical and Mining Engineering, Universidad Católica del Norte, Antofagasta, Chile
Mohammad Maleki

Authors

Alvaro Egaña
View author publications
You can also search for this author in PubMed Google Scholar
Felipe Navarro
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Maleki
View author publications
You can also search for this author in PubMed Google Scholar
Francisca Grandón
View author publications
You can also search for this author in PubMed Google Scholar
Francisco Carter
View author publications
You can also search for this author in PubMed Google Scholar
Fabián Soto
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alvaro Egaña.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Egaña, A., Navarro, F., Maleki, M. et al. Ensemble Spatial Interpolation: A New Approach to Natural or Anthropogenic Variable Assessment. Nat Resour Res 30, 3777–3793 (2021). https://doi.org/10.1007/s11053-021-09860-2

Download citation

Received: 05 October 2020
Accepted: 06 March 2021
Published: 14 April 2021
Issue Date: October 2021
DOI: https://doi.org/10.1007/s11053-021-09860-2

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Ensemble Spatial Interpolation: A New Approach to Natural or Anthropogenic Variable Assessment

Abstract

Access this article

Similar content being viewed by others

Climate Change and Drought: From Past to Future

What is the relationship between land use and surface water quality? A review and prospects from remote sensing perspective

Empowering real-time flood impact assessment through the integration of machine learning and Google Earth Engine: a comprehensive approach

Data Availability

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Navigation

Ensemble Spatial Interpolation: A New Approach to Natural or Anthropogenic Variable Assessment

Abstract

Access this article

Similar content being viewed by others

Climate Change and Drought: From Past to Future

What is the relationship between land use and surface water quality? A review and prospects from remote sensing perspective

Empowering real-time flood impact assessment through the integration of machine learning and Google Earth Engine: a comprehensive approach

Data Availability

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation