当前位置: X-MOL 学术Math. Geosci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Towards Geostatistical Learning for the Geosciences: A Case Study in Improving the Spatial Awareness of Spectral Clustering
Mathematical Geosciences ( IF 2.6 ) Pub Date : 2020-06-08 , DOI: 10.1007/s11004-020-09867-0
H. Talebi , L. J. M. Peeters , U. Mueller , R. Tolosana-Delgado , K. G. van den Boogaart

The particularities of geosystems and geoscience data must be understood before any development or implementation of statistical learning algorithms. Without such knowledge, the predictions and inferences may not be accurate and physically consistent. Accuracy, transparency and interpretability, credibility, and physical realism are minimum criteria for statistical learning algorithms when applied to the geosciences. This study briefly reviews several characteristics of geoscience data and challenges for novel statistical learning algorithms. A novel spatial spectral clustering approach is introduced to illustrate how statistical learners can be adapted for modelling geoscience data. The spatial awareness and physical realism of the spectral clustering are improved by utilising a dissimilarity matrix based on nonparametric higher-order spatial statistics. The proposed model-free technique can identify meaningful spatial clusters (i.e. meaningful geographical subregions) from multivariate spatial data at different scales without the need to define a model of co-dependence. Several mixed (e.g. continuous and categorical) variables can be used as inputs to the proposed clustering technique. The proposed technique is illustrated using synthetic and real mining datasets. The results of the case studies confirm the usefulness of the proposed method for modelling spatial data.



中文翻译:

面向地球科学的地统计学学习:以提高光谱聚类的空间意识为例的研究

在开发或实施统计学习算法之前,必须先了解地球系统和地球科学数据的特殊性。没有这样的知识,预测和推断就可能不准确且在物理上是一致的。当应用于地球科学时,准确性,透明度和可解释性,可信度和物理真实性是统计学习算法的最低标准。这项研究简要回顾了地球科学数据的几个特征以及新型统计学习算法的挑战。介绍了一种新颖的空间光谱聚类方法,以说明如何将统计学习者用于建模地球科学数据。通过利用基于非参数高阶空间统计量的相异性矩阵,可以提高频谱聚类的空间感知度和物理真实性。所提出的无模型技术可以从多尺度的多元空间数据中识别有意义的空间聚类(即有意义的地理子区域),而无需定义相互依赖的模型。几个混合变量(例如连续变量和分类变量)可以用作提出的聚类技术的输入。使用合成的和实际的采矿数据集说明了所提出的技术。案例研究的结果证实了所提出的空间数据建模方法的有用性。有意义的地理子区域),而无需定义一个相互依赖的模型,就可以从不同规模的多元空间数据中获取。几个混合变量(例如连续变量和分类变量)可以用作建议的聚类技术的输入。使用合成的和实际的采矿数据集说明了所提出的技术。案例研究的结果证实了所提出的空间数据建模方法的有用性。有意义的地理子区域),而无需定义相互依赖的模型。几个混合变量(例如连续变量和分类变量)可以用作建议的聚类技术的输入。使用合成的和实际的采矿数据集说明了所提出的技术。案例研究的结果证实了所提出的空间数据建模方法的有用性。

更新日期:2020-06-08
down
wechat
bug