当前位置: X-MOL 学术Math. Geosci. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A Truly Spatial Random Forests Algorithm for Geoscience Data Analysis and Modelling
Mathematical Geosciences ( IF 2.6 ) Pub Date : 2021-07-14 , DOI: 10.1007/s11004-021-09946-w
Hassan Talebi 1, 2, 3 , Luk J. M. Peeters 1, 4 , Alex Otto 2 , Raimon Tolosana-Delgado 5
Affiliation  

Spatial data mining helps to find hidden but potentially informative patterns from large and high-dimensional geoscience data. Non-spatial learners generally look at the observations based on their relationships in the feature space, which means that they cannot consider spatial relationships between regionalised variables. This study introduces a novel spatial random forests technique based on higher-order spatial statistics for analysis and modelling of spatial data. Unlike the classical random forests algorithm that uses pixelwise spectral information as predictors, the proposed spatial random forests algorithm uses the local spatial-spectral information (i.e., vectorised spatial patterns) to learn intrinsic heterogeneity, spatial dependencies, and complex spatial patterns. Algorithms for supervised (i.e., regression and classification) and unsupervised (i.e., dimension reduction and clustering) learning are presented. Approaches to deal with big data, multi-resolution data, and missing values are discussed. The superior performance and usefulness of the proposed algorithm over the classical random forests method are illustrated via synthetic and real cases, where the remotely sensed geophysical covariates in North West Minerals Province of Queensland, Australia, are used as input spatial data for geology mapping, geochemical prediction, and process discovery analysis.



中文翻译:

用于地球科学数据分析和建模的真正空间随机森林算法

空间数据挖掘有助于从大型和高维地球科学数据中找到隐藏但可能提供信息的模式。非空间学习者通常根据他们在特征空间中的关系来查看观察结果,这意味着他们无法考虑区域化变量之间的空间关系。本研究引入了一种基于高阶空间统计的新型空间随机森林技术,用于空间数据的分析和建模。与使用逐像素光谱信息作为预测器的经典随机森林算法不同,所提出的空间随机森林算法使用局部空间光谱信息(即矢量化空间模式)来学习内在的异质性、空间依赖性和复杂的空间模式。监督算法(即 回归和分类)和无监督(即降维和聚类)学习。讨论了处理大数据、多分辨率数据和缺失值的方法。通过合成和真实案例说明了所提出的算法优于经典随机森林方法的优越性能和实用性,其中澳大利亚昆士兰西北矿产省的遥感地球物理协变量被用作地质测绘、地球化学的输入空间数据预测和过程发现分析。

更新日期:2021-07-14
down
wechat
bug