当前位置: X-MOL 学术Geoderma › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Analysing the impact of soil spatial sampling on the performances of Digital Soil Mapping models and their evaluation: A numerical experiment on Quantile Random Forest using clay contents obtained from Vis-NIR-SWIR hyperspectral imagery
Geoderma ( IF 5.6 ) Pub Date : 2020-10-01 , DOI: 10.1016/j.geoderma.2020.114503
P. Lagacherie , D. Arrouays , H. Bourennane , C. Gomez , L. Nkuba-Kasanda

Abstract It has long been acknowledged that the soil spatial samplings used as inputs to DSM models are strong drivers – and often limiting factors – of the performances of such models. However, few studies have focused on evaluating this impact and identifying the related spatial sampling characteristics. In this study, a numerical experiment was conducted on this topic using the pseudo values of topsoil clay content obtained from an airborne Visible Near InfraRed-Short Wave InfraRed (Vis-NIR-SWIR) hyperspectral image in the Cap Bon region (Tunisia) as the source of the spatial sampling. Twelve thousand DSM models were built by running a Random Forest algorithm from soil spatial sampling of different sizes and average spacings (from 200 m to 2000 m) and different spatial distributions (from clustered to regularly distributed), aiming to mimic the various situations encountered when handling legacy data. These DSM models were evaluated with regard to both their prediction performances and their ability to estimate their overall and local uncertainties. Three evaluation methods were applied: a model-based one, a classical model-free one using 25% of the sites removed from the initial soil data, and a reference one using a set of 100,000 independent sites selected by stratified random sampling over the entire region. The results showed that: 1) While, as expected, the performances of the DSM models increased when the spacing of the sample increased, this increase was diminished for the smallest spacing as soon as 50% of the spatially structured variance was captured by the sampling, 2) Sampling that provided complete and even distributions in the geographical space and had as great spread of the target soil property as possible increased the DSM performances, while complete and even sampling distributions in the covariate space had less impacts, 3) Systematic underestimations of the overall uncertainty of DSM models were observed, that were all the more important that the sparse samplings poorly covered the real distribution of the target soil property and that the dense sampling were unevenly distributed in the geographical space, 4) The local uncertainties were underestimated for sparse sampling and over-estimated for dense sampling while being sensitive to the same sampling characteristics as overall uncertainty. Such finding have practical outcomes on sampling strategies and DSM model evaluation that are discussed.

中文翻译:

分析土壤空间采样对数字土壤测绘模型性能的影响及其评估:使用从 Vis-NIR-SWIR 高光谱图像获得的粘土含量的分位数随机森林数值试验

摘要 人们早就认识到,用作 DSM 模型输入的土壤空间采样是此类模型性能的强大驱动因素——通常也是限制因素。然而,很少有研究关注评估这种影响并确定相关的空间采样特征。在这项研究中,使用从 Cap Bon 地区(突尼斯)的空中可见近红外-短波红外 (Vis-NIR-SWIR) 高光谱图像获得的表土粘土含量的伪值作为该主题的数值实验空间采样的来源。通过从不同大小和平均间距(从 200 m 到 2000 m)和不同空间分布(从聚集到规则分布)的土壤空间采样运行随机森林算法,构建了 12000 个 DSM 模型,旨在模拟处理遗留数据时遇到的各种情况。对这些 DSM 模型的预测性能和估计其整体和局部不确定性的能力进行了评估。应用了三种评估方法:基于模型的评估方法,使用从初始土壤数据中移除 25% 的站点的经典无模型评估方法,以及使用通过整个分层随机抽样选择的一组 100,000 个独立站点的参考方法地区。结果表明:1) 正如预期的那样,DSM 模型的性能随着样本间距的增加而增加,但一旦采样捕获了 50% 的空间结构方差,这种增加就会随着最小间距而减少, 2)在地理空间中提供完整均匀分布并尽可能大地扩展目标土壤性质的采样提高了DSM性能,而协变量空间中完整均匀的采样分布影响较小,3)系统低估了观察到 DSM 模型的整体不确定性,更重要的是稀疏采样不能很好地覆盖目标土壤性质的真实分布,并且稠密采样在地理空间中分布不均匀,4)局部不确定性被低估了稀疏抽样和高估密集抽样,同时对与总体不确定性相同的抽样特征敏感。这样的发现对讨论的采样策略和 DSM 模型评估具有实际意义。
更新日期:2020-10-01
down
wechat
bug