Prediction of topsoil organic carbon content with Sentinel-2 imagery and spectroscopic measurements under different conditions using an ensemble model approach with multiple pre-treatment combinations,Soil and Tillage Research

当前位置： X-MOL 学术 › Soil Tillage Res. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Prediction of topsoil organic carbon content with Sentinel-2 imagery and spectroscopic measurements under different conditions using an ensemble model approach with multiple pre-treatment combinations
Soil and Tillage Research ( IF 6.1 ) Pub Date : 2022-03-27 , DOI: 10.1016/j.still.2022.105379
James Kobina Mensah Biney _{1,

2} , Radim Vašát ₁ , Stephen Mackenzie Bell ₃ , Ndiye Michael Kebonye ₁ , Aleš Klement ₁ , Kingsley John ₁ , Luboš Borůvka ₁

Affiliation

Estimating soil organic carbon (SOC) using visible near infrared (Vis-NIR) spectroscopy has proven to be a rapid and reliable approach. However, when working across large geographical scales, remote sensing may be more suitable. Acquiring these spectra data normally under different measurement conditions could introduce artefacts that reduce SOC prediction accuracy. A common procedure has been using calibration or multivariate techniques in conjunction with one or more pre-treatment algorithms. The results of several comparative studies based on these predictive calibration techniques used alone were inconsistent. Moreover, protocols to select the most appropriate pre-treatment algorithms rarely exist. This study combines predictions from different techniques into a single model based on an ensemble learning approach. The main objective is to improve the accuracy of SOC prediction by assessing the effectiveness of using different calibration techniques individually against an ensemble model consisting of one statistical method, which includes partial least squares regression (PLSR), and three machine learning (ML) algorithms, including random forest (RF), support vector machine regression (SVMR), and Cubist. Several pre-treatment algorithms were also employed to improve the spectral data before prediction. Spectra data were collected from three different agricultural fields (with different soil types), under different spectral measurement conditions (field, wet and dry). Additionally, Sentinel-2 (S2) data was collected from one of these fields. Furthermore, to ascertain the effectiveness of the developed model on regional scale dataset, two options were employed: (1) merged data from all fields, and (2) merged data from fields measured under the same spectral measurement conditions. The models were evaluated using root mean square error of prediction (RMSEP_CV), the coefficient of determination (R²_CV), the ratio of performance to interquartile range (RPIQ), the ratio of performance to deviation (RPD) and BIAS. The results show that, across the three agricultural fields, the ensemble model predicted SOC more accurately than each of the individual calibration techniques (R²_CV = 0.92, RMSEP_CV (g/kg) = 1.00, RPD = 3.06, RPIQ = 3.74, BIAS (g/kg) = 0.067). The models derived from merged data (regional dataset) show that the ensemble approach predicted SOC more accurately with option 2 than option 1. Finally, while the ensemble model improves SOC accuracy with S2 data, the final output was poor. Further research to determine the underlying problem is strongly recommended. Nonetheless, these results indicate that the ensemble model is advantageous because it improved the prediction accuracy of SOC and reduced the error margin.

中文翻译：

使用具有多种预处理组合的集合模型方法在不同条件下使用 Sentinel-2 图像和光谱测量预测表层土壤有机碳含量

使用可见近红外 (Vis-NIR) 光谱估算土壤有机碳 (SOC) 已被证明是一种快速可靠的方法。然而，在大地理范围内工作时，遥感可能更适合。在不同的测量条件下正常获取这些光谱数据可能会引入降低 SOC 预测精度的伪影。一种常见的程序是结合一种或多种预处理算法使用校准或多变量技术。基于这些单独使用的预测校准技术的几项比较研究的结果是不一致的。此外，很少存在选择最合适的预处理算法的协议。这项研究基于集成学习方法将来自不同技术的预测结合到一个模型中。主要目标是通过针对由一种统计方法（包括偏最小二乘回归 (PLSR) 和三种机器学习 (ML) 算法）组成的集成模型单独评估使用不同校准技术的有效性来提高 SOC 预测的准确性，包括随机森林（RF）、支持向量机回归（SVMR）和立体派。还采用了几种预处理算法来改进预测之前的光谱数据。在不同的光谱测量条件（田间、潮湿和干燥）下，从三个不同的农田（具有不同的土壤类型）收集光谱数据。此外，从这些字段之一收集了 Sentinel-2 (S2) 数据。此外，为了确定所开发模型在区域尺度数据集上的有效性，采用了两种选择：(1) 合并来自所有领域的数据，以及 (2) 合并来自在相同光谱测量条件下测量的领域的数据。使用预测的均方根误差 (RMSEP) 评估模型_CV )、决定系数 (R² _CV )、性能与四分位间距的比率 (RPIQ)、性能与偏差的比率 (RPD) 和 BIAS。结果表明，在三个农田中，集成模型比每种单独的校准技术更准确地预测了 SOC（R ²_CV = 0.92，RMSEP _CV(g/kg) = 1.00，RPD = 3.06，RPIQ = 3.74，BIAS (g/kg) = 0.067)。从合并数据（区域数据集）导出的模型表明，使用选项 2 的集成方法比选项 1 更准确地预测 SOC。最后，虽然集成模型使用 S2 数据提高了 SOC 精度，但最终输出很差。强烈建议进一步研究以确定潜在问题。尽管如此，这些结果表明集成模型是有利的，因为它提高了 SOC 的预测精度并降低了误差范围。

更新日期：2022-03-27

点击分享查看原文

点击收藏

阅读更多本刊最新论文