当前位置: X-MOL 学术Ecol. Inform. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Sampling bias mitigation for species occurrence modeling using machine learning methods
Ecological Informatics ( IF 5.1 ) Pub Date : 2020-04-10 , DOI: 10.1016/j.ecoinf.2020.101091
Victor Hugo Gutierrez-Velez , Daniel Wiese

The identification and mitigation of sampling biases is commonly overseen in species distribution modeling, even though bias can seriously compromise the validity of modeling outcomes. Here we propose methods to 1) detect and mitigate spatial and detection sampling biases in the use of machine learning methods for modeling species occurrence and 2) assess the magnitude of bias and the effectiveness of bias mitigation on modeling prediction, variable importance, and model performance. We illustrate these techniques through the calibration of boosted decision trees for the prediction of annual occurrences of Aedes albopictus, an invasive disease vector, in South-East Pennsylvania between 2001 and 2015. Methods consist of the application of spatial filters and the assignment of sampling reliability weights to observed locations. We tested the performance of spatial bias mitigation by comparing the frequency distribution obtained for predictors before and after filtering with the distribution that would be obtained under an ideal sampling design. We also tested the performance of detection bias mitigation by comparing the importance of variables representing detection bias before and after the assignment of reliability weights. Results show that spatial filtering reduced differences between the frequency distribution obtained with the unfiltered data and the distribution that would be obtained under a reference sampling design. The assignment of sampling reliability weights to observations reduced the relative influence of detection bias on fitted models. The mitigation of spatial bias had a larger effect on modeling prediction and accuracy estimates compared to detection bias mitigation. Spatial sampling bias mitigation largely tended to reduce the number of years of predicted A. albopictus occurrence while detection bias mitigation tended to increase it. Our results highlight the importance of identifying, quantifying and mitigating observation biases as a standard practice in the use of machine learning methods for species occurrence modeling because biases can compromise the reliability of modeling outcomes and interpretation.



中文翻译:

使用机器学习方法减轻物种发生建模的采样偏差

尽管偏差会严重影响建模结果的有效性,但在物种分布建模中通常会监督和消除抽样偏差。在这里,我们提出以下方法:1)在使用机器学习方法对物种发生进行建模的过程中检测和缓解空间和检测采样偏差,以及2)评估偏差的大小以及减轻偏差对模型预测,变量重要性和模型性能的有效性。我们通过对增强型决策树进行标定来说明这些技术,以预测白纹伊蚊的年发生是2001年至2015年间在宾夕法尼亚州东南部的一种侵入性疾病媒介。方法包括应用空间滤波器和将采样可靠性权重分配给观察到的位置。我们通过将滤波前后的预测变量频率分布与理想采样设计下获得的分布进行比较,测试了空间偏差缓解的性能。我们还通过比较代表可靠性权重分配前后代表检测偏差的变量的重要性来测试检测偏差缓解的性能。结果表明,空间滤波减少了未经滤波的数据获得的频率分布与参考采样设计下获得的频率分布之间的差异。将采样可靠性权重分配给观测值可减少检测偏差对拟合模型的相对影响。与检测偏差缓解相比,空间偏差的缓解对建模预测和准确性估计的影响更大。空间采样偏差的缓解在很大程度上倾向于减少预测的年数白化曲霉的发生,而检测偏差的缓解趋于增加。我们的结果凸显了在使用机器学习方法进行物种发生建模时作为标准实践来识别,量化和缓解观察偏差的重要性,因为偏差会损害建模结果和解释的可靠性。

更新日期:2020-04-10
down
wechat
bug