当前位置: X-MOL 学术SPE Reserv. Eval. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Demonstration and Mitigation of Spatial Sampling Bias for Machine-Learning Predictions
SPE Reservoir Evaluation & Engineering ( IF 2.1 ) Pub Date : 2020-10-01 , DOI: 10.2118/203838-pa
Wendi Liu 1 , Svetlana Ikonnikova 2 , H. Scott Hamlin 1 , Livia Sivila 3 , Michael J. Pyrcz 1
Affiliation  

Machine learning provides powerful methods for inferential and predictive modeling of complicated multivariate relationships to support decision-making for spatial problems such as optimization of unconventional reservoir development. Current machine-learning methods have been widely used in exhaustive spatial data sets like satellite images. However, geological subsurface characterization is significantly different because it is conditioned by sparse, nonrepresentative sampling. These sparse spatial data sets are generally not sampled in a representative manner; therefore, they are biased. The critical questions are: first, does spatial bias in training data result in a bias for machine-learning-based predictive models; and if there is a bias, how can we mitigate the bias in these spatial machine-learning-based predictions?

The presence and mitigation of prediction with spatial sampling bias is demonstrated with tree-based machine learning due to its high degree of interpretability. In expectation, training data bias imposes bias in machine-learning predictions over a wide variety of spatial data configurations and degrees of bias, even when the model is applied to make predictions with unbiased testing and real-world data. We reduce the bias in prediction with a novel spatial weighted tree method over a variety of spatial data configurations and degrees of spatial sampling bias. The proposed method is able to improve the accuracy for reservoir evaluation. We recommend modeling checking and bias mitigation for all machine-learning prediction models with sparse, spatial data sets, because bias in, bias out.



中文翻译:

演示和缓解用于机器学习预测的空间采样偏差

机器学习为复杂的多元关系的推理和预测建模提供了强大的方法,以支持针对空间问题(例如非常规油藏开发的优化)的决策。当前的机器学习方法已广泛用于诸如卫星图像之类的详尽的空间数据集。但是,由于其稀疏性和非代表性采样条件决定了地质地下特征的显着不同。这些稀疏的空间数据集通常不会以代表性的方式进行采样。因此,他们有偏见。关键问题是:首先,训练数据中的空间偏差是否会导致基于机器学习的预测模型产生偏差?如果存在偏见,我们如何在这些基于空间机器学习的预测中减轻偏见?

基于树的机器学习由于具有高度的可解释性,因此证明了具有空间采样偏差的预测的存在和缓解。可以预期的是,即使使用模型通过无偏测试和真实数据进行预测,训练数据偏倚也会在各种空间数据配置和偏倚程度上对机器学习预测造成偏倚。在各种空间数据配置和空间采样偏差度上,我们使用新颖的空间加权树方法减少了预测中的偏差。所提出的方法能够提高储层评价的准确性。我们建议对所有具有稀疏空间数据集的机器学习预测模型进行建模检查和缓解偏倚,因为偏见偏向偏向。

更新日期:2020-10-05
down
wechat
bug