当前位置:
X-MOL 学术
›
arXiv.cs.AI
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Explaining Data-Driven Decisions made by AI Systems: The Counterfactual Approach
arXiv - CS - Artificial Intelligence Pub Date : 2020-01-21 , DOI: arxiv-2001.07417 Carlos Fern\'andez-Lor\'ia, Foster Provost, Xintian Han
arXiv - CS - Artificial Intelligence Pub Date : 2020-01-21 , DOI: arxiv-2001.07417 Carlos Fern\'andez-Lor\'ia, Foster Provost, Xintian Han
Lack of understanding of the decisions made by model-based AI systems is an
important barrier for their adoption. We examine counterfactual explanations as
an alternative for explaining AI decisions. The counterfactual approach defines
an explanation as a set of the system's data inputs that causally drives the
decision (meaning that removing them changes the decision) and is irreducible
(meaning that removing any subset of the inputs in the explanation does not
change the decision). We generalize previous work on counterfactual
explanations, resulting in a framework that (a) is model-agnostic, (b) can
address features with arbitrary data types, (c) can explain decisions made by
complex AI systems that incorporate multiple models, and (d) is scalable to
large numbers of features. We also propose a heuristic procedure to find the
most useful explanations depending on the context. We contrast counterfactual
explanations with another alternative: methods that explain model predictions
by weighting features according to their importance (e.g., SHAP, LIME). This
paper presents two fundamental reasons why explaining model predictions is not
the same as explaining the decisions made using those predictions, suggesting
we should carefully consider whether importance-weight explanations are
well-suited to explain decisions made by AI systems. Specifically, we show that
(1) features that have a large importance weight for a model prediction may not
actually affect the corresponding decision, and (2) importance weights are
insufficient to communicate whether and how features influence system
decisions. We demonstrate this with several examples, including three detailed
case studies that compare the counterfactual approach with SHAP to illustrate
various conditions under which counterfactual explanations explain data-driven
decisions better than feature importance weights.
中文翻译:
解释人工智能系统做出的数据驱动决策:反事实方法
缺乏对基于模型的 AI 系统所做决策的理解是其采用的一个重要障碍。我们将反事实解释作为解释 AI 决策的替代方法。反事实方法将解释定义为一组系统的数据输入,这些数据输入因果驱动决策(意味着删除它们会改变决策)并且是不可约的(意味着删除解释中输入的任何子集不会改变决策)。我们概括了之前在反事实解释方面的工作,得到了一个框架:(a)与模型无关,(b)可以处理具有任意数据类型的特征,(c)可以解释由包含多个模型的复杂人工智能系统做出的决策,以及( d) 可扩展到大量特征。我们还提出了一种启发式程序,以根据上下文找到最有用的解释。我们将反事实解释与另一种替代方法进行对比:通过根据特征的重要性对特征进行加权来解释模型预测的方法(例如,SHAP、LIME)。本文提出了解释模型预测与解释使用这些预测做出的决策不同的两个根本原因,建议我们应该仔细考虑重要性权重解释是否非常适合解释人工智能系统做出的决策。具体来说,我们表明(1)对模型预测具有较大重要性权重的特征实际上可能不会影响相应的决策,以及(2)重要性权重不足以传达特征是否以及如何影响系统决策。
更新日期:2020-05-12
中文翻译:
解释人工智能系统做出的数据驱动决策:反事实方法
缺乏对基于模型的 AI 系统所做决策的理解是其采用的一个重要障碍。我们将反事实解释作为解释 AI 决策的替代方法。反事实方法将解释定义为一组系统的数据输入,这些数据输入因果驱动决策(意味着删除它们会改变决策)并且是不可约的(意味着删除解释中输入的任何子集不会改变决策)。我们概括了之前在反事实解释方面的工作,得到了一个框架:(a)与模型无关,(b)可以处理具有任意数据类型的特征,(c)可以解释由包含多个模型的复杂人工智能系统做出的决策,以及( d) 可扩展到大量特征。我们还提出了一种启发式程序,以根据上下文找到最有用的解释。我们将反事实解释与另一种替代方法进行对比:通过根据特征的重要性对特征进行加权来解释模型预测的方法(例如,SHAP、LIME)。本文提出了解释模型预测与解释使用这些预测做出的决策不同的两个根本原因,建议我们应该仔细考虑重要性权重解释是否非常适合解释人工智能系统做出的决策。具体来说,我们表明(1)对模型预测具有较大重要性权重的特征实际上可能不会影响相应的决策,以及(2)重要性权重不足以传达特征是否以及如何影响系统决策。