当前位置: X-MOL 学术Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Explaining black-box classifiers using post-hoc explanations-by-example: The effect of explanations and error-rates in XAI user studies
Artificial Intelligence ( IF 14.4 ) Pub Date : 2021-01-26 , DOI: 10.1016/j.artint.2021.103459
Eoin M. Kenny , Courtney Ford , Molly Quinn , Mark T. Keane

In this paper, we describe a post-hoc explanation-by-example approach to eXplainable AI (XAI), where a black-box, deep learning system is explained by reference to a more transparent, proxy model (in this situation a case-based reasoner), based on a feature-weighting analysis of the former that is used to find explanatory cases from the latter (as one instance of the so-called Twin Systems approach). A novel method (COLE-HP) for extracting the feature-weights from black-box models is demonstrated for a convolutional neural network (CNN) applied to the MNIST dataset; in which extracted feature-weights are used to find explanatory, nearest-neighbours for test instances. Three user studies are reported examining people's judgements of right and wrong classifications made by this XAI twin-system, in the presence/absence of explanations-by-example and different error-rates (from 3-60%). The judgements gathered include item-level evaluations of both correctness and reasonableness, and system-level evaluations of trust, satisfaction, correctness, and reasonableness. Several proposals are made about the user's mental model in these tasks and how it is impacted by explanations at an item- and system-level. The wider lessons from this work for XAI and its user studies are reviewed.



中文翻译:

使用事后解释来举例说明黑匣子分类器:XAI用户研究中解释和错误率的影响

在本文中,我们描述了一个事后的解释按示例的方法来解释的AI(XAI),其中一个黑盒子,深度学习系统通过引用解释更加透明,代理模型(在这种情况下一个病例基于推理器),基于对前者的特征加权分析,该特征加权分析用于从后者中寻找解释性案例(作为所谓的Twin Systems的一个实例)方法)。针对应用于MNIST数据集的卷积神经网络(CNN),展示了一种从黑盒模型中提取特征权重的新方法(COLE-HP);其中提取的特征权重用于查找测试实例的解释性最近邻。据报道,有三项用户研究检查了人们对这个XAI双系统进行的正确与错误分类的判断,并以示例为例进行解释,并且错误率不同(3-60%)。收集的判断包括对项目的正确性和合理性评估,以及对信任度,满意度,正确性和合理性的系统评估。针对这些任务中用户的心理模型,以及在项目和系统级别的解释如何影响它,提出了一些建议。

更新日期:2021-01-29
down
wechat
bug