Counterfactual state explanations for reinforcement learning agents via generative deep learning,Artificial Intelligence

当前位置： X-MOL 学术 › Artif. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Counterfactual state explanations for reinforcement learning agents via generative deep learning
Artificial Intelligence ( IF 5.1 ) Pub Date : 2021-01-27 , DOI: 10.1016/j.artint.2021.103455
Matthew L. Olson , Roli Khanna , Lawrence Neal , Fuxin Li , Weng-Keen Wong

Counterfactual explanations, which deal with “why not?” scenarios, can provide insightful explanations to an AI agent's behavior [Miller [38]]. In this work, we focus on generating counterfactual explanations for deep reinforcement learning (RL) agents which operate in visual input environments like Atari. We introduce counterfactual state explanations, a novel example-based approach to counterfactual explanations based on generative deep learning. Specifically, a counterfactual state illustrates what minimal change is needed to an Atari game image such that the agent chooses a different action. We also evaluate the effectiveness of counterfactual states on human participants who are not machine learning experts. Our first user study investigates if humans can discern if the counterfactual state explanations are produced by the actual game or produced by a generative deep learning approach. Our second user study investigates if counterfactual state explanations can help non-expert participants identify a flawed agent; we compare against a baseline approach based on a nearest neighbor explanation which uses images from the actual game. Our results indicate that counterfactual state explanations have sufficient fidelity to the actual game images to enable non-experts to more effectively identify a flawed RL agent compared to the nearest neighbor baseline and to having no explanation at all.

中文翻译：

通过生成式深度学习对强化学习代理进行反事实状态解释

反事实的解释，涉及“为什么不呢？” 这些场景可以为AI代理的行为提供深刻的解释[Miller [38]]。在这项工作中，我们专注于为在视觉输入环境（如Atari）中运行的深度强化学习（RL）代理生成反事实说明。介绍反事实状态的说明，这是一种基于示例的基于生成深度学习的反事实解释方法。具体而言，反事实状态说明了对Atari游戏图像所需的最小更改，以便代理选择不同的动作。我们还评估了反事实状态对不是机器学习专家的人类参与者的有效性。我们的首次用户研究调查了人类是否可以辨别反事实状态的解释是由实际游戏产生还是由生成性深度学习方法产生。我们的第二项用户研究调查了反事实状态的解释是否可以帮助非专家参与者识别有缺陷的代理商；我们将根据最接近邻居的解释与使用实际游戏中的图像的基线方法进行比较。

更新日期：2021-02-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11