DRLViz: Understanding Decisions and Memory in Deep Reinforcement Learning,Computer Graphics Forum

当前位置： X-MOL 学术 › Comput. Graph. Forum › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

DRLViz: Understanding Decisions and Memory in Deep Reinforcement Learning
Computer Graphics Forum ( IF 2.7 ) Pub Date : 2020-06-01 , DOI: 10.1111/cgf.13962
T. Jaunet ₁ , R. Vuillemot ₂ , C. Wolf _{1,

3}

Affiliation

We present DRLViz, a visual analytics interface to interpret the internal memory of an agent (e.g. a robot) trained using deep reinforcement learning. This memory is composed of large temporal vectors updated when the agent moves in an environment and is not trivial to understand due to the number of dimensions, dependencies to past vectors, spatial/temporal correlations, and co‐correlation between dimensions. It is often referred to as a black box as only inputs (images) and outputs (actions) are intelligible for humans. Using DRLViz, experts are assisted to interpret decisions using memory reduction interactions, and to investigate the role of parts of the memory when errors have been made (e.g. wrong direction). We report on DRLViz applied in the context of video games simulators (ViZDoom) for a navigation scenario with item gathering tasks. We also report on experts evaluation using DRLViz, and applicability of DRLViz to other scenarios and navigation problems beyond simulation games, as well as its contribution to black box models interpretability and explain‐ability in the field of visual analytics.

中文翻译：

DRLViz：理解深度强化学习中的决策和记忆

我们提出了 DRLViz，这是一个可视化分析界面，用于解释使用深度强化学习训练的代理（例如机器人）的内部记忆。这种记忆由当代理在环境中移动时更新的大时间向量组成，并且由于维度数量、对过去向量的依赖性、空间/时间相关性以及维度之间的相关性而不易理解。它通常被称为黑匣子，因为只有输入（图像）和输出（动作）才能被人类理解。使用 DRLViz，专家可以使用记忆减少交互来解释决策，并在出现错误（例如错误方向）时调查部分记忆的作用。我们报告了 DRLViz 在视频游戏模拟器 (ViZDoom) 的上下文中应用于具有项目收集任务的导航场景。

更新日期：2020-06-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11