当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
How Do You Act? An Empirical Study to Understand Behavior of Deep Reinforcement Learning Agents
arXiv - CS - Artificial Intelligence Pub Date : 2020-04-07 , DOI: arxiv-2004.03237
Richard Meyes, Moritz Schneider, Tobias Meisen

The demand for more transparency of decision-making processes of deep reinforcement learning agents is greater than ever, due to their increased use in safety critical and ethically challenging domains such as autonomous driving. In this empirical study, we address this lack of transparency following an idea that is inspired by research in the field of neuroscience. We characterize the learned representations of an agent's policy network through its activation space and perform partial network ablations to compare the representations of the healthy and the intentionally damaged networks. We show that the healthy agent's behavior is characterized by a distinct correlation pattern between the network's layer activation and the performed actions during an episode and that network ablations, which cause a strong change of this pattern, lead to the agent failing its trained control task. Furthermore, the learned representation of the healthy agent is characterized by a distinct pattern in its activation space reflecting its different behavioral stages during an episode, which again, when distorted by network ablations, leads to the agent failing its trained control task. Concludingly, we argue in favor of a new perspective on artificial neural networks as objects of empirical investigations, just as biological neural systems in neuroscientific studies, paving the way towards a new standard of scientific falsifiability with respect to research on transparency and interpretability of artificial neural networks.

中文翻译:

你怎么做?一项了解深度强化学习代理行为的实证研究

由于深度强化学习代理越来越多地用于安全关键和具有道德挑战性的领域,例如自动驾驶,因此对深度强化学习代理决策过程透明度的需求比以往任何时候都大。在这项实证研究中,我们遵循受神经科学领域研究启发的想法来解决这种缺乏透明度的问题。我们通过其激活空间表征代理策略网络的学习表示,并执行部分网络消融以比较健康和故意损坏网络的​​表示。我们表明,健康代理的行为的特征在于网络层激活与发作期间执行的动作之间存在明显的相关模式,并且网络消融会导致该模式发生强烈变化,导致代理未能完成其训练有素的控制任务。此外,健康智能体的学习表征的特征在于其激活空间中的独特模式反映了其在一个情节期间的不同行为阶段,当被网络消融扭曲时,这再次导致智能体无法完成其训练有素的控制任务。最后,我们支持将人工神经网络作为实证研究对象的新视角,就像神经科学研究中的生物神经系统一样,在人工神经网络的透明度和可解释性研究方面为科学可证伪性的新标准铺平了道路。网络。健康智能体的学习表征的特征在于其激活空间中的独特模式,反映了其在一个情节期间的不同行为阶段,再次,当被网络消融扭曲时,会导致智能体未能完成其训练有素的控制任务。最后,我们支持将人工神经网络作为实证研究对象的新视角,就像神经科学研究中的生物神经系统一样,在人工神经网络的透明度和可解释性研究方面为科学可证伪性的新标准铺平了道路。网络。健康智能体的学习表征的特征在于其激活空间中的独特模式,反映了其在一个情节期间的不同行为阶段,再次,当被网络消融扭曲时,会导致智能体未能完成其训练有素的控制任务。最后,我们支持将人工神经网络作为实证研究对象的新视角,就像神经科学研究中的生物神经系统一样,在人工神经网络的透明度和可解释性研究方面为科学可证伪性的新标准铺平了道路。网络。
更新日期:2020-04-08
down
wechat
bug