Uniform State Abstraction For Reinforcement Learning,arXiv - CS - Artificial Intelligence

当前位置： X-MOL 学术 › arXiv.cs.AI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Uniform State Abstraction For Reinforcement Learning
arXiv - CS - Artificial Intelligence Pub Date : 2020-04-06 , DOI: arxiv-2004.02919
John Burden and Daniel Kudenko

Potential Based Reward Shaping combined with a potential function based on appropriately defined abstract knowledge has been shown to significantly improve learning speed in Reinforcement Learning. MultiGrid Reinforcement Learning (MRL) has further shown that such abstract knowledge in the form of a potential function can be learned almost solely from agent interaction with the environment. However, we show that MRL faces the problem of not extending well to work with Deep Learning. In this paper we extend and improve MRL to take advantage of modern Deep Learning algorithms such as Deep Q-Networks (DQN). We show that DQN augmented with our approach perform significantly better on continuous control tasks than its Vanilla counterpart and DQN augmented with MRL.

中文翻译：

强化学习的统一状态抽象

基于潜力的奖励塑造与基于适当定义的抽象知识的潜在功能相结合，已被证明可以显着提高强化学习的学习速度。MultiGrid 强化学习 (MRL) 进一步表明，这种潜在函数形式的抽象知识几乎可以仅从代理与环境的交互中学习。然而，我们表明 MRL 面临着不能很好地扩展到与深度学习一起工作的问题。在本文中，我们扩展和改进了 MRL，以利用现代深度学习算法，例如 Deep Q-Networks (DQN)。我们表明，用我们的方法增强的 DQN 在连续控制任务上的表现明显优于其 Vanilla 对应物和用 MRL 增强的 DQN。

更新日期：2020-04-08

点击分享查看原文

点击收藏

阅读更多本刊最新论文