当前位置:
X-MOL 学术
›
arXiv.cs.AI
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Uniform State Abstraction For Reinforcement Learning
arXiv - CS - Artificial Intelligence Pub Date : 2020-04-06 , DOI: arxiv-2004.02919 John Burden and Daniel Kudenko
arXiv - CS - Artificial Intelligence Pub Date : 2020-04-06 , DOI: arxiv-2004.02919 John Burden and Daniel Kudenko
Potential Based Reward Shaping combined with a potential function based on
appropriately defined abstract knowledge has been shown to significantly
improve learning speed in Reinforcement Learning. MultiGrid Reinforcement
Learning (MRL) has further shown that such abstract knowledge in the form of a
potential function can be learned almost solely from agent interaction with the
environment. However, we show that MRL faces the problem of not extending well
to work with Deep Learning. In this paper we extend and improve MRL to take
advantage of modern Deep Learning algorithms such as Deep Q-Networks (DQN). We
show that DQN augmented with our approach perform significantly better on
continuous control tasks than its Vanilla counterpart and DQN augmented with
MRL.
中文翻译:
强化学习的统一状态抽象
基于潜力的奖励塑造与基于适当定义的抽象知识的潜在功能相结合,已被证明可以显着提高强化学习的学习速度。MultiGrid 强化学习 (MRL) 进一步表明,这种潜在函数形式的抽象知识几乎可以仅从代理与环境的交互中学习。然而,我们表明 MRL 面临着不能很好地扩展到与深度学习一起工作的问题。在本文中,我们扩展和改进了 MRL,以利用现代深度学习算法,例如 Deep Q-Networks (DQN)。我们表明,用我们的方法增强的 DQN 在连续控制任务上的表现明显优于其 Vanilla 对应物和用 MRL 增强的 DQN。
更新日期:2020-04-08
中文翻译:
强化学习的统一状态抽象
基于潜力的奖励塑造与基于适当定义的抽象知识的潜在功能相结合,已被证明可以显着提高强化学习的学习速度。MultiGrid 强化学习 (MRL) 进一步表明,这种潜在函数形式的抽象知识几乎可以仅从代理与环境的交互中学习。然而,我们表明 MRL 面临着不能很好地扩展到与深度学习一起工作的问题。在本文中,我们扩展和改进了 MRL,以利用现代深度学习算法,例如 Deep Q-Networks (DQN)。我们表明,用我们的方法增强的 DQN 在连续控制任务上的表现明显优于其 Vanilla 对应物和用 MRL 增强的 DQN。