Deep Residual Reinforcement Learning,arXiv - CS - Artificial Intelligence

当前位置： X-MOL 学术 › arXiv.cs.AI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Deep Residual Reinforcement Learning
arXiv - CS - Artificial Intelligence Pub Date : 2019-05-03 , DOI: arxiv-1905.01072
Shangtong Zhang, Wendelin Boehmer, Shimon Whiteson

We revisit residual algorithms in both model-free and model-based reinforcement learning settings. We propose the bidirectional target network technique to stabilize residual algorithms, yielding a residual version of DDPG that significantly outperforms vanilla DDPG in the DeepMind Control Suite benchmark. Moreover, we find the residual algorithm an effective approach to the distribution mismatch problem in model-based planning. Compared with the existing TD($k$) method, our residual-based method makes weaker assumptions about the model and yields a greater performance boost.

中文翻译：

深度残差强化学习

我们在无模型和基于模型的强化学习设置中重新审视残差算法。我们提出了双向目标网络技术来稳定残差算法，产生一个残差版本的 DDPG，在 DeepMind 控制套件基准测试中明显优于普通 DDPG。此外，我们发现残差算法是解决基于模型的规划中分布不匹配问题的有效方法。与现有的 TD($k$) 方法相比，我们的基于残差的方法对模型做出了更弱的假设，并产生了更大的性能提升。

更新日期：2020-01-27

点击分享查看原文

点击收藏

阅读更多本刊最新论文