MIME: Mutual Information Minimisation Exploration,arXiv - CS - Machine Learning

当前位置： X-MOL 学术 › arXiv.cs.LG › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

MIME: Mutual Information Minimisation Exploration
arXiv - CS - Machine Learning Pub Date : 2020-01-16 , DOI: arxiv-2001.05636
Haitao Xu and Brendan McCane and Lech Szymanski and Craig Atkinson

We show that reinforcement learning agents that learn by surprise (surprisal) get stuck at abrupt environmental transition boundaries because these transitions are difficult to learn. We propose a counter-intuitive solution that we call Mutual Information Minimising Exploration (MIME) where an agent learns a latent representation of the environment without trying to predict the future states. We show that our agent performs significantly better over sharp transition boundaries while matching the performance of surprisal driven agents elsewhere. In particular, we show state-of-the-art performance on difficult learning games such as Gravitar, Montezuma's Revenge and Doom.

中文翻译：

MIME：互信息最小化探索

我们表明，出其不意地学习的强化学习代理会卡在突然的环境转变边界上，因为这些转变很难学习。我们提出了一种反直觉的解决方案，我们称之为互信息最小化探索 (MIME)，其中代理学习环境的潜在表示，而无需尝试预测未来状态。我们表明，我们的代理在急剧过渡边界上的表现明显更好，同时与其他地方的意外驱动代理的表现相匹配。特别是，我们在 Gravitar、Montezuma's Revenge 和 Doom 等困难学习游戏中展示了最先进的性能。

更新日期：2020-01-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文