Benchmarking End-to-End Behavioural Cloning on Video Games,arXiv - CS - Artificial Intelligence

当前位置： X-MOL 学术 › arXiv.cs.AI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Benchmarking End-to-End Behavioural Cloning on Video Games
arXiv - CS - Artificial Intelligence Pub Date : 2020-04-02 , DOI: arxiv-2004.00981
Anssi Kanervisto, Joonas Pussinen, Ville Hautam\"aki

Behavioural cloning, where a computer is taught to perform a task based on demonstrations, has been successfully applied to various video games and robotics tasks, with and without reinforcement learning. This also includes end-to-end approaches, where a computer plays a video game like humans do: by looking at the image displayed on the screen, and sending keystrokes to the game. As a general approach to playing video games, this has many inviting properties: no need for specialized modifications to the game, no lengthy training sessions and the ability to re-use the same tools across different games. However, related work includes game-specific engineering to achieve the results. We take a step towards a general approach and study the general applicability of behavioural cloning on twelve video games, including six modern video games (published after 2010), by using human demonstrations as training data. Our results show that these agents cannot match humans in raw performance but do learn basic dynamics and rules. We also demonstrate how the quality of the data matters, and how recording data from humans is subject to a state-action mismatch, due to human reflexes.

中文翻译：

对视频游戏的端到端行为克隆进行基准测试

行为克隆，计算机被教导执行基于演示的任务，已成功应用于各种视频游戏和机器人任务，无论是否采用强化学习。这还包括端到端方法，即计算机像人类一样玩视频游戏：通过查看屏幕上显示的图像，然后向游戏发送按键。作为玩视频游戏的一般方法，它具有许多吸引人的特性：无需对游戏进行专门修改、无需冗长的培训课程以及在不同游戏中重复使用相同工具的能力。但是，相关工作包括游戏特定的工程以实现结果。我们向通用方法迈进了一步，研究了行为克隆在十二个视频游戏中的普遍适用性，包括六款现代视频游戏（2010 年之后发布），使用人类演示作为训练数据。我们的结果表明，这些智能体在原始表现上无法与人类匹敌，但可以学习基本的动态和规则。我们还展示了数据质量的重要性，以及由于人类的反射，人类记录数据如何受到状态-动作不匹配的影响。

更新日期：2020-05-19

点击分享查看原文

点击收藏

阅读更多本刊最新论文