Inductive general game playing,Machine Learning

当前位置： X-MOL 学术 › Mach. Learn. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Inductive general game playing
Machine Learning ( IF 7.5 ) Pub Date : 2019-11-18 , DOI: 10.1007/s10994-019-05843-w
Andrew Cropper , Richard Evans , Mark Law

General game playing (GGP) is a framework for evaluating an agent’s general intelligence across a wide range of tasks. In the GGP competition, an agent is given the rules of a game (described as a logic program) that it has never seen before. The task is for the agent to play the game, thus generating game traces. The winner of the GGP competition is the agent that gets the best total score over all the games. In this paper, we invert this task: a learner is given game traces and the task is to learn the rules that could produce the traces. This problem is central to inductive general game playing (IGGP). We introduce a technique that automatically generates IGGP tasks from GGP games. We introduce an IGGP dataset which contains traces from 50 diverse games, such as Sudoku, Sokoban, and Checkers. We claim that IGGP is difficult for existing inductive logic programming (ILP) approaches. To support this claim, we evaluate existing ILP systems on our dataset. Our empirical results show that most of the games cannot be correctly learned by existing systems. The best performing system solves only 40% of the tasks perfectly. Our results suggest that IGGP poses many challenges to existing approaches. Furthermore, because we can automatically generate IGGP tasks from GGP games, our dataset will continue to grow with the GGP competition, as new games are added every year. We therefore think that the IGGP problem and dataset will be valuable for motivating and evaluating future research.

中文翻译：

归纳一般游戏

通用游戏 (GGP) 是一个框架，用于评估代理在各种任务中的通用智能。在 GGP 竞赛中，Agent 被赋予了从未见过的游戏规则（描述为逻辑程序）。任务是让代理玩游戏，从而生成游戏痕迹。GGP比赛的获胜者是在所有比赛中获得最高总分的代理人。在本文中，我们颠倒了这个任务：给学习者游戏轨迹，任务是学习可以产生轨迹的规则。这个问题是归纳一般博弈 (IGGP) 的核心。我们介绍了一种从 GGP 游戏中自动生成 IGGP 任务的技术。我们引入了一个 IGGP 数据集，其中包含来自 50 种不同游戏的轨迹，例如数独、推箱子和跳棋。我们声称 IGGP 对于现有的归纳逻辑编程 (ILP) 方法是困难的。为了支持这一说法，我们在我们的数据集上评估了现有的 ILP 系统。我们的实证结果表明，现有系统无法正确学习大多数游戏。性能最好的系统只能完美地解决 40% 的任务。我们的结果表明 IGGP 对现有方法提出了许多挑战。此外，由于我们可以从 GGP 游戏中自动生成 IGGP 任务，我们的数据集将随着 GGP 比赛的进行而不断增长，因为每年都会添加新游戏。因此，我们认为 IGGP 问题和数据集对于激励和评估未来的研究很有价值。我们的实证结果表明，现有系统无法正确学习大多数游戏。性能最好的系统只能完美地解决 40% 的任务。我们的结果表明 IGGP 对现有方法提出了许多挑战。此外，由于我们可以从 GGP 游戏中自动生成 IGGP 任务，我们的数据集将随着 GGP 比赛的进行而不断增长，因为每年都会添加新游戏。因此，我们认为 IGGP 问题和数据集对于激励和评估未来的研究很有价值。我们的实证结果表明，现有系统无法正确学习大多数游戏。性能最好的系统只能完美地解决 40% 的任务。我们的结果表明 IGGP 对现有方法提出了许多挑战。此外，由于我们可以从 GGP 游戏中自动生成 IGGP 任务，我们的数据集将随着 GGP 比赛的进行而不断增长，因为每年都会添加新游戏。因此，我们认为 IGGP 问题和数据集对于激励和评估未来的研究很有价值。因为每年都会添加新游戏。因此，我们认为 IGGP 问题和数据集对于激励和评估未来的研究很有价值。因为每年都会添加新游戏。因此，我们认为 IGGP 问题和数据集对于激励和评估未来的研究很有价值。

更新日期：2019-11-18

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>