当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
DinerDash Gym: A Benchmark for Policy Learning in High-Dimensional Action Space
arXiv - CS - Artificial Intelligence Pub Date : 2020-07-13 , DOI: arxiv-2007.06207
Siwei Chen, Xiao Ma, David Hsu

It has been arduous to assess the progress of a policy learning algorithm in the domain of hierarchical task with high dimensional action space due to the lack of a commonly accepted benchmark. In this work, we propose a new light-weight benchmark task called Diner Dash for evaluating the performance in a complicated task with high dimensional action space. In contrast to the traditional Atari games that only have a flat structure of goals and very few actions, the proposed benchmark task has a hierarchical task structure and size of 57 for the action space and hence can facilitate the development of policy learning in complicated tasks. On top of that, we introduce Decomposed Policy Graph Modelling (DPGM), an algorithm that combines both graph modelling and deep learning to allow explicit domain knowledge embedding and achieves significant improvement comparing to the baseline. In the experiments, we have shown the effectiveness of the domain knowledge injection via a specially designed imitation algorithm as well as results of other popular algorithms.

中文翻译:

DinerDash Gym:高维行动空间中政策学习的基准

由于缺乏普遍接受的基准,在具有高维动作空间的分层任务领域评估策略学习算法的进展一直很困难。在这项工作中,我们提出了一个新的轻量级基准任务,称为 Diner Dash,用于评估具有高维动作空间的复杂任务的性能。与只有平面结构的目标和很少动作的传统 Atari 游戏相比,所提出的基准任务具有分层任务结构,动作空间的大小为 57,因此可以促进复杂任务中策略学习的发展。最重要的是,我们引入了分解策略图建模(DPGM),一种结合图建模和深度学习的算法,允许显式的领域知识嵌入,并且与基线相比实现了显着的改进。在实验中,我们通过专门设计的模仿算法以及其他流行算法的结果展示了领域知识注入的有效性。
更新日期:2020-07-14
down
wechat
bug