当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Goal-Aware Prediction: Learning to Model What Matters
arXiv - CS - Artificial Intelligence Pub Date : 2020-07-14 , DOI: arxiv-2007.07170
Suraj Nair, Silvio Savarese, Chelsea Finn

Learned dynamics models combined with both planning and policy learning algorithms have shown promise in enabling artificial agents to learn to perform many diverse tasks with limited supervision. However, one of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model (future state reconstruction), and that of the downstream planner or policy (completing a specified task). This issue is exacerbated by vision-based control tasks in diverse real-world environments, where the complexity of the real world dwarfs model capacity. In this paper, we propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space, resulting in a learning objective that more closely matches the downstream task. Further, we do so in an entirely self-supervised manner, without the need for a reward function or image labels. We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.

中文翻译:

目标意识预测:学习对重要的事物建模

学习动力学模型与规划和策略学习算法相结合,已显示出使人工智能能够在有限监督下学习执行许多不同任务的前景。然而,使用学习的前向动力学模型的基本挑战之一是学习模型的目标(未来状态重建)与下游规划者或策略(完成指定任务)的目标之间的不匹配。在不同的现实世界环境中,基于视觉的控制任务加剧了这个问题,其中现实世界的复杂性使模型容量相形见绌。在本文中,我们建议将预测直接针对任务相关信息,使模型能够了解当前任务并鼓励它仅对状态空间的相关数量进行建模,导致学习目标更接近下游任务。此外,我们以完全自我监督的方式这样做,不需要奖励函数或图像标签。我们发现我们的方法更有效地模拟了以目标为条件的场景的相关部分,因此优于标准的与任务无关的动力学模型和无模型强化学习。
更新日期:2020-08-12
down
wechat
bug