Curious Representation Learning for Embodied Intelligence,arXiv - CS - Robotics

当前位置： X-MOL 学术 › arXiv.cs.RO › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Curious Representation Learning for Embodied Intelligence
arXiv - CS - Robotics Pub Date : 2021-05-03 , DOI: arxiv-2105.01060
Yilun Du, Chuang Gan, Phillip Isola

Self-supervised representation learning has achieved remarkable success in recent years. By subverting the need for supervised labels, such approaches are able to utilize the numerous unlabeled images that exist on the Internet and in photographic datasets. Yet to build truly intelligent agents, we must construct representation learning algorithms that can learn not only from datasets but also learn from environments. An agent in a natural environment will not typically be fed curated data. Instead, it must explore its environment to acquire the data it will learn from. We propose a framework, curious representation learning (CRL), which jointly learns a reinforcement learning policy and a visual representation model. The policy is trained to maximize the error of the representation learner, and in doing so is incentivized to explore its environment. At the same time, the learned representation becomes stronger and stronger as the policy feeds it ever harder data to learn from. Our learned representations enable promising transfer to downstream navigation tasks, performing better than or comparably to ImageNet pretraining without using any supervision at all. In addition, despite being trained in simulation, our learned representations can obtain interpretable results on real images.

中文翻译：

体现智力的好奇表征学习

近年来，自我监督的表征学习取得了巨大的成功。通过颠覆对监督标签的需求，此类方法能够利用互联网和摄影数据集中存在的大量未标记图像。然而，要构建真正的智能代理，我们必须构建表示学习算法，该算法不仅可以从数据集中学习，而且还可以从环境中学习。在自然环境中的代理通常将不会获得策展数据。相反，它必须探索其环境以获取将要学习的数据。我们提出了一个框架，好奇表示学习（CRL），该框架可以共同学习强化学习策略和视觉表示模型。该政策经过培训，可以最大程度地提高表示学习器的错误，并以此激励探索其环境。同时，随着策略提供越来越难的数据以供学习，学习到的表示形式将变得越来越强大。我们所学的表示法使您有可能将其转移到下游导航任务，从而在不进行任何监督的情况下，比ImageNet的预训练表现更好或可比。此外，尽管我们接受了模拟方面的培训，但我们学到的表示形式仍可以在真实图像上获得可解释的结果。

更新日期：2021-05-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文