Sequential robot imitation learning from observations,The International Journal of Robotics Research

当前位置： X-MOL 学术 › Int. J. Robot. Res. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Sequential robot imitation learning from observations
The International Journal of Robotics Research ( IF 9.2 ) Pub Date : 2021-08-06 , DOI: 10.1177/02783649211032721
Ajay Kumar Tanwani ₁ , Andy Yan ₁ , Jonathan Lee ₁ , Sylvain Calinon ₂ , Ken Goldberg ₁

Affiliation

This paper presents a framework to learn the sequential structure in the demonstrations for robot imitation learning. We first present a family of task-parameterized hidden semi-Markov models that extracts invariant segments (also called sub-goals or options) from demonstrated trajectories, and optimally follows the sampled sequence of states from the model with a linear quadratic tracking controller. We then extend the concept to learning invariant segments from visual observations that are sequenced together for robot imitation. We present Motion2Vec that learns a deep embedding space by minimizing a metric learning loss in a Siamese network: images from the same action segment are pulled together while being pushed away from randomly sampled images of other segments, and a time contrastive loss is used to preserve the temporal ordering of the images. The trained embeddings are segmented with a recurrent neural network, and subsequently used for decoding the end-effector pose of the robot. We first show its application to a pick-and-place task with the Baxter robot while avoiding a moving obstacle from four kinesthetic demonstrations only, followed by suturing task imitation from publicly available suturing videos of the JIGSAWS dataset with state-of-the-art $85.5$ % segmentation accuracy and $0.94$ cm error in position per observation on the test set.

中文翻译：

从观察中学习的顺序机器人模仿

本文提出了一个框架来学习机器人模仿学习演示中的顺序结构。我们首先提出了一系列任务参数化的隐藏半马尔可夫模型，它们从演示的轨迹中提取不变段（也称为子目标或选项），并使用线性二次跟踪控制器最佳地遵循模型中的采样状态序列。然后，我们将概念扩展到从视觉观察中学习不变段，这些段被排序在一起以进行机器人模仿。我们提出了 Motion2Vec，它通过最小化 Siamese 网络中的度量学习损失来学习深度嵌入空间：来自同一动作段的图像被拉到一起，同时被推离其他段的随机采样图像，并且使用时间对比损失来保持图像的时间顺序。训练好的嵌入被循环神经网络分割，随后用于解码机器人的末端执行器姿态。我们首先展示了它在 Baxter 机器人的拾取和放置任务中的应用，同时仅从四个动觉演示中避免了移动障碍，然后使用最先进的 JIGSAWS 数据集的公开缝合视频进行缝合任务模仿 $85.5$ % 分割准确率和 $0.94$ 测试集上每次观察的位置误差 cm。

更新日期：2021-08-07

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>