Physical interaction as communication: Learning robot objectives online from human corrections,The International Journal of Robotics Research

当前位置： X-MOL 学术 › Int. J. Robot. Res. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Physical interaction as communication: Learning robot objectives online from human corrections
The International Journal of Robotics Research ( IF 9.2 ) Pub Date : 2021-10-25 , DOI: 10.1177/02783649211050958
Dylan P. Losey ₁ , Andrea Bajcsy ₂ , Marcia K. O’Malley ₃ , Anca D. Dragan ₂

Affiliation

When a robot performs a task next to a human, physical interaction is inevitable: the human might push, pull, twist, or guide the robot. The state of the art treats these interactions as disturbances that the robot should reject or avoid. At best, these robots respond safely while the human interacts; but after the human lets go, these robots simply return to their original behavior. We recognize that physical human–robot interaction (pHRI) is often intentional: the human intervenes on purpose because the robot is not doing the task correctly. In this article, we argue that when pHRI is intentional it is also informative: the robot can leverage interactions to learn how it should complete the rest of its current task even after the person lets go. We formalize pHRI as a dynamical system, where the human has in mind an objective function they want the robot to optimize, but the robot does not get direct access to the parameters of this objective: they are internal to the human. Within our proposed framework human interactions become observations about the true objective. We introduce approximations to learn from and respond to pHRI in real-time. We recognize that not all human corrections are perfect: often users interact with the robot noisily, and so we improve the efficiency of robot learning from pHRI by reducing unintended learning. Finally, we conduct simulations and user studies on a robotic manipulator to compare our proposed approach with the state of the art. Our results indicate that learning from pHRI leads to better task performance and improved human satisfaction.

中文翻译：

作为交流的物理交互：从人工修正中在线学习机器人目标

当机器人在人类旁边执行任务时，物理交互是不可避免的：人类可能会推、拉、扭曲或引导机器人。现有技术将这些交互视为机器人应该拒绝或避免的干扰。充其量，这些机器人在人类互动时安全响应；但是在人类放手之后，这些机器人只是简单地恢复了原来的行为。我们认识到物理人机交互 (pHRI) 通常是有意的：人类故意干预是因为机器人没有正确执行任务。在本文中，我们认为，当 pHRI 是有意为之时，它也能提供信息：机器人可以利用交互来学习即使在人放手之后它应该如何完成当前任务的其余部分。我们将 pHRI 形式化为一个动态系统，人类心中有一个他们希望机器人优化的目标函数，但机器人无法直接访问该目标的参数：它们是人类内部的。在我们提出的框架内，人类互动成为对真正目标的观察。我们引入了近似值来实时学习和响应 pHRI。我们认识到并非所有的人工修正都是完美的：用户经常与机器人进行嘈杂的互动，因此我们通过减少意外学习来提高机器人从 pHRI 学习的效率。最后，我们对机器人操纵器进行模拟和用户研究，以将我们提出的方法与现有技术进行比较。我们的结果表明，从 pHRI 中学习可以提高任务绩效并提高人类满意度。但机器人无法直接访问此目标的参数：它们是人类内部的。在我们提出的框架内，人类互动成为对真正目标的观察。我们引入了近似值来实时学习和响应 pHRI。我们认识到并非所有的人工修正都是完美的：用户经常与机器人进行嘈杂的互动，因此我们通过减少意外学习来提高机器人从 pHRI 学习的效率。最后，我们对机器人操纵器进行模拟和用户研究，以将我们提出的方法与现有技术进行比较。我们的结果表明，从 pHRI 中学习可以提高任务绩效并提高人类满意度。但机器人无法直接访问此目标的参数：它们是人类内部的。在我们提出的框架内，人类互动成为对真正目标的观察。我们引入了近似值来实时学习和响应 pHRI。我们认识到并非所有的人工修正都是完美的：用户经常与机器人进行嘈杂的互动，因此我们通过减少意外学习来提高机器人从 pHRI 学习的效率。最后，我们对机器人操纵器进行模拟和用户研究，以将我们提出的方法与现有技术进行比较。我们的结果表明，从 pHRI 中学习可以提高任务绩效并提高人类满意度。在我们提出的框架内，人类互动成为对真正目标的观察。我们引入了近似值来实时学习和响应 pHRI。我们认识到并非所有的人工修正都是完美的：用户经常与机器人进行嘈杂的互动，因此我们通过减少意外学习来提高机器人从 pHRI 学习的效率。最后，我们对机器人操纵器进行模拟和用户研究，以将我们提出的方法与现有技术进行比较。我们的结果表明，从 pHRI 中学习可以提高任务绩效并提高人类满意度。在我们提出的框架内，人类互动成为对真正目标的观察。我们引入了近似值来实时学习和响应 pHRI。我们认识到并非所有的人工修正都是完美的：用户经常与机器人进行嘈杂的互动，因此我们通过减少意外学习来提高机器人从 pHRI 学习的效率。最后，我们对机器人操纵器进行模拟和用户研究，以将我们提出的方法与现有技术进行比较。我们的结果表明，从 pHRI 中学习可以提高任务绩效并提高人类满意度。因此，我们通过减少意外学习来提高机器人从 pHRI 学习的效率。最后，我们对机器人操纵器进行模拟和用户研究，以将我们提出的方法与现有技术进行比较。我们的结果表明，从 pHRI 中学习可以提高任务绩效并提高人类满意度。因此，我们通过减少意外学习来提高机器人从 pHRI 学习的效率。最后，我们对机器人操纵器进行模拟和用户研究，以将我们提出的方法与现有技术进行比较。我们的结果表明，从 pHRI 中学习可以提高任务绩效并提高人类满意度。

更新日期：2021-10-25

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>