Multimodal estimation and communication of latent semantic knowledge for robust execution of robot instructions,The International Journal of Robotics Research

当前位置： X-MOL 学术 › Int. J. Robot. Res. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Multimodal estimation and communication of latent semantic knowledge for robust execution of robot instructions
The International Journal of Robotics Research ( IF 9.2 ) Pub Date : 2020-06-05 , DOI: 10.1177/0278364920917755
Jacob Arkin ₁ , Daehyung Park ₂ , Subhro Roy ₂ , Matthew R Walter ₃ , Nicholas Roy ₂ , Thomas M Howard ₁ , Rohan Paul _{2,

4}

Affiliation

The goal of this article is to enable robots to perform robust task execution following human instructions in partially observable environments. A robot’s ability to interpret and execute commands is fundamentally tied to its semantic world knowledge. Commonly, robots use exteroceptive sensors, such as cameras or LiDAR, to detect entities in the workspace and infer their visual properties and spatial relationships. However, semantic world properties are often visually imperceptible. We posit the use of non-exteroceptive modalities including physical proprioception, factual descriptions, and domain knowledge as mechanisms for inferring semantic properties of objects. We introduce a probabilistic model that fuses linguistic knowledge with visual and haptic observations into a cumulative belief over latent world attributes to infer the meaning of instructions and execute the instructed tasks in a manner robust to erroneous, noisy, or contradictory evidence. In addition, we provide a method that allows the robot to communicate knowledge dissonance back to the human as a means of correcting errors in the operator’s world model. Finally, we propose an efficient framework that anticipates possible linguistic interactions and infers the associated groundings for the current world state, thereby bootstrapping both language understanding and generation. We present experiments on manipulators for tasks that require inference over partially observed semantic properties, and evaluate our framework’s ability to exploit expressed information and knowledge bases to facilitate convergence, and generate statements to correct declared facts that were observed to be inconsistent with the robot’s estimate of object properties.

中文翻译：

用于机器人指令鲁棒执行的潜在语义知识的多模态估计和通信

本文的目标是使机器人能够在部分可观察的环境中按照人类指令执行稳健的任务执行。机器人解释和执行命令的能力从根本上与其语义世界知识相关。通常，机器人使用外部感受器（例如相机或 LiDAR）来检测工作空间中的实体并推断它们的视觉属性和空间关系。然而，语义世界属性通常在视觉上是不可察觉的。我们假设使用非外感模态，包括物理本体感觉、事实描述和领域知识作为推断对象语义属性的机制。我们引入了一种概率模型，该模型将语言知识与视觉和触觉观察融合到对潜在世界属性的累积信念中，以推断指令的含义并以对错误、嘈杂或矛盾证据鲁棒的方式执行指令任务。此外，我们提供了一种方法，允许机器人将知识失调传达给人类，作为纠正操作员世界模型中错误的一种手段。最后，我们提出了一个有效的框架，可以预测可能的语言交互并推断当前世界状态的相关基础，从而引导语言理解和生成。我们针对需要对部分观察到的语义属性进行推理的任务进行了操纵器实验，

更新日期：2020-06-05

点击分享查看原文

点击收藏

阅读更多本刊最新论文

全部期刊列表>>