当前位置: X-MOL 学术Int. J. Robot. Res. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Learning from demonstration using products of experts: Applications to manipulation and task prioritization
The International Journal of Robotics Research ( IF 9.2 ) Pub Date : 2021-09-22 , DOI: 10.1177/02783649211040561
Emmanuel Pignat 1, 2 , Joāo Silvério 1 , Sylvain Calinon 1, 2
Affiliation  

Probability distributions are key components of many learning from demonstration (LfD) approaches, with the spaces chosen to represent tasks playing a central role. Although the robot configuration is defined by its joint angles, end-effector poses are often best explained within several task spaces. In many approaches, distributions within relevant task spaces are learned independently and only combined at the control level. This simplification implies several problems that are addressed in this work. We show that the fusion of models in different task spaces can be expressed as products of experts (PoE), where the probabilities of the models are multiplied and renormalized so that it becomes a proper distribution of joint angles. Multiple experiments are presented to show that learning the different models jointly in the PoE framework significantly improves the quality of the final model. The proposed approach particularly stands out when the robot has to learn hierarchical objectives that arise when a task requires the prioritization of several sub-tasks (e.g. in a humanoid robot, keeping balance has a higher priority than reaching for an object). Since training the model jointly usually relies on contrastive divergence, which requires costly approximations that can affect performance, we propose an alternative strategy using variational inference and mixture model approximations. In particular, we show that the proposed approach can be extended to PoE with a nullspace structure (PoENS), where the model is able to recover secondary tasks that are masked by the resolution of tasks of higher-importance.



中文翻译:

使用专家产品从演示中学习:操作和任务优先级的应用

概率分布是许多从演示中学习 (LfD) 方法的关键组成部分,选择来表示任务的空间起着核心作用。尽管机器人配置是由其关节角度定义的,但末端执行器姿势通常在几个任务空间中得到最好的解释。在许多方法中,相关任务空间内的分布是独立学习的,并且仅在控制级别进行组合。这种简化意味着在这项工作中要解决的几个问题。我们表明,不同任务空间中模型的融合可以表示为专家产品(PoE),其中模型的概率相乘并重新归一化,使其成为关节角度的适当分布。多个实验表明,在 PoE 框架中联合学习不同模型显着提高了最终模型的质量。当机器人必须学习当一项任务需要对多个子任务进行优先级排序时出现的分层目标时,所提出的方法尤其突出(例如,在类人机器人中,保持平衡比伸手去拿物体具有更高的优先级)。由于联合训练模型通常依赖于对比发散,这需要可能影响性能的代价高昂的近似值,因此我们提出了一种使用变分推理和混合模型近似值的替代策略。特别是,我们表明所提出的方法可以扩展到具有零空间结构(POENS)的 PoE,

更新日期:2021-09-22
down
wechat
bug