当前位置: X-MOL 学术Artif. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Reasoning about uncertain parameters and agent behaviors through encoded experiences and belief planning
Artificial Intelligence ( IF 14.4 ) Pub Date : 2020-03-01 , DOI: 10.1016/j.artint.2019.103228
Akinobu Hayashi , Dirk Ruiken , Tadaaki Hasegawa , Christian Goerick

Abstract Robots are expected to handle increasingly complex tasks. Such tasks often include interaction with objects or collaboration with other agents. One of the key challenges for reasoning in such situations is the lack of accurate models that hinders the effectiveness of planners. We present a system for online model adaptation that continuously validates and improves models while solving tasks with a belief space planner. We employ the well known online belief planner POMCP. Particles are used to represent hypotheses about the current state and about models of the world. They are sufficient to configure a simulator to provide transition and observation models. We propose an enhanced particle reinvigoration process that leverages prior experiences encoded in a recurrent neural network (RNN). The network is trained through interaction with a large variety of object and agent parametrizations. The RNN is combined with a mixture density network (MDN) to process the current history of observations in order to propose suitable particles and models parametrizations. The proposed method also ensures that newly generated particles are consistent with the current history. These enhancements to the particle reinvigoration process help alleviate problems arising from poor sampling quality in large state spaces and enable handling of dynamics with discontinuities. The proposed approach can be applied to a variety of domains depending on what uncertainty the decision maker needs to reason about. We evaluate the approach with experiments in several domains and compare against other state-of-the-art methods. Experiments are done in a collaborative multi-agent and a single agent object manipulation domain. The experiments are performed both in simulation and on a real robot. The framework handles reasoning with uncertain agent behaviors and with unknown object and environment parametrizations well. The results show good performance and indicate that the proposed approach can improve existing state-of-the-art methods.

中文翻译:

通过编码经验和信念规划推理不确定参数和代理行为

摘要 机器人有望处理越来越复杂的任务。此类任务通常包括与对象的交互或与其他代理的协作。在这种情况下推理的主要挑战之一是缺乏准确的模型,这会阻碍计划者的有效性。我们提出了一个在线模型适应系统,该系统在使用置信空间规划器解决任务的同时不断验证和改进模型。我们聘请了著名的在线信念规划师 POMCP。粒子用于表示关于当前状态和世界模型的假设。它们足以配置模拟器以提供过渡和观察模型。我们提出了一种增强的粒子重振过程,该过程利用了在循环神经网络 (RNN) 中编码的先前经验。该网络通过与各种对象和代理参数化的交互进行训练。RNN 与混合密度网络 (MDN) 相结合来处理当前的观测历史,以提出合适的粒子和模型参数化。所提出的方法还确保新生成的粒子与当前历史一致。这些对粒子重振过程的增强有助于缓解在大状态空间中由于采样质量差而引起的问题,并能够处理具有不连续性的动力学。根据决策者需要推​​理的不确定性,建议的方法可以应用于各种领域。我们通过多个领域的实验评估该方法,并与其他最先进的方法进行比较。实验是在协作多代理和单个代理对象操作域中完成的。实验在模拟和真实机器人上进行。该框架可以很好地处理具有不确定代理行为以及未知对象和环境参数化的推理。结果显示出良好的性能,并表明所提出的方法可以改进现有的最先进方法。
更新日期:2020-03-01
down
wechat
bug