当前位置: X-MOL 学术Rob. Auton. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
LoOP: Iterative learning for optimistic planning on robots
Robotics and Autonomous Systems ( IF 4.3 ) Pub Date : 2021-02-01 , DOI: 10.1016/j.robot.2020.103693
Francesco Riccio , Roberto Capobianco , Daniele Nardi

Abstract Efficient robotic behaviors require robustness and adaptation to dynamic changes of the environment, whose characteristics rapidly vary during robot operation. To generate effective robot action policies, planning and learning techniques have shown the most promising results. However, if considered individually, they present different limitations. Planning techniques lack generalization among similar states and require experts to define behavioral routines at different levels of abstraction. Conversely, learning methods usually require a considerable number of training samples and iterations of the algorithm. To overcome these issues, and to efficiently generate robot behaviors, we introduce LoOP , an iterative learning algorithm for optimistic planning that combines state-of-the-art planning and learning techniques to generate action policies. The main contribution of LoOP is the combination of Monte-Carlo Search Planning and Q-learning, which enables focused exploration during policy refinement in different robotic applications. We demonstrate the robustness and flexibility of LoOP in various domains and multiple robotic platforms, by validating the proposed approach with an extensive experimental evaluation.

中文翻译:

LoOP:机器人乐观规划的迭代学习

摘要 高效的机器人行为需要鲁棒性和适应环境的动态变化,其特性在机器人运行过程中迅速变化。为了生成有效的机器人动作策略,规划和学习技术已显示出最有希望的结果。但是,如果单独考虑,它们会呈现出不同的局限性。规划技术在相似状态之间缺乏概括性,需要专家在不同抽象级别定义行为例程。相反,学习方法通​​常需要大量的训练样本和算法的迭代。为了克服这些问题,并有效地生成机器人行为,我们引入了 LoOP,一种用于乐观规划的迭代学习算法,它结合了最先进的规划和学习技术来生成行动策略。LoOP 的主要贡献是蒙特卡洛搜索规划和 Q-learning 的结合,它可以在不同机器人应用程序的策略细化过程中进行集中探索。我们通过广泛的实验评估验证所提出的方法,证明了 LoOP 在各个领域和多个机器人平台中的稳健性和灵活性。
更新日期:2021-02-01
down
wechat
bug