当前位置: X-MOL 学术IEEE Robot. Automation Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Decentralized Function Approximated Q-Learning in Multi-Robot Systems For Predator Avoidance
IEEE Robotics and Automation Letters ( IF 5.2 ) Pub Date : 2020-10-01 , DOI: 10.1109/lra.2020.3013920
Revanth Konda , Hung Manh La , Jun Zhang

The nature-inspired behavior of collective motion is found to be an optimal solution in swarming systems for predator avoidance and survival. In this work, we propose a two-level control architecture for multi-robot systems (MRS), which leverages the advantages of flocking control and function approximated reinforcement learning for predator avoidance task. Reinforcement learning in multi-agent systems has gained a tremendous amount of interest in recent few years. Computationally intensive architectures such as deep reinforcement learning and actor-critic approaches have been extensively developed and have proved to be extremely efficient. The proposed approach, comprising of cooperative function approximated Q-learning, is applied such that it ensures formation maintenance in MRS while predator avoidance. A consensus filter is incorporated into the control architecture, to sense predators in close vicinity in a distributed and cooperative fashion to ensure consensus on states among the robots in the system. The proposed approach is proved to be convergent and results in superior performance in unexplored states and reduced number of variables. Simulation results confirm the effectiveness of the proposed approach over existing methods. We expect that the proposed approach can be conveniently applied to many other areas with little modifications, such as fire-fighting robots, surveillance and patrolling robots.

中文翻译:

多机器人系统中用于避免捕食者的分散函数近似 Q 学习

集体运动的自然启发行为被发现是群集系统中避免捕食者和生存的最佳解决方案。在这项工作中,我们为多机器人系统 (MRS) 提出了一种两级控制架构,该架构利用集群控制和函数近似强化学习的优势来避免捕食者任务。近年来,多智能体系统中的强化学习引起了极大的兴趣。诸如深度强化学习和 actor-critic 方法等计算密集型架构已得到广泛开发,并且已被证明非常有效。所提出的方法由近似 Q 学习的协作函数组成,被应用以确保在 MRS 中保持编队,同时避免捕食者。共识过滤器被整合到控制架构中,以分布式和合作的方式感知附近的捕食者,以确保系统中机器人之间的状态达成共识。所提出的方法被证明是收敛的,并且在未探索的状态和减少的变量数量下具有卓越的性能。仿真结果证实了所提出的方法相对于现有方法的有效性。我们希望所提出的方法可以方便地应用于许多其他领域,而无需修改,例如消防机器人、监视和巡逻机器人。所提出的方法被证明是收敛的,并且在未探索的状态和减少的变量数量下具有卓越的性能。仿真结果证实了所提出的方法相对于现有方法的有效性。我们希望所提出的方法可以方便地应用于许多其他领域,而无需修改,例如消防机器人、监视和巡逻机器人。所提出的方法被证明是收敛的,并且在未探索的状态和减少的变量数量下具有卓越的性能。仿真结果证实了所提出的方法相对于现有方法的有效性。我们希望所提出的方法可以方便地应用于许多其他领域,而无需修改,例如消防机器人、监视和巡逻机器人。
更新日期:2020-10-01
down
wechat
bug