Hierarchical reinforcement learning via dynamic subspace search for multi-agent planning,Autonomous Robots

当前位置： X-MOL 学术 › Auton. Robot. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Hierarchical reinforcement learning via dynamic subspace search for multi-agent planning
Autonomous Robots ( IF 3.7 ) Pub Date : 2019-07-12 , DOI: 10.1007/s10514-019-09871-2
Aaron Ma , Michael Ouimet , Jorge Cortés

We consider scenarios where a swarm of unmanned vehicles (UxVs) seek to satisfy a number of diverse, spatially distributed objectives. The UxVs strive to determine an efficient plan to service the objectives while operating in a coordinated fashion. We focus on developing autonomous high-level planning, where low-level controls are leveraged from previous work in distributed motion, target tracking, localization, and communication. We rely on the use of state and action abstractions in a Markov decision processes framework to introduce a hierarchical algorithm, Dynamic Domain Reduction for Multi-Agent Planning, that enables multi-agent planning for large multi-objective environments. Our analysis establishes the correctness of our search procedure within specific subsets of the environments, termed ‘sub-environment’ and characterizes the algorithm performance with respect to the optimal trajectories in single-agent and sequential multi-agent deployment scenarios using tools from submodularity. Simulated results show significant improvement over using a standard Monte Carlo tree search in an environment with large state and action spaces.

中文翻译：

通过动态子空间搜索进行多层强化学习以进行多主体规划

我们考虑了无人驾驶汽车（UxV）群试图满足许多不同的，空间分布的目标的场景。UxV努力确定有效的计划，以协调的方式服务于目标。我们专注于开发自主的高级计划，在以前的工作中利用低级控制进行分布式运动，目标跟踪，本地化和通信。我们依靠在Markov决策过程框架中使用状态和动作抽象来引入分层算法，即针对多主体规划的动态域约简。，可为大型多目标环境进行多主体规划。我们的分析确定了我们在特定子环境（称为“子环境”）中搜索过程的正确性，并使用亚模块性工具对单代理和顺序多代理部署方案中的最佳轨迹进行了算法性能表征。模拟结果显示，在状态空间和动作空间较大的环境中，使用标准的蒙特卡洛树搜索比使用标准的蒙特卡洛树搜索有显着改善。

更新日期：2019-07-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11