AirCapRL: Autonomous Aerial Human Motion Capture using Deep Reinforcement Learning,IEEE Robotics and Automation Letters

当前位置： X-MOL 学术 › IEEE Robot. Automation Lett. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

AirCapRL: Autonomous Aerial Human Motion Capture using Deep Reinforcement Learning
IEEE Robotics and Automation Letters ( IF 4.6 ) Pub Date : 2020-10-01 , DOI: 10.1109/lra.2020.3013906
Rahul Tallamraju , Nitin Saini , Elia Bonetto , Michael Pabst , Yu Tang Liu , Michael J. Black , Aamir Ahmad

In this letter, we introduce a deep reinforcement learning (DRL) based multi-robot formation controller for the task of autonomous aerial human motion capture (MoCap). We focus on vision-based MoCap, where the objective is to estimate the trajectory of body pose, and shape of a single moving person using multiple micro aerial vehicles. State-of-the-art solutions to this problem are based on classical control methods, which depend on hand-crafted system, and observation models. Such models are difficult to derive, and generalize across different systems. Moreover, the non-linearities, and non-convexities of these models lead to sub-optimal controls. In our work, we formulate this problem as a sequential decision making task to achieve the vision-based motion capture objectives, and solve it using a deep neural network-based RL method. We leverage proximal policy optimization (PPO) to train a stochastic decentralized control policy for formation control. The neural network is trained in a parallelized setup in synthetic environments. We performed extensive simulation experiments to validate our approach. Finally, real-robot experiments demonstrate that our policies generalize to real world conditions.

中文翻译：

AirCapRL：使用深度强化学习的自主空中人体动作捕捉

在这封信中，我们介绍了一种基于深度强化学习 (DRL) 的多机器人编队控制器，用于自主空中人体动作捕捉 (MoCap) 任务。我们专注于基于视觉的 MoCap，其目标是使用多个微型飞行器来估计身体姿势的轨迹和单个移动人的形状。这个问题的最先进的解决方案是基于经典的控制方法，这依赖于手工制作的系统和观察模型。这种模型很难推导出来，也很难在不同的系统中推广。此外，这些模型的非线性和非凸性导致次优控制。在我们的工作中，我们将此问题制定为顺序决策任务，以实现基于视觉的运动捕捉目标，并使用基于深度神经网络的 RL 方法解决该问题。我们利用近端策略优化 (PPO) 来训练用于编队控制的随机分散控制策略。神经网络在合成环境中的并行设置中进行训练。我们进行了大量的模拟实验来验证我们的方法。最后，真实机器人实验表明我们的策略可以推广到现实世界的条件。

更新日期：2020-10-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文