当前位置: X-MOL 学术arXiv.cs.SY › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Autonomous Blimp Control using Deep Reinforcement Learning
arXiv - CS - Systems and Control Pub Date : 2021-09-22 , DOI: arxiv-2109.10719
Yu Tang Liu, Eric Price, Pascal Goldschmid, Michael J. Black, Aamir Ahmad

Aerial robot solutions are becoming ubiquitous for an increasing number of tasks. Among the various types of aerial robots, blimps are very well suited to perform long-duration tasks while being energy efficient, relatively silent and safe. To address the blimp navigation and control task, in our recent work, we have developed a software-in-the-loop simulation and a PID-based controller for large blimps in the presence of wind disturbance. However, blimps have a deformable structure and their dynamics are inherently non-linear and time-delayed, often resulting in large trajectory tracking errors. Moreover, the buoyancy of a blimp is constantly changing due to changes in the ambient temperature and pressure. In the present paper, we explore a deep reinforcement learning (DRL) approach to address these issues. We train only in simulation, while keeping conditions as close as possible to the real-world scenario. We derive a compact state representation to reduce the training time and a discrete action space to enforce control smoothness. Our initial results in simulation show a significant potential of DRL in solving the blimp control task and robustness against moderate wind and parameter uncertainty. Extensive experiments are presented to study the robustness of our approach. We also openly provide the source code of our approach.

中文翻译:

使用深度强化学习的自主飞艇控制

空中机器人解决方案在越来越多的任务中变得无处不在。在各种类型的空中机器人中,飞艇非常适合执行长时间的任务,同时具有节能、相对安静和安全的特点。为了解决飞艇的导航和控制任务,在我们最近的工作中,我们开发了一个软件在环仿真和一个基于 PID 的控制器,用于存在风扰动的大型飞艇。然而,飞艇具有可变形的结构,它们的动力学本质上是非线性和时滞的,通常会导致较大的轨迹跟踪误差。此外,由于环境温度和压力的变化,飞艇的浮力不断变化。在本文中,我们探索了一种深度强化学习 (DRL) 方法来解决这些问题。我们只在模拟中训练,同时保持条件尽可能接近现实世界的场景。我们推导出一个紧凑的状态表示来减少训练时间和一个离散的动作空间来强制控制平滑。我们在模拟中的初步结果表明,DRL 在解决飞艇控制任务和针对中等风和参数不确定性的鲁棒性方面具有巨大潜力。提供了广泛的实验来研究我们方法的稳健性。我们还公开提供了我们方法的源代码。我们在模拟中的初步结果表明,DRL 在解决飞艇控制任务和针对中等风和参数不确定性的鲁棒性方面具有巨大潜力。提出了广泛的实验来研究我们方法的稳健性。我们还公开提供了我们方法的源代码。我们在模拟中的初步结果表明,DRL 在解决飞艇控制任务和针对中等风和参数不确定性的鲁棒性方面具有巨大潜力。提供了广泛的实验来研究我们方法的稳健性。我们还公开提供了我们方法的源代码。
更新日期:2021-09-23
down
wechat
bug