Long-Term Planning with Deep Reinforcement Learning on Autonomous Drones,arXiv - CS - Artificial Intelligence

当前位置： X-MOL 学术 › arXiv.cs.AI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Long-Term Planning with Deep Reinforcement Learning on Autonomous Drones
arXiv - CS - Artificial Intelligence Pub Date : 2020-07-11 , DOI: arxiv-2007.05694
Ugurkan Ates

In this paper, we study a long-term planning scenario that is based on drone racing competitions held in real life. We conducted this experiment on a framework created for "Game of Drones: Drone Racing Competition" at NeurIPS 2019. The racing environment was created using Microsoft's AirSim Drone Racing Lab. A reinforcement learning agent, a simulated quadrotor in our case, has trained with the Policy Proximal Optimization(PPO) algorithm was able to successfully compete against another simulated quadrotor that was running a classical path planning algorithm. Agent observations consist of data from IMU sensors, GPS coordinates of drone obtained through simulation and opponent drone GPS information. Using opponent drone GPS information during training helps dealing with complex state spaces, serving as expert guidance allows for efficient and stable training process. All experiments performed in this paper can be found and reproduced with code at our GitHub repository

中文翻译：

对自主无人机进行深度强化学习的长期规划

在本文中，我们研究了一个基于现实生活中举办的无人机竞速比赛的长期规划场景。我们在 NeurIPS 2019 为“无人机游戏：无人机赛车比赛”创建的框架上进行了这项实验。赛车环境是使用微软的 AirSim 无人机赛车实验室创建的。一个强化学习代理，在我们的例子中是一个模拟四旋翼机，已经使用策略近端优化 (PPO) 算法进行训练，能够成功地与另一个运行经典路径规划算法的模拟四旋翼机竞争。代理观察包括来自 IMU 传感器的数据、通过模拟获得的无人机 GPS 坐标和对手无人机 GPS 信息。在训练期间使用对手无人机 GPS 信息有助于处理复杂的状态空间，作为专家指导，可以实现高效稳定的培训过程。本文中进行的所有实验都可以在我们的 GitHub 存储库中找到并使用代码进行复制

更新日期：2020-07-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文