当前位置: X-MOL 学术Rob. Auton. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fixed-Wing UAVs flocking in continuous spaces: A Deep reinforcement learning approach
Robotics and Autonomous Systems ( IF 4.3 ) Pub Date : 2020-09-01 , DOI: 10.1016/j.robot.2020.103594
Chao Yan , Xiaojia Xiang , Chang Wang

Abstract Fixed-Wing UAVs (Unmanned Aerial Vehicles) flocking is still a challenging problem due to the kinematics complexity and environmental dynamics. In this paper, we solve the leader–followers flocking problem using a novel deep reinforcement learning algorithm that can generate roll angle and velocity commands by training an end-to-end controller in continuous state and action spaces. Specifically, we choose CACLA (Continuous Actor–Critic Learning Automation) as the base algorithm and we use the multi-layer perceptron to represent both the actor and the critic. Besides, we further improve the learning efficiency by using the experience replay technique that stores the training data in the experience memory and samples from the memory as needed. We have compared the performance of the proposed CACER (Continuous Actor–Critic with Experience Replay) algorithm with benchmark algorithms such as DDPG and double DQN in numerical simulation, and we have demonstrated the performance of the learned optimal policy in semi-physical simulation without any parameter tuning.

中文翻译:

固定翼无人机在连续空间中蜂拥而至:一种深度强化学习方法

摘要 由于运动学的复杂性和环境动力学,固定翼无人机(无人驾驶飞行器)集群仍然是一个具有挑战性的问题。在本文中,我们使用一种新颖的深度强化学习算法解决了领导者 - 跟随者蜂拥而至的问题,该算法可以通过在连续状态和动作空间中训练端到端控制器来生成滚动角度和速度命令。具体来说,我们选择 CACLA(Continuous Actor-Critic Learning Automation)作为基本算法,我们使用多层感知器来表示演员和评论家。此外,我们通过使用经验回放技术进一步提高学习效率,该技术将训练数据存储在经验记忆中,并根据需要从记忆中提取样本。
更新日期:2020-09-01
down
wechat
bug