当前位置: X-MOL 学术Def. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Maneuvering target tracking of UAV based on MN-DDPG and transfer learning
Defence Technology ( IF 5.0 ) Pub Date : 2020-11-27 , DOI: 10.1016/j.dt.2020.11.014
Bo Li , Zhi-peng Yang , Da-qing Chen , Shi-yang Liang , Hao Ma

Tracking maneuvering target in real time autonomously and accurately in an uncertain environment is one of the challenging missions for unmanned aerial vehicles (UAVs). In this paper, aiming to address the control problem of maneuvering target tracking and obstacle avoidance, an online path planning approach for UAV is developed based on deep reinforcement learning. Through end-to-end learning powered by neural networks, the proposed approach can achieve the perception of the environment and continuous motion output control. This proposed approach includes: (1) A deep deterministic policy gradient (DDPG)-based control framework to provide learning and autonomous decision-making capability for UAVs; (2) An improved method named MN-DDPG for introducing a type of mixed noises to assist UAV with exploring stochastic strategies for online optimal planning; and (3) An algorithm of task-decomposition and pre-training for efficient transfer learning to improve the generalization capability of UAV’s control model built based on MN-DDPG. The experimental simulation results have verified that the proposed approach can achieve good self-adaptive adjustment of UAV’s flight attitude in the tasks of maneuvering target tracking with a significant improvement in generalization capability and training efficiency of UAV tracking controller in uncertain environments.



中文翻译:

基于MN-DDPG的无人机机动目标跟踪与转移学习

在不确定的环境中自动,实时地实时跟踪机动目标是无人机的一项艰巨任务。为了解决机动目标跟踪和避障的控制问题,基于深度强化学习的无人机在线路径规划方法得到了发展。通过以神经网络为动力的端到端学习,该方法可以实现对环境的感知和连续运动输出控制。该提议的方法包括:(1)基于深度确定性策略梯度(DDPG)的控制框架,可为无人机提供学习和自主决策能力;(2)一种改进的方法,称为MN-DDPG,用于引入一种混合噪声,以协助无人机探索随机策略进行在线最优规划;(3)一种有效的转移学习的任务分解和预训练算法,以提高基于MN-DDPG构建的无人机控制模型的泛化能力。实验仿真结果表明,该方法可以在机动目标跟踪任务中实现对无人机飞行姿态的自适应自适应调整,在不确定环境下可以显着提高无人机跟踪控制器的泛化能力和训练效率。

更新日期:2020-11-27
down
wechat
bug