当前位置: X-MOL 学术Appl. Soft Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Target tracking strategy using deep deterministic policy gradient
Applied Soft Computing ( IF 8.7 ) Pub Date : 2020-06-23 , DOI: 10.1016/j.asoc.2020.106490
Shixun You , Ming Diao , Lipeng Gao , Fulong Zhang , Huan Wang

To address the challenge of maintaining high robustness of target tracking in a 3D dynamic high-altitude scenario, this paper presents a method to formulate continuous strategic maneuvers for unmanned combat air vehicles (UCAVs) based on deep deterministic policy gradient (DDPG). DDPG is an efficient reinforcement learning approach that helps UCAV perform a variety of navigation tasks in real-time in a dynamic and random electronic warfare environment, and therefore possesses clear advantages over other technologies. First, create a target tracking simulator, Tracker, in the cognitive electronic warfare framework, and conduct a theoretical analysis of maneuvering bias produced by environmental observational errors. Tracker can automatically correlate the maximum physical overload with UCAV’s attitude angles and desired movement commands. Second, shape the agent’s behavior rewards under the inspiration of vector-based navigation to ensure that the DDPG’s output is reliable. Finally, a DRL-based navigation decision framework is employed to validate the simulation for target tracking tasks in different environments and bring excellent results. In terms of behavior assessment, the agile maneuvers mastered by the agent are dissected by time segmentation of high-quality trajectories.



中文翻译:

使用深度确定性策略梯度的目标跟踪策略

为了解决在3D动态高空情况下保持目标跟踪的高鲁棒性的挑战,本文提出了一种基于深度确定性策略梯度(DDPG)制定无人战斗机(UCAV)连续战略机动的方法。DDPG是一种有效的强化学习方法,可帮助UCAV在动态和随机的电子战环境中实时执行各种导航任务,因此与其他技术相比具有明显的优势。首先,在认知电子战框架中创建目标跟踪模拟器Tracker,并对环境观测误差产生的机动偏差进行理论分析。跟踪器可以自动将最大物理过载与UCAV的姿态角和所需的运动命令相关联。第二,在基于矢量的导航的启发下塑造代理的行为回报,以确保DDPG的输出可靠。最后,基于DRL的导航决策框架被用于验证不同环境中目标跟踪任务的仿真,并带来出色的结果。在行为评估方面,通过高质量轨迹的时间分段来剖析代理所掌握的敏捷动作。

更新日期:2020-06-23
down
wechat
bug