当前位置: X-MOL 学术IEEE Trans. Veh. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Autonomous Tracking Using a Swarm of UAVs: A Constrained Multi-agent Reinforcement Learning Approach
IEEE Transactions on Vehicular Technology ( IF 6.8 ) Pub Date : 2020-11-01 , DOI: 10.1109/tvt.2020.3023733
Yu-Jia Chen , Deng-Kai Chang , Cheng Zhang

In this paper, we aim to design an autonomous tracking system for a swarm of unmanned aerial vehicles (UAVs) to localize a radio frequency (RF) mobile target. In the system, UAVs equipped with omnidirectional received signal strength (RSS) sensors can cooperatively search the target with a specified tracking accuracy. To achieve fast localization and tracking in the highly dynamic channel environment (e.g., time-varying transmit power and intermittent signal), we formulate a flight decision problem as a constrained Markov decision process (CMDP) with the main objective of avoiding redundant UAV flight path. Then, we propose an enhanced multi-agent reinforcement learning to coordinate multiple UAVs performing real-time target tracking. The core of the proposed scheme is a feedback control system that takes into account the uncertainty of the channel estimate. We prove that the proposed algorithm can converge to the optimal decision. Our simulation results show that the proposed scheme outperforms standard Q-learning and multi-agent Q-learning algorithms in terms of searching time and successful localization probability.

中文翻译:

使用一群无人机进行自主跟踪:一种受约束的多智能体强化学习方法

在本文中,我们旨在为一群无人机 (UAV) 设计一个自主跟踪系统,以定位射频 (RF) 移动目标。在该系统中,配备全向接收信号强度(RSS)传感器的无人机可以以指定的跟踪精度协同搜索目标。为了在高度动态的信道环境(例如,时变发射功率和间歇信号)中实现快速定位和跟踪,我们将飞行决策问题制定为约束马尔可夫决策过程(CMDP),其主要目标是避免冗余的无人机飞行路径. 然后,我们提出了一种增强的多智能体强化学习来协调多个无人机执行实时目标跟踪。所提出方案的核心是一个反馈控制系统,它考虑了信道估计的不确定性。我们证明了所提出的算法可以收敛到最优决策。我们的仿真结果表明,所提出的方案在搜索时间和成功定位概率方面优于标准 Q-learning 和多智能体 Q-learning 算法。
更新日期:2020-11-01
down
wechat
bug