当前位置:
X-MOL 学术
›
arXiv.cs.SY
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
SREC: Proactive Self-Remedy of Energy-Constrained UAV-Based Networks via Deep Reinforcement Learning
arXiv - CS - Systems and Control Pub Date : 2020-09-17 , DOI: arxiv-2009.08528 Ran Zhang, Miao Wang, and Lin X. Cai
arXiv - CS - Systems and Control Pub Date : 2020-09-17 , DOI: arxiv-2009.08528 Ran Zhang, Miao Wang, and Lin X. Cai
Energy-aware control for multiple unmanned aerial vehicles (UAVs) is one of
the major research interests in UAV based networking. Yet few existing works
have focused on how the network should react around the timing when the UAV
lineup is changed. In this work, we study proactive self-remedy of
energy-constrained UAV networks when one or more UAVs are short of energy and
about to quit for charging. We target at an energy-aware optimal UAV control
policy which proactively relocates the UAVs when any UAV is about to quit the
network, rather than passively dispatches the remaining UAVs after the quit.
Specifically, a deep reinforcement learning (DRL)-based self remedy approach,
named SREC-DRL, is proposed to maximize the accumulated user satisfaction
scores for a certain period within which at least one UAV will quit the
network. To handle the continuous state and action space in the problem, the
state-of-the-art algorithm of the actor-critic DRL, i.e., deep deterministic
policy gradient (DDPG), is applied with better convergence stability. Numerical
results demonstrate that compared with the passive reaction method, the
proposed SREC-DRL approach shows a $12.12\%$ gain in accumulative user
satisfaction score during the remedy period.
中文翻译:
SREC:通过深度强化学习对基于能量约束的无人机网络进行主动自我修复
多无人机(UAV)的能量感知控制是基于无人机的网络的主要研究兴趣之一。然而,很少有现有工作关注网络应该如何在无人机阵容改变时做出反应。在这项工作中,我们研究了当一架或多架无人机能量不足并即将停止充电时能量受限无人机网络的主动自我补救。我们的目标是一种能量感知的最优无人机控制策略,当任何无人机即将退出网络时主动重新定位无人机,而不是在退出后被动调度剩余的无人机。具体而言,提出了一种基于深度强化学习 (DRL) 的自我补救方法,名为 SREC-DRL,以在至少一个无人机退出网络的特定时期内最大化累积的用户满意度分数。为了处理问题中的连续状态和动作空间,应用了actor-critic DRL 的最新算法,即深度确定性策略梯度(DDPG),具有更好的收敛稳定性。数值结果表明,与被动反应方法相比,所提出的 SREC-DRL 方法在补救期间累计用户满意度得分增加了 12.12 美元。
更新日期:2020-09-21
中文翻译:
SREC:通过深度强化学习对基于能量约束的无人机网络进行主动自我修复
多无人机(UAV)的能量感知控制是基于无人机的网络的主要研究兴趣之一。然而,很少有现有工作关注网络应该如何在无人机阵容改变时做出反应。在这项工作中,我们研究了当一架或多架无人机能量不足并即将停止充电时能量受限无人机网络的主动自我补救。我们的目标是一种能量感知的最优无人机控制策略,当任何无人机即将退出网络时主动重新定位无人机,而不是在退出后被动调度剩余的无人机。具体而言,提出了一种基于深度强化学习 (DRL) 的自我补救方法,名为 SREC-DRL,以在至少一个无人机退出网络的特定时期内最大化累积的用户满意度分数。为了处理问题中的连续状态和动作空间,应用了actor-critic DRL 的最新算法,即深度确定性策略梯度(DDPG),具有更好的收敛稳定性。数值结果表明,与被动反应方法相比,所提出的 SREC-DRL 方法在补救期间累计用户满意度得分增加了 12.12 美元。