当前位置: X-MOL 学术IEEE Trans. Wirel. Commun. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Simultaneous Navigation and Radio Mapping for Cellular-Connected UAV With Deep Reinforcement Learning
IEEE Transactions on Wireless Communications ( IF 10.4 ) Pub Date : 2021-02-12 , DOI: 10.1109/twc.2021.3056573
Yong Zeng , Xiaoli Xu , Shi Jin , Rui Zhang

Cellular-connected unmanned aerial vehicle (UAV) is a promising technology to unlock the full potential of UAVs in the future by reusing the cellular base stations (BSs) to enable their air-ground communications. However, how to achieve ubiquitous three-dimensional (3D) communication coverage for the UAVs in the sky is a new challenge. In this paper, we tackle this challenge by a new coverage-aware navigation approach, which exploits the UAV’s controllable mobility to design its navigation/trajectory to avoid the cellular BSs’ coverage holes while accomplishing their missions. To this end, we formulate an UAV trajectory optimization problem to minimize the weighted sum of its mission completion time and expected communication outage duration, which, however, cannot be solved by the standard optimization techniques due to the lack of an accurate and tractable end-to-end communication model in practice. To overcome this difficulty, we propose a new solution approach based on the technique of deep reinforcement learning (DRL). Specifically, by leveraging the state-of-the-art dueling double deep Q network (dueling DDQN) with multi-step learning, we first propose a UAV navigation algorithm based on direct RL, where the signal measurement at the UAV is used to directly train the action-value function of the navigation policy. To further improve the performance, we propose a new framework called simultaneous navigation and radio mapping (SNARM) , where the UAV’s signal measurement is used not only for training the DQN directly, but also to create a radio map that is able to predict the outage probabilities at all locations in the area of interest. This enables the generation of simulated UAV trajectories and predicting their expected returns, which are then used to further train the DQN via Dyna technique, thus greatly improving the learning efficiency.

中文翻译:

具有深度强化学习的蜂窝连接无人机的同步导航和无线电测绘

蜂窝连接无人机 (UAV) 是一项很有前途的技术,可通过重复使用蜂窝基站 (BS) 来实现空地通信,从而在未来释放无人机的全部潜力。然而,如何实现无人机在空中的无处不在的三维(3D)通信覆盖是一个新的挑战。在本文中,我们通过一种新的方法来应对这一挑战覆盖感知导航方法,它利用无人机的可控移动性来设计其导航/轨迹,以避免蜂窝基站在完成任务时出现覆盖漏洞。为此,我们制定了一个无人机轨迹优化问题,以最小化其任务完成时间和预期通信中断持续时间的加权和,然而,由于缺乏准确和易处理的终端,标准优化技术无法解决这个问题。实践中的端到端通信模型。为了克服这个困难,我们提出了一种基于深度强化学习 (DRL)。具体来说,通过利用最先进的基于多步学习的双深度Q网络(dueling DDQN),我们首先提出了一种基于直接RL的无人机导航算法,其中无人机的信号测量用于直接训练 导航策略的动作价值函数。为了进一步提高性能,我们提出了一个新的框架,称为同时导航和无线电测绘(SNARM),其中无人机的信号测量不仅用于直接训练 DQN,还用于创建能够预测感兴趣区域内所有位置的中断概率的无线电地图。这使得能够生成模拟的无人机轨迹并预测它们的预期回报,然后用于通过以下方式进一步训练 DQNDyna技术,从而大大提高了学习效率。
更新日期:2021-02-12
down
wechat
bug