当前位置: X-MOL 学术IEEE Trans. Cognit. Commun. Netw. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
UAV-Assisted Wireless Energy and Data Transfer With Deep Reinforcement Learning
IEEE Transactions on Cognitive Communications and Networking ( IF 7.4 ) Pub Date : 2020-09-29 , DOI: 10.1109/tccn.2020.3027696
Zehui Xiong , Yang Zhang , Wei Yang Bryan Lim , Jiawen Kang , Dusit Niyato , Cyril Leung , Chunyan Miao

As a typical scenario in future generation communication network applications, UAV-assisted communication can perform autonomous data delivery for massive machine type communication (mMTC), where the data generated from Internet of Things (IoT) devices can be carried and delivered to the corresponding locations with no direct communication channels to the IoT devices. Wireless energy transfer technique can recharge the UAV when the system is in operation, assisting the UAV to continuously collect and deliver data. In this work, we formulate a Markov decision process (MDP) model to describe the energy and data transfer optimization problem for the UAV. To maximize the long-term utility of the UAV, the MDP model is solved by value iteration algorithm to obtain the optimal strategies of the UAV to collect data, deliver data, and receive transferred energy to replenish on-device battery energy storage. Furthermore, to tackle the issues of system state uncertainties, partially observable states, and large state space in UAV-assisted communication systems, we extend the MDP model and solve it by using a Q -learning and a deep reinforcement learning (DRL) schemes. Simulations and numerical results validate that, compared with baseline schemes, the proposed MDP model with DRL based scheme can achieve better wireless energy and data transfer strategies in terms of the higher long-term utility of the UAV.

中文翻译:


无人机辅助无线能源和数据传输与深度强化学习



作为下一代通信网络应用的典型场景,无人机辅助通信可以进行海量机器类通信(mMTC)的自主数据传送,将物联网(IoT)设备产生的数据承载并传送到相应的位置没有与物联网设备的直接通信通道。无线能量传输技术可以在系统运行时为无人机充电,协助无人机持续采集和传输数据。在这项工作中,我们制定了马尔可夫决策过程(MDP)模型来描述无人机的能量和数据传输优化问题。为了最大化无人机的长期效用,通过值迭代算法对MDP模型进行求解,以获得无人机收集数据、传输数据以及接收传输能量以补充设备电池储能的最优策略。此外,为了解决无人机辅助通信系统中系统状态不确定性、部分可观测状态和大状态空间的问题,我们扩展了 MDP 模型,并通过使用 Q 学习和深度强化学习(DRL)方案来解决它。仿真和数值结果证明,与基线方案相比,所提出的基于 DRL 方案的 MDP 模型可以在无人机更高的长期效用方面实现更好的无线能量和数据传输策略。
更新日期:2020-09-29
down
wechat
bug