当前位置: X-MOL 学术IEEE Trans. Cognit. Commun. Netw. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dynamic Spectrum Interaction of UAV Flight Formation Communication with Priority: A Deep Reinforcement Learning Approach
IEEE Transactions on Cognitive Communications and Networking ( IF 8.6 ) Pub Date : 2020-09-01 , DOI: 10.1109/tccn.2020.2973376
Yun Lin , Meiyu Wang , Xianglong Zhou , Guoru Ding , Shiwen Mao

The formation flights of multiple unmanned aerial vehicles (UAV) can improve the success probability of single-machine. Dynamic spectrum interaction solves the problem of the ordered communication of multiple UAVs with limited bandwidth via spectrum interaction between UAVs. By introducing reinforcement learning algorithm, UAVs can continuously obtain the optimal strategy by continuously interacting with the environment. In this paper, two types of UAV formation communication methods are studied. One method allows for information sharing between two UAVs in the same time slot. The other method is the adoption of a dynamic time slot allocation scheme to complete the alternate use of time slots by the UAV to realize information sharing. The quality of experience (QoE) is introduced to evaluate the results of UAV sharing, and the M/G/1 queuing model is used for priority and to evaluate the packet loss of UAV. In terms of algorithms, a combination of deep reinforcement learning (DRL) and the long-short-term memory (LSTM) network is adopted to accelerate the convergence speed of the algorithm. The experimental results show that, compared with the Q-learning and deep Q-network (DQN) methods, the proposed method achieves faster convergence and better performance with respect to the throughput rate.

中文翻译:

具有优先级的无人机编队通信的动态频谱交互:一种深度强化学习方法

多架无人机(UAV)的编队飞行可以提高单机的成功概率。动态频谱交互通过无人机之间的频谱交互,解决了多架无人机在带宽有限的情况下有序通信的问题。通过引入强化学习算法,无人机可以通过与环境的不断交互,不断获得最优策略。本文研究了两种无人机编队通信方法。一种方法允许在同一时隙内在两架无人机之间共享信息。另一种方法是采用动态时隙分配方案,完成无人机对时隙的交替使用,实现信息共享。引入体验质量(QoE)来评估无人机共享的结果,M/G/1排队模型用于优先级和评估无人机的丢包。在算法方面,采用深度强化学习(DRL)和长短期记忆(LSTM)网络相结合的方式,加快算法的收敛速度。实验结果表明,与Q-learning和深度Q-network(DQN)方法相比,所提出的方法在吞吐率方面实现了更快的收敛和更好的性能。
更新日期:2020-09-01
down
wechat
bug