当前位置: X-MOL 学术EURASIP J. Wirel. Commun. Netw. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Actor-critic learning-based energy optimization for UAV access and backhaul networks
EURASIP Journal on Wireless Communications and Networking ( IF 2.6 ) Pub Date : 2021-04-07 , DOI: 10.1186/s13638-021-01960-0
Yaxiong Yuan 1 , Lei Lei 1 , Thang X Vu 1 , Symeon Chatzinotas 1 , Sumei Sun 2 , Björn Ottersten 1
Affiliation  

In unmanned aerial vehicle (UAV)-assisted networks, UAV acts as an aerial base station which acquires the requested data via backhaul link and then serves ground users (GUs) through an access network. In this paper, we investigate an energy minimization problem with a limited power supply for both backhaul and access links. The difficulties for solving such a non-convex and combinatorial problem lie at the high computational complexity/time. In solution development, we consider the approaches from both actor-critic deep reinforcement learning (AC-DRL) and optimization perspectives. First, two offline non-learning algorithms, i.e., an optimal and a heuristic algorithms, based on piecewise linear approximation and relaxation are developed as benchmarks. Second, toward real-time decision-making, we improve the conventional AC-DRL and propose two learning schemes: AC-based user group scheduling and backhaul power allocation (ACGP), and joint AC-based user group scheduling and optimization-based backhaul power allocation (ACGOP). Numerical results show that the computation time of both ACGP and ACGOP is reduced tenfold to hundredfold compared to the offline approaches, and ACGOP is better than ACGP in energy savings. The results also verify the superiority of proposed learning solutions in terms of guaranteeing the feasibility and minimizing the system energy compared to the conventional AC-DRL.



中文翻译:

基于 Actor-Critic 学习的无人机接入和回程网络能量优化

在无人机 (UAV) 辅助网络中,无人机充当空中基站,通过回程链路获取请求的数据,然后通过接入网络为地面用户 (GU) 提供服务。在本文中,我们研究了回程和接入链路电源有限的能量最小化问题。解决这种非凸组合问题的困难在于高计算复杂度/时间。在解决方案开发中,我们从演员评论深度强化学习 (AC-DRL) 和优化角度考虑了方法。首先,开发了两种离线非学习算法,即基于分段线性逼近和松弛的最优算法和启发式算法作为基准。其次,走向实时决策,我们改进了传统的AC-DRL并提出了两种学习方案:基于AC的用户组调度和回程功率分配(ACGP),以及基于AC的联合用户组调度和基于优化的回程功率分配(ACGOP)。数值结果表明,与离线方法相比,ACGP 和 ACGOP 的计算时间都减少了十倍到一百倍,并且在节能方面,ACGOP 优于 ACGP。与传统的 AC-DRL 相比,结果还验证了所提出的学习解决方案在保证可行性和最小化系统能量方面的优越性。数值结果表明,与离线方法相比,ACGP 和 ACGOP 的计算时间都减少了十倍到一百倍,并且在节能方面,ACGOP 优于 ACGP。与传统的 AC-DRL 相比,结果还验证了所提出的学习解决方案在保证可行性和最小化系统能量方面的优越性。数值结果表明,与离线方法相比,ACGP 和 ACGOP 的计算时间都减少了十倍到一百倍,并且在节能方面,ACGOP 优于 ACGP。与传统的 AC-DRL 相比,结果还验证了所提出的学习解决方案在保证可行性和最小化系统能量方面的优越性。

更新日期:2021-04-08
down
wechat
bug