GAN-powered Deep Distributional Reinforcement Learning for Resource Management in Network Slicing,IEEE Journal on Selected Areas in Communications

当前位置： X-MOL 学术 › IEEE J. Sel. Area. Comm. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

GAN-powered Deep Distributional Reinforcement Learning for Resource Management in Network Slicing
IEEE Journal on Selected Areas in Communications ( IF 13.8 ) Pub Date : 2020-02-01 , DOI: 10.1109/jsac.2019.2959185
Yuxiu Hua , Rongpeng Li , Zhifeng Zhao , Xianfu Chen , Honggang Zhang

Network slicing is a key technology in 5G communications system. Its purpose is to dynamically and efficiently allocate resources for diversified services with distinct requirements over a common underlying physical infrastructure. Therein, demand-aware resource allocation is of significant importance to network slicing. In this paper, we consider a scenario that contains several slices in a radio access network with base stations that share the same physical resources (e.g., bandwidth or slots). We leverage deep reinforcement learning (DRL) to solve this problem by considering the varying service demands as the environment state and the allocated resources as the environment action. In order to reduce the effects of the annoying randomness and noise embedded in the received service level agreement (SLA) satisfaction ratio (SSR) and spectrum efficiency (SE), we primarily propose generative adversarial network-powered deep distributional Q network (GAN-DDQN) to learn the action-value distribution driven by minimizing the discrepancy between the estimated action-value distribution and the target action-value distribution. We put forward a reward-clipping mechanism to stabilize GAN-DDQN training against the effects of widely-spanning utility values. Moreover, we further develop Dueling GAN-DDQN, which uses a specially designed dueling generator, to learn the action-value distribution by estimating the state-value distribution and the action advantage function. Finally, we verify the performance of the proposed GAN-DDQN and Dueling GAN-DDQN algorithms through extensive simulations.

中文翻译：

用于网络切片中资源管理的 GAN 驱动的深度分布式强化学习

网络切片是5G通信系统的关键技术。其目的是在公共底层物理基础设施上为具有不同需求的多样化服务动态有效地分配资源。其中，需求感知资源分配对网络切片具有重要意义。在本文中，我们考虑一个场景，该场景包含无线接入网络中的多个切片，基站共享相同的物理资源（例如，带宽或时隙）。我们利用深度强化学习 (DRL) 来解决这个问题，将不同的服务需求视为环境状态，将分配的资源视为环境动作。为了减少接收服务水平协议（SLA）满意度（SSR）和频谱效率（SE）中嵌入的恼人随机性和噪声的影响，我们主要提出了生成对抗网络驱动的深度分布Q网络（GAN-DDQN） ) 通过最小化估计的动作值分布和目标动作值分布之间的差异来学习动作值分布。我们提出了一种奖励剪裁机制来稳定 GAN-DDQN 训练，以对抗广泛的效用值的影响。此外，我们进一步开发了 Duling GAN-DDQN，它使用专门设计的决斗生成器，通过估计状态值分布和动作优势函数来学习动作值分布。最后，

更新日期：2020-02-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11