Reinforcement learning-based cost-efficient service function chaining with CoMP zero-forcing beamforming in edge networks,Future Generation Computer Systems

当前位置： X-MOL 学术 › Future Gener. Comput. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Reinforcement learning-based cost-efficient service function chaining with CoMP zero-forcing beamforming in edge networks
Future Generation Computer Systems ( IF 6.2 ) Pub Date : 2022-11-24 , DOI: 10.1016/j.future.2022.11.022
Kan Wang , Xuan Liu , Hongfang Zhou , Dapeng Lan , Zhen Gao , Amir Taherkordi , Yujie Ye , Yuan Gao

As two promising paradigms in emerging 6G wireless systems, service function chaining (SFC) and mobile edge computing (MEC) have attracted insensitive attentions from both industry and academia, and would bring more close-proximity services to 6G users with communication, computing and caching (3C) resources, yet also faced with challenges arising in time-varying channel conditions and resource dynamics. In this work, boosted by recent advents in artificial intelligence and reinforcement learning, we investigate the on-line SFC deployment in the edge of 6G wireless systems via the actor–critic learning framework. First, one long-run cost-efficient SFC deployment problem is investigated, and the coordinated multiple points (CoMP)-based zero-forcing beamforming is utilized to cancel the interference across SFCs. Then, by exploiting the Markov decision processes (MDP) property of long-run SFC deployment, one natural gradient-based actor–critic framework is proposed to characterize edge network dynamics, and meanwhile facilitates the training of neural networks to the global optimum. Next, to lower the size of action space, we follow the principle that a subproblem is embedded into each state–action pair’s critic to solve the reward function, and then utilize both the $ℓ_{p}$ $(0 < p < 1)$ norm-based successive convex approximation (SCA) and proximal center-based dual decomposition to approach the global optimum and accelerate the convergence. Finally, numerical results are used to validate proposed actor–critic approach, showing that the communication resource management deserves special attentions in the SFC deployment in the edge of 6G wireless systems.

中文翻译：

基于强化学习的具有成本效益的服务功能链与边缘网络中的 CoMP 迫零波束成形

服务功能链（SFC）和移动边缘计算（MEC）作为新兴的6G无线系统中的两个有前途的范式已经引起了工业界和学术界的不敏感关注，它们将通过通信、计算和缓存为6G用户带来更近距离的服务(3C) 资源，但也面临着时变信道条件和资源动态带来的挑战。在这项工作中，受最近人工智能和强化学习的推动，我们通过 actor-critic 学习框架研究了 6G 无线系统边缘的在线 SFC 部署。首先，研究了一个长期具有成本效益的 SFC 部署问题，并利用基于协调多点 (CoMP) 的迫零波束成形来消除跨 SFC 的干扰。然后，通过利用长期 SFC 部署的马尔可夫决策过程 (MDP) 特性，提出了一种基于自然梯度的 actor-critic 框架来表征边缘网络动力学，同时有助于将神经网络训练到全局最优。接下来，为了减小动作空间的大小，我们遵循将子问题嵌入到每个状态-动作对的评论家来解决奖励函数的原则，然后利用 $ℓ_{p}$ $(0 < p < 1个)$ 基于范数的逐次凸近似（SCA）和基于近端中心的对偶分解来接近全局最优并加速收敛。最后，数值结果用于验证所提出的演员-评论家方法，表明在 6G 无线系统边缘的 SFC 部署中，通信资源管理值得特别关注。

更新日期：2022-11-24

点击分享查看原文

点击收藏

阅读更多本刊最新论文