Deep reinforcement learning for computation offloading in mobile edge computing environment,Computer Communications

当前位置： X-MOL 学术 › Comput. Commun. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Deep reinforcement learning for computation offloading in mobile edge computing environment
Computer Communications ( IF 4.5 ) Pub Date : 2021-04-30 , DOI: 10.1016/j.comcom.2021.04.028
Miaojiang Chen , Tian Wang , Shaobo Zhang , Anfeng Liu

Recently, in order to distribute computing, networking resources, services, near terminals, mobile fog is gradually becoming the mobile edge computing (MEC) paradigm. In a mobile fog environment, the quality of service affected by offloading speeds and the fog processing, however the traditional fog method to solve the problem of computation resources allocation is difficult because of the complex network states distribution environment (that is, F-AP states, AP states, mobile device states and code block states). In this paper, to improve the fog resource provisioning performance of mobile devices, the learning-based mobile fog scheme with deep deterministic policy gradient (DDPG) algorithm is proposed. An offloading block pulsating discrete event system is modeled as a Markov Decision Processes (MDPs), which can realize the offloading computing without knowing the transition probabilities among different network states. Furthermore, the DDPG algorithm is used to solve the issue of state spaces explosion and learn an optimal offloading policy on distributed mobile fog computing. The simulation results show that our proposed scheme achieves 20%, 37%, 46% improvement on related performance compared with the policy gradient (PG), deterministic policy gradient (DPG) and actor–critic (AC) methods. Besides, compared with the traditional fog provisioning scheme, our scheme shows better cost performance of fog resource provisioning under different locations number and different task arrival rates.

中文翻译：

深度强化学习，用于移动边缘计算环境中的计算分流

最近，为了在终端附近分布计算，网络资源，服务，移动雾正逐渐成为移动边缘计算（MEC）的范例。在移动雾环境中，服务质量受卸载速度和雾处理的影响，但是由于网络状态分布环境（即F-AP状态）复杂，传统的雾方法难以解决计算资源分配问题。，AP状态，移动设备状态和代码块状态）。为了提高移动设备的雾资源供应性能，提出了一种基于学习的带有深度确定性策略梯度（DDPG）算法的移动雾方案。卸载块脉动离散事件系统被建模为马尔可夫决策过程（MDP），无需知道不同网络状态之间的转移概率就可以实现分流计算。此外，DDPG算法用于解决状态空间爆炸问题，并在分布式移动雾计算中学习最优卸载策略。仿真结果表明，与策略梯度（PG），确定性策略梯度（DPG）和行为者批评（AC）方法相比，我们提出的方案在相关性能上分别提高了20％，37％，46％。此外，与传统的雾气供应方案相比，我们的方案在不同的地点数量和不同的任务到达率下显示出更好的雾气资源供应成本性能。DDPG算法用于解决状态空间爆炸问题，并在分布式移动雾计算中学习最优卸载策略。仿真结果表明，与策略梯度（PG），确定性策略梯度（DPG）和行为者批评（AC）方法相比，我们提出的方案在相关性能上分别提高了20％，37％，46％。此外，与传统的雾气供应方案相比，我们的方案在不同的地点数量和不同的任务到达率下显示出更好的雾气资源供应成本性能。DDPG算法用于解决状态空间爆炸问题，并在分布式移动雾计算中学习最优卸载策略。仿真结果表明，与策略梯度（PG），确定性策略梯度（DPG）和行为者批评（AC）方法相比，我们提出的方案在相关性能上分别提高了20％，37％，46％。此外，与传统的雾气供应方案相比，我们的方案在不同的地点数量和不同的任务到达率下显示出更好的雾气资源供应成本性能。确定性策略梯度（DPG）和行为者批评（AC）方法。此外，与传统的雾供应方案相比，我们的方案在不同的地点数量和不同的任务到达率下显示出更好的雾资源供应性价比。确定性策略梯度（DPG）和行为者批评（AC）方法。此外，与传统的雾气供应方案相比，我们的方案在不同的地点数量和不同的任务到达率下显示出更好的雾气资源供应成本性能。

更新日期：2021-05-05

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11