当前位置: X-MOL 学术Future Gener. Comput. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
GP-NFSP: Decentralized task offloading for mobile edge computing with independent reinforcement learning
Future Generation Computer Systems ( IF 6.2 ) Pub Date : 2022-11-24 , DOI: 10.1016/j.future.2022.11.025
Jiaxin Hou , Meng Chen , Haijun Geng , Rongzhen Li , Jianyuan Lu

In Mobile Edge Computing (MEC), offloading tasks from mobile devices to edge servers may accelerate the processing speed and save the energy of the devices, hence improving device users’ quality of experience. Recently, reinforcement learning (RL) is increasingly used for offload decision making. RL seeks long-term cumulative benefits and is proved useful for a sequence of decisions, thus is well suited for the work. Due to privacy and security concerns, mobile devices may be unwilling to expose their local information, leading to a fully decentralized MEC environment. Independent RL (IRL) emerges as a promising solution for this scenario. However, IRL solutions are faced with the non-stationarity issue, which arises when the components are changing their policies. In this paper, we proposing adopting the Neural Fictitious Self-Play (NFSP) architecture for offload decision making. NFSP explicitly tackles the non-stationarity issue with the built-in self-play mechanism, and uses a mixed strategy consisting of deep RL and the past average strategy, which is approximated by supervised deep learning. Furthermore, we use the Proximal Policy Optimization (PPO) algorithm as the RL component and exploit the Gated Recurrent Unit (GRU) to deal with the partial-observability issue in fully decentralized MEC. We conduct extensive simulation experiment, the result of which shows that our method outperforms the raw IRL approaches, validating the effectiveness of our proposed method.



中文翻译:

GP-NFSP:具有独立强化学习的移动边缘计算的分散式任务卸载

在移动边缘计算 (MEC) 中,将任务从移动设备卸载到边缘服务器可以加快处理速度并节省设备的能量,从而提高设备用户的体验质量。最近,强化学习 (RL) 越来越多地用于卸载决策。RL 寻求长期累积收益,并被证明对一系列决策有用,因此非常适合这项工作。由于隐私和安全问题,移动设备可能不愿意公开其本地信息,从而导致完全去中心化的 MEC 环境。Independent RL (IRL) 成为这种情况下很有前途的解决方案。然而,IRL 解决方案面临着非平稳性问题,当组件改变它们的策略时会出现这种问题。在本文中,我们建议采用 Neural Fictitious Self-Play (NFSP) 架构来进行卸载决策。NFSP 通过内置的自我博弈机制明确解决了非平稳性问题,并使用由深度 RL 和过去平均策略组成的混合策略,由监督深度学习逼近。此外,我们使用近端策略优化 (PPO) 算法作为 RL 组件,并利用门控循环单元 (GRU) 来处理完全分散的 MEC 中的部分可观察性问题。我们进行了大量的模拟实验,结果表明我们的方法优于原始 IRL 方法,验证了我们提出的方法的有效性。并使用由深度强化学习和过去的平均策略组成的混合策略,由有监督的深度学习来逼近。此外,我们使用近端策略优化 (PPO) 算法作为 RL 组件,并利用门控循环单元 (GRU) 来处理完全分散的 MEC 中的部分可观察性问题。我们进行了大量的模拟实验,结果表明我们的方法优于原始 IRL 方法,验证了我们提出的方法的有效性。并使用由深度强化学习和过去的平均策略组成的混合策略,由有监督的深度学习来逼近。此外,我们使用近端策略优化 (PPO) 算法作为 RL 组件,并利用门控循环单元 (GRU) 来处理完全分散的 MEC 中的部分可观察性问题。我们进行了大量的模拟实验,结果表明我们的方法优于原始 IRL 方法,验证了我们提出的方法的有效性。

更新日期:2022-11-24
down
wechat
bug