当前位置: X-MOL 学术IEEE J. Sel. Area. Comm. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Intelligent VNF Orchestration and Flow Scheduling via Model-assisted Deep Reinforcement Learning
IEEE Journal on Selected Areas in Communications ( IF 16.4 ) Pub Date : 2020-02-01 , DOI: 10.1109/jsac.2019.2959182
Lin Gu , Deze Zeng , Wei Li , Song Guo , Albert Y. Zomaya , Hai Jin

Hosting virtualized network functions (VNF) has been regarded as an effective way to realize network function virtualization (NFV). Considering the cost diversity in cloud computing, from the perspective of service providers, it is significant to orchestrate the VNFs and schedule the traffic flows for network utility maximization (NUM) as it implies maximal revenue. However, traditional heuristic solutions based on optimization models usually follow some assumptions, limiting their applicability. Recent studies have shown that deep reinforcement learning (DRL) is a promising way to tackle such limitations. However, DRL agent training also suffers from slow convergence problem, especially with complex control problems. We notice that optimization models actually can be applied to accelerate the DRL training. Therefore, we are motivated to design a model-assisted DRL framework for VNF orchestration in this paper. Other than letting the agent blindly explore actions, the heuristic solutions are used to guide the training process. Based on such principle, the DRL framework is also redesigned accordingly. Experiment results validate the high efficiency of our model-assisted DRL framework as it not only converges $23\times$ faster than traditional DRL algorithm, but also with higher performance at the same time.

中文翻译:

通过模型辅助的深度强化学习进行智能 VNF 编排和流程调度

托管虚拟化网络功能(VNF)已被视为实现网络功能虚拟化(NFV)的有效方式。考虑到云计算的成本多样性,从服务提供商的角度来看,为网络效用最大化 (NUM) 编排 VNF 和调度流量非常重要,因为这意味着最大的收入。然而,基于优化模型的传统启发式解决方案通常遵循一些假设,限制了它们的适用性。最近的研究表明,深度强化学习 (DRL) 是解决此类限制的一种很有前景的方法。然而,DRL 代理训练也存在收敛缓慢的问题,尤其是复杂的控制问题。我们注意到优化模型实际上可以用于加速 DRL 训练。所以,在本文中,我们有动力为 VNF 编排设计一个模型辅助的 DRL 框架。除了让代理盲目探索动作之外,启发式解决方案用于指导训练过程。基于这样的原则,DRL 框架也进行了相应的重新设计。实验结果验证了我们的模型辅助 DRL 框架的高效率,因为它不仅比传统 DRL 算法收敛速度快 23 倍,而且同时具有更高的性能。
更新日期:2020-02-01
down
wechat
bug