当前位置: X-MOL 学术IEEE Trans. Cognit. Commun. Netw. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Delay-Aware VNF Scheduling: A Reinforcement Learning Approach With Variable Action Set
IEEE Transactions on Cognitive Communications and Networking ( IF 7.4 ) Pub Date : 2020-04-21 , DOI: 10.1109/tccn.2020.2988908
Junling Li , Weisen Shi , Ning Zhang , Xuemin Shen

Software defined networking (SDN) and network function virtualization (NFV) are the key enabling technologies for service customization in next generation networks to support various applications. In such a circumstance, virtual network function (VNF) scheduling plays an essential role in enhancing resource utilization and achieving better quality-of-service (QoS). In this paper, the VNF scheduling problem is investigated to minimize the makespan (i.e., overall completion time) of all services, while satisfying their different end-to-end (E2E) delay requirements. The problem is formulated as a mixed integer linear program (MILP) which is NP-hard with exponentially increasing computational complexity as the network size expands. To solve the MILP with high efficiency and accuracy, the original problem is reformulated as a Markov decision process (MDP) problem with variable action set. Then, a reinforcement learning (RL) algorithm is developed to learn the best scheduling policy by continuously interacting with the network environment. The proposed learning algorithm determines the variable action set at each decision-making state and captures different execution time of the actions. The reward function in the proposed algorithm is carefully designed to realize delay-aware VNF scheduling. Simulation results are presented to demonstrate the convergence and high accuracy of the proposed approach against other benchmark algorithms.

中文翻译:

延迟感知的VNF调度:具有可变动作集的强化学习方法

软件定义网络(SDN)和网络功能虚拟化(NFV)是用于下一代网络中的服务定制以支持各种应用程序的关键支持技术。在这种情况下,虚拟网络功能(VNF)调度在提高资源利用率和实现更好的服务质量(QoS)方面起着至关重要的作用。在本文中,对VNF调度问题进行了研究,以在满足它们不同的端到端(E2E)延迟要求的同时,最小化所有服务的完成时间(即总体完成时间)。该问题被表述为一个混合整数线性程序(MILP),该程序对NP困难,并且随着网络规模的扩展,计算复杂度呈指数增长。为了高效,准确地解决MILP,原始问题被重新表述为具有可变操作集的马尔可夫决策过程(MDP)问题。然后,开发了强化学习(RL)算法,以通过与网络环境不断交互来学习最佳调度策略。所提出的学习算法确定在每个决策状态下的可变动作集,并捕获动作的不同执行时间。精心设计了该算法中的奖励函数,以实现可感知延迟的VNF调度。仿真结果表明了该方法相对于其他基准算法的收敛性和较高的准确性。所提出的学习算法确定在每个决策状态下的可变动作集,并捕获动作的不同执行时间。精心设计了该算法中的奖励函数,以实现可感知延迟的VNF调度。仿真结果表明了该方法相对于其他基准算法的收敛性和较高的准确性。所提出的学习算法确定在每个决策状态下的可变动作集,并捕获动作的不同执行时间。精心设计了该算法中的奖励函数,以实现可感知延迟的VNF调度。仿真结果表明了该方法相对于其他基准算法的收敛性和较高的准确性。
更新日期:2020-04-21
down
wechat
bug