当前位置: X-MOL 学术arXiv.cs.MA › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hierarchical Deep Double Q-Routing
arXiv - CS - Multiagent Systems Pub Date : 2019-10-09 , DOI: arxiv-1910.04041
Ramy E. Ali, Bilgehan Erman, Ejder Ba\c{s}tu\u{g} and Bruce Cilli

This paper explores a deep reinforcement learning approach applied to the packet routing problem with high-dimensional constraints instigated by dynamic and autonomous communication networks. Our approach is motivated by the fact that centralized path calculation approaches are often not scalable, whereas the distributed approaches with locally acting nodes are not fully aware of the end-to-end performance. We instead hierarchically distribute the path calculation over designated nodes in the network while taking into account the end-to-end performance. Specifically, we develop a hierarchical cluster-oriented adaptive per-flow path calculation mechanism by leveraging the Deep Double Q-network (DDQN) algorithm, where the end-to-end paths are calculated by the source nodes with the assistance of cluster (group) leaders at different hierarchical levels. In our approach, a deferred composite reward is designed to capture the end-to-end performance through a feedback signal from the source nodes to the group leaders and captures the local network performance through the local resource assessments by the group leaders. This approach scales in large networks, adapts to the dynamic demand, utilizes the network resources efficiently and can be applied to segment routing.

中文翻译:

分层深度双 Q 路由

本文探讨了一种深度强化学习方法,该方法适用于动态和自主通信网络引发的具有高维约束的数据包路由问题。我们的方法受到以下事实的启发:集中式路径计算方法通常不可扩展,而具有本地作用节点的分布式方法并不完全了解端到端的性能。我们在考虑端到端性能的同时,将路径计算分层分布在网络中的指定节点上。具体来说,我们通过利用深度双 Q 网络 (DDQN) 算法开发了一种面向分层集群的自适应每流路径计算机制,其中端到端路径由源节点在不同层次级别的集群(组)领导者的帮助下计算。在我们的方法中,延迟复合奖励旨在通过从源节点到组长的反馈信号来捕获端到端性能,并通过组长对本地资源的评估来捕获本地网络性能。这种方法可在大型网络中扩展,适应动态需求,有效利用网络资源,可应用于分段路由。
更新日期:2020-03-05
down
wechat
bug