当前位置: X-MOL 学术IEEE Trans. Cognit. Commun. Netw. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SMART: Situationally-Aware Multi-Agent Reinforcement Learning-Based Transmissions
IEEE Transactions on Cognitive Communications and Networking ( IF 8.6 ) Pub Date : 2021-03-25 , DOI: 10.1109/tccn.2021.3068740
Zhiyuan Jiang , Yan Liu , Jernej Hribar , Luiz A. DaSilva , Sheng Zhou , Zhisheng Niu

In future wireless systems, latency of information needs to be minimized to satisfy the requirements of many mission-critical applications. Meanwhile, not all terminals carry equally-urgent packets given their distinct situations, e.g., status freshness. Leveraging this feature, we propose an on-demand Medium Access Control (MAC) scheme, whereby each terminal transmits with dynamically adjusted aggressiveness based on its situations which are modeled as Markov states. A Multi-Agent Reinforcement Learning (MARL) framework is utilized and each agent is trained with a Deep Deterministic Policy Gradient (DDPG) network. A notorious issue for MARL is slow and non-scalable convergence – to address this, a new Situationally-aware MARL-based Transmissions (SMART) scheme is proposed. It is shown that SMART can significantly shorten the convergence time and the converged performance is also dramatically improved compared with state-of-the-art DDPG-based MARL schemes, at the expense of an additional offline training stage. SMART also outperforms conventional MAC schemes significantly, e.g., Carrier Sensing and Multiple Access (CSMA), in terms of average and peak Age of Information (AoI). In addition, SMART also has the advantage of versatility – different Quality-of-Service (QoS) metrics and hence various state space definitions are tested in extensive simulations, where SMART shows robustness and scalability in all considered scenarios.

中文翻译:

SMART:基于情境感知的多智能体强化学习传输

在未来的无线系统中,需要最大限度地减少信息延迟以满足许多关键任务应用程序的要求。同时,考虑到它们不同的情况,例如状态新鲜度,并非所有终端都携带同样紧急的数据包。利用此功能,我们提出了一种按需媒体访问控制 (MAC) 方案,其中每个终端根据其情况(建模为马尔可夫状态)以动态调整的积极性进行传输。使用了多代理强化学习 (MARL) 框架,并且每个代理都使用深度确定性策略梯度 (DDPG) 网络进行训练。MARL 的一个臭名昭著的问题是缓慢且不可扩展的收敛——为了解决这个问题,提出了一种新的基于情景感知的 MARL 传输 (SMART) 方案。结果表明,与最先进的基于 DDPG 的 MARL 方案相比,SMART 可以显着缩短收敛时间,并且收敛性能也显着提高,但代价是额外的离线训练阶段。在平均和峰值信息年龄 (AoI) 方面,SMART 也显着优于传统的 MAC 方案,例如,载波侦听和多路访问 (CSMA)。此外,SMART 还具有多功能性的优势——不同的服务质量 (QoS) 指标以及各种状态空间定义在广泛的模拟中进行了测试,其中 SMART 在所有考虑的场景中都表现出稳健性和可扩展性。在平均和峰值信息年龄 (AoI) 方面,SMART 也显着优于传统的 MAC 方案,例如,载波侦听和多路访问 (CSMA)。此外,SMART 还具有多功能性的优势——不同的服务质量 (QoS) 指标以及各种状态空间定义在广泛的模拟中进行了测试,其中 SMART 在所有考虑的场景中都表现出稳健性和可扩展性。在平均和峰值信息年龄 (AoI) 方面,SMART 也显着优于传统的 MAC 方案,例如,载波侦听和多路访问 (CSMA)。此外,SMART 还具有多功能性的优势——不同的服务质量 (QoS) 指标以及各种状态空间定义在广泛的模拟中进行了测试,其中 SMART 在所有考虑的场景中都表现出稳健性和可扩展性。
更新日期:2021-03-25
down
wechat
bug