当前位置: X-MOL 学术Front. Inform. Technol. Electron. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Cooperative channel assignment for VANETs based on multiagent reinforcement learning
Frontiers of Information Technology & Electronic Engineering ( IF 2.7 ) Pub Date : 2020-07-29 , DOI: 10.1631/fitee.1900308
Yun-peng Wang , Kun-xian Zheng , Da-xin Tian , Xu-ting Duan , Jian-shan Zhou

Dynamic channel assignment (DCA) plays a key role in extending vehicular ad-hoc network capacity and mitigating congestion. However, channel assignment under vehicular direct communication scenarios faces mutual influence of large-scale nodes, the lack of centralized coordination, unknown global state information, and other challenges. To solve this problem, a multiagent reinforcement learning (RL) based cooperative DCA (RL-CDCA) mechanism is proposed. Specifically, each vehicular node can successfully learn the proper strategies of channel selection and backoff adaptation from the real-time channel state information (CSI) using two cooperative RL models. In addition, neural networks are constructed as nonlinear Q-function approximators, which facilitates the mapping of the continuously sensed input to the mixed policy output. Nodes are driven to locally share and incorporate their individual rewards such that they can optimize their policies in a distributed collaborative manner. Simulation results show that the proposed multiagent RL-CDCA can better reduce the one-hop packet delay by no less than 73.73%, improve the packet delivery ratio by no less than 12.66% on average in a highly dense situation, and improve the fairness of the global network resource allocation.



中文翻译:

基于多主体强化学习的VANET协作信道分配

动态信道分配(DCA)在扩展车辆自组织网络容量和缓解拥塞方面起着关键作用。然而,在车辆直接通信情况下的信道分配面临大规模节点的相互影响,缺乏集中协调,未知的全局状态信息以及其他挑战。为了解决这个问题,提出了一种基于多主体强化学习(RL)的协作DCA(RL-CDCA)机制。具体来说,每个车辆节点都可以使用两个协作RL模型从实时信道状态信息(CSI)成功学习正确的信道选择和退避自适应策略。另外,神经网络被构造为非线性Q函数逼近器,这有助于将连续感测到的输入映射到混合策略输出。驱动节点在本地共享和合并其各自的奖励,以便它们可以以分布式协作方式优化其策略。仿真结果表明,所提出的多智能体RL-CDCA能够在高度密集的情况下更好地将单跳包延迟减少不小于73.73%,平均将包传递率提高不小于12.66%,并提高了公平性。全球网络资源分配。

更新日期:2020-07-29
down
wechat
bug