当前位置: X-MOL 学术IEEE Trans. Veh. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Centralized Cooperation for Connected and Automated Vehicles at Intersections by Proximal Policy Optimization
IEEE Transactions on Vehicular Technology ( IF 6.1 ) Pub Date : 2020-11-01 , DOI: 10.1109/tvt.2020.3026111
Yang Guan , Yangang Ren , Shengbo Eben Li , Qi Sun , Laiquan Luo , Keqiang Li

Connected vehicles will change the modes of future transportation management and organization, especially at an intersection without traffic light. Centralized coordination methods globally coordinate vehicles approaching the intersection from all sections by considering their states altogether. However, they need substantial computation resources since they own a centralized controller to optimize the trajectories for all approaching vehicles in real-time. In this paper, we propose a centralized coordination scheme of automated vehicles at an intersection without traffic signals using reinforcement learning (RL) to address low computation efficiency suffered by current centralized coordination methods. We first propose an RL training algorithm, model accelerated proximal policy optimization (MA-PPO), which incorporates a prior model into proximal policy optimization (PPO) algorithm to accelerate the learning process in terms of sample efficiency. Then we present the design of state, action and reward to formulate centralized coordination as an RL problem. Finally, we train a coordinate policy in a simulation setting and compare computing time and traffic efficiency with a coordination scheme based on model predictive control (MPC) method. Results show that our method spends only 1/400 of the computing time of MPC and increase the efficiency of the intersection by 4.5 times.

中文翻译:

通过近端策略优化在交叉路口实现互联和自动驾驶车辆的集中合作

联网车辆将改变未来交通管理和组织的模式,尤其是在没有红绿灯的十字路口。集中协调方法通过综合考虑车辆的状态来全局协调从所有路段接近交叉口的车辆。然而,他们需要大量的计算资源,因为他们拥有一个集中控制器来实时优化所有接近车辆的轨迹。在本文中,我们提出了一种使用强化学习(RL)在没有交通信号的十字路口自动车辆的集中协调方案,以解决当前集中协调方法所遭受的低计算效率问题。我们首先提出了一种强化学习训练算法,模型加速近端策略优化(MA-PPO),它将先验模型合并到近端策略优化 (PPO) 算法中,以加快样本效率方面的学习过程。然后我们提出了状态、动作和奖励的设计,以将集中协调制定为 RL 问题。最后,我们在模拟环境中训练协调策略,并将计算时间和交通效率与基于模型预测控制 (MPC) 方法的协调方案进行比较。结果表明,我们的方法只花费了 MPC 计算时间的 1/400,将交集的效率提高了 4.5 倍。我们在模拟环境中训练协调策略,并将计算时间和交通效率与基于模型预测控制 (MPC) 方法的协调方案进行比较。结果表明,我们的方法只花费了 MPC 计算时间的 1/400,将交集的效率提高了 4.5 倍。我们在模拟环境中训练协调策略,并将计算时间和交通效率与基于模型预测控制 (MPC) 方法的协调方案进行比较。结果表明,我们的方法只花费了 MPC 计算时间的 1/400,将交集的效率提高了 4.5 倍。
更新日期:2020-11-01
down
wechat
bug