当前位置: X-MOL 学术Proc. Inst. Mech. Eng. Part D J. Automob. Eng. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Preceding vehicle following algorithm with human driving characteristics
Proceedings of the Institution of Mechanical Engineers, Part D: Journal of Automobile Engineering ( IF 1.5 ) Pub Date : 2021-01-04 , DOI: 10.1177/0954407020981546
Feng Pan 1 , Hong Bao 2
Affiliation  

This paper proposes a new approach of using reinforcement learning (RL) to train an agent to perform the task of vehicle following with human driving characteristics. We refer to the ideal of inverse reinforcement learning to design the reward function of the RL model. The factors that need to be weighed in vehicle following were vectorized into reward vectors, and the reward function was defined as the inner product of the reward vector and weights. Driving data of human drivers was collected and analyzed to obtain the true reward function. The RL model was trained with the deterministic policy gradient algorithm because the state and action spaces are continuous. We adjusted the weight vector of the reward function so that the value vector of the RL model could continuously approach that of a human driver. After dozens of rounds of training, we selected the policy with the nearest value vector to that of a human driver and tested it in the PanoSim simulation environment. The results showed the desired performance for the task of an agent following the preceding vehicle safely and smoothly.



中文翻译:

具有人为驾驶特性的在前车辆跟随算法

本文提出了一种使用强化学习(RL)来训练智能体来执行具有人类驾驶特性的跟随车辆任务的新方法。我们参考逆强化学习的理想来设计RL模型的奖励函数。将车辆跟随中需要权衡的因素矢量化为奖励向量,并将奖励函数定义为奖励向量和权重的内积。收集并分析人类驾驶员的驾驶数据以获得真正的奖励功能。由于状态空间和动作空间是连续的,因此使用确定性策略梯度算法训练了RL模型。我们调整了奖励函数的权重向量,以使RL模型的值向量可以不断接近人类驾驶员的值。经过数十轮训练 我们选择了与人类驾驶员最接近的向量的策略,并在PanoSim仿真环境中对其进行了测试。结果表明,安全,顺畅地跟随前面车辆的特工任务具有期望的性能。

更新日期:2021-01-04
down
wechat
bug