当前位置: X-MOL 学术Transp. Res. Part C Emerg. Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Predictive trajectory planning for autonomous vehicles at intersections using reinforcement learning
Transportation Research Part C: Emerging Technologies ( IF 7.6 ) Pub Date : 2023-02-28 , DOI: 10.1016/j.trc.2023.104063
Ethan Zhang , Ruixuan Zhang , Neda Masoud

In this work we put forward a predictive trajectory planning framework to help autonomous vehicles plan future trajectories. We develop a partially observable Markov decision process (POMDP) to model this sequential decision making problem, and a deep reinforcement learning solution methodology to learn high-quality policies. The POMDP model utilizes driving scenarios, condensed into graphs, as inputs. More specifically, an input graph contains information on the history trajectory of the subject vehicle, predicted trajectories of other agents in the scene (e.g., other vehicles, pedestrians, and cyclists), as well as predicted risk levels posed by surrounding vehicles to devise safe, comfortable, and energy-efficient trajectories for the subject vehicle to follow. In order to obtain sufficient driving scenarios to use as training data, we propose a simulation framework to generate socially acceptable driving scenarios using a real world autonomous vehicle dataset. The simulation framework utilizes Bayesian Gaussian mixture models to learn trajectory patterns of different agent types, and Gibbs sampling to ensure that the distribution of simulated scenarios matches that of the real-world dataset collected by an autonomous fleet. We evaluate the proposed work in two complex urban driving environments: a non-signalized T-junction and a non-signalized lane merge intersection. Both environments provide vastly more complex driving scenarios compared to a highway driving environment, which has been mostly the focus of previous studies. The framework demonstrates promising performance for planning horizons as long as five seconds. We compare safety, comfort, and energy efficiency of the planned trajectories against human-driven trajectories in both experimental driving environments, and demonstrate that it outperforms human-driven trajectories in a statistically significant fashion in all aspects.



中文翻译:

使用强化学习的交叉路口自动驾驶车辆的预测轨迹规划

在这项工作中,我们提出了一个预测轨迹规划框架来帮助自动驾驶汽车规划未来的轨迹。我们开发了一个部分可观察的马尔可夫决策过程 (POMDP) 来模拟这个顺序决策问题,并开发了一个深度强化学习解决方案方法来学习高质量的政策。POMDP 模型利用浓缩成图形的驾驶场景作为输入。更具体地说,输入图包含有关主体车辆的历史轨迹、场景中其他代理(例如,其他车辆、行人和骑自行车的人)的预测轨迹以及周围车辆构成的预测风险级别的信息,以设计安全、舒适、节能的轨迹供目标车辆遵循。为了获得足够的驾驶场景作为训练数据,社会可接受的使用真实世界自动驾驶汽车数据集的驾驶场景。模拟框架利用贝叶斯高斯混合模型来学习不同代理类型的轨迹模式,并使用吉布斯采样来确保模拟场景的分布与自主车队收集的真实世界数据集的分布相匹配。我们在两个复杂的城市驾驶环境中评估拟议的工作:无信号 T 型交叉口和无信号车道合并交叉口。与高速公路驾驶环境相比,这两种环境都提供了更为复杂的驾驶场景,而高速公路驾驶环境一直是以往研究的重点。该框架展示了长达五秒的规划视野的有前途的性能。我们比较安全,舒适,

更新日期:2023-03-01
down
wechat
bug