Microscopic modeling of cyclists interactions with pedestrians in shared spaces: a Gaussian process inverse reinforcement learning approach,Transportmetrica A: Transport Science

当前位置： X-MOL 学术 › Transportmetr. A Transp. Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Microscopic modeling of cyclists interactions with pedestrians in shared spaces: a Gaussian process inverse reinforcement learning approach
Transportmetrica A: Transport Science ( IF 3.6 ) Pub Date : 2021-03-20 , DOI: 10.1080/23249935.2021.1898487
Rushdi Alsaleh ₁ , Tarek Sayed ₁

Affiliation

This study presents a microsimulation-oriented framework for modeling cyclists' interactions with pedestrians in shared spaces. The objectives of this study are to 1) infer how cyclists in head-on and crossing interactions rationally assess and make guidance decisions of acceleration and yaw rate, and 2) use advanced Artificial Intelligent (AI) techniques to model road-user interactions. The Markov Decision Process modeling framework is used to account for road-user rationality and intelligence. Road user trajectories from three shared spaces in North America are extracted by means of computer-vision algorithms. Inverse Reinforcement Learning (IRL) algorithms are utilized to recover continuous linear and nonlinear Gaussian-Process (GP) reward-functions (RFs). Deep Reinforcement Learning is used to estimate optimal cyclist policies. Results demonstrated that the GP-RF captures the more complex interaction behaviour and accounts for road-user heterogeneity. The GP-RF led to more consistent inferences of road-users behaviour and accurate predictions of their trajectories compared with the linear RF.

中文翻译：

共享空间中骑车人与行人互动的微观建模：一种高斯过程逆强化学习方法

本研究提出了一个面向微观仿真的框架，用于对共享空间中骑车人与行人的互动进行建模。本研究的目标是 1) 推断骑车人在正面和交叉路口互动中如何合理地评估加速度和偏航率并做出指导决策，以及 2) 使用先进的人工智能 (AI) 技术来模拟道路与用户的互动。马尔可夫决策过程建模框架用于说明道路使用者的理性和智能。来自北美三个共享空间的道路使用者轨迹是通过计算机视觉算法提取的。逆强化学习 (IRL) 算法用于恢复连续线性和非线性高斯过程 (GP) 奖励函数 (RF)。深度强化学习用于估计最佳骑车人政策。结果表明，GP-RF 捕获了更复杂的交互行为并解释了道路用户的异质性。与线性 RF 相比，GP-RF 可以更一致地推断道路使用者的行为并准确预测他们的轨迹。

更新日期：2021-03-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文