Combining reinforcement learning with rule-based controllers for transparent and general decision-making in autonomous driving,Robotics and Autonomous Systems

当前位置： X-MOL 学术 › Rob. Auton. Syst. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Combining reinforcement learning with rule-based controllers for transparent and general decision-making in autonomous driving
Robotics and Autonomous Systems ( IF 4.3 ) Pub Date : 2020-09-01 , DOI: 10.1016/j.robot.2020.103568
Amarildo Likmeta , Alberto Maria Metelli , Andrea Tirinzoni , Riccardo Giol , Marcello Restelli , Danilo Romano

Abstract The design of high-level decision-making systems is a topical problem in the field of autonomous driving. In this paper, we combine traditional rule-based strategies and reinforcement learning (RL) with the goal of achieving transparency and robustness. On the one hand, the use of handcrafted rule-based controllers allows for transparency, i.e., it is always possible to determine why a given decision was made, but they struggle to scale to complex driving scenarios, in which several objectives need to be considered. On the other hand, black-box RL approaches enable us to deal with more complex scenarios, but they are usually hardly interpretable. In this paper, we combine the best properties of these two worlds by designing parametric rule-based controllers, in which interpretable rules can be provided by domain experts and their parameters are learned via RL. After illustrating how to apply parameter-based RL methods (PGPE) to this setting, we present extensive numerical simulations in the highway and in two urban scenarios: intersection and roundabout. For each scenario, we show the formalization as an RL problem and we discuss the results of our approach in comparison with handcrafted rule-based controllers and black-box RL techniques.

中文翻译：

将强化学习与基于规则的控制器相结合，实现自动驾驶中的透明和通用决策

摘要高层决策系统的设计是自动驾驶领域的热点问题。在本文中，我们将传统的基于规则的策略和强化学习 (RL) 相结合，以实现透明度和鲁棒性。一方面，手工制作的基于规则的控制器的使用允许透明，即总是可以确定为什么做出给定的决定，但它们很难扩展到复杂的驾驶场景，其中需要考虑几个目标. 另一方面，黑盒强化学习方法使我们能够处理更复杂的场景，但它们通常难以解释。在本文中，我们通过设计基于参数规则的控制器结合了这两个世界的最佳属性，其中领域专家可以提供可解释的规则，并通过 RL 学习它们的参数。在说明了如何将基于参数的 RL 方法 (PGPE) 应用于此设置之后，我们在高速公路和两个城市场景中进行了广泛的数值模拟：交叉路口和环形交叉路口。对于每个场景，我们将形式化展示为 RL 问题，并与手工制作的基于规则的控制器和黑盒 RL 技术进行比较，讨论我们的方法的结果。

更新日期：2020-09-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11