当前位置: X-MOL 学术IEEE Trans. Neural Netw. Learn. Syst. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Deductive Reinforcement Learning for Visual Autonomous Urban Driving Navigation
IEEE Transactions on Neural Networks and Learning Systems ( IF 10.2 ) Pub Date : 2021-09-14 , DOI: 10.1109/tnnls.2021.3109284
Changxin Huang , Ronghui Zhang , Meizi Ouyang , Pengxu Wei , Junfan Lin , Jiang Su , Liang Lin

Existing deep reinforcement learning (RL) are devoted to research applications on video games, e.g., The Open Racing Car Simulator (TORCS) and Atari games. However, it remains under-explored for vision-based autonomous urban driving navigation (VB-AUDN). VB-AUDN requires a sophisticated agent working safely in structured, changing, and unpredictable environments; otherwise, inappropriate operations may lead to irreversible or catastrophic damages. In this work, we propose a deductive RL (DeRL) to address this challenge. A deduction reasoner (DR) is introduced to endow the agent with ability to foresee the future and to promote policy learning. Specifically, DR first predicts future transitions through a parameterized environment model. Then, DR conducts self-assessment at the predicted trajectory to perceive the consequences of current policy resulting in a more reliable decision-making process. Additionally, a semantic encoder module (SEM) is designed to extract compact driving representation from the raw images, which is robust to the changes of the environment. Extensive experimental results demonstrate that DeRL outperforms the state-of-the-art model-free RL approaches on the public CAR Learning to Act (CARLA) benchmark and presents a superior performance on success rate and driving safety for goal-directed navigation.

中文翻译:


视觉自主城市驾驶导航的演绎强化学习



现有的深度强化学习(RL)致力于研究视频游戏的应用,例如开放赛车模拟器(TORCS)和Atari游戏。然而,基于视觉的自动城市驾驶导航(VB-AUDN)仍未得到充分探索。 VB-AUDN 需要一个复杂的代理在结构化、变化和不可预测的环境中安全工作;否则,不当操作可能会导致不可逆转或灾难性的损害。在这项工作中,我们提出了演绎强化学习(DeRL)来应对这一挑战。引入演绎推理机(DR),赋予智能体预见未来的能力并促进政策学习。具体来说,DR首先通过参数化环境模型预测未来的转变。然后,DR按照预测的轨迹进行自我评估,以感知当前政策的后果,从而形成更可靠的决策过程。此外,语义编码器模块(SEM)旨在从原始图像中提取紧凑的驾驶表示,这对环境的变化具有鲁棒性。大量实验结果表明,DeRL 在公共 CAR 学习行动 (CARLA) 基准上优于最先进的无模型 RL 方法,并且在目标导向导航的成功率和驾驶安全性方面表现出卓越的性能。
更新日期:2021-09-14
down
wechat
bug