当前位置: X-MOL 学术arXiv.cs.RO › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Label Efficient Visual Abstractions for Autonomous Driving
arXiv - CS - Robotics Pub Date : 2020-05-20 , DOI: arxiv-2005.10091
Aseem Behl, Kashyap Chitta, Aditya Prakash, Eshed Ohn-Bar, Andreas Geiger

It is well known that semantic segmentation can be used as an effective intermediate representation for learning driving policies. However, the task of street scene semantic segmentation requires expensive annotations. Furthermore, segmentation algorithms are often trained irrespective of the actual driving task, using auxiliary image-space loss functions which are not guaranteed to maximize driving metrics such as safety or distance traveled per intervention. In this work, we seek to quantify the impact of reducing segmentation annotation costs on learned behavior cloning agents. We analyze several segmentation-based intermediate representations. We use these visual abstractions to systematically study the trade-off between annotation efficiency and driving performance, i.e., the types of classes labeled, the number of image samples used to learn the visual abstraction model, and their granularity (e.g., object masks vs. 2D bounding boxes). Our analysis uncovers several practical insights into how segmentation-based visual abstractions can be exploited in a more label efficient manner. Surprisingly, we find that state-of-the-art driving performance can be achieved with orders of magnitude reduction in annotation cost. Beyond label efficiency, we find several additional training benefits when leveraging visual abstractions, such as a significant reduction in the variance of the learned policy when compared to state-of-the-art end-to-end driving models.

中文翻译:

为自动驾驶标记高效的视觉抽象

众所周知,语义分割可以用作学习驾驶策略的有效中间表示。然而,街景语义分割的任务需要昂贵的注释。此外,无论实际驾驶任务如何,分割算法都经常被训练,使用辅助图像空间损失函数,这些函数不能保证最大化驾驶指标,例如每次干预的安全性或行驶距离。在这项工作中,我们试图量化减少分割注释成本对学习行为克隆代理的影响。我们分析了几种基于分割的中间表示。我们使用这些视觉抽象来系统地研究注释效率和驾驶性能之间的权衡,即标记的类的类型,用于学习视觉抽象模型的图像样本数量及其粒度(例如,对象掩码与 2D 边界框)。我们的分析揭示了一些关于如何以更高效的标签方式利用基于分割的视觉抽象的实用见解。令人惊讶的是,我们发现可以通过将注释成本降低几个数量级来实现最先进的驾驶性能。除了标签效率之外,我们在利用视觉抽象时发现了一些额外的训练好处,例如与最先进的端到端驾驶模型相比,学习策略的方差显着减少。我们的分析揭示了一些关于如何以更高效的标签方式利用基于分割的视觉抽象的实用见解。令人惊讶的是,我们发现可以通过将注释成本降低几个数量级来实现最先进的驾驶性能。除了标签效率之外,我们在利用视觉抽象时发现了一些额外的训练好处,例如与最先进的端到端驾驶模型相比,学习策略的方差显着减少。我们的分析揭示了一些关于如何以更高效的标签方式利用基于分割的视觉抽象的实用见解。令人惊讶的是,我们发现可以通过将注释成本降低几个数量级来实现最先进的驾驶性能。除了标签效率之外,我们在利用视觉抽象时发现了一些额外的训练好处,例如与最先进的端到端驾驶模型相比,学习策略的方差显着减少。
更新日期:2020-07-17
down
wechat
bug