当前位置: X-MOL 学术arXiv.cs.CV › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Self-Supervised Monocular Depth Estimation: Solving the Dynamic Object Problem by Semantic Guidance
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2020-07-14 , DOI: arxiv-2007.06936
Marvin Klingner, Jan-Aike Term\"ohlen, Jonas Mikolajczyk, Tim Fingscheidt

Self-supervised monocular depth estimation presents a powerful method to obtain 3D scene information from single camera images, which is trainable on arbitrary image sequences without requiring depth labels, e.g., from a LiDAR sensor. In this work we present a new self-supervised semantically-guided depth estimation (SGDepth) method to deal with moving dynamic-class (DC) objects, such as moving cars and pedestrians, which violate the static-world assumptions typically made during training of such models. Specifically, we propose (i) mutually beneficial cross-domain training of (supervised) semantic segmentation and self-supervised depth estimation with task-specific network heads, (ii) a semantic masking scheme providing guidance to prevent moving DC objects from contaminating the photometric loss, and (iii) a detection method for frames with non-moving DC objects, from which the depth of DC objects can be learned. We demonstrate the performance of our method on several benchmarks, in particular on the Eigen split, where we exceed all baselines without test-time refinement.

中文翻译:

自监督单目深度估计:通过语义指导解决动态对象问题

自监督单目深度估计提供了一种从单个相机图像中获取 3D 场景信息的强大方法,该方法可在任意图像序列上进行训练,而无需深度标签,例如来自 LiDAR 传感器。在这项工作中,我们提出了一种新的自监督语义引导深度估计 (SGDepth) 方法来处理移动的动态类 (DC) 对象,例如移动的汽车和行人,这违反了通常在训练期间做出的静态世界假设这样的模型。具体来说,我们提出(i)互利的(监督)语义分割和自监督深度估计与任务特定的网络头的互利跨域训练,(ii)提供指导以防止移动 DC 对象污染光度测量的语义掩蔽方案失利,(iii) 具有非移动 DC 对象的帧的检测方法,从中可以学习 DC 对象的深度。我们在几个基准测试中展示了我们的方法的性能,特别是在特征分割上,我们在没有测试时间改进的情况下超过了所有基线。
更新日期:2020-07-22
down
wechat
bug