Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer,IEEE Transactions on Pattern Analysis and Machine Intelligence

当前位置： X-MOL 学术 › IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 23.6 ) Pub Date : 2020-08-27 , DOI: 10.1109/tpami.2020.3019967
René Ranftl ₁ , Katrin Lasinger ₂ , David Hafner ₁ , Konrad Schindler ₂ , Vladlen Koltun ₃

Affiliation

The success of monocular depth estimation relies on large and diverse training sets. Due to the challenges associated with acquiring dense ground-truth depth across different environments at scale, a number of datasets with distinct characteristics and biases have emerged. We develop tools that enable mixing multiple datasets during training, even if their annotations are incompatible. In particular, we propose a robust training objective that is invariant to changes in depth range and scale, advocate the use of principled multi-objective learning to combine data from different sources, and highlight the importance of pretraining encoders on auxiliary tasks. Armed with these tools, we experiment with five diverse training datasets, including a new, massive data source: 3D films. To demonstrate the generalization power of our approach we use zero-shot cross-dataset transfer , i.e. we evaluate on datasets that were not seen during training. The experiments confirm that mixing data from complementary sources greatly improves monocular depth estimation. Our approach clearly outperforms competing methods across diverse datasets, setting a new state of the art for monocular depth estimation.

中文翻译：

迈向稳健的单目深度估计：混合数据集以实现零样本跨数据集传输

单目深度估计的成功依赖于大而多样的训练集。由于在大规模不同环境中获取密集的地面实况深度相关的挑战，出现了许多具有不同特征和偏差的数据集。我们开发了能够在训练期间混合多个数据集的工具，即使它们的注释不兼容。特别是，我们提出了一个鲁棒的训练目标，该目标不受深度范围和尺度的变化的影响，提倡使用有原则的多目标学习来组合来自不同来源的数据，并强调预训练编码器对辅助任务的重要性。借助这些工具，我们对五个不同的训练数据集进行了实验，其中包括一个新的海量数据源：3D 电影。为了展示我们方法的泛化能力，我们使用zero-shot cross-dataset transfer ，即我们在训练期间没有看到的数据集上进行评估。实验证实，混合来自互补源的数据极大地改善了单目深度估计。我们的方法明显优于跨不同数据集的竞争方法，为单目深度估计设置了新的技术水平。

更新日期：2020-08-27

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>