当前位置: X-MOL 学术Image Vis. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Depth prediction from 2D images: A taxonomy and an evaluation study
Image and Vision Computing ( IF 4.2 ) Pub Date : 2019-11-10 , DOI: 10.1016/j.imavis.2019.11.003
Ambroise Moreau , Matei Mancas , Thierry Dutoit

Among the various cues that help us understand and interact with our surroundings, depth is of particular importance. It allows us to move in space and grab objects to complete different tasks. Therefore, depth prediction has been an active research field for decades and many algorithms have been proposed to retrieve depth. Some imitate human vision and compute depth through triangulation on correspondences found between pixels or handcrafted features in different views of the same scene. Others rely on simple assumptions and semantic knowledge of the structure of the scene to get the depth information. Recently, numerous algorithms based on deep learning have emerged from the computer vision community. They implement the same principles as the non-deep learning methods and leverage the ability of deep neural networks of automatically learning important features that help to solve the task. By doing so, they produce new state-of-the-art results and show encouraging prospects. In this article, we propose a taxonomy of deep learning methods for depth prediction from 2D images. We retained the training strategy as the sorting criterion. Indeed, some methods are trained in a supervised manner which means depth labels are needed during training while others are trained in an unsupervised manner. In that case, the models learn to perform a different task such as view synthesis and depth is only a by-product of this learning. In addition to this taxonomy, we also evaluate nine models on two similar datasets without retraining. Our analysis showed that (i) most models are sensitive to sharp discontinuities created by shadows or colour contrasts and (ii) the post processing applied to the results before computing the commonly used metrics can change the model ranking. Moreover, we showed that most metrics agree with each other and are thus redundant.



中文翻译:

从2D图像进行深度预测:分类学和评估研究

在帮助我们了解周围环境并与周围环境互动的各种线索中,深度尤为重要。它使我们可以在太空中移动并抓住物体来完成不同的任务。因此,深度预测几十年来一直是活跃的研究领域,并且已经提出了许多算法来检索深度。有些人通过三角测量来模仿人类的视觉并计算深度,该三角测量是针对同一场景的不同视图中的像素或手工制作的特征之间的对应关系进行的。其他人则依靠简单的假设和场景结构的语义知识来获取深度信息。最近,计算机视觉界出现了许多基于深度学习的算法。它们实现与非深度学习方法相同的原理,并利用深度神经网络自动学习有助于解决任务的重要功能的能力。通过这样做,他们得出了最新的结果,并显示出令人鼓舞的前景。在本文中,我们提出了一种用于根据2D图像进行深度预测的深度学习方法的分类法。我们保留了训练策略作为排序标准。实际上,某些方法是以监督方式进行训练的,这意味着在训练期间需要深度标签,而其他方法则是以无监督方式进行训练的。在这种情况下,模型将学习执行不同的任务,例如视图合成,而深度只是此学习的副产品。除此分类法外,我们还对两个相似的数据集评估了九个模型,而无需进行重新训练。我们的分析表明,(i)大多数模型对阴影或色彩对比所造成的尖锐不连续性敏感;(ii)在计算常用指标可以改变模型排名之前,对结果进行后处理。此外,我们证明了大多数指标彼此一致,因此是多余的。

更新日期:2019-11-10
down
wechat
bug