Perceptual Monocular Depth Estimation,Neural Processing Letters

当前位置： X-MOL 学术 › Neural Process Lett. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Perceptual Monocular Depth Estimation
Neural Processing Letters ( IF 3.1 ) Pub Date : 2021-02-10 , DOI: 10.1007/s11063-021-10437-6
Janice Pan , Alan C. Bovik

Monocular depth estimation (MDE), which is the task of using a single image to predict scene depths, has gained considerable interest, in large part owing to the popularity of applying deep learning methods to solve “computer vision problems”. Monocular cues provide sufficient data for humans to instantaneously extract an understanding of scene geometries and relative depths, which is evidence of both the processing power of the human visual system and the predictive power of the monocular data. However, developing computational models to predict depth from monocular images remains challenging. Hand-designed MDE features do not perform particularly well, and even current “deep” models are still evolving. Here we propose a novel approach that uses perceptually-relevant natural scene statistics (NSS) features to predict depths from monocular images in a simple, scale-agnostic way that is competitive with state-of-the-art systems. While the statistics of natural photographic images have been successfully used in a variety of image and video processing, analysis, and quality assessment tasks, they have never been applied in a predictive end-to-end deep-learning model for monocular depth. Correspondingly, no previous work has explicitly incorporated perceptual features in a monocular depth-prediction approach. Here we accomplish this by developing a new closed-form bivariate model of image luminances and use features extracted from this model and from other NSS models to drive a novel deep learning framework for predicting depth given a single image.

中文翻译：

感知单眼深度估计

单眼深度估计（MDE）是使用单个图像预测场景深度的任务，它引起了人们的极大兴趣，这在很大程度上是由于应用深度学习方法来解决“计算机视觉问题”的普遍性。单眼线索为人类提供了足够的数据，以即时提取对场景几何形状和相对深度的理解，这既证明了人类视觉系统的处理能力，又证明了单眼数据的预测能力。然而，开发计算模型以从单眼图像预测深度仍然具有挑战性。手工设计的MDE功能的性能不是特别好，甚至当前的“深度”模型仍在不断发展。在这里，我们提出了一种新颖的方法，该方法使用与感知相关的自然场景统计（NSS）功能，以简单，与规模无关的方式从单眼图像中预测深度，这与最新的系统竞争。尽管自然摄影图像的统计信息已成功用于各种图像和视频处理，分析和质量评估任务，但它们从未用于单眼深度的预测性端到端深度学习模型中。相应地，以前没有任何工作在单眼深度预测方法中明确纳入感知特征。在这里，我们通过开发一种新的图像亮度的封闭形式双变量模型并使用从该模型和其他NSS模型中提取的特征来驱动新颖的深度学习框架来预测给定单个图像的深度来实现这一点。

更新日期：2021-02-10

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>