Hierarchical Object Relationship Constrained Monocular Depth Estimation,Pattern Recognition

当前位置： X-MOL 学术 › Pattern Recogn. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Hierarchical Object Relationship Constrained Monocular Depth Estimation
Pattern Recognition ( IF 7.5 ) Pub Date : 2021-06-24 , DOI: 10.1016/j.patcog.2021.108116
Shuai Li , Jiaying Shi , Wenfeng Song , Aimin Hao , Hong Qin

Monocular depth estimation has been gaining growing momentum in recent years. Despite significant advances of this task, due to the inherent difficulty of reliably capturing contextual cues from RGB images, it remains challenging to accurately predict depth in scenes with complicated and cluttered spatial arrangement of objects. Instead of naively utilizing the primary features in the single RGB image, in this paper we propose a hierarchical object relationship constrained network for monocular depth estimation, which could enable accurate and smooth depth prediction from monocular RGB image. The key idea of our method is to exploit object-centric hierarchical relationship as contextual constraints to compensate for the regularity of spatial depth changing. In particular, we design a semantics-guided CNN network to encode the original image into a global context feature map and encode the objects’ relationship into a local relationship feature map simultaneously, so that we can leverage such effective and consolidated coding scheme over scenario samples to guide the depth prediction in a more accurate way. Benefiting from the local-to-global context constraints, our method can well respect the global depth changing and preserve the local depth details at the same time. In addition, our approach could make full use of the hierarchical semantic relationship across inner-object components and neighboring objects to define depth changing constraints. We conduct extensive experiments and make comprehensive evaluations on widely-used public datasets, and the experiments confirm that our method outperforms most state-of-the-art depth estimation methods in preserving the local details in depth.

中文翻译：

分层对象关系约束单目深度估计

近年来，单目深度估计的势头越来越大。尽管这项任务取得了重大进展，但由于从 RGB 图像可靠地捕获上下文线索的固有困难，准确预测具有复杂和杂乱物体空间排列的场景中的深度仍然具有挑战性。在本文中，我们没有天真地利用单个 RGB 图像中的主要特征，而是提出了一种用于单眼深度估计的分层对象关系约束网络，它可以从单眼 RGB 图像中实现准确和平滑的深度预测。我们方法的关键思想是利用以对象为中心的层次关系作为上下文约束来补偿空间深度变化的规律性。特别是，我们设计了一个语义引导的 CNN 网络，将原始图像编码为全局上下文特征图，同时将对象的关系编码为局部关系特征图，以便我们可以利用这种对场景样本有效且统一的编码方案来指导以更准确的方式进行深度预测。受益于局部到全局的上下文约束，我们的方法可以很好地尊重全局深度变化并同时保留局部深度细节。此外，我们的方法可以充分利用内部对象组件和相邻对象之间的分层语义关系来定义深度变化约束。我们对广泛使用的公共数据集进行了广泛的实验和综合评估，

更新日期：2021-07-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11