当前位置: X-MOL 学术IEEE Trans. Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
RGB-D Salient Object Detection With Ubiquitous Target Awareness
IEEE Transactions on Image Processing ( IF 10.6 ) Pub Date : 2021-09-03 , DOI: 10.1109/tip.2021.3108412
Yifan Zhao , Jiawei Zhao , Jia Li , Xiaowu Chen

Conventional RGB-D salient object detection methods aim to leverage depth as complementary information to find the salient regions in both modalities. However, the salient object detection results heavily rely on the quality of captured depth data which sometimes are unavailable. In this work, we make the first attempt to solve the RGB-D salient object detection problem with a novel depth-awareness framework. This framework only relies on RGB data in the testing phase, utilizing captured depth data as supervision for representation learning. To construct our framework as well as achieving accurate salient detection results, we propose a Ubiquitous Target Awareness (UTA) network to solve three important challenges in RGB-D SOD task: 1) a depth awareness module to excavate depth information and to mine ambiguous regions via adaptive depth-error weights, 2) a spatial-aware cross-modal interaction and a channel-aware cross-level interaction, exploiting the low-level boundary cues and amplifying high-level salient channels, and 3) a gated multi-scale predictor module to perceive the object saliency in different contextual scales. Besides its high performance, our proposed UTA network is depth-free for inference and runs in real-time with 43 FPS. Experimental evidence demonstrates that our proposed network not only surpasses the state-of-the-art methods on five public RGB-D SOD benchmarks by a large margin, but also verifies its extensibility on five public RGB SOD benchmarks.

中文翻译:

具有无处不在的目标意识的 RGB-D 显着目标检测

传统的 RGB-D 显着对象检测方法旨在利用深度作为补充信息来找到两种模式中的显着区域。然而,显着物体检测结果在很大程度上依赖于有时不可用的捕获深度数据的质量。在这项工作中,我们首次尝试使用新颖的深度感知框架解决 RGB-D 显着对象检测问题。该框架仅在测试阶段依赖 RGB 数据,利用捕获的深度数据作为表征学习的监督。为了构建我们的框架并获得准确的显着检测结果,我们提出了一个无处不在的目标感知 (UTA) 网络来解决 RGB-D SOD 任务中的三个重要挑战:1) 深度感知模块,通过自适应深度误差权重挖掘深度信息并挖掘模糊区域,2) 空间感知跨模式交互和通道感知跨级别交互,利用低级别边界线索和放大高级显着性通道,以及 3) 门控多尺度预测器模块,以感知不同上下文尺度中的对象显着性。除了其高性能之外,我们提出的 UTA 网络在推理方面是无深度的,并且以 43 FPS 实时运行。实验证据表明,我们提出的网络不仅在五个公共 RGB-D SOD 基准上大大超过了最先进的方法,而且在五个公共 RGB SOD 基准上验证了其可扩展性。利用低级边界线索并放大高级显着性通道,以及 3) 门控多尺度预测器模块以感知不同上下文尺度的对象显着性。除了其高性能之外,我们提出的 UTA 网络在推理方面是无深度的,并且以 43 FPS 实时运行。实验证据表明,我们提出的网络不仅在五个公共 RGB-D SOD 基准上大大超过了最先进的方法,而且在五个公共 RGB SOD 基准上验证了其可扩展性。利用低级边界线索并放大高级显着性通道,以及 3) 门控多尺度预测器模块以感知不同上下文尺度的对象显着性。除了其高性能之外,我们提出的 UTA 网络在推理方面是无深度的,并且以 43 FPS 实时运行。实验证据表明,我们提出的网络不仅在五个公共 RGB-D SOD 基准上大大超过了最先进的方法,而且在五个公共 RGB SOD 基准上验证了其可扩展性。
更新日期:2021-09-14
down
wechat
bug