Guided residual network for RGB-D salient object detection with efficient depth feature learning,The Visual Computer

当前位置： X-MOL 学术 › Vis. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Guided residual network for RGB-D salient object detection with efficient depth feature learning
The Visual Computer ( IF 3.0 ) Pub Date : 2021-04-17 , DOI: 10.1007/s00371-021-02106-5
Jian Wang , Shuhan Chen , Xiao Lv , Xiuqi Xu , Xuelong Hu

RGB-D salient object detection aims at identifying the most attractive parts from a RGB image and its corresponding depth image, which has been widely applied in many computer vision tasks. However, there are still two challenges: (1) how to quickly and effectively integrate the cross-modal features from the RGB-D data; and (2) how to mitigate the negative impact from the low-quality depth map. The previous methods mostly employ a two-stream architecture which adopts two backbone network to process RGB-D data and ignore the quality of depth map. In this paper, we propose a guided residual network to address these two issues. On the one hand, we design a simpler and efficient depth branch only using one convolutional layer and three residual modules to extract depth features instead of employing a pre-trained backbone to handle the depth data, and fuse RGB features and depth features in a multi-scale manner for refinement with top-down guidance. On the other hand, we add adaptive weight to depth maps to control the fusion between them, which mitigates the negative influence of unreliable depth map. Experimental results compared with 13 state-of-the-art methods on 7 datasets demonstrate the validity of the proposed approach both quantitatively and qualitatively, especially in efficiency (102 FPS) and compactness (64.2 MB).

中文翻译：

导引残差网络，用于RGB-D显着物体检测，具有有效的深度特征学习

RGB-D显着物体检测的目的是从RGB图像及其相应的深度图像中识别出最吸引人的部分，这已广泛应用于许多计算机视觉任务中。但是，仍然存在两个挑战：（1）如何快速有效地整合RGB-D数据中的交叉模式特征；（2）如何减轻低质量深度图的负面影响。先前的方法大多采用两流架构，该架构采用两个主干网络来处理RGB-D数据，而忽略了深度图的质量。在本文中，我们提出了一个有导引的残差网络来解决这两个问题。一方面，我们仅使用一个卷积层和三个残差模块来提取深度特征，而不是使用预先训练的主干来处理深度数据，从而设计了一种更简单有效的深度分支，并以多比例融合RGB功能和深度功能，以自上而下的指导进行细化。另一方面，我们将自适应权重添加到深度图以控制它们之间的融合，从而减轻了不可靠的深度图的负面影响。实验结果与7个数据集上的13种最新方法进行了比较，证明了该方法在定量和定性方面的有效性，特别是在效率方面（102 FPS）和紧凑性（64.2 MB）。

更新日期：2021-04-18

点击分享查看原文

点击收藏

阅读更多本刊最新论文