当前位置: X-MOL 学术IEEE Trans. Multimedia › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Attentive Cross-Modal Fusion Network for RGB-D Saliency Detection
IEEE Transactions on Multimedia ( IF 7.3 ) Pub Date : 2020-01-01 , DOI: 10.1109/tmm.2020.2991523
Di Liu , Kao Zhang , Zhenzhong Chen

In this paper, an attentive cross-modal fusion (ACMF) network is proposed for RGB-D salient object detection. The proposed method selectively fuses features in a cross-modal manner and uses a fusion refinement module to fuse output features from different resolutions. Our attentive cross-modal fusion network is built based on the residual attention. In each level of ResNet output, both the RGB and depth features are turned into an identity map and a weighted attention map. The identity map is reweighted by the attention map of the paired modal. Moreover, the lower level features with higher resolution are adopted to refine the boundary of detected targets. The whole implementation can be trained end-to-end. Our experimental results show that the ACMF exceeds state-of-the-art methods on five recent datasets for RGB-D salient object detection with averagely 9.0% gain in F-measure, 6.7% gain in S-measure, and 37.2% reduction in MAE.

中文翻译:

用于 RGB-D 显着性检测的注意力跨模态融合网络

在本文中,提出了一种用于 RGB-D 显着目标检测的注意力交叉模态融合 (ACMF) 网络。所提出的方法以跨模态方式选择性地融合特征,并使用融合细化模块来融合来自不同分辨率的输出特征。我们的注意力跨模态融合网络是基于剩余注意力构建的。在 ResNet 输出的每一级中,RGB 和深度特征都被转化为身份图和加权注意力图。身份图由配对模态的注意力图重新加权。此外,采用分辨率更高的低级特征来细化检测目标的边界。整个实现可以进行端到端的训练。
更新日期:2020-01-01
down
wechat
bug