当前位置: X-MOL 学术IEEE Trans. Image Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Residual Learning for Salient Object Detection
IEEE Transactions on Image Processing ( IF 10.6 ) Pub Date : 2020-02-28 , DOI: 10.1109/tip.2020.2975919
Mengyang Feng , Huchuan Lu , Yizhou Yu

Recent deep learning based salient object detection methods improve the performance by introducing multi-scale strategies into fully convolutional neural networks (FCNs). The final result is obtained by integrating all the predictions at each scale. However, the existing multi-scale based methods suffer from several problems: 1) it is difficult to directly learn discriminative features and filters to regress high-resolution saliency masks for each scale; 2) rescaling the multi-scale features could pull in many redundant and inaccurate values, and this weakens the representational ability of the network. In this paper, we propose a residual learning strategy and introduce to gradually refine the coarse prediction scale-by-scale. Concretely, instead of directly predicting the finest-resolution result at each scale, we learn to predict residuals to remedy the errors between coarse saliency map and scale-matching ground truth masks. We employ a Dilated Convolutional Pyramid Pooling (DCPP) module to generate the coarse prediction and guide the the residual learning process through several novel Attentional Residual Modules (ARMs). We name our network as Residual Refinement Network (R 2 Net). We demonstrate the effectiveness of the proposed method against other state-of-the-art algorithms on five released benchmark datasets. Our R 2 Net is a fully convolutional network which does not need any post-processing and achieves a real-time speed of 33 FPS when it is run on one GPU.

中文翻译:

残差学习用于显着目标检测

最近的基于深度学习的显着目标检测方法通过将多尺度策略引入完全卷积神经网络(FCN)中来提高性能。通过整合每个尺度上的所有预测来获得最终结果。然而,现有的基于多尺度的方法存在几个问题:1)难以直接学习判别特征和滤波器以回归每个尺度的高分辨率显着性掩模;2)重新缩放多尺度特征可能会引入许多冗余和不准确的值,这会削弱网络的表示能力。在本文中,我们提出了一种残差学习策略,并介绍了逐步细化粗略预测的方法。具体而言,与其直接预测每个尺度上的最高分辨率结果,不如说是,我们学会预测残差,以纠正粗略显着图和比例匹配的地面真相掩模之间的误差。我们采用扩散卷积金字塔合并(DCPP)模块来生成粗略预测,并通过一些新颖的注意力残差模块(ARM)引导残差学习过程。我们将网络命名为“残差优化网络(R 2 净)。我们在五个已发布的基准数据集上证明了该方法相对于其他最新算法的有效性。我们的R 2 Net是一个完全卷积的网络,不需要任何后处理,并且在一个GPU上运行时,可以达到33 FPS的实时速度。
更新日期:2020-04-22
down
wechat
bug