CNN-based encoder-decoder networks for salient object detection: A comprehensive review and recent advances,Information Sciences

当前位置： X-MOL 学术 › Inform. Sci. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

CNN-based encoder-decoder networks for salient object detection: A comprehensive review and recent advances
Information Sciences Pub Date : 2020-09-09 , DOI: 10.1016/j.ins.2020.09.003
Yuzhu Ji , Haijun Zhang , Zhao Zhang , Ming Liu

Convolutional neural network (CNN)-based encoder-decoder models have profoundly inspired recent works in the field of salient object detection (SOD). With the rapid development of encoder-decoder models with respect to most pixel-level dense prediction tasks, an empirical study still does not exist that evaluates performance by applying a large body of encoder-decoder models on SOD tasks. In this paper, instead of limiting our survey to SOD methods, a broader view is further presented from the perspective of fundamental architectures of key modules and structures in CNN-based encoder-decoder models for pixel-level dense prediction tasks. Moreover, we focus on performing SOD by leveraging deep encoder-decoder models, and present an extensive empirical study on baseline encoder-decoder models in terms of different encoder backbones, loss functions, training batch sizes, and attention structures. Moreover, state-of-the-art encoder-decoder models adopted from semantic segmentation and deep CNN-based SOD models are also investigated. New baseline models that can outperform state-of-the-art performance were discovered. In addition, these newly discovered baseline models were further evaluated on three video-based SOD benchmark datasets. Experimental results demonstrate the effectiveness of these baseline models on both image- and video-based SOD tasks. This empirical study is concluded by a comprehensive summary which provides suggestions on future perspectives.

中文翻译：

基于CNN的用于显着目标检测的编码器/解码器网络：全面综述和最新进展

基于卷积神经网络（CNN）的编码器/解码器模型极大地启发了显着物体检测（SOD）领域的最新工作。随着针对大多数像素级密集预测任务的编码器-解码器模型的快速发展，通过将大量编码器-解码器模型应用于SOD任务来评估性能的经验研究仍然不存在。在本文中，不是将我们的调查限于SOD方法，而是从针对像素级密集预测任务的基于CNN的编码器-解码器模型中关键模块和结构的基本体系结构的角度，提供了更广阔的视野。此外，我们专注于利用深层编码器-解码器模型执行SOD，并针对不同编码器主干，损失函数，训练批次大小和注意力结构。此外，还研究了从语义分割和基于深CNN的SOD模型中采用的最新编解码器模型。发现了可以超越最新性能的新基准模型。此外，这些新发现的基线模型还基于三个基于视频的SOD基准数据集进行了评估。实验结果证明了这些基准模型对基于图像和视频的SOD任务的有效性。这项实证研究由一个全面的总结得出结论，该总结为未来的观点提供了建议。发现了可以超越最新性能的新基准模型。此外，这些新发现的基线模型还基于三个基于视频的SOD基准数据集进行了评估。实验结果证明了这些基准模型对基于图像和视频的SOD任务的有效性。这项实证研究由一个全面的总结得出结论，该总结为未来的观点提供了建议。发现了可以超越最新性能的新基准模型。此外，这些新发现的基线模型还基于三个基于视频的SOD基准数据集进行了评估。实验结果证明了这些基准模型对基于图像和视频的SOD任务的有效性。这项实证研究由一个全面的总结得出结论，该总结为未来的观点提供了建议。

更新日期：2020-09-09

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11