Image saliency detection via multi-scale iterative CNN

Huang, Kun; Gao, Shenghua

doi:10.1007/s00371-019-01734-2

Image saliency detection via multi-scale iterative CNN

Original Article
Published: 06 August 2019

Volume 36, pages 1355–1367, (2020)
Cite this article

The Visual Computer Aims and scope Submit manuscript

716 Accesses
17 Citations
Explore all metrics

Abstract

Salient object detection has received increasingly more attention and achieved significant progress lately due to the powerful features learned by deep convolutional neural networks (CNNs). In this work, we propose a multi-scale iterative CNN for salient object detection, which has two complementary subnetworks at different spatial scales. For each subnetwork, we augment the CNN structures with an iterative learning process to learn the saliency map, where early stages of the CNN give a rough estimate of the saliency map and the remaining errors are gradually learned to refine the saliency map. By merging predictions of the two subnetworks, the training error can be reduced significantly and the estimated saliency map becomes more accurate. Unlike some previous CNN-based methods which often rely on superpixel segmentations, the proposed model is fully CNN and hence can estimate the saliency map much more efficiently. Extensive experiments on standard benchmarks demonstrate that our method outperforms some of the state-of-the-art methods in terms of both accuracy and speed and achieves as good performance as some recent state-of-the-art end-to-end methods under fair settings.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Salient Object Detection via Hierarchical Network Learning

Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs

SalNet: Edge Constraint Based End-to-End Model for Salient Object Detection

Notes

Note that iterative error correction has been a common practice in other fields such as automatic control and signal processing for decades.
Essentially S is a weighted sum of output feature maps of all convolutional filters at each stage.
Note that result of MC method [19] on MSRA-10k is not reported, since it is fine-tuned on MSRA-10k (after pretraining) with an 8000/2000 train/validation split.
Note that DISC is also trained on MSRA-10k with a 9000/1000 split, so we do not compare with it on MSRA-10k dataset but on ECSSD, SED1 and Pascal-1500.
The result of non-CRF version of DSS is not provided.

References

Wu, R., Yu, Y., Wang, W.: Scale: supervised and cascaded Laplacian eigenmaps for visual object recognition based on nearest neighbors. In: CVPR, pp. 867–874 (2013)
Lv, X., Zou, D., Zhang, L., Jia, S.: Feature coding for image classification based on saliency detection and fuzzy reasoning and its application in elevator videos. WSEAS Trans. Comput. 13(1), 266–276 (2014)
Google Scholar
Wei, Y., Liang, X., Chen, Y., Shen, X., Cheng, M.-M., Feng, J., Zhao, Y., Yan, S.: Stc: a simple to complex framework for weakly-supervised semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(11), 2314–2320 (2017)
Article Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998)
Article Google Scholar
Bharath, R., Jian Nicholas, L.Z., Cheng, X.: Scalable scene understanding using saliency-guided object localization. In: 2013 10th IEEE International Conference on Control and Automation (ICCA), pp. 1503–1508 (2013)
Hadizadeh, H., Bajic, I.V.: Saliency-aware video compression. IEEE Trans. Image Process. 23(1), 19–33 (2014)
Article MathSciNet Google Scholar
Liu, T., Yuan, Z., Sun, J., Wang, J., Zheng, N., Tang, X., Shum, H.-Y.: Learning to detect a salient object. IEEE Tran. Pattern Anal. Mach. Intell. 33(2), 353–367 (2011)
Article Google Scholar
Shen, X., Wu, Y.: A unified approach to salient object detection via low rank matrix recovery. In: CVPR, IEEE, pp. 853–860 (2012)
Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: ICCV, IEEE, pp. 2106–2113 (2009)
Perazzi, F., Krähenbühl, P., Pritch, Y., Hornung, A.: Saliency filters: contrast based filtering for salient region detection. In: CVPR, IEEE, pp. 733–740 (2012)
Cheng, M., Mitra, N.J., Huang, X., Torr, P.H.S., Hu, S.: Global contrast based salient region detection. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 569–582 (2015)
Article Google Scholar
Zhu, W., Liang, S., Wei, Y., Sun, J.: Saliency optimization from robust background detection. In: CVPR, IEEE, pp. 2814–2821 (2014)
Jiang, H., Wang, J., Yuan, Z., Wu, Y., Zheng, N., Li, S.: Salient object detection: a discriminative regional feature integration approach. In: CVPR, IEEE, pp. 2083–2090 (2013)
Borji, A., Cheng, M.-M., Jiang, H., Li, J.: Salient object detection: a benchmark. IEEE Trans. Image Process. 24(12), 5706–5722 (2015)
Article MathSciNet Google Scholar
Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: Cnn features off-the-shelf: an astounding baseline for recognition. In: 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 512–519 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)
Li, G., Yu, Y.: Visual saliency based on multiscale deep features. In: CVPR, IEEE, pp. 5455–5463 (2015)
Wang, L., Lu, H., Ruan, X., Yang, M.-H.: Deep networks for saliency detection via local estimation and global search. In: CVPR, IEEE, pp. 3183–3192 (2015)
Zhao, R., Ouyang, W., Li, H., Wang, X.: Saliency detection by multi-context deep learning. In: CVPR, IEEE, pp. 1265–1274 (2015)
Chen, T., Lin, L., Liu, L., Luo, X., Li, X.: Disc: deep image saliency computing via progressive representation learning. IEEE Trans. Neural Netw. Learn. Syst. PP(99), 1–15 (2016)
MathSciNet Google Scholar
Yan, Q., Xu, L., Shi, J., Jia, J.: Hierarchical saliency detection. In: CVPR, IEEE, pp. 1155–1162 (2013)
Zeiler, M. D., Fergus, R.: Visualizing and understanding convolutional networks. In: ECCV, Springer, pp. 818–833 (2014)
Srivastava, R. K, Greff, K., Schmidhuber, J.: Training very deep networks. In: Advances in Neural Information Processing Systems, pp. 2368–2376 (2015)
Zhang, J., Shan, S., Kan, M., Chen, X.: Coarse-to-fine auto-encoder networks (CFAN) for real-time face alignment. In: ECCV, Springer, pp. 1–16 (2014)
Carreira, J., Agrawal, P., Fragkiadaki, K., Malik, J.: Human pose estimation with iterative error feedback. In: CVPR, pp. 4733–4742 (2016)
Achanta, R., Hemami, S., Estrada, F., Susstrunk, S.: Frequency-tuned salient region detection. In: CVPR, IEEE, pp. 1597–1604 (2009)
Hou, X., Zhang, L.: Saliency detection: a spectral residual approach. In: CVPR, pp. 1–8 (2007)
Goferman, S., Zelnik-Manor, L., Tal, A.: Context-aware saliency detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(10), 1915–1926 (2012)
Article Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, ACM, pp. 675–678 (2014)
Farabet, C., Couprie, C., Najman, L., LeCun, Y.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1915–1929 (2013)
Article Google Scholar
Huang, X., Shen, C., Boix, X., Zhao, Q.: Salicon: reducing the semantic gap in saliency prediction by adapting deep neural networks. In: ICCV, pp. 262–270 (2015)
Chen, L.-C., Yang, Y., Wang, J., Xu, W., Yuille, A. L: Attention to scale: Scale-aware semantic image segmentation. In: CVPR, pp. 3640–3649 (2016)
Liu, N., Han, J., Yang, M.-H.: Picanet: learning pixel-wise contextual attention for saliency detection. In: CVPR (2018)
Wang, T., Zhang, L., Wang, S., Lu, H., Yang, G., Ruan, X., Borji, A.: Detect globally, refine locally: a novel approach to saliency detection. In: CVPR (2018)
Wang, T., Borji, A., Zhang, L., Zhang, P., Lu, H.: A stagewise refinement model for detecting salient objects in images. In: ICCV (2017)
Zhang, P., Wang, D., Lu, H., Wang, H., Yin, B.: Learning uncertain convolutional features for accurate saliency detection. In: ICCV (2017)
Chen, X., Zheng, A., Li, J., Lu, F.: Look, perceive and segment: finding the salient objects in images via two-stream fixation-semantic CNNS. In: ICCV (2017)
Luo, Z., Mishra, A., Achkar, A., Eichel, J., Li, S., Jodoin, P.-M.: Non-local deep features for salient object detection. In: CVPR (2017)
Hariharan, B., Arbeláez, P., Girshick, R., Malik, J.: Hypercolumns for object segmentation and fine-grained localization. In: CVPR, IEEE, pp. 447–456 (2015)
Li, Guanbin, Y., Y.: Deep contrast learning for salient object detection. In: CVPR, pp. 478–487 (2016)
Hu, P., Shuai, B., Liu, J., Wang, G.: Deep level sets for salient object detection. In: CVPR (2017)
Zhang, P., Wang, D., Lu, H., Wang, H., Ruan, X.: Amulet: aggregating multi-level convolutional features for salient object detection. In: ICCV, pp. 202–211 (2017)
Hou, Q., Cheng, M.-M., Hu, X., Borji, A., Tu, Z., Torr, P. H.: Deeply supervised salient object detection with short connections. In: CVPR, pp. 3203–3212 (2017)
Zhang, L., Dai, J., Lu, H., He, Y., Wang, G.: A bi-directional message passing model for salient object detection. In: CVPR, pp. 1741–1750 (2018)
Deng, Z., Hu, X., Zhu, L., Xu, X., Qin, J., Han, G., Heng, P.-A.: R3net: recurrent residual refinement network for saliency detection. In: IJCAI, AAAI Press, pp. 684–690 (2018)
Hu, X., Zhu, L., Qin, J., Fu, C.-W., Heng, P.-A.: Recurrently aggregating deep features for salient object detection. In: AAAI (2018)
Li, X., Yang, F., Cheng, H., Liu, W., Shen, D.: Contour knowledge transfer for salient object detection. In: ECCV, pp. 355–370 (2018)
Chen, S., Tan, X., Wang, B., Hu, X.: Reverse attention for salient object detection. In: ECCV, pp. 234–250 (2018)
Li, G., Xie, Y., Lin, L., Yu, Y.: Instance-level salient object segmentation. In: CVPR (2017)
Hou, Q., Cheng, M.-M., Hu, X., Borji, A., Tu, Z., Torr, P. H.: Deeply supervised salient object detection with short connections. IEEE Trans. Pattern Anal. Mach. Intell. 41(1), 1–1
Xie, S., Tu, Z.: Holistically-nested edge detection. In: ICCV, pp. 1395–1403 (2015)
Zhao, T., Wu, X.: Pyramid feature attention network for saliency detection. In: CVPR (2019)
Li, Z., Lang, C., Chen, Y., Liew, J., Feng, J.: Deep reasoning with multi-scale context for salient object detection. arXiv preprint arXiv:1901.08362 (2019)
Zhou, S., Wang, J., Wang, F., Huang, D.: Se2net: siamese edge-enhancement network for salient object detection. arXiv preprint arXiv:1904.00048 (2019)
Liu, Y., Fan, D.-P., Nie, G.-Y., Zhang, X., Petrosyan, V., Cheng, M.-M.: Dna: deeply-supervised nonlinear aggregation for salient object detection. arXiv preprint arXiv:1903.12476 (2019)
Hu, X., Fu, C.-W., Zhu, L., Heng, P.-A.: Sac-net: spatial attenuation context for salient object detection. arXiv preprint arXiv:1903.10152 (2019)
Alpert, S., Galun, M., Brandt, A., Basri, R.: Image segmentation by probabilistic bottom-up aggregation and cue integration. IEEE Trans. Pattern Anal. Mach. Intell. 34(2), 315–327 (2012)
Article Google Scholar
Zou, W., Kpalma, K., Liu, Z., Ronsin, J.: Segmentation driven low-rank matrix recovery for saliency detection. In: BMVC, pp. 1–13 (2013)
Yang, C., Zhang, L., Lu, H., Ruan, X., Yang, M.-H.: Saliency detection via graph-based manifold ranking. In: CVPR, IEEE, pp. 3166–3173 (2013)
Wei, Y., Wen, F., Zhu, W., Sun, J.: Geodesic saliency using background priors. In: ECCV, Springer, pp. 29–42 (2012)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

Download references

Author information

Authors and Affiliations

Shanghai Institute of Microsysterm and Information Technology, Chinese Academy of Sciences, Shanghai, 200050, People’s Republic of China
Kun Huang
School of Information Science and Technology, ShanghaiTech University, Shanghai, 201210, People’s Republic of China
Kun Huang & Shenghua Gao
University of Chinese Academy of Sciences, Beijing, People’s Republic of China
Kun Huang

Authors

Kun Huang
View author publications
You can also search for this author in PubMed Google Scholar
Shenghua Gao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kun Huang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, K., Gao, S. Image saliency detection via multi-scale iterative CNN. Vis Comput 36, 1355–1367 (2020). https://doi.org/10.1007/s00371-019-01734-2

Download citation

Published: 06 August 2019
Issue Date: July 2020
DOI: https://doi.org/10.1007/s00371-019-01734-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Image saliency detection via multi-scale iterative CNN

Abstract

Access this article

Similar content being viewed by others

Deep Salient Object Detection via Hierarchical Network Learning

Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs

SalNet: Edge Constraint Based End-to-End Model for Salient Object Detection

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Image saliency detection via multi-scale iterative CNN

Abstract

Access this article

Similar content being viewed by others

Deep Salient Object Detection via Hierarchical Network Learning

Saliency Detection via Combining Region-Level and Pixel-Level Predictions with CNNs

SalNet: Edge Constraint Based End-to-End Model for Salient Object Detection

Notes

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation