Semantic Edge Detection with Diverse Deep Supervision

Liu, Yun; Cheng, Ming-Ming; Fan, Deng-Ping; Zhang, Le; Bian, Jia-Wang; Tao, Dacheng

doi:10.1007/s11263-021-01539-8

Semantic Edge Detection with Diverse Deep Supervision

Published: 29 November 2021

Volume 130, pages 179–198, (2022)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

Yun Liu¹,
Ming-Ming Cheng ORCID: orcid.org/0000-0001-5550-8758¹,
Deng-Ping Fan¹,
Le Zhang²,
Jia-Wang Bian³ &
…
Dacheng Tao⁴

2602 Accesses
34 Citations
1 Altmetric
Explore all metrics

Abstract

Semantic edge detection (SED), which aims at jointly extracting edges as well as their category information, has far-reaching applications in domains such as semantic segmentation, object proposal generation, and object recognition. SED naturally requires achieving two distinct supervision targets: locating fine detailed edges and identifying high-level semantics. Our motivation comes from the hypothesis that such distinct targets prevent state-of-the-art SED methods from effectively using deep supervision to improve results. To this end, we propose a novel fully convolutional neural network using diverse deep supervision within a multi-task framework where bottom layers aim at generating category-agnostic edges, while top layers are responsible for the detection of category-aware semantic edges. To overcome the hypothesized supervision challenge, a novel information converter unit is introduced, whose effectiveness has been extensively evaluated on SBD and Cityscapes datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

U-Net: Convolutional Networks for Biomedical Image Segmentation

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

Tausif Diwan, G. Anirudh & Jitendra V. Tembhurne

References

Acuna, D., Kar, A., & Fidler, S. (2019). Devil is in the edges: Learning semantic boundaries from noisy annotations. In IEEE conference on computer vision and pattern recognition (pp. 11075–11083).
Amer, M. R., Yousefi, S., Raich, R., & Todorovic, S. (2015). Monocular extraction of 2.1D sketch using constrained convex optimization. International Journal of Computer Vision, 112(1), 23–42.
Article Google Scholar
Arbeláez, P., Maire, M., Fowlkes, C., & Malik, J. (2011). Contour detection and hierarchical image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(5), 898–916.
Article Google Scholar
Bertasius, G., Shi, J., & Torresani, L. (2015a). DeepEdge: A multi-scale bifurcated deep network for top-down contour detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4380–4389).
Bertasius, G., Shi, J., & Torresani, L. (2015b). High-for-low and low-for-high: Efficient boundary detection from deep object features and its applications to high-level vision. In Proceedings of the IEEE international conference on computer vision (pp. 504–512).
Bertasius, G., Shi, J., & Torresani, L. (2016). Semantic segmentation with boundary neural fields. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3602–3610).
Bian, J.-W., Zhan, H., Wang, N., Li, Z., Zhang, L., Shen, C., et al. (2021). Unsupervised scale-consistent depth learning from video. International Journal of Computer Vision, 129, 2548–2564.
Article Google Scholar
Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8(6), 679–698.
Article Google Scholar
Chan, T.-H., Jia, K., Gao, S., Lu, J., Zeng, Z., & Ma, Y. (2015). PCANet: A simple deep learning baseline for image classification? IEEE Transactions on Image Processing, 24(12), 5017–5032.
Article MathSciNet Google Scholar
Chen, L.-C., Barron, J. T., Papandreou, G., Murphy, K., & Yuille, A. L. (2016). Semantic image segmentation with task-specific edge detection using CNNs and a discriminatively trained domain transform. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4545–4554).
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., et al. (2016). The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3213–3223).
Deng, R., Shen, C., Liu, S., Wang, H., & Liu, X. (2018). Learning to predict crisp boundaries. In European conference on computer vision (pp. 570–586).
Dollár, P., & Zitnick, C. L. (2015). Fast edge detection using structured forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(8), 1558–1570.
Article Google Scholar
Ferrari, V., Fevrier, L., Jurie, F., & Schmid, C. (2008). Groups of adjacent contour segments for object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(1), 36–51.
Article Google Scholar
Ferrari, V., Jurie, F., & Schmid, C. (2010). From images to shape models for object detection. International Journal of Computer Vision, 87(3), 284–303.
Article Google Scholar
Ganin, Y., & Lempitsky, V. (2014). N\(^4\)-Fields: Neural network nearest neighbor fields for image transforms. In Asian conference on computer vision (pp. 536–551).
Hardie, R. C., & Boncelet, C. G. (1995). Gradient-based edge detection using nonlinear edge enhancing prefilters. IEEE Transactions on Image Processing, 4(11), 1572–1577.
Article Google Scholar
Hariharan, B., Arbeláez, P., Bourdev, L., Maji, S., & Malik, J. (2011). Semantic contours from inverse detectors. In Proceedings of the IEEE international conference on computer vision (pp. 991–998).
Hayder, Z., He, X., & Salzmann, M. (2017). Boundary-aware instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5696–5704).
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770–778).
Henstock, P. V., & Chelberg, D. M. (1996). Automatic gradient threshold determination for edge detection. IEEE Transactions on Image Processing, 5(5), 784–787.
Article Google Scholar
Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554.
Article MathSciNet Google Scholar
Hou, Q., Cheng, M.-M., Hu, X., Borji, A., Tu, Z., & Torr, P. (2019). Deeply supervised salient object detection with short connections. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(4), 815–828.
Article Google Scholar
Hou, Q., Liu, J., Cheng, M.-M., Borji, A., & Torr, P. H. (2018). Three birds one stone: A unified framework for salient object segmentation, edge detection and skeleton extraction. arXiv preprint arXiv:1803.09860.
Hu, X., Liu, Y., Wang, K., & Ren, B. (2018). Learning hybrid convolutional features for edge detection. Neurocomputing, 313(2018), 377–385.
Article Google Scholar
Hu, Y., Chen, Y., Li, X., & Feng, J. (2019). Dynamic feature fusion for semantic edge detection. In International joint conferences on artificial intelligence (pp. 782–788).
Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In The international conference on machine learning (pp. 448–456).
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., et al. (2014). Caffe: Convolutional architecture for fast feature embedding. In ACM international conference on multimedia (pp. 675–678).
Khoreva, A., Benenson, R., Omran, M., Hein, M., & Schiele, B. (2016). Weakly supervised object boundaries. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 183–192).
Kirillov, A., Levinkov, E., Andres, B., Savchynskyy, B., & Rother, C. (2017). Instancecut: From edges to instances with multicut. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5008–5017).
Kokkinos, I. (2016). Pushing the boundaries of boundary detection using deep learning. In The international conference on learning representations (pp. 1–12).
Konishi, S., Yuille, A. L., Coughlan, J. M., & Zhu, S. C. (2003). Statistical edge detection: Learning and evaluating edge cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 25(1), 57–74.
Article Google Scholar
Lee, C.-Y., Xie, S., Gallagher, P., Zhang, Z., & Tu, Z. (2015). Deeply-supervised nets. In Artificial intelligence and statistics (pp. 562–570).
Lim, J. J., Zitnick, C. L., & Dollár, P. (2013). Sketch tokens: A learned mid-level representation for contour and object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3158–3165).
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017). Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2117–2125).
Lin, T.-Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2020). Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2), 318–327.
Article Google Scholar
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., et al. (2014). Microsoft COCO: Common objects in context. In European conference on computer vision (pp. 740–755).
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., et al. (2016). SSD: Single shot multibox detector. In European conference on computer vision (pp. 21–37).
Liu, Y., Cheng, M.-M., Hu, X., Bian, J.-W., Zhang, L., Bai, X., et al. (2019). Richer convolutional features for edge detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41(8), 1939–1946.
Article Google Scholar
Liu, Y., Cheng, M.-M., Hu, X., Wang, K., & Bai, X. (2017). Richer convolutional features for edge detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3000–3009).
Liu, Y., Jiang, P.-T., Petrosyan, V., Li, S.-J., Bian, J., Zhang, L., et al. (2018). DEL: Deep embedding learning for efficient image segmentation. In International joint conferences on artificial intelligence (pp. 864–870).
Mafi, M., Rajaei, H., Cabrerizo, M., & Adjouadi, M. (2018). A robust edge detection approach in the presence of high impulse noise intensity through switching adaptive median and fixed weighted mean filtering. IEEE Transactions on Image Processing, 27(11), 5475–5490.
Article MathSciNet Google Scholar
Maninis, K.-K., Pont-Tuset, J., Arbelaez, P., & Van Gool, L. (2017). Convolutional oriented boundaries: From image segmentation to high-level tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 819–833.
Article Google Scholar
Martin, D. R., Fowlkes, C. C., & Malik, J. (2004). Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(5), 530–549.
Article Google Scholar
Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted Boltzmann machines. In The international conference on machine learning (pp. 807–814).
Ramalingam, S., Bouaziz, S., Sturm, P., & Brand, M. (2010). Skyline2gps: Localization in urban canyons using omni-skylines. In The IEEE/RSJ international conference on intelligent robots and systems (pp. 3816–3823).
Shan, Q., Curless, B., Furukawa, Y., Hernandez, C., & Seitz, S. M. (2014). Occluding contours for multi-view stereo. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4002–4009).
Shen, W., Wang, X., Wang, Y., Bai, X., & Zhang, Z. (2015). DeepContour: A deep convolutional feature learned by positive-sharing loss for contour detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3982–3991).
Shui, P.-L., & Wang, F.-P. (2017). Anti-impulse-noise edge detection via anisotropic morphological directional derivatives. IEEE Transactions on Image Processing, 26(10), 4962–4977.
Article MathSciNet Google Scholar
Sobel, I. (1970). Camera models and machine perception. Technical report, Stanford Univercity California, Department of Computer Science.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 15(1), 1929–1958.
MathSciNet MATH Google Scholar
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., et al. (2015). Going deeper with convolutions. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 1–9).
Takikawa, T., Acuna, D., Jampani, V., & Fidler, S. (2019). Gated-SCNN: Gated shape CNNs for semantic segmentation. In Proceedings of the IEEE international conference on computer vision (pp. 5229–5238).
Tang, P., Wang, X., Feng, B., & Liu, W. (2017). Learning multi-instance deep discriminative patterns for image classification. IEEE Transactions on Image Processing, 26(7), 3385–3396.
Article MathSciNet Google Scholar
Trahanias, P. E., & Venetsanopoulos, A. N. (1993). Color edge detection using vector order statistics. IEEE Transactions on Image Processing, 2(2), 259–264.
Article Google Scholar
Wang, L., Ouyang, W., Wang, X., & Lu, H. (2015). Visual tracking with fully convolutional networks. In Proceedings of the IEEE international conference on computer vision (pp. 3119–3127).
Wang, Y., Zhao, X., Li, Y., & Huang, K. (2019). Deep crisp boundaries: From boundaries to higher-level tasks. IEEE Transactions on Image Processing, 28(3), 1285–1298.
Article MathSciNet Google Scholar
Xie, S., & Tu, Z. (2015). Holistically-nested edge detection. In Proceedings of the IEEE international conference on computer vision (pp. 1395–1403).
Xie, S., & Tu, Z. (2017). Holistically-nested edge detection. International Journal of Computer Vision, 125(1–3), 3–18.
Article MathSciNet Google Scholar
Yang, J., Price, B., Cohen, S., Lee, H., & Yang, M.-H. (2016). Object contour detection with a fully convolutional encoder–decoder network. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 193–202).
Yang, W., Feng, J., Yang, J., Zhao, F., Liu, J., Guo, Z., et al. (2017). Deep edge guided recurrent residual learning for image super-resolution. IEEE Transactions on Image Processing, 26(12), 5895–5907.
Article MathSciNet Google Scholar
Yu, F., & Koltun, V. (2016). Multi-scale context aggregation by dilated convolutions. In International conference on learning representations (pp. 1–13).
Yu, Z., Feng, C., Liu, M.-Y., & Ramalingam, S. (2017). CASENet: Deep category-aware semantic edge detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5964–5973).
Yu, Z., Liu, W., Zou, Y., Feng, C., Ramalingam, S., Kumar, B., et al. (2018). Simultaneous edge alignment and learning. In European conference on computer vision (pp. 400–417).
Zamir, A. R., Sax, A., Shen, W., Guibas, L., Malik, J., & Savarese, S. (2018). Taskonomy: Disentangling task transfer learning. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3712–3722).

Download references

Author information

Authors and Affiliations

College of Computer Science, Nankai University, Tianjin, China
Yun Liu, Ming-Ming Cheng & Deng-Ping Fan
University of Electronic Science and Technology of China, Chengdu, China
Le Zhang
School of Computer Science, University of Adelaide, Adelaide, Australia
Jia-Wang Bian
JD Explore Academy in JD.com, Beijing, China
Dacheng Tao

Authors

Yun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ming-Ming Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Deng-Ping Fan
View author publications
You can also search for this author in PubMed Google Scholar
Le Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jia-Wang Bian
View author publications
You can also search for this author in PubMed Google Scholar
Dacheng Tao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ming-Ming Cheng.

Additional information

Communicated by Yuri Boykov.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This research was supported by the National Key Research and Development Program of China Grant No. 2018AAA0100400 and NSFC (No. 61922046)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, Y., Cheng, MM., Fan, DP. et al. Semantic Edge Detection with Diverse Deep Supervision. Int J Comput Vis 130, 179–198 (2022). https://doi.org/10.1007/s11263-021-01539-8

Download citation

Received: 21 April 2019
Accepted: 22 October 2021
Published: 29 November 2021
Issue Date: January 2022
DOI: https://doi.org/10.1007/s11263-021-01539-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semantic Edge Detection with Diverse Deep Supervision

Abstract

Access this article

Similar content being viewed by others

U-Net: Convolutional Networks for Biomedical Image Segmentation

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Semantic Edge Detection with Diverse Deep Supervision

Abstract

Access this article

Similar content being viewed by others

U-Net: Convolutional Networks for Biomedical Image Segmentation

SSD: Single Shot MultiBox Detector

Object detection using YOLO: challenges, architectural successors, datasets and applications

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation