Skip to main content
Log in

SAFD: single shot anchor free face detector

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The anchor-free based face detection methods can cover a large range of scales and perform better in the speed. However, their performance still bears a large gap compared with anchor-based methods, especially for detecting small faces. Because they are troubled by the context modeling and scale imbalance problems. In this study, to address these problems, we propose a novel single shot anchor-free face detector (SAFD) for detecting multi-scale faces by leveraging the multi-scale context aware information of multi-layer features. In the SAFD, we use the dilated convolution layers and attention mechanism to select the informative features that can accommodate to different scales. We also propose a scale-aware sampling strategy to mitigate the scale imbalance problem by adaptivity selecting the positive training samples. The experimental results on two public benchmark datasets, Wider Face and FDDB dataset, demonstrate that our SAFD can achieve competitive performance with the anchor-based detectors while with lower computation cost.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

Notes

  1. http://shuoyang1213.me/WIDERFACE/WiderFace_Results.html

  2. http://vis-www.cs.umass.edu/fddb/results.html

References

  1. Atallah RR, Kamsin A, Ismail MA, Abdelrahman SA, Zerdoumi S (2018) Face recognition and age estimation implications of changes in facial features: A critical review study. IEEE Access 6:28290–28304

    Article  Google Scholar 

  2. Bai Y, Zhang Y, Ding M, Ghanem B (2018) Finding tiny faces in the wild with generative adversarial network. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 21–30

  3. Bodla N, Singh B, Chellappa R, Davis LS (2017) Soft-NMS – improving object detection with one line of code. In: In Proceedings of the IEEE international conference on computer vision (ICCV), pp. 5561–5569

  4. Cai Z, Fan Q, Feris RS, Vasconcelos N (2016) A unified multi-scale deep convolutional neural network for fast object detection. In: Proc. ECCV, pp 354–370

  5. Chen Z, Huang S, Tao D (2018) Conext refinement for object detection. In: Proc. ECCV

  6. Chu W, Cai D (2018) Deep feature based contextual model for object detection. Neurocomputing 275:1035–1042

    Article  Google Scholar 

  7. Dakhia A, Wang T, Lu H (2019) Multi-scale pyramid pooling network for salient object detection. Neurocomputing 333:211–220

    Article  Google Scholar 

  8. Deng J, Guo J, Zhou Y, Yu J, Kotsia I, Zafeiriou S (2019) Retinaface: Single-stage dense face localisation in the wild. arXiv: Computer Vision and Pattern Recognition

  9. Farfade SS, Saberian MJ, Li L-J (2015) Multi-view face detection using deep convolutional neural networks. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval (ICMR), pp. 643–650

  10. Felzenszwalb PF, Girshick RB, McAllester D, Ramanan D (2010) Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9):1627–1645

    Article  Google Scholar 

  11. Gidaris S, Komodakis N (2015) Object detection via a multi-region and semantic segmentation-aware cnn model. In: In Proceedings of the IEEE international conference on computer vision (ICCV), pp. 1134–1142

  12. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 580–587

  13. He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: Proc. ECCV, pp 346–361

  14. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proc. CVPR, pp 770–778

  15. Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proc. CVPR, pp 7132–7141

  16. Hu P, Ramanan D (2017) Finding tiny faces. In: Proc. CVPR, pp 1522–1530

  17. Huang L, Yang Y, Deng Y, Yu Y (2015) Densebox: Unifying landmark localization with end to end object detection

  18. Jain V, Learned-Miller E (2010) Fddb: A benchmark for face detection in unconstrained settings. Tech. rep., Technical Report UM-CS-2010-009, University of Massachusetts, Amherst

  19. Jiang H, Learned-Miller E (2017) Face detection with the faster r-cnn. In: Proc. FG, pp 650–657

  20. Li J, Wang Y, Wang C, Tai Y, Qian J, Yang J, Wang C, Li J, Huang F (2019) Dsfd: Dual shot face detector. pp 5060–5069

  21. Li Y, Sun B, Wu T, Wang Y (2016) Face detection with end-to-end integration of a convnet and a 3d model. In: European Conference on Computer Vision (ECCV), pp. 420–436. Springer, Cham

  22. Lin T-Y, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proc. CVPR

  23. Liu N, Han J, Liu T, Li X (2018) Learning to predict eye fixations via multiresolution convolutional neural networks. IEEE Trans. Neural Netw. Learn. Syst. 29(2):392–404

    Article  MathSciNet  Google Scholar 

  24. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: Single shot multibox detector. In: Proc. ECCV, pp 21–37

  25. Liu Z, Luo P, Wang X, Tang X (2015) Deep learning face attributes in the wild. In: Proc. ICCV, pp 3730–3738

  26. Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proc. CVPR, pp 3431–3440

  27. Mathias M, Benenson R, Pedersoli M, VanGool L (2014) Face detection without bells and whistles. Proc. ECCV 8692:720–735

    Google Scholar 

  28. Najibi M, Samangouei P, Chellappa R, Davis LS (2017) Ssh: Single stage headless face detector. In: Proc. ICCV, pp 4875–4884

  29. Ranjan R, Patel VM, Chellappa R (2017) Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans. Pattern Anal. Mach. Intell. 41(1):121–135

    Article  Google Scholar 

  30. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In: Proc. CVPR, pp 779–788

  31. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Proc. NeurIPS, pp 91–99

  32. Shen Y, Ji R, Wang C, Li X, Li X (2018) Weakly supervised object detection via object-specific pixel gradient. IEEE Trans. Neural Netw. Learn. Syst. 29(12):5960–5970

    Article  Google Scholar 

  33. Shi B, Bai X, Liu W, Wang J (2018) Face alignment with deep regression. IEEE Trans. Neural Netw. Learn. Syst. 29(1):183–194

    Article  MathSciNet  Google Scholar 

  34. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  35. Sun X, Wu P, Hoi S CH (2018) Face detection using deep learning: An improved faster rcnn approach. Neurocomputing 299:42–50

    Article  Google Scholar 

  36. Tang X, Du DK, He Z, Liu J (2018) Pyramidbox: A context-assisted single shot face detector. In: Proc. ECCV, pp 797–813

  37. Viola P, Jones MJ (2001) Robust real-time face detection

  38. Wang C, Luo Z, Lian S, Li S (2018) Anchor free network for multi-scale face detection. In: Proc. ICPR, pp 1554–1559

  39. Wang F, Jiang M, Qian C, Yang S, Li C, Zhang H, Wang X, Tang X (2017) Residual attention network for image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR), pp. 3156–3164

  40. Woo S, Park J, Lee J-Y, SoKweon I (2018) Cbam: Convolutional block attention module. In: Proc. ECCV, pp 3–19

  41. Yan J, Lei Z, Wen L, Li SZ (2014) The fastest deformable part model for object detection. In: Proc. CVPR, pp 2497–2504

  42. Yang B, Yan J, Lei Z, Li SZ (2014) Aggregate channel features for multi-view face detection. In: Proc. IJCB, pp 1–8

  43. Yang S, Luo P, Loy C-C, Tang X (2015) From facial parts responses to face detection: A deep learning approach. In: Proceedings of the IEEE international conference on computer vision (ICCV), pp. 3676–3684

  44. Yang S, Luo P, Loy C-C, Tang X (2016) Wider face: A face detection benchmark. In: Proc. CVPR, pp 5525–5533

  45. Yang S, Xiong Y, Loy CC, Tang X (2017) Face detection through scale-friendly deep convolutional networks. arXiv: Computer Vision and Pattern Recognition

  46. Yu F, Koltun V (2015) Multi-scale context aggregation by dilated convolutions. preprint arXiv:1511.07122

  47. Yu J, Jiang Y, Wang Z, Cao Z, Huang T (2016) Unitbox: An advanced object detection network. In: Proc. ACM, pp 516–520. ACM

  48. Zagoruyko S, Lerer A, Lin T-Y, Pinheiro PO, Gross S, Chintala S, Dollár P (2016) A multipath network for object detection. In: Proc. BMVC

  49. Zeng X, Ouyang W, Yang B, Yan J, Wang X (2016) Gated bi-directional cnn for object detection. In: Proc.ECCV, pp 354–369

  50. Zhai Y, Fu J, Lu Y, Li H (2018) Feature selective networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4139–4147

  51. Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters 23(10):1499–1503

    Article  Google Scholar 

  52. Zhang K, Zhang Z, Wang H, Li Z, Qiao Y, Liu W (2017) Detecting faces using inside cascaded ual cnn. In: Proc. ICCV

  53. Zhang L, Dai J, Lu H, He Y, Wang G (2018) A bi-directional message passing model for salient object detection. In: Proc. CVPR, pp 1741–1750

  54. Zhang S, Zhu X, Lei Z, Shi H, Wang X, Li SZ (2017) S ˆ 3fd: Single shot scale-invariant face detector. In: Proc. ICCV, pp 192–201

  55. Zhang S, Zhu X, Lei Z, Wang X, Shi H, Li SZ (2018) Detecting face with densely connected face proposal network. Neurocomputing 284:119–127

    Article  Google Scholar 

  56. Zhao H, Shi J, Qi X, Wang X, Jia J (2017) Pyramid scene parsing network. In: Proc. CVPR, pp 2881–2890

  57. Zhu C, Tao R, Luu K, Savvides M (2018) Seeing small faces from robust anchor’s perspective. In: Proc. CVPR, pp 5127–5136

  58. Zhu C, Zheng Y, Luu K, Savvides M (2017) Cms-rcnn: contextual multi-scale region-based cnn for unconstrained face detection. In: Deep learning for biometrics, pp 57–79

  59. Zhu X, Ramanan D (2012) Face detection, pose estimation, and landmark localization in the wild. In: Proc. CVPR, pp 2879–2886

Download references

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Zhiming Luo or Shaozi Li.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, C., Luo, Z., Zhong, Z. et al. SAFD: single shot anchor free face detector. Multimed Tools Appl 80, 13761–13785 (2021). https://doi.org/10.1007/s11042-020-10401-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-020-10401-x

Keywords

Navigation