Skip to main content
Log in

Detecting Soccer Balls with Reduced Neural Networks

A Comparison of Multiple Architectures Under Constrained Hardware Scenarios

  • Published:
Journal of Intelligent & Robotic Systems Aims and scope Submit manuscript

Abstract

Object detection techniques that achieve state-of-the-art detection accuracy employ convolutional neural networks, implemented to have lower latency in graphics processing units. Some hardware systems, such as mobile robots, operate under constrained hardware situations, but still benefit from object detection capabilities. Multiple network models have been proposed, achieving comparable accuracy with reduced architectures and leaner operations. Motivated by the need to create a near real-time object detection system for a soccer team of mobile robots operating with x86 CPU-only embedded computers, this work analyses the average precision and inference time of multiple object detection systems in a constrained hardware setting. We train open implementations of MobileNetV2 and MobileNetV3 models with different underlying architectures, achieved by changing their input and width multipliers, as well as YOLOv3, TinyYOLOv3, YOLOv4 and TinyYOLOv4 in an annotated image dataset captured using a mobile robot. We emphasize the speed/accuracy trade-off in the models by reporting their average precision on a test data set and their inference time in videos at different resolutions, under constrained and unconstrained hardware configurations. Results show that MobileNetV3 models have a good trade-off between average precision and inference time in constrained scenarios only, while MobileNetV2 with high width multipliers are appropriate for server-side inference. YOLO models in their official implementations are not suitable for inference in CPUs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Materials Availability

The scripts utilized in the experiments presented in this paper are available at https://github.com/douglasrizzo/JINT2020-ball-detection. The image dataset used in the experiments was also made available online [4], along with a static copy of the aforementioned scripts.

References

  1. Alippi, C., Disabato, S., Roveri, M.: Moving convolutional neural networks to embedded systems: the alexnet and VGG-16 case. In: 2018 17th ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN), pp. 212–223 (2018). https://doi.org/10.1109/IPSN.2018.00049

  2. Ba, L.J., Caruana, R.: Do deep nets really need to be deep? arXiv:1312.6184[cs] (2014)

  3. Bettoni, M., Urgese, G., Kobayashi, Y., Macii, E., Acquaviva, A.: A convolutional neural network fully implemented on FPGA for embedded platforms. In: 2017 New Generation of CAS (NGCAS), pp. 49–52 (2017). https://doi.org/10.1109/ngcas.2017.16

  4. Bianchi, R.A.D.C., Perico, D.H., Homem, T.P.D., da Silva, I.J., Meneghetti, D.D.R.: Open soccer ball dataset. IEEE Dataport. https://doi.org/10.21227/0vvr-5c61 (2020)

  5. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: optimal speed and accuracy of object detection. arXiv:2004.10934[cs, eess] (2020)

  6. Buciluǎ, C., Caruana, R., Niculescu-Mizil, A.: Model compression. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ’06, p 535. ACM Press, Philadelphia (2006). https://doi.org/10/fkdh9m

  7. Canziani, A., Culurciello, E., Paszke, A.: Evaluation of neural network architectures for embedded systems. In: 2017 IEEE International Symposium on Circuits and Systems (ISCAS), pp. 1–4 (2017). https://doi.org/10.1109/ISCAS.2017.8050276

  8. Cheng, Y., Wang, D., Zhou, P., Zhang, T.: A survey of model compression and acceleration for deep neural networks. IEEE Signal Process. Mag. 35(1), 126–136 (2020). https://doi.org/10.1109/MSP.2017.2765695

    Article  Google Scholar 

  9. Chollet, F.: Xception: deep learning with depthwise separable convolutions. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1800–1807 (2017). https://doi.org/10.1109/CVPR.2017.195

  10. Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: training deep neural networks with weights and activations constrained to + 1 or -1. arXiv:1602.02830[cs] (2016)

  11. de Oliveira, J.H.R., da Silva, I.J., Homem, T.P.D., Meneghetti, D.D.R., Perico, D.H., Bianchi, R.A.D.C.: Object detection under constrained hardware scenarios: a comparative study of reduced convolutional network architectures. In: 2019 XVI Latin American Robotics Symposium and VII Brazilian Robotics Symposium (LARS/SBR). IEEE (2019)

  12. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 (2009). https://doi.org/10/cvc7xp

  13. Guo, K., Sui, L., Qiu, J., Yu, J., Wang, J., Yao, S., Han, S., Wang, Y., Yang, H.: Angel-eye: a complete design flow for mapping CNN onto embedded FPGA. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37(1), 35–47 (2018). https://doi.org/10/gf2ntg

    Article  Google Scholar 

  14. Han, S., Pool, J., Tran, J., Dally, W.: Learning both weights and connections for efficient neural network. Advances in Neural Information Processing Systems 28, 1135–1143 (2015)

    Google Scholar 

  15. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. https://doi.org/10.1109/cvpr.2016.90, pp. 770–778 (2016)

  16. Hinton, G., Vinyals, O., Dean, J.: Distilling the Knowledge in a Neural Network. arXiv:1503.02531[cs, stat] (2015)

  17. Howard, A., Sandler, M., Chu, G., Chen, L.C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., Le, Q.V., Adam, H.: Searching for MobileNetV3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324 (2019). https://openaccess.thecvf.com/content_ICCV_2019/html/Howard_Searching_for_MobileNetV3_ICCV_2019_paper.html

  18. Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H.: MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv:1704.04861 (2017)

  19. Huang, J., Rathod, V., Sun, C., Zhu, M., Korattikara, A., Fathi, A., Fischer, I., Wojna, Z., Song, Y., Guadarrama, S., Murphy, K.: Speed/accuracy trade-offs for modern convolutional object detectors. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/cvpr.2017.351. IEEE (2017)

  20. Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations. arXiv:1609.07061[cs] (2016)

  21. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, pp. 448–456 (2015). http://proceedings.mlr.press/v37/ioffe15.html

  22. Jaramillo-Avila, U., Anderson, S.R.: Foveated image processing for faster object detection and recognition in embedded systems using deep convolutional neural networks. In: Martinez-Hernandez, U., Vouloutsi, V., Mura, A., Mangan, M., Asada, M., Prescott, T.J., Verschure, P.F. (eds.) Biomimetic and Biohybrid Systems, Lecture Notes in Computer Science, pp. 193–204. Springer International Publishing, Cham (2019). https://doi.org/10.1007/978-3-030-24741-6_17

  23. Jian, B., Yu, C., Jinshou, Y.: Neural networks with limited precision weights and its application in embedded systems. In: 2010 Second International Workshop on Education Technology and Computer Science, vol. 1, pp. 86–91 (2010). https://doi.org/10.1109/ETCS.2010.448

  24. Jiang, Z., Zhao, L., Li, S., Jia, Y.: Real-time object detection method based on improved YOLOv4-tiny. arXiv:2011.04244[cs] (2020)

  25. Jiao, L., Luo, C., Cao, W., Zhou, X., Wang, L.: Accelerating low bit-width convolutional neural networks with embedded FPGA. In: 2017 27th International Conference on Field Programmable Logic and Applications (FPL), pp. 1–4 (2017). https://doi.org/10.23919/FPL.2017.8056820

  26. Krizhevsky, A.: Convolutional deep belief networks on CIFAR-10. Tech. rep. (2010)

  27. Li, Q., Xiao, Q., Liang, Y.: Enabling high performance deep learning networks on embedded systems. In: IECON 2017 - 43rd Annual Conference of the IEEE Industrial Electronics Society, pp. 8405–8410 (2017). https://doi.org/10/ghpz8h

  28. Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) Computer Vision – ECCV 2014, Lecture Notes in Computer Science, pp. 740–755. Springer International Publishing, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

  29. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD: single shot multibox detector. In: Computer Vision – ECCV 2016, vol. 9905, pp. 21–37. Springer International Publishing (2016). https://doi.org/10.1007/978-3-319-46448-0_2

  30. Mao, H., Yao, S., Tang, T., Li, B., Yao, J., Wang, Y.: Towards real-time object detection on embedded systems. IEEE Transactions on Emerging Topics in Computing 6(3), 417–431 (2018). https://doi.org/10/gd7rvr

    Article  Google Scholar 

  31. Niazi-Razavi, M., Savadi, A., Noori, H.: Toward real-time object detection on heterogeneous embedded systems. In: 2019 9th International Conference on Computer and Knowledge Engineering (ICCKE), pp. 450–454 (2019). https://doi.org/10/ghpz8c

  32. Qin, H., Gong, R., Liu, X., Bai, X., Song, J., Sebe, N.: Binary neural networks: a survey. Pattern Recogn. 105, 107281 (2020). https://doi.org/10/ggs3g4

    Article  Google Scholar 

  33. Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. In: 33st AAAI Conference on Artificial Intelligence, AAAI 2019. arXiv:1802.01548 (2019)

  34. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). https://doi.org/10.1109/cvpr.2016.91. IEEE (2015)

  35. Redmon, J., Farhadi, A.: YOLO9000: better, faster. Stronger. arXiv:1612.08242[cs] (2016)

  36. Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv:1804.02767[cs] (2018)

  37. Ren, S., He, K., Girshick, R.B., Sun, J.: Faster r-CNN: towards real-time object detection with region proposal networks. arXiv:1506.01497. https://doi.org/10.1109/tpami.2016.2577031 (2015)

  38. Roth, W., Schindler, G., Zöhrer, M., Pfeifenberger, L., Peharz, R., Tschiatschek, S., Fröning, H., Pernkopf, F., Ghahramani, Z.: Resource-Efficient Neural Networks for Embedded Systems. arXiv:2001.03048[cs, stat] (2020)

  39. Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4510–4520. https://doi.org/10/gfxgjz (2018)

  40. Sifre, L.: Rigid-Motion Scattering for Image Classification. Ph. D. Thesis, Ecole Polytechnique, CMAP, Palaiseau, France (2014)

  41. Sze, V., Chen, Y., Yang, T., Emer, J.S.: Efficient processing of deep neural networks: a tutorial and survey. Proc. IEEE 105(12), 2295–2329 (2017). 10/gcnp38

    Article  Google Scholar 

  42. Tan, M., Chen, B., Pang, R., Vasudevan, V., Sandler, M., Howard, A., Le, Q.V.: MnasNet: Platform-Aware Neural Architecture Search for Mobile. arXiv:1807.11626[cs] (2019)

  43. Tan, M., Le, Q.V.: EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. arXiv:1905.11946[cs, stat] (2020)

  44. Tan, M., Pang, R., Le, Q.V.: EfficientDet: Scalable and Efficient Object Detection. arXiv:1911.09070[cs, eess] (2020)

  45. Tripathi, S., Dane, G., Kang, B., Bhaskaran, V., Nguyen, T.: LCDet: low-complexity fully-convolutional neural networks for object detection in embedded systems. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 411–420 (2017). https://doi.org/10/ghpz79

  46. Venieris, S.I., Kouris, A., Bouganis, C.S.: Deploying Deep Neural Networks in the Embedded Space. arXiv:1806.08616[cs] (2018)

  47. Yang, T.J., Howard, A., Chen, B., Zhang, X., Go, A., Sandler, M., Sze, V., Adam, H.: NetAdapt: platform-aware neural network adaptation for mobile applications. In: European Conference on Computer Vision (ECCV). https://openaccess.thecvf.com/content_ECCV_2018/papers/Tien-Ju_Yang_NetAdapt_Platform-Aware_Neural_ECCV_2018_paper.pdf (2018)

  48. Zhao, Z., Zhang, Z., Xu, X., Xu, Y., Yan, H., Zhang, L.: A lightweight object detection network for real-time detection of driver handheld call on embedded devices. https://www.hindawi.com/journals/cin/2020/6616584/ (2020)

  49. Zhao, Z.Q., Zheng, P., Xu, S.T., Wu, X.: Object detection with deep learning: a review. IEEE Transactions on Neural Networks and Learning Systems, pp. 1–21. https://doi.org/10.1109/tnnls.2018.2876865 (2019)

Download references

Funding

The authors acknowledge the São Paulo Research Foundation (FAPESP Grant 2019/07665-4) for supporting this project. This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001.

Author information

Authors and Affiliations

Authors

Contributions

– Conceptualization: D. R. Meneghetti; T. P. D. Homem; J. H. R. de Oliveira; R. A. C. Bianchi

– Methodology: D. R. Meneghetti; T. P. D. Homem; D. H. Perico

– Software: D. R. Meneghetti

– Investigation: D. R. Meneghetti

– Formal Analysis: D. R. Meneghetti

– Validation: D. R. Meneghetti

– Data curation: D. R. Meneghetti; T. P. D. Homem; J. H. R. de Oliveira; I. J. da Silva; D. H. Perico; R. A. C. Bianchi

– Writing – original draft: D. R. Meneghetti; T. P. D. Homem

– Writing – review & editing: D. R. Meneghetti; T. P. D. Homem; D. H. Perico; R. A. C. Bianchi

– Visualization: D. R. Meneghetti; T. P. D. Homem; R. A. C. Bianchi

– Resources: D. R. Meneghetti; R. A. C. Bianchi

– Funding acquisition: D. R. Meneghetti; R. A. C. Bianchi, Project administration: R. A. C. Bianchi

– Supervision: R. A. C. Bianchi

Corresponding author

Correspondence to Douglas De Rizzo Meneghetti.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors acknowledge the São Paulo Research Foundation (FAPESP Grant 2019/07665-4) for supporting this project. This study was financed in part by the Coordenaç ão de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES) – Finance Code 001.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Meneghetti, D.D.R., Homem, T.P.D., de Oliveira, J. et al. Detecting Soccer Balls with Reduced Neural Networks. J Intell Robot Syst 101, 53 (2021). https://doi.org/10.1007/s10846-021-01336-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10846-021-01336-y

Keywords

Navigation