Multi-scale ResNet for real-time underwater object detection

Pan, Tien-Szu; Huang, Huang-Chu; Lee, Jen-Chun; Chen, Chung-Hsien

doi:10.1007/s11760-020-01818-w

Multi-scale ResNet for real-time underwater object detection

Original Paper
Published: 27 November 2020

Volume 15, pages 941–949, (2021)
Cite this article

Signal, Image and Video Processing Aims and scope Submit manuscript

Tien-Szu Pan¹,
Huang-Chu Huang²,
Jen-Chun Lee² &
…
Chung-Hsien Chen³

2084 Accesses
28 Citations
2 Altmetric
Explore all metrics

Abstract

An automatic underwater object recognition system is essential to reduce the costs of underwater inspection. In this study, we propose a novel convolutional neural network architecture that is trained on underwater video frames. This method is based on a modified residual neural network (ResNet) for underwater object detection. Multi-scale ResNet (M-ResNet), the modified method, improves efficiency by utilizing multi-scale operations for the accurate detection of objects of various sizes, especially small objects. The experimental results show that the proposed method yields an accuracy of 96.5% (mAP) in recognition performance. As a consequence, we propose a novel system for automatic object detection as an application for marine environments.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Underwater Object Detection Using Restructured SSD

Underwater object detection: architectures and algorithms – a comprehensive review

Article 12 March 2022

Sheezan Fayaz, Shabir A. Parah & G. J. Qureshi

Multi-stream fish detection in unconstrained underwater videos by the fusion of two convolutional neural network detectors

Article 16 January 2021

Abdelouahid Ben Tamou, Abdesslam Benzinou & Kamal Nasreddine

References

Mliki, H., Dammak, S., Fendri, E.: An improved multi-scale face detection using convolutional neural network. SIViP 14, 1345–1353 (2020)
Article Google Scholar
Wang, X., Yang, J.: Marathon athletes number recognition model with compound deep neural network. SIViP 14, 1379–1386 (2020)
Article Google Scholar
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587 (2014)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Alexander C.B.: SSD: single shot multibox detector. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 21–37 (2016)
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.: YOLOv4: Optimal Speed and Accuracy of Object Detection (2020). arXiv:2004.10934
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Article Google Scholar
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 886–893 (2005)
Priyadharsini, X.R., Sharmila, T.S.: Object Detection in underwater acoustic images using edge based segmentation method. Procedia Comput. Sci. 165, 759–765 (2019)
Article Google Scholar
Yang, H., Liu, P., Hu, Y., Fu, J.N.: Research on underwater object recognition based on YOLOv3. Microsyst. Technol. (2020). https://doi.org/10.1007/s00542-019-04694-8
Article Google Scholar
Song, Y., He, B., Liu, P.: Real-time object detection for AUVs using self-cascaded convolutional neural networks. IEEE J. Ocean. Eng. (2020). https://doi.org/10.1109/JOE.2019.2950974
Article Google Scholar
Salman, A., Siddiqui, S.A., Shafait, F., Mian, A., Shortis, M.R., Khurshid, K., Ulges, A., Schwanecke, U.: Automatic fish detection in underwater videos by a deep neural network-based hybrid motion learning system. ICES J. Mar. Sci. 77(4), 1295–1307 (2020)
Article Google Scholar
Cao, S., Zhao, D., Liu, X., Sun, Y.: Real-time robust detector for underwater live crabs based on deep learning. Comput. Electron. Agric. 172, 105339 (2020)
Article Google Scholar
Uijlings, J.R., van de Sande, K.E., Gevers, T., Smeulders, A.W.: Selective search for object recognition. Int. J. Comput. Vis. 104(2), 154–171 (2013)
Article Google Scholar
Girshick. R.: Fast R-CNN (2015). arXiv:1504.08083
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems (2016). arXiv:1506.01497v3
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: IEEE International Conference on Computer Vision (ICCV), pp. 2961–2969 (2017)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7263–7271 (2017)
Joseph, R., Farhadi, A.: YOLOv3: an incremental improvement. CVPR (2018)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 1106–1114 (2012)
Pedersen, M., Haurum, J.B., Gade, R., Moeslund, T.B.: Detection of marine animals in a new underwater dataset with varying visibility. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPR Workshops), pp. 18–26 (2019)
Fabbri, C., Islam, M.J., Sattar, J.: Enhancing underwater imagery using generative adversarial networks. In: IEEE International Conference on Robotics and Automation (ICRA), pp. 7159–7165 (2018)
Lin, T.Y., Dollár, P., Girshick, R.B., He, K., Hariharan, B., Belongie, S.J.: Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 21–26 (2017)
Bulo, S.R., Porzi, L., Kontschieder, P.: In-place activated batchnorm for memory-optimized training of DNNs. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5639–5647 (2018)
Zhao, H., Zhang, Y., Liu, S., Shi, J., Loy, C.C., Lin, D., & Jia, J.: PSANet: point-wise spatial attention network for scene parsing. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 267–283 (2018)
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions. In: International Conference on Learning Representations (ICLR) (2016)
Salman, A., Maqbool, S., Khan, A.H., Jalal, A., Shafait, F.: Real-time fish detection in complex backgrounds using probabilistic background modelling. Ecol. Inform. 51, 44–51 (2019)
Article Google Scholar
Shorten, C., Khoshgoftaar, T.M.: A survey on image data augmentation for deep learning. J. Big Data 6(60), 1–48 (2019)
Google Scholar
Antoniou, A., Storkey, A.H.: Edwards, Data Augmentation Generative Adversarial Networks (2017). arXiv:1711.04340

Download references

Author information

Authors and Affiliations

Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan
Tien-Szu Pan
Department of Telecommunication Engineering, National Kaohsiung University of Science and Technology, Kaohsiung, Taiwan
Huang-Chu Huang & Jen-Chun Lee
Metal Industries Research & Development Centre (MIRDC), Kaohsiung, Taiwan
Chung-Hsien Chen

Authors

Tien-Szu Pan
View author publications
You can also search for this author in PubMed Google Scholar
Huang-Chu Huang
View author publications
You can also search for this author in PubMed Google Scholar
Jen-Chun Lee
View author publications
You can also search for this author in PubMed Google Scholar
Chung-Hsien Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Huang-Chu Huang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pan, TS., Huang, HC., Lee, JC. et al. Multi-scale ResNet for real-time underwater object detection. SIViP 15, 941–949 (2021). https://doi.org/10.1007/s11760-020-01818-w

Download citation

Received: 16 July 2020
Revised: 02 November 2020
Accepted: 06 November 2020
Published: 27 November 2020
Issue Date: July 2021
DOI: https://doi.org/10.1007/s11760-020-01818-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-scale ResNet for real-time underwater object detection

Abstract

Access this article

Similar content being viewed by others

Underwater Object Detection Using Restructured SSD

Underwater object detection: architectures and algorithms – a comprehensive review

Multi-stream fish detection in unconstrained underwater videos by the fusion of two convolutional neural network detectors

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Multi-scale ResNet for real-time underwater object detection

Abstract

Access this article

Similar content being viewed by others

Underwater Object Detection Using Restructured SSD

Underwater object detection: architectures and algorithms – a comprehensive review

Multi-stream fish detection in unconstrained underwater videos by the fusion of two convolutional neural network detectors

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation