Skip to main content
Log in

Detection and segmentation of underwater objects from forward-looking sonar based on a modified Mask RCNN

  • Original Paper
  • Published:
Signal, Image and Video Processing Aims and scope Submit manuscript

Abstract

Nowadays, high-frequency forward-looking sonar is an effective device to obtain the main information of underwater objects. Detection and segmentation of underwater objects are also one of the key topics of current research. Deep learning has shown excellent performance in image features extracting and has been extensively used in image object detection and instance segmentation. With the network depth increasing, training accuracy gets saturated and training parameters also increase rapidly. In this paper, a series of residual blocks are used to build a 32-layer feature extraction network and take place of the Resnet50/101 in Mask RCNN, which reduces the training parameters of the network while guaranteeing the detection performance. The parameters of the proposed network are 29% less than Resnet50 and 50.2% less than Resnet101, which is of great significance for future hardware implementation. In addition, Adagrad optimizer is introduced into this research to improve the detection performance of sonar images. Finally, the object detection results of 500 test sonar images show that the mAP is 96.97% that is only 0.18% less than Resnet50 (97.15%) but more than Resnet101 (95.15%).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Abu, A., Diamant, R.: A statistically-based method for the detection of underwater objects in sonar imagery. IEEE Sens. J. 19(16), 6858–6871 (2019)

    Article  Google Scholar 

  2. Cho, H., Pyo, J., Gu, J., Jeo, H., Yu, S.C.: Experimental results of rapid underwater object search based on forward-looking imaging sonar. In: 2015 IEEE Underwater Technology (UT), pp. 1–5. IEEE (2015)

  3. Duchi, J., Hazan, E., Singer, Y.: Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res. 12, 2121–2159 (2011)

    MathSciNet  MATH  Google Scholar 

  4. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)

  5. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)

  6. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

  7. Klausner, H.N., Azimi-Sadjadi, M.R.: Performance prediction and estimation for underwater target detection using multichannel sonar. IEEE J. Ocean. Eng. 2019, 1–13 (2019)

    Google Scholar 

  8. Kong, W., Hong, J., Jia, M., Yao, J., Cong, W., Hu, H., Zhang, H.: Yolov3-dpfin: a dual-path feature fusion neural network for robust real-time sonar target detection. IEEE Sens. J. 20(7), 3745–3756 (2020)

    Article  Google Scholar 

  9. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

  10. Li, Z., Peng, C., Yu, G., Zhang, X., Deng, Y., Sun, J.: Detnet: a backbone network fore object detection. arXiv preprint arXiv:1804.06215 (2018)

  11. Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

  12. Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft coco: common objects in context. In: European Conference on Computer Vision, pp. 740–755. Springer (2014)

  13. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: Ssd: Single shot multibox detector. In: European Conference on Computer Vision, pp. 21–37. Springer (2016)

  14. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)

  15. Qin, Z., Li, Z., Zhang, Z., Bao, Y., Yu, G., Peng, Y., Sun, J.: Thundernet: towards real-time generic object detection on mobile devices. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 6718–6727 (2019)

  16. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016)

  17. Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, pp. 91–99 (2015)

  18. Shi, T., Liu, M., Niu, Y., Yang, Y., Huang, Y.: Underwater targets detection and classification in complex scenes based on an improved yolov3 algorithm. J. Electron. Imaging 29(4), 1 (2020)

    Article  Google Scholar 

  19. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  20. Song, Y., He, B., Liu, P.: Real-time object detection for auvs using self-cascaded convolutional neural networks. IEEE J. Ocean. Eng. PP(99), 1–12 (2019)

    Article  Google Scholar 

  21. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9 (2015)

  22. Tang, C., Zhang, G., Hu, H., Wei, P., Duan, Z., Qian, Y.: An improved YOLOv3 algorithm to detect molting in swimming crabs against a complex background. Aquac. Eng. 91, 102115 (2020)

    Article  Google Scholar 

  23. Valdenegro-Toro, M.: Learning objectness from sonar images for class-independent object detection. arXiv preprint arXiv:1907.00734 (2019)

  24. Yang, H., Liu, P., Hu, Y.Z., Fu, J.N.: Research on underwater object recognition based on yolov3. Microsystem Technologies, pp. 1–8 (2020)

  25. Yang, Q., Xiao, D., Lin, S.: Feeding behavior recognition for group-housed pigs with the faster r-cnn. Comput. Electron. Agric. 155, 453–460 (2018)

    Article  Google Scholar 

  26. Yu, Y., Zhang, K., Yang, L., Zhang, D.: Fruit detection for strawberry harvesting robot in non-structural environment based on mask-rcnn. Comput. Electron. Agric. 163, 104846 (2019)

    Article  Google Scholar 

  27. Zeng, W.J., Wan, L., Zhang, T.D., Huang, S.I.: Simultaneous localization and mapping of autonomous underwater vehicle using looking forward sonar. J. Shanghai Jiaotong Univ. (Science) 17(1), 91–97 (2012)

    Article  Google Scholar 

  28. Zhou, X., Wang, D., Krähenbühl, P.: Objects as points. arXiv preprint arXiv:1904.07850 (2019)

Download references

Acknowledgements

This work is supported by the National Natural Science Foundation of China (Grant No. 6150010825) and the project of Jiangsu Province’s six talent peak funding: deep sea ROV obstacle avoidance sonar (No. KTHY-026).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weijie Xia.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fan, Z., Xia, W., Liu, X. et al. Detection and segmentation of underwater objects from forward-looking sonar based on a modified Mask RCNN. SIViP 15, 1135–1143 (2021). https://doi.org/10.1007/s11760-020-01841-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11760-020-01841-x

Keywords

Navigation