A quantitative detection algorithm based on improved faster R-CNN for marine benthos

https://doi.org/10.1016/j.ecoinf.2021.101228Get rights and content

Highlights

  • The study of marine biological abundance is of great significance in the field of marine science.

  • A dense marine benthic target detection algorithm based on improved Faster RCNN is proposed.

  • Embedding adaptive selection unit in ResNet101 can improve the feature extraction ability.

Abstract

In order to realize the accurate quantitative detection of marine benthos and solve the problems in detecting small and densely distributed benthic organisms under overlapping and occlusion image, a quantitative detection algorithm for marine benthos based on Faster R-CNN is proposed. A convolution kernel adaptive selection unit is embedded in the backbone to enhance the feature extraction ability of network. Based on this, multi-resolution feature fusion is introduced to design deconvolution feature pyramid structure for small object detection. At the same time, the selection of anchor in Region Proposal Network is optimized to improve the accuracy of counting. Transfer learning strategy is also employed to train the proposed model and alleviate the limitation of small dataset. The results show that compared with the original Faster R-CNN, the proposed algorithm improves the recognition precision of marine benthos from 93.25% to 96.32%, and reduces the mean average error from 16.53 to 7.38. This improvement reflects that the proposed algorithm is more suitable for the quantitative detection of small and dense objects on the seafloor.

Introduction

Deep-sea object detection technology is one of the hotspots in the field of marine science. At present, the underwater robot uses artificial intelligence technology to detect the seabed environment and ecosystem (Huang et al., 2020; Li et al., 2014; Pedersen et al., 2019). For example, underwater robot can be used to lay and maintain underwater optical cables, inspect fishing grounds, and execute underwater military missions, etc. Therefore, research on underwater object recognition and detection algorithms has significant application value and development prospects. This paper mainly studies the quantitative detection algorithm of marine benthos to provide technical support for the protection of deep-sea biological resources and sustainable development of the marine ecosystem.

Traditional object detectors use artificial feature extractors to obtain data features, and then combine them with classifiers to achieve the final detection results. Recently, the object detection algorithm based on deep learning has made rapid progress. The deep convolution neural network (DCNN) is used as a feature extractor, which can extract abundant image information by independently learning features of different levels. Object detectors based on deep learning can be separated into two groups, one is two-stage detector represented by Region Convolutional Neural Network (RCNN) (Girshick et al., 2014), Fast R-CNN (Girshick, 2015),and Faster R-CNN (Ren et al., 2017), the other is single-stage detector represented by YOLO (Redmon et al., 2016) and SSD (Liu et al., 2016). The first kind of algorithm firstly generates a series of region proposals through sliding window algorithm or Regional Proposal Network (RPN) (Ren et al., 2017), and then determines the category and location of object according to the feature of proposals. The second one treats the object classification and location as an end-to-end regression problem without generating additional region proposals. Compared with the two-stage detector, the second simplifies the detection process and improves efficiency of detection. In addition to the above anchor-based detectors, researchers have proposed some anchor-free detection algorithms recently. The main representatives of this trends include the FCOS framework (Tian et al., 2020), which determines the bounding box of the object by predicting the position of the four coordinates. CornerNet (Law and Deng, 2018) also adopts a similar anchor-free method to determine the location by predicting the coordinates of the top-left and bottom-right corner of the object. Two-stage detectors are usually more accurate in positioning and classification due to their meticulous processing of Region of Interest (ROI) (Ren et al., 2017), but relatively slower than the single-stage detectors in terms of processing speed (Hurtik et al., 2020).

Underwater detection usually adopts different feature extraction algorithms depending on different objects. For objects with fixed shapes (such as underwater pipelines, cables, etc.), the traditional feature descriptors are often used to extract the features, such as color features, shape features and texture features, etc. Fatan et al. (2016) proposed a method using Hough transform to extract image edge texture information, and then a multilayer perceptron with SVM was used to classify underwater cables. However, the Hough transform is only adequate for straight line detection, which cannot effectively extract complete features. Tharwat et al. (2018) used the AdaBoost classifier to recognize fish and mainly focus the texture and color features. However, such a method does not use large datasets to train the classifier (SVM, AdaBoost, etc.), limiting the generalization ability of classification and making detection just suitable for specific simple scenes. Qiao et al. (2017) proposed an image quality enhancement algorithm for sea cucumber based on adaptive histogram equalization and wavelet transform to improve the predict precision. To sum up, prior feature descriptors are good at detecting specific single object, but not suitable to the diversity feature.

Object detection and recognition algorithm based on deep learning benefits from the strong feature learning ability of DCNN (Cao et al., 2020), which can effectively solve the feature extraction problems mentioned in above methods. Therefore, it has been successfully applied in many underwater detection scenes (Jin and Liang, 2017; Wang et al., 2020). Using a CNN without any artificial features, Kratzert (Kratzert and Mader, 2017) et al. established a monitoring platform for marine fish, and fish classification accuracy reached 93%. Huang et al. (2019) realized expansion of a small dataset through three data enhancement methods to verify the effectiveness of Faster R-CNN in organisms detection under different Marine turbulent environments. Xia et al. (2018) proposed a sea cucumber detection structure based on YOLOV2 (Redmon and Farhadi, 2017) model, which has an excellent detection effect for the sea cucumbers with regular shapes. Jenni (Jenni and Ekaterina, 2018) proposed a benchmark database of aquatic macroinvertebrate taxa and presented the classification results of AlexNet CNN and other well-known methods to provide a baseline evaluation performance. Modasshir (Modasshir et al., 2018) proposed a new identification method to calculate and estimate the number of corals, but the problem of double counting has not been solved. Raphael (Raphael and Iluz, 2020) enumerated the methods based on neural network for marine coral reef identification in recent years, and pointed out the limitations and development prospects of above methods, which make a good summary of coral reef identification. Although these methods succeed in some fields, they are all applied to limited scenarios and do not cover number statistics. Other algorithms for counting of dense objects mostly adopt deep learning method based on regression, predicting the number of objects by generating a density map; for example, the CSRNet (Li et al., 2018) is used in crowd counting and the U-Net (Ronneberger et al., 2015) in cell counting. Because the method based on regression only counts the objects but not for recognition, which is not suitable for our research.

According to the above analysis, there are two problems in identification and counting of marine benthic objects, because of dark light, underwater medium and other factors, the quality of underwater images is low, which may cause some obstacles for features extraction. Secondly, due to impact of marine benthic organisms of living environment and survival mode, stacking and blocking phenomenon usually appears, which increases the difficulty of underwater detection and counting.

Based on above discussion, this paper proposes a quantitative algorithm based on detection, which obtains high-quality predict results and then counts the objects. Therefore, we focus on Faster R-CNN, a two-stage detector with higher detection accuracy, and make improvements to it. The contributions and benefits of our approach are as follows:

  • 1.

    Embedding the kernel adaptive selection unit selective kernel (SK) (Li et al., 2019) in ResNet101 to strengthen the backbone's feature extraction ability;

  • 2.

    A deconvolution feature pyramid network (FPN) (Lin et al., 2017) is designed to make the detection layer used in the network have appropriate resolution and strong semantic features;

  • 3.

    In order to reduce the missing detection and improve the accuracy of counting, the selection of anchors in RPN is optimized according to the aspect ratio of marine benthic objects.

The rest of this paper is structured as follows: dataset and the proposed algorithm based on Faster R-CNN is introduced in Section 2. Section 3 illustrates the results for our algorithm. The results and effectiveness of our method are discussed in Section 4. Finally, conclusion of this article is drawn in Section 5.

Section snippets

Dataset

The dataset in our research includes two parts. One is based on our fund project (Institute of Oceanology, Chinese Academy of Sciences “Deep sea biological in situ intelligent recognition system and quantitative analysis system development project” (KEXUE2019GZ04)), which contains 30,000 pictures (30K-Seafloor dataset) of marine benthic taken under the dark conditions of the seabed. The species mainly include shrimp, mussel, and spider crab. Among them, shrimp and mussel are densely

Results

The model performance is evaluated by Precision (PR) and Omission Ratio (OR), in addition, due to the density and quantitative counting requirements of objects in this paper, Mean Average Error (MAE) and Root Mean Squared Error (RMSE) commonly used in density counting work are also adopted as performance metrics. The PR, OR, MAE, and RMSE can be calculated by Eqs. (11), (12), (13), (14):PR=1nintiti+fiOR=1ninmiti+fi+miMAE=1ninkitiRMSE=1ninkiti2where, ti is the number of true-positives,

Discussion

The latest improvements in hardware, software and camera systems have promoted the development of new methods, coupled with the growing need to protect local species, of monitoring the environment through images and video recordings. The collection of a significant number of seabed biological images by Autonomous underwater vehicle (AUV) must be automatically analyzed.

The main contribution of this paper is to propose an automatic recognition and quantitative algorithm based on deep learning for

Conclusion

In this study, we proposed a quantitative detection algorithm for marine benthos based on Faster R-CNN. Due to the dense distribution of objects and the low quality of underwater imaging, we adopt the SK unit to enhance feature extraction capability of the Faster R-CNN backbone network and change the multi-scale feature fusion FPN structure to improve the detection effect of small objects. Through optimizing the selection of the RPN anchors, a good quantitative result is achieved. Four

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This work was supported by the program of Qingdao University of Science and Technology “Research on key technologies and platforms for regional medical data sharing and analysis” (GG201710030036). This project is partially supported by the program of The Institute of Oceanology, Chinese Academy of Sciences “Deep sea biological in situ intelligent recognition system and quantitative analysis system development project” (KEXUE2019GZ04).

References (36)

  • K. He et al.

    Deep residual learning for image recognition

    Comput. Vision Pattern Recognit.

    (2016)
  • P. Hurtik et al.

    Poly-YOLO: higher speed, more precise detection and instance segmentation for YOLOv3

    arXiv

    (2020)
  • R. Jenni et al.

    Database for fine-grained image classification of benthic macroinvertebrates

    Imavis

    (2018)
  • L. Jin et al.

    Deep learning for underwater image recognition in small sample size situations

  • D.J. Jobson et al.

    A multiscale retinex for bridging the gap between color images and the human observation of scenes

    IEEE Trans. Image Process.

    (1997)
  • F. Kratzert et al.

    Advances of FishNet towards a fully automatic monitoring system for fish migration

  • H. Law et al.

    CornerNet: detecting objects as paired keypoints

    Int. J. Comput. Vis.

    (2018)
  • Q.-Z. Li et al.

    Fast multicamera video stitching for underwater wide field-of-view observation

    J. Electron Imaging

    (2014)
  • Cited by (0)

    View full text