An intelligent and cost-effective remote underwater video device for fish size monitoring

https://doi.org/10.1016/j.ecoinf.2021.101311Get rights and content

Highlights

  • The Underwater Detector of Moving Object Size (UDMOS) computer vision system is presented, which records large fishes passing in front of a camera

  • UDMOS can be deployed as un unbaited underwater system and can seamlessly work in shallow or deep waters, and with infrared or visible light

  • A free-to-use standardized Web Service is also proposed to run UDMOS for batch video-processing

  • Three alternative large-object detectors are embedded, based on deep learning, unsupervised modelling, and motion detection respectively

  • UDMOS is cost-effective because it uses minimalistic hardware and power consumption and spares time to experts for analysing long BRUV videos

Abstract

Monitoring the size of key indicator species of fish is important to understand ecosystem functions, anthropogenic stress, and population dynamics. Standard methodologies gather data using underwater cameras, but are biased due to the use of baits, limited deployment time, and short field of view. Furthermore, they require experts to analyse long videos to search for species of interest, which is time consuming and expensive. This paper describes the Underwater Detector of Moving Object Size (UDMOS), a cost-effective computer vision system that records events of large fishes passing in front of a camera, using minimalistic hardware and power consumption. UDMOS can be deployed underwater, as an unbaited system, and is also offered as a free-to-use Web Service for batch video-processing. It embeds three different alternative large-object detection algorithms based on deep learning, unsupervised modelling, and motion detection, and can work both in shallow and deep waters with infrared or visible light.

Introduction

Monitoring the frequency of detection of key indicator species of marine fishes in their native habitat is a useful method of gathering data to understand characteristics such as population distribution, biomass change, anthropogenic impact, and the function of ecosystem relationships such as mutualistic behaviour. Common approaches to gathering such data use standalone video recording devices, sometimes equipped with baits that are deployed underwater (baited remote underwater video device, BRUV), capturing footage continuously until on-board storage is exhausted (Cappo et al., 2004; Mallet and Pelletier, 2014; Vos et al., 2014). This continuous recording results in a number of characteristics which produce bias in data collection, e.g. the duration of deployment time capturing data is limited to a few hours and this limitation also encourages the use of non-passive baited camera systems, which may affect inference regarding the presence of species. Upon retrieving the devices, experts need to view single time period samples in the form of long videos to search for species of interest and conduct further analysis such as taxonomic confirmation, maximum abundance of fish per frame (MaxN), and estimation of life stages dependant upon their size. In some cases, dual cameras are used to allow on-screen measures of fish length by helping expert-observation with shape analysis tools (Costa et al., 2006).

In recent years, researchers have applied artificial intelligence to accelerate the post data capture process (González-Rivero et al., 2020; Marini et al., 2018; Qin et al., 2016; Shafait et al., 2016). However the methodology for video collection of information, and port capture data retrieval from video and its analysis is both time consuming, expensive, and generates substantial video data. Another practical difficulty is to find, hire, and train professional video operators. Today, operators are mostly university students whose commitment time, interest, and availability is often limited and fragmented. Also, this approach offers no alternative to the continuous recording approach. On the contrary, an automatic solution - for example an edge-computing vision system - could autonomously monitor for far longer time underwater while constantly capturing video event data. Although analogue solutions for motion sensing may offer alternatives to a passive AI-based approach (Daum, 2005; Hsiao et al., 2014; Salman et al., 2020; Spampinato et al., 2008), many have implications regarding bias and limited capacity underwater (e.g. microwave motion sensors) (Hussey et al., 2015; Yoon et al., 2012). Also, detection methods may perturb the presence or absence of species (e.g. analogue sonar scanning) or simply are not precise enough to discriminate between debris or smaller organisms. With these considerations in mind, in order to create a detection system for larger indicator species, such as sharks and rays, a solution is needed to detect animated moving objects and classify them in terms of their size.

This paper describes the Underwater Detector of Moving Object Size (UDMOS) software, a cost-effective computer vision system that can be deployed underwater and is able to identify and record videos of fishes of large size moving in front of a camera. The minimum fish size to detect is a configurable parameter that defines the minimum percentage object's size with respect to the camera's frame size that should trigger video recording. UDMOS can work in shallow as well as deep waters, and uses minimal hardware and deployment equipment to operate. Hardware is scalable from a very inexpensive solution, based only on a single IR camera and one Raspberry Pi4 device, to more expensive solutions that use more cameras and more powerful hardware. UDMOS can be deployed to capture the presence of large fishes over long time periods and this characteristic reduces the need for a baited system to concentrate activity in front of the camera. This non-invasive solution is cost effective and reduces the bias for fish detection that the presence of bait may cause. Additionally, UDMOS is offered as a free-to-use cloud Web service that can post-process videos captured by standard BRUVs.

In this study, the performance of the workflow is assessed on five real operational cases under different light and depth conditions, using different object detection algorithms that can work on minimal hardware. The performance comparison also involves a movement-detection algorithm embedded in UDMOS. Overall, UDMOS addresses the following research question: Can we use modern single-board computers to design a large, precise, non-invasive, efficient, and cost-effective system for monitoring fish of a specific size range?

UDMOS embeds several novel features with respect to alternative remote underwater video devices (Codd-Downey et al., 2017; Ebner et al., 2014; Edgington et al., 2006; Schlining and Stout, 2006). Internally, it can use one among three different approaches to detect objects or movement. These approaches work on a low-resource hardware, with different response times. One advantage of UDMOS with respect to other approaches (Brooks et al., 2011; Quevedo et al., 2017; Schmid et al., 2017; Struthers et al., 2015), is that it can work with one basic camera to estimate the approximate distance of an object. Different from other solutions (Edgington et al., 2006; Harvey et al., 2003; Palazzo et al., 2014; Schlining and Stout, 2006; Van Damme, 2015), UDMOS is conceived to automatically adapt to both low and high power hardware. Unlike camera-trap systems that use motion detection (Apps and McNutt, 2018; Golkarnarenji et al., 2018; Kays et al., 2010; Marcot et al., 2019; Miguel et al., 2016; Zhou et al., 2008), our workflow improves motion detection precision through an adaptive thresholding algorithm and offers alternative object detection models. Similar to other underwater devices (Edgington et al., 2003; Hermann et al., 2020), UDMOS can work in both IR and visible light conditions by automatically selecting the optimal configuration based on the scene brightness. The workflow is also able to approximately account for common issues found by other systems due to small and close fishes attracted by the recording device (Coghlan et al., 2017; Dunlop et al., 2015; Harvey et al., 2007). Overall, UDMOS strongly reduces the amount of irrelevant video data produced, especially when the events to capture are rare, and thus is beneficial both in terms of human time saving and hardware costs. It can be used to implement an edge computing as well as an as-a-service batch processing system for current BRUV systems.

UDMOS can be coupled with modern underwater species identification systems that work on BRUV-collected videos. Both open-source (Dawkins et al., 2017) and commercial fish identification software and abundance estimators (Santana-Garcon et al., 2014) exist that can work on BRUV videos, and thus on the UDMOS videos. These systems usually work offline and are used after the underwater video capture session. Indeed, they have hardware requirements that are not affordable by current low-cost and embeddable technology. Most species identification systems and abundance estimators are based on deep learning models (Dawkins et al., 2017), which have demonstrated optimal performance with respect to other alternative models (Sheaves et al., 2020) and can over-perform human classification on species-specific identification tasks (Knausgård et al., 2020; Konovalov et al., 2019; Sheaves et al., 2020). However, these model are still unreliable in species abundance estimation and generally cannot substitute human experts (Connolly et al., 2021). Furthermore, they have demanding hardware requirements - e.g., 64-bit Operating Systems and powerful GPUs (Dawkins et al., 2017) - and thus are normally provided as-a-service through high-performance computing architectures that maximise both their efficiency and effectiveness (Candela et al., 2016; Coro et al., 2018; Sheaves et al., 2020). Overall, on-board species identification requires powerful and expensive hardware and battery capacity - GPU processing is very power-consuming, and 64-bit Operating Systems on edge computers are still at an early stage - and thus on-board processing is usually limited to motion detection (Sheehan et al., 2020).

UDMOS is principally conceived to reduce the time that either human experts or automatic models require to post-process the captured underwater videos for species recognition, size measurement, and abundance estimations. Thus, one crucial requirement is that its performance on an edge computer is optimal as it would be on a cloud computing platform. For this reason, UDMOS addresses the simpler task of triggering the recording when a large fish passes in front of the camera rather than recognizing the fish. This choice has the advantage to (i) avoid biases due to misclassification, (ii) be applicable to a large spectrum of species (i.e. not only those for which a model was trained), and (iii) reduce recorded video length to a much lower length than the continuous recording (Section 3).

The next sections describe the complete UDMOS workflow and target hardware (Section 2) and its effectiveness and efficiency (Section 3). Finally, a discussion of the results and the potential applications of UDMOS is reported (Section 4).

Section snippets

Material and methods

In this section, the general architecture of the UDMOS workflow is described by following the flowchart in Fig. 1.

Results

This section reports the UDMOS parameter estimation out of the development set (Section 3.1), and then reports the workflow performance on our five test cases at the variation of the detection model used (Section 3.2).

Discussion and conclusions

In this paper, a workflow to detect large objects underwater, with the objective of detecting large fishes moving in front of a camera, has been described. The presented solution has a number of advantages (Table 4), e.g. it can internally use one among three different models that reported reasonably good performance on the selected test cases. The results also demonstrate that UDMOS can work effectively and efficiently even with low-resource hardware. Deep water scenarios are particularly

Declaration of Competing Interest

None.

Acknowledgments

This work was conducted under the self-funded ISTI CNR-Visual Persistence collaboration agreement Number ISTI-0020363/2020. Gianpaolo Coro acknowledges the courtesy of Dr. Edith Widder and Dr. Nathan Robinson (NOAA OER) for providing a video of a rare giant squid Architeuthis dux captured at a 700 m depth by a baited underwater device, which was included in Test Case 1. The authors also acknowledge the courtesy of Alexander Wilson (University of Plymouth) to allow using a video footage on

References (80)

  • T.B. Letessier et al.

    Low-cost small action cameras in stereo generates accurate underwater measurements of fish

    J. Exp. Mar. Biol. Ecol.

    (2015)
  • D. Mallet et al.

    Underwater video techniques for observing coastal marine biodiversity: a review of sixty years of publications (1952–2012)

    Fish. Res.

    (2014)
  • H. Qin et al.

    Deepfish: accurate underwater live fish recognition with a deep architecture

    Neurocomputing

    (2016)
  • E. Quevedo et al.

    Underwater video enhancement using multi-camera super-resolution

    Opt. Commun.

    (2017)
  • T. Schaner et al.

    An inexpensive system for underwater video surveys of demersal fishes

    J. Great Lakes Res.

    (2009)
  • A.D. Wilson et al.

    Activity syndromes and metabolism in giant deep-sea isopods

    Deep-Sea Res. I Oceanogr. Res. Pap.

    (2017)
  • P. Abeles

    Boofcv Project Website

  • P.J. Apps et al.

    How camera traps work and how to work them

    Afr. J. Ecol.

    (2018)
  • Autodesk

    Autodesk Maya

  • BoofCV

    Polyline Split and Merge Process Documentation

  • E.J. Brooks et al.

    Validating the use of baited remote underwater video surveys for assessing the diversity, distribution and abundance of sharks in the Bahamas

    Endanger. Species Res.

    (2011)
  • L. Candela et al.

    Species distribution modeling in the cloud

    Concurr. Comput.

    (2016)
  • M. Cappo et al.

    Counting and measuring fish with baited video techniques-an overview

  • CBD

    Decision adopted by the Conference of the Parties to the Convention on Biological Diversity. 14/8. Protected areas and other effective area-based conservation measures

  • CNR

    Underwater Detector of Moving Object Size as-a-service

  • R. Codd-Downey et al.

    Milton: an open hardware underwater autonomous vehicle

  • R. Connolly et al.

    Improved accuracy for automated counting of a fish in baited underwater videos for stock assessment. bioRxiv

    (2021)
  • G. Coro et al.

    Parallelizing the execution of native data mining algorithms for computational biology

    Concurr. Comput.

    (2015)
  • G. Coro et al.

    Analysing and forecasting fisheries time series: purse seine in indian ocean as a case study

    ICES J. Mar. Sci.

    (2016)
  • G. Coro et al.

    A web application to publish r scripts as-a-service on a cloud computing platform

    Boll. Geofis. Teor. Appl.

    (2016)
  • G. Coro et al.

    Cloud computing in a distributed e-infrastructure using the web processing service standard

    Concurr. Comput.

    (2017)
  • D.W. Daum

    Monitoring fish wheel catch using event-triggered video technology

    N. Am. J. Fish Manag.

    (2005)
  • M. Dawkins et al.

    An open-source platform for underwater image and video analytics

  • M. Di Benedetto et al.

    Learning safety equipment detection using virtual worlds

  • M. Di Benedetto et al.

    Learning accurate personal protective equipment detection from virtual worlds

    Multimed. Tools Appl.

    (2020)
  • E.M. Ditria et al.

    Automating the analysis of fish abundance using object detection: optimizing animal ecology with deep learning

    Front. Mar. Sci.

    (2020)
  • DL4j

    Deeplearning4j: Open-Source Distributed Deep Learning for the JVM, Apache Software Foundation License 2.0

    (2016)
  • J.A. Dominquez et al.

    Binarization of Gray-Scaled Digital Images via Fuzzy Reasoning

  • K.M. Dunlop et al.

    Do agonistic behaviours bias baited remote underwater video surveys of fish?

    Mar. Ecol.

    (2015)
  • B.C. Ebner et al.

    Emergence of field-based underwater video for understanding the ecology of freshwater fishes and crustaceans in Australia

    J. R. Soc. West. Aust.

    (2014)
  • Cited by (16)

    • Environmentally adaptive fish or no-fish classification for river video fish counters using high-performance desktop and embedded hardware

      2022, Ecological Informatics
      Citation Excerpt :

      In addition to being accurate, the proposed classification process is computationally efficient, as it relies only on a few extracted frames from each video, achieving an F1 score of up to 98%. These results are comparable to the underwater deep learning-based and unsupervised object detector systems tested by Coro and Bjerregaard Walsh (2021). However, it was found that frame image enhancement can also lead to incorrect environmental condition classification.

    • An Open Science approach to infer fishing activity pressure on stocks and biodiversity from vessel tracking data

      2021, Ecological Informatics
      Citation Excerpt :

      When used within an Integrated Environmental Assessment system, our analysis can inform about fishing-pressure change in time and its potential impact on threatened species (Piet et al., 2006; Mouillot et al., 2011; Coll et al., 2012). Furthermore, our output can be the input of other models that monitor and forecast fishing activity change, stock exploitation, and population shift (Coro et al., 2016a; Coro and Walsh, 2021). Finally, our Open Science services can flexibly manage different levels of temporal aggregations (i.e., seasonal, annual, etc.) to support studies of stock distribution change across fishing periods.

    View all citing articles on Scopus
    View full text