当前位置: X-MOL 学术Remote Sens. Ecol. Conserv. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Automated identification of avian vocalizations with deep convolutional neural networks
Remote Sensing in Ecology and Conservation ( IF 3.9 ) Pub Date : 2019-09-03 , DOI: 10.1002/rse2.125
Zachary J. Ruff 1 , Damon B. Lesmeister 1, 2 , Leila S. Duchac 1, 2, 3 , Bharath K. Padmaraju 4 , Christopher M. Sullivan 4
Affiliation  

Passive acoustic monitoring is an emerging approach to wildlife monitoring that leverages recent improvements in automated recording units and other technologies. A central challenge of this approach is the task of locating and identifying target species vocalizations in large volumes of audio data. To address this issue, we developed an efficient data processing pipeline using a deep convolutional neural network (CNN) to automate the detection of owl vocalizations in spectrograms generated from unprocessed field recordings. While the project was initially focused on spotted and barred owls, we also trained the network to recognize northern saw‐whet owl, great horned owl, northern pygmy‐owl, and western screech‐owl. Although classification performance varies across species, initial results are promising. Recall, or the proportion of calls in the dataset that are detected and correctly identified, ranged from 63.1% for barred owl to 91.5% for spotted owl based on raw network output. Precision, the rate of true positives among apparent detections, ranged from 0.4% for spotted owl to 77.1% for northern saw‐whet owl based on raw output. In limited tests, the CNN performed as well as or better than human technicians at detecting owl calls. Our model output is suitable for developing species encounter histories for occupancy models and other analyses. We believe our approach is sufficiently general to support long‐term, large‐scale monitoring of a broad range of species beyond our target species list, including birds, mammals, and others.

中文翻译:

深度卷积神经网络自动识别禽鸟发声

被动声学监测是一种新兴的野生生物监测方法,它利用了自动记录装置和其他技术的最新改进。这种方法的主要挑战是在大量音频数据中定位和识别目标物种发声的任务。为了解决这个问题,我们开发了使用深度卷积神经网络(CNN)的有效数据处理管道,以自动检测未处理的现场记录所产生的频谱图中的猫头鹰发声。当项目最初集中在斑点猫头鹰和条纹猫头鹰时,我们还训练了网络以识别北部的锯齿猫头鹰,大角,、北部的侏儒猫头鹰和西部的尖叫猫头鹰。尽管分类性能因物种而异,但初步结果令人鼓舞。召回,或在数据集中检测到并正确识别的呼叫比例,根据原始网络的输出,从条纹猫头鹰的63.1%到斑点猫头鹰的91.5%不等。根据原始产量,精确度(表观检测中的真实阳性率)范围从斑点猫头鹰的0.4%到北部锯齿猫头鹰的77.1%不等。在有限的测试中,CNN在检测猫头鹰鸣叫方面的表现与人类技术人员相同或更好。我们的模型输出适合于开发物种遭遇历史的占用模型和其他分析。我们认为我们的方法具有足够的通用性,可以支持对目标物种清单以外的广泛物种进行长期,大规模的监测,包括鸟类,哺乳动物和其他物种。基于原始网络输出的斑点猫头鹰的5%。根据原始产量,精确度(表观检测中的真实阳性率)范围从斑点猫头鹰的0.4%到北部锯齿猫头鹰的77.1%不等。在有限的测试中,CNN在检测猫头鹰鸣叫方面的表现与人类技术人员相同或更好。我们的模型输出适合于开发物种遭遇历史的占用模型和其他分析。我们认为我们的方法具有足够的通用性,可以支持对目标物种清单以外的广泛物种进行长期,大规模的监测,包括鸟类,哺乳动物和其他物种。基于原始网络输出的斑点猫头鹰的5%。根据原始产量,精确度(表观检测中的真实阳性率)范围从斑点猫头鹰的0.4%到北部锯齿猫头鹰的77.1%不等。在有限的测试中,CNN在检测猫头鹰鸣叫方面的表现与人类技术人员相同或更好。我们的模型输出适合于开发物种遭遇历史的占用模型和其他分析。我们认为我们的方法具有足够的通用性,可以支持对目标物种清单以外的广泛物种进行长期,大规模的监测,包括鸟类,哺乳动物和其他物种。CNN在检测猫头鹰鸣叫方面表现优于或优于人类技术人员。我们的模型输出适合于开发物种遭遇历史的占用模型和其他分析。我们认为我们的方法具有足够的通用性,可以支持对目标物种清单以外的广泛物种进行长期,大规模的监测,包括鸟类,哺乳动物和其他物种。CNN在检测猫头鹰鸣叫方面表现优于或优于人类技术人员。我们的模型输出适合于开发物种遭遇历史的占用模型和其他分析。我们认为我们的方法具有足够的通用性,可以支持对目标物种清单以外的广泛物种进行长期,大规模的监测,包括鸟类,哺乳动物和其他物种。
更新日期:2019-09-03
down
wechat
bug