Fine‐scale observations of spatio‐spectro‐temporal dynamics of bird vocalizations using robot audition techniques,Remote Sensing in Ecology and Conservation

当前位置： X-MOL 学术 › Remote Sens. Ecol. Conserv. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Fine‐scale observations of spatio‐spectro‐temporal dynamics of bird vocalizations using robot audition techniques
Remote Sensing in Ecology and Conservation ( IF 3.9 ) Pub Date : 2020-03-15 , DOI: 10.1002/rse2.152
Shinji Sumitani ₁ , Reiji Suzuki ₁ , Shiho Matsubayashi ₂ , Takaya Arita ₁ , Kazuhiro Nakadai _{3,

4} , Hiroshi G. Okuno ₅

Affiliation

Ecoacoustics needs sophisticated acoustic monitoring tools to extract a wide level of features from an observed mixture of sounds. We have developed a portable acoustic monitoring system called ‘HARKBird’ which consists of a laptop PC and an inexpensive commercial microphone array with the robot audition software HARK. HARKBird can extract acoustic events in a recording, and we can obtain the begin and end timings, the spatial information (e.g., position or direction from the microphone array), and the spectrogram of the sound separated from the original recording. In this study, we report how robot audition techniques contribute to monitoring spatio‐spectro‐temporal dynamics of bird behaviors, using an extended and minimal system based on multiple microphone arrays. The dimension reduction of separated sounds is important to integrate the information from multiple microphone arrays. As a dimension reduction algorithm, we use t‐SNE to help manual annotation of each sound and to generate the vocalization distribution automatically. We conduct playback experiments to Spotted Towhee (Pipilo maculatus) to simulate different cases of territorial intrusions (song/call/no playback). Our hypothesis in playback experiments is that playback of conspecific vocalizations would invoke aggressive responses of males against song playbacks and the effects would be more prominent than those of call playbacks. Our primary aim is to test whether our system can extract the necessary information on the aggressiveness of target individuals to examine our hypothesis. We show the system with manual annotation of vocalizations can extract their different spatio‐spectro‐temporal dynamics in different conditions, which supported our hypothesis. We also consider the spectral affinity‐based automatic matching of localized sounds from different microphone arrays. The relative number of localized songs depending on the playback conditions reflected a similar trend to those in the manual approach, implying that we can grasp the long‐term dynamics of vocalizations without costly annotations.

中文翻译：

利用机器人试听技术对鸟类发声的时空动态进行精细观测

生态声学需要先进的声学监测工具，才能从观察到的声音混合中提取出广泛的特征。我们已经开发了一种名为“ HARKBird”的便携式声学监控系统，该系统由一台笔记本电脑和一个带有机器人试听软件HARK的廉价商用麦克风阵列组成。HARKBird可以提取记录中的声音事件，我们可以获取开始和结束时间，空间信息（例如，来自麦克风阵列的位置或方向）以及与原始记录分离的声音的声谱图。在这项研究中，我们报告了使用基于多个麦克风阵列的扩展且最小的系统，机器人试听技术如何有助于监视鸟类行为的时空动态。减小分离声音的尺寸对于整合来自多个麦克风阵列的信息非常重要。作为降维算法，我们使用t-SNE来帮助手动注释每种声音并自动生成发声分布。我们对Spoted Towhee（黄斑）以模拟不同情况下的领土入侵（歌曲/通话/不播放）。我们在回放实验中的假设是，特定声音的回放将引起男性对歌曲回放的积极反应，并且其效果将比通话回放更为突出。我们的主要目的是测试我们的系统是否可以提取关于目标个人的攻击性的必要信息，以检验我们的假设。我们展示了带有人工发声注释的系统可以在不同条件下提取其不同的时空动态，这支持了我们的假设。我们还考虑了基于频谱亲和力的来自不同麦克风阵列的本地声音的自动匹配。

更新日期：2020-03-15

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文