SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context,arXiv - CS - Sound

当前位置： X-MOL 学术 › arXiv.cs.SD › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

SONYC-UST-V2: An Urban Sound Tagging Dataset with Spatiotemporal Context
arXiv - CS - Sound Pub Date : 2020-09-11 , DOI: arxiv-2009.05188
Mark Cartwright, Jason Cramer, Ana Elisa Mendez Mendez, Yu Wang, Ho-Hsiang Wu, Vincent Lostanlen, Magdalena Fuentes, Graham Dove, Charlie Mydlarz, Justin Salamon, Oded Nov, and Juan Pablo Bello

We present SONYC-UST-V2, a dataset for urban sound tagging with spatiotemporal information. This dataset is aimed for the development and evaluation of machine listening systems for real-world urban noise monitoring. While datasets of urban recordings are available, this dataset provides the opportunity to investigate how spatiotemporal metadata can aid in the prediction of urban sound tags. SONYC-UST-V2 consists of 18510 audio recordings from the "Sounds of New York City" (SONYC) acoustic sensor network, including the timestamp of audio acquisition and location of the sensor. The dataset contains annotations by volunteers from the Zooniverse citizen science platform, as well as a two-stage verification with our team. In this article, we describe our data collection procedure and propose evaluation metrics for multilabel classification of urban sound tags. We report the results of a simple baseline model that exploits spatiotemporal information.

中文翻译：

SONYC-UST-V2：具有时空上下文的城市声音标记数据集

我们提出了 SONYC-UST-V2，这是一个具有时空信息的城市声音标记数据集。该数据集旨在开发和评估用于现实世界城市噪声监测的机器监听系统。虽然城市录音数据集可用，但该数据集提供了研究时空元数据如何帮助预测城市声音标签的机会。SONYC-UST-V2 由来自“纽约市之声”（SONYC）声学传感器网络的 18510 条录音组成，包括音频采集的时间戳和传感器的位置。该数据集包含来自 Zooniverse 公民科学平台的志愿者的注释，以及与我们团队的两阶段验证。在本文中，我们描述了我们的数据收集程序，并提出了城市声音标签多标签分类的评估指标。我们报告了一个利用时空信息的简单基线模型的结果。

更新日期：2020-09-14

点击分享查看原文

点击收藏

阅读更多本刊最新论文