当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Over-Parameterization and Generalization in Audio Classification
arXiv - CS - Sound Pub Date : 2021-07-19 , DOI: arxiv-2107.08933
Khaled Koutini, Hamid Eghbal-zadeh, Florian Henkel, Jan Schlüter, Gerhard Widmer

Convolutional Neural Networks (CNNs) have been dominating classification tasks in various domains, such as machine vision, machine listening, and natural language processing. In machine listening, while generally exhibiting very good generalization capabilities, CNNs are sensitive to the specific audio recording device used, which has been recognized as a substantial problem in the acoustic scene classification (DCASE) community. In this study, we investigate the relationship between over-parameterization of acoustic scene classification models, and their resulting generalization abilities. Specifically, we test scaling CNNs in width and depth, under different conditions. Our results indicate that increasing width improves generalization to unseen devices, even without an increase in the number of parameters.

中文翻译:

音频分类中的过度参数化和泛化

卷积神经网络 (CNN) 一直主导着各种领域的分类任务,例如机器视觉、机器听力和自然语言处理。在机器聆听中,虽然通常表现出非常好的泛化能力,但 CNN 对所使用的特定录音设备很敏感,这已被公认为声场景分类 (DCASE) 社区中的一个重大问题。在这项研究中,我们研究了声学场景分类模型的过度参数化与其产生的泛化能力之间的关系。具体来说,我们在不同条件下测试了 CNN 的宽度和深度缩放。我们的结果表明,即使不增加参数数量,增加宽度也可以提高对看不见的设备的泛化能力。
更新日期:2021-07-20
down
wechat
bug