当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
PERSA+: A Deep Learning Front-End for Context-Agnostic Audio Classification
arXiv - CS - Sound Pub Date : 2021-07-20 , DOI: arxiv-2107.09311
Lazaros Vrysis, Iordanis Thoidis, Charalampos Dimoulas, George Papanikolaou

Deep learning has been applied to diverse audio semantics tasks, enabling the construction of models that learn hierarchical levels of features from high-dimensional raw data, delivering state-of-the-art performance. But do these algorithms perform similarly in real-world conditions, or just at the benchmark, where their high learning capability assures the complete memorization of the employed datasets? This work presents a deep learning front-end, aiming at discarding detrimental information before entering the modeling stage, bringing the learning process closer to the point, anticipating the development of robust and context-agnostic classification algorithms.

中文翻译:

PERSA+:用于上下文无关音频分类的深度学习前端

深度学习已应用于各种音频语义任务,能够构建从高维原始数据中学习特征层次级别的模型,从而提供最先进的性能。但是,这些算法在现实世界中的表现是否相似,或者只是在基准测试中,它们的高学习能力确保了所用数据集的完全记忆?这项工作提出了一个深度学习前端,旨在在进入建模阶段之前丢弃有害信息,使学习过程更接近点,预测稳健和上下文无关的分类算法的发展。
更新日期:2021-07-21
down
wechat
bug