Polyphonic pitch tracking with deep layered learning.,The Journal of the Acoustical Society of America

当前位置： X-MOL 学术 › J. Acoust. Soc. Am. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Polyphonic pitch tracking with deep layered learning.
The Journal of the Acoustical Society of America ( IF 2.4 ) Pub Date : 2020-07-30 , DOI: 10.1121/10.0001468
Anders Elowsson ₁

Affiliation

This article presents a polyphonic pitch tracking system that is able to extract both framewise and note-based estimates from audio. The system uses several artificial neural networks trained individually in a deep layered learning setup. First, cascading networks are applied to a spectrogram for framewise fundamental frequency (f₀) estimation. A sparse receptive field is learned by the first network and then used as a filter kernel for parameter sharing throughout the system. The f₀ activations are connected across time to extract pitch contours. These contours define a framework within which subsequent networks perform onset and offset detection, operating across both time and smaller pitch fluctuations at the same time. As input, the networks use, e.g., variations of latent representations from the f₀ estimation network. Finally, erroneous tentative notes are removed one by one in an iterative procedure that allows a network to classify notes within a correct context. The system was evaluated on four public test sets: MAPS, Bach10, TRIOS, and the MIREX Woodwind quintet and achieved state-of-the-art results for all four datasets. It performs well across all subtasks f₀, pitched onset, and pitched offset tracking.

中文翻译：

具有深度学习功能的复音音高跟踪。

本文介绍了一种复音音高跟踪系统，该系统能够从音频中提取逐帧估计和基于音符的估计。该系统使用了几个人工神经网络，它们分别在深层学习设置中训练。首先，将级联网络应用于频谱图以进行逐帧基本频率（f ₀）估计。稀疏的接收域由第一个网络学习，然后用作整个系统中参数共享的过滤器内核。该˚F ₀跨时间激活连接以提取音高轮廓。这些轮廓定义了一个框架，随后的网络在该框架内执行开始和偏移检测，同时跨越时间和较小的音高波动进行操作。作为输入，网络使用例如来自f ₀估计网络的潜在表示的变化。最后，在允许网络在正确的上下文中对笔记进行分类的迭代过程中，错误的暂时笔记被一一删除。该系统在四个公共测试集上进行了评估：MAPS，Bach10，TRIOS和MIREX Woodwind五重奏，并获得了所有四个数据集的最新结果。它在所有子任务f ₀，俯仰开始和俯仰偏移跟踪中表现良好。

更新日期：2020-08-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文

全部期刊列表>>