当前位置: X-MOL 学术EURASIP J. Audio Speech Music Proc. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Piano multipitch estimation using sparse coding embedded deep learning
EURASIP Journal on Audio, Speech, and Music Processing ( IF 1.7 ) Pub Date : 2018-09-12 , DOI: 10.1186/s13636-018-0132-x
Xingda Li , Yujing Guan , Yingnian Wu , Zhongbo Zhang

As the foundation of many applications, multipitch estimation problem has always been the focus of acoustic music processing; however, existing algorithms perform deficiently due to its complexity. In this paper, we employ deep learning to address piano multipitch estimation problem by proposing MPENet based on a novel multimodal sparse incoherent non-negative matrix factorization (NMF) layer. This layer originates from a multimodal NMF problem with Lorentzian-BlockFrobenius sparsity constraint and incoherentness regularization. Experiments show that MPENet achieves state-of-the-art performance (83.65% F-measure for polyphony level 6) on RAND subset of MAPS dataset. MPENet enables NMF to do online learning and accomplishes multi-label classification by using only monophonic samples as training data. In addition, our layer algorithms can be easily modified and redeveloped for a wide variety of problems.

中文翻译:

使用稀疏编码嵌入深度学习的钢琴多音高估计

作为许多应用的基础,多音高估计问题一直是原声音乐处理的重点;然而,现有算法由于其复杂性而表现不佳。在本文中,我们通过提出基于新型多模态稀疏非相干非负矩阵分解 (NMF) 层的 MPENet,采用深度学习来解决钢琴多音高估计问题。该层源自具有 Lorentzian-BlockFrobenius 稀疏约束和非相干正则化的多模态 NMF 问题。实验表明,MPENet 在 MAPS 数据集的 RAND 子集上实现了最先进的性能(复音级别 6 的 F-measure 为 83.65%)。MPENet 使 NMF 能够进行在线学习,并通过仅使用单音样本作为训练数据来完成多标签分类。此外,
更新日期:2018-09-12
down
wechat
bug