Piano multipitch estimation using sparse coding embedded deep learning,EURASIP Journal on Audio, Speech, and Music Processing

当前位置： X-MOL 学术 › EURASIP J. Audio Speech Music Proc. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Piano multipitch estimation using sparse coding embedded deep learning
EURASIP Journal on Audio, Speech, and Music Processing ( IF 1.7 ) Pub Date : 2018-09-12 , DOI: 10.1186/s13636-018-0132-x
Xingda Li , Yujing Guan , Yingnian Wu , Zhongbo Zhang

As the foundation of many applications, multipitch estimation problem has always been the focus of acoustic music processing; however, existing algorithms perform deficiently due to its complexity. In this paper, we employ deep learning to address piano multipitch estimation problem by proposing MPENet based on a novel multimodal sparse incoherent non-negative matrix factorization (NMF) layer. This layer originates from a multimodal NMF problem with Lorentzian-BlockFrobenius sparsity constraint and incoherentness regularization. Experiments show that MPENet achieves state-of-the-art performance (83.65% F-measure for polyphony level 6) on RAND subset of MAPS dataset. MPENet enables NMF to do online learning and accomplishes multi-label classification by using only monophonic samples as training data. In addition, our layer algorithms can be easily modified and redeveloped for a wide variety of problems.

中文翻译：

使用稀疏编码嵌入深度学习的钢琴多音高估计

作为许多应用的基础，多音高估计问题一直是原声音乐处理的重点；然而，现有算法由于其复杂性而表现不佳。在本文中，我们通过提出基于新型多模态稀疏非相干非负矩阵分解 (NMF) 层的 MPENet，采用深度学习来解决钢琴多音高估计问题。该层源自具有 Lorentzian-BlockFrobenius 稀疏约束和非相干正则化的多模态 NMF 问题。实验表明，MPENet 在 MAPS 数据集的 RAND 子集上实现了最先进的性能（复音级别 6 的 F-measure 为 83.65%）。MPENet 使 NMF 能够进行在线学习，并通过仅使用单音样本作为训练数据来完成多标签分类。此外，

更新日期：2018-09-12

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文