Audio-to-Score Alignment Using Deep Automatic Music Transcription,arXiv - CS - Multimedia

当前位置： X-MOL 学术 › arXiv.cs.MM › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Audio-to-Score Alignment Using Deep Automatic Music Transcription
arXiv - CS - Multimedia Pub Date : 2021-07-27 , DOI: arxiv-2107.12854
Federico Simonetta, Stavros Ntalampiras, Federico Avanzini

Audio-to-score alignment (A2SA) is a multimodal task consisting in the alignment of audio signals to music scores. Recent literature confirms the benefits of Automatic Music Transcription (AMT) for A2SA at the frame-level. In this work, we aim to elaborate on the exploitation of AMT Deep Learning (DL) models for achieving alignment at the note-level. We propose a method which benefits from HMM-based score-to-score alignment and AMT, showing a remarkable advancement beyond the state-of-the-art. We design a systematic procedure to take advantage of large datasets which do not offer an aligned score. Finally, we perform a thorough comparison and extensive tests on multiple datasets.

中文翻译：

使用深度自动音乐转录的音频到乐谱对齐

音频到乐谱对齐 (A2SA) 是一项多模式任务，包括将音频信号与乐谱对齐。最近的文献证实了自动音乐转录 (AMT) 在帧级 A2SA 的好处。在这项工作中，我们旨在详细说明如何利用 AMT 深度学习 (DL) 模型来实现音符级别的对齐。我们提出了一种受益于基于 HMM 的分数到分数对齐和 AMT 的方法，显示出超越最先进技术的显着进步。我们设计了一个系统的程序来利用不提供对齐分数的大型数据集。最后，我们对多个数据集进行了彻底的比较和广泛的测试。

更新日期：2021-07-28

点击分享查看原文

点击收藏

阅读更多本刊最新论文