Zero-shot Learning for Audio-based Music Classification and Tagging,arXiv - CS - Sound

当前位置： X-MOL 学术 › arXiv.cs.SD › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Zero-shot Learning for Audio-based Music Classification and Tagging
arXiv - CS - Sound Pub Date : 2019-07-05 , DOI: arxiv-1907.02670
Jeong Choi, Jongpil Lee, Jiyoung Park, and Juhan Nam

Audio-based music classification and tagging is typically based on categorical supervised learning with a fixed set of labels. This intrinsically cannot handle unseen labels such as newly added music genres or semantic words that users arbitrarily choose for music retrieval. Zero-shot learning can address this problem by leveraging an additional semantic space of labels where side information about the labels is used to unveil the relationship between each other. In this work, we investigate the zero-shot learning in the music domain and organize two different setups of side information. One is using human-labeled attribute information based on Free Music Archive and OpenMIC-2018 datasets. The other is using general word semantic information based on Million Song Dataset and Last.fm tag annotations. Considering a music track is usually multi-labeled in music classification and tagging datasets, we also propose a data split scheme and associated evaluation settings for the multi-label zero-shot learning. Finally, we report experimental results and discuss the effectiveness and new possibilities of zero-shot learning in the music domain.

中文翻译：

基于音频的音乐分类和标记的零样本学习

基于音频的音乐分类和标记通常基于具有固定标签集的分类监督学习。这本质上无法处理看不见的标签，例如用户为音乐检索随意选择的新添加的音乐流派或语义词。零样本学习可以通过利用标签的额外语义空间来解决这个问题，其中关于标签的辅助信息用于揭示彼此之间的关系。在这项工作中，我们研究了音乐领域中的零样本学习并组织了两种不同的辅助信息设置。一种是使用基于 Free Music Archive 和 OpenMIC-2018 数据集的人工标记属性信息。另一种是使用基于百万歌曲数据集和Last.fm标签注释的通用词语义信息。考虑到音乐曲目在音乐分类和标记数据集中通常是多标签的，我们还为多标签零样本学习提出了数据拆分方案和相关的评估设置。最后，我们报告了实验结果并讨论了音乐领域零样本学习的有效性和新的可能性。

更新日期：2020-03-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文