Spectral__emporal Receptive Field-Based Descriptors and Hierarchical Cascade Deep Belief Network for Guitar Playing Technique Classification,IEEE Transactions on Cybernetics

当前位置： X-MOL 学术 › IEEE Trans. Cybern. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Spectral__emporal Receptive Field-Based Descriptors and Hierarchical Cascade Deep Belief Network for Guitar Playing Technique Classification
IEEE Transactions on Cybernetics ( IF 9.4 ) Pub Date : 9-16-2020 , DOI: 10.1109/tcyb.2020.3014207
Chien-Yao Wang , Pao-Chi Chang , Jian-Jiun Ding , Tzu-Chiang Tai , Andri Santoso , Yu-Ting Liu , Jia-Ching Wang

Music information retrieval is of great interest in audio signal processing. However, relatively little attention has been paid to the playing techniques of musical instruments. This work proposes an automatic system for classifying guitar playing techniques (GPTs). Automatic classification for GPTs is challenging because some playing techniques differ only slightly from others. This work presents a new framework for GPT classification: it uses a new feature extraction method based on spectral_temporal receptive fields (STRFs) to extract features from guitar sounds. This work applies a supervised deep learning approach to classify GPTs. Specifically, a new deep learning model, called the hierarchical cascade deep belief network (HCDBN), is proposed to perform automatic GPT classification. Several simulations were performed and the datasets of: 1) data on onsets of signals; 2) complete audio signals; and 3) audio signals in a real-world environment are adopted to compare the performance. The proposed system improves upon the F-score by approximately 11.47% in setup 1) and yields an F-score of 96.82% in setup 2). The results in setup 3) demonstrate that the proposed system also works well in a real-world environment. These results show that the proposed system is robust and has very high accuracy in automatic GPT classification.

中文翻译：

用于吉他演奏技术分类的 Spectral__emporal 基于感受场的描述符和分层级联深度置信网络

音乐信息检索在音频信号处理中引起了极大的兴趣。然而，对于乐器的演奏技巧却很少受到关注。这项工作提出了一种对吉他演奏技术（GPT）进行分类的自动系统。 GPT 的自动分类具有挑战性，因为某些演奏技术与其他演奏技术仅略有不同。这项工作提出了一种新的 GPT 分类框架：它使用基于频谱时间感受野（STRF）的新特征提取方法来从吉他声音中提取特征。这项工作应用监督深度学习方法对 GPT 进行分类。具体来说，提出了一种称为分层级联深度信念网络（HCDBN）的新深度学习模型来执行自动 GPT 分类。进行了多次模拟，数据集为：1) 信号出现的数据； 2）完整的音频信号； 3）采用真实环境中的音频信号来比较性能。所提出的系统在设置 1) 中将 F 分数提高了约 11.47%，在设置 2) 中产生了 96.82% 的 F 分数。设置 3) 中的结果表明，所提出的系统在现实环境中也运行良好。这些结果表明，所提出的系统具有鲁棒性，并且在自动 GPT 分类中具有非常高的准确度。

更新日期：2024-08-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11