当前位置: X-MOL 学术IEEE Signal Proc. Mag. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
The Constant-Q Harmonic Coefficients: A timbre feature designed for music signals [Lecture Notes]
IEEE Signal Processing Magazine ( IF 9.4 ) Pub Date : 5-6-2022 , DOI: 10.1109/msp.2021.3138870
Zafar Rafii 1
Affiliation  

Timbre is the attribute of sound that makes, for example, two musical instruments playing the same note sound different. It is typically associated with the spectral (but also the temporal) envelope and assumed to be independent from the pitch (but also the loudness) of the sound [1]. This article shows how to design a simple but effective pitch-independent timbre feature, well adapted to musical data, by deriving it from the constant-Q transform (CQT), a log-frequency transform that matches the typical Western musical scale [2], [3]. The decomposition of the CQT spectrum into an energy-normalized pitch component and a pitch-normalized spectral component is demonstrated, the latter from which a number of harmonic coefficients are extracted. The discriminative powers of these constant-Q harmonic coefficients (CQHCs) are then evaluated on the NSynth data set [4], a publicly available, large-scale data set of musical notes, where they are compared with the mel-frequency cepstral coefficients (MFCCs) [5], a feature originally designed for speech recognition but commonly used to characterize timbre in music.

中文翻译:


恒定 Q 谐波系数:专为音乐信号设计的音色特征 [讲座笔记]



音色是声音的属性,例如,它使演奏相同音符的两种乐器发出不同的声音。它通常与频谱(以及时间)包络相关,并被假定与声音的音调(以及响度)无关[1]。本文展示了如何通过从恒定 Q 变换 (CQT)(一种与典型西方音阶相匹配的对数频率变换)中导出来设计一种简单但有效的、与音调无关的音色特征,该特征非常适合音乐数据 [2] ,[3]。演示了 CQT 频谱分解为能量归一化音调分量和音调归一化频谱分量,从中提取多个谐波系数。然后在 NSynth 数据集 [4](一个公开的大规模音符数据集)上评估这些恒定 Q 谐波系数 (CQHC) 的判别力,并将它们与梅尔频率倒谱系数进行比较( MFCC)[5],该功能最初是为语音识别而设计的,但通常用于表征音乐中的音色。
更新日期:2024-08-26
down
wechat
bug