当前位置: X-MOL 学术IEEE ACM Trans. Audio Speech Lang. Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Robust Harmonic Features for Classification-Based Pitch Estimation.
IEEE/ACM Transactions on Audio, Speech, and Language Processing ( IF 5.4 ) Pub Date : 2017-02-13 , DOI: 10.1109/taslp.2017.2667879
Dongmei Wang 1 , Chengzhu Yu 1 , John H L Hansen 1
Affiliation  

Pitch estimation in diverse naturalistic audio streams remains a challenge for speech processing and spoken language technology. In this study, we investigate the use of robust harmonic features for classification-based pitch estimation. The proposed pitch estimation algorithm is composed of two stages: pitch candidate generation and target pitch selection. Based on energy intensity and spectral envelope shape, five types of robust harmonic features are proposed to reflect pitch associated harmonic structure. A neural network is adopted for modeling the relationship between input harmonic features and output pitch salience for each specific pitch candidate. In the test stage, each pitch candidate is assessed with an output salience that indicates the potential as a true pitch value, based on its input feature vector processed through the neural network. Finally, according to the temporal continuity of pitch values, pitch contour tracking is performed using a hidden Markov model (HMM), and the Viterbi algorithm is used for HMM decoding. Experimental results show that the proposed algorithm outperforms several state-of-the-art pitch estimation methods in terms of accuracy in both high and low levels of additive noise.

中文翻译:

用于基于分类的基音估计的鲁棒谐波特征。

各种自然音频流中的音调估计仍然是语音处理和口语技术的挑战。在本研究中,我们研究了使用稳健的谐波特征进行基于分类的音高估计。所提出的基音估计算法由两个阶段组成:基音候选生成和目标基音选择。基于能量强度和频谱包络形状,提出了五种鲁棒谐波特征来反映与音调相关的谐波结构。采用神经网络对每个特定音高候选的输入和声特征与输出音高显着性之间的关系进行建模。在测试阶段,根据通过神经网络处理的输入特征向量,使用输出显着性来评估每个音调候选者,该输出显着性指示作为真实音调值的潜力。最后,根据基音值的时间连续性,采用隐马尔可夫模型(HMM)进行基音轮廓跟踪,并采用维特比算法进行HMM解码。实验结果表明,所提出的算法在高水平和低水平加性噪声的准确性方面均优于几种最先进的基音估计方法。
更新日期:2019-11-01
down
wechat
bug