当前位置: X-MOL 学术IET Signal Process. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Hilbert–Huang–Hurst-based non-linear acoustic feature vector for emotion classification with stochastic models and learning systems
IET Signal Processing ( IF 1.1 ) Pub Date : 2020-10-02 , DOI: 10.1049/iet-spr.2019.0383
Vinícius Vieira 1 , Rosângela Coelho 2 , Francisco Marcos Assis 1
Affiliation  

This study presents a widespread analysis of affective vocal expression classification systems. In this study, the Hilbert–Huang–Hurst coefficient (HHHC) vector is proposed as a non-linear vocal source feature to represent the emotional states according to their effects on the speech production mechanism. Affective states are highlighted by the empirical mode decomposition-based method, which exploits the non-stationarity of the acoustic variations. Hurst coefficients are then estimated from the decomposition modes to form the feature vector. Additionally, a vector of the index of non-stationarity (INS) is introduced as dynamic information to the HHHC. The proposed feature vector is evaluated in speech emotion classification experiments with three databases in German and English languages. Three state-of-the-art acoustic feature vectors are adopted as a baseline. The -integrated Gaussian mixture model ( -GMM) is also introduced for the emotion representation and classification. Its performance is compared to competing for stochastic and machine learning classifiers. Results demonstrate that the HHHC leads to significant classification improvement when compared to the baseline acoustic feature vectors. Moreover, results also show that the -GMM outperforms the competing classification methods. Finally, the complementarity aspects of HHHC and INS are also evaluated for the GeMAPS and eGeMAPS feature sets.

中文翻译:

基于Hilbert–Huang–Hurst的非线性声学特征向量,用于基于随机模型和学习系统的情感分类

这项研究提出了对情感人声表达分类系统的广泛分析。在这项研究中,希尔伯特-黄-赫斯特系数(HHHC)矢量被提出作为非线性声源特征,根据其对语音产生机制的影响来表示情绪状态。基于经验模式分解的方法突出了情感状态,该方法利用了声学变化的非平稳性。然后根据分解模式估计赫斯特系数,以形成特征向量。另外,将非平稳性指数(INS)的向量作为动态信息引入到HHHC。在语音情感分类实验中,使用三个德语和英语数据库对所提出的特征向量进行了评估。三个最新的声学特征向量被用作基线。的 集成高斯混合模型( -GMM)也用于情感表示和分类。将其性能与竞争随机和机器学习分类器进行比较。结果表明,与基线声学特征向量相比,HHHC可以显着改善分类。而且,结果还表明 -GMM优于竞争分类方法。最后,还针对GeMAPS和eGeMAPS功能集评估了HHHC和INS的互补性。
更新日期:2020-10-06
down
wechat
bug