当前位置: X-MOL 学术Multimed. Tools Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Modified dense convolutional networks based emotion detection from speech using its paralinguistic features
Multimedia Tools and Applications ( IF 3.0 ) Pub Date : 2021-07-22 , DOI: 10.1007/s11042-021-11210-6
Ritika Dhiman 1 , Gurkanwal Singh Kang 1 , Varun Gupta 1
Affiliation  

Emotion recognition through speech is one of the fundamental approaches for human interaction. Speech modulations stipulate different emotions and context. In this paper, we propose modified dense convolutional networks (modified DenseNet201) for emotion detection from speech using its paralinguistic features such as vocal tract features. The proposed network performs emotion classification from speech using spectrograms of its audio files. The proposed network outperforms other alternative models like residual networks, AlexNet, VGG16, SVM, XGBoost, boosted random forest etc. for emotion classification from speech. Moreover, the proposed network surpasses all other existing methods proposed in the literature and obtains state-of-the-art results in most of the cases. Further, the proposed network has been successfully validated on two different language datasets: ‘EmoDB’ and ‘SAVEE’ which qualifies it as a language-independent emotion detection system from speech.



中文翻译:

使用其副语言特征的基于语音情感检测的改进密集卷积网络

通过语音进行情感识别是人类交互的基本方法之一。语音调制规定了不同的情绪和语境。在本文中,我们提出了改进的密集卷积网络(改进的 DenseNet201),使用其副语言特征(如声道特征)从语音中检测情感。提议的网络使用其音频文件的频谱图从语音中执行情感分类。所提出的网络在语音情感分类方面优于其他替代模型,如残差网络、AlexNet、VGG16、SVM、XGBoost、增强随机森林等。此外,所提出的网络超越了文献中提出的所有其他现有方法,并在大多数情况下获得了最先进的结果。更多,

更新日期:2021-07-22
down
wechat
bug