Speech Emotion Recognition Based on Multi-feature and Multi-lingual Fusion,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Speech Emotion Recognition Based on Multi-feature and Multi-lingual Fusion
arXiv - CS - Computation and Language Pub Date : 2020-01-16 , DOI: arxiv-2001.05908
Chunyi Wang

A speech emotion recognition algorithm based on multi-feature and Multi-lingual fusion is proposed in order to resolve low recognition accuracy caused by lack of large speech dataset and low robustness of acoustic features in the recognition of speech emotion. First, handcrafted and deep automatic features are extracted from existing data in Chinese and English speech emotions. Then, the various features are fused respectively. Finally, the fused features of different languages are fused again and trained in a classification model. Distinguishing the fused features with the unfused ones, the results manifest that the fused features significantly enhance the accuracy of speech emotion recognition algorithm. The proposed solution is evaluated on the two Chinese corpus and two English corpus, and is shown to provide more accurate predictions compared to original solution. As a result of this study, the multi-feature and Multi-lingual fusion algorithm can significantly improve the speech emotion recognition accuracy when the dataset is small.

中文翻译：

基于多特征多语言融合的语音情感识别

针对语音情感识别中由于语音数据量不足、声学特征鲁棒性低等问题，提出了一种基于多特征多语言融合的语音情感识别算法。首先，从现有的中英文语音情感数据中提取手工制作的深度自动特征。然后，分别融合各种特征。最后，将不同语言的融合特征再次融合并在分类模型中进行训练。区分融合特征和未融合特征，结果表明融合特征显着提高了语音情感识别算法的准确性。提出的解决方案在两个中文语料库和两个英文语料库上进行评估，并且与原始解决方案相比，可以提供更准确的预测。作为本研究的结果，当数据集较小时，多特征和多语言融合算法可以显着提高语音情感识别的准确率。

更新日期：2020-01-17

点击分享查看原文

点击收藏

阅读更多本刊最新论文