当前位置: X-MOL 学术IEEE Trans. Consum. Electron. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Real-Time Speech Emotion Analysis for Smart Home Assistants
IEEE Transactions on Consumer Electronics ( IF 4.3 ) Pub Date : 2021-02-10 , DOI: 10.1109/tce.2021.3056421
Rajdeep Chatterjee , Saptarshi Mazumdar , R. Simon Sherratt , Rohit Halder , Tanmoy Maitra , Debasis Giri

Artificial Intelligence (AI) based Speech Emotion Recognition (SER) has been widely used in the consumer field for control of smart home personal assistants, with many such devices on the market. However, with the increase in computational power, connectivity, and the need to enable people to live in the home for longer though the use of technology, then smart home assistants that could detect human emotion will improve the communication between a user and the assistant enabling the assistant of offer more productive feedback. Thus, the aim of this work is to analyze emotional states in speech and propose a suitable method considering performance verses complexity for deployment in Consumer Electronics home products, and to present a practical live demonstration of the research. In this article, a comprehensive approach has been introduced for the human speech-based emotion analysis. The 1-D convolutional neural network (CNN) has been implemented to learn and classify the emotions associated with human speech. The paper has been implemented on the standard datasets (emotion classification) Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) and Toronto Emotional Speech Set database (TESS) (Young and Old). The proposed approach gives 90.48%, 95.79% and 94.47% classification accuracies in the aforementioned datasets. We conclude that the 1-D CNN classification models used in speaker-independent experiments are highly effective in the automatic prediction of emotion and are ideal for deployment in smart home assistants to detect emotion.

中文翻译:


智能家居助理的实时语音情感分析



基于人工智能(AI)的语音情绪识别(SER)已广泛应用于消费领域,用于控制智能家庭个人助理,市场上有许多此类设备。然而,随着计算能力、连接性的增强,以及通过技术使人们能够在家里生活更长时间的需求,能够检测人类情绪的智能家居助理将改善用户和助理之间的沟通,从而使提供更有成效的反馈的助手。因此,这项工作的目的是分析语音中的情绪状态,并提出一种考虑性能与复杂性的合适方法,以便在消费电子家用产品中部署,并提供该研究的实际现场演示。在本文中,介绍了一种基于人类语音的情感分析的综合方法。一维卷积神经网络 (CNN) 已被用来学习和分类与人类语音相关的情绪。该论文已在标准数据集(情感分类)瑞尔森情感语音和歌曲视听数据库(RAVDESS)和多伦多情感语音集数据库(TESS)(年轻人和老年人)上实现。所提出的方法在上述数据集中给出了 90.48%、95.79% 和 94.47% 的分类准确率。我们得出的结论是,在与说话人无关的实验中使用的一维 CNN 分类模型在自动预测情绪方面非常有效,并且非常适合部署在智能家居助手中检测情绪。
更新日期:2021-02-10
down
wechat
bug