当前位置: X-MOL 学术arXiv.cs.HC › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Emo-CNN for Perceiving Stress from Audio Signals: A Brain Chemistry Approach
arXiv - CS - Human-Computer Interaction Pub Date : 2020-01-08 , DOI: arxiv-2001.02329
Anup Anand Deshmukh, Catherine Soladie, Renaud Seguier

Emotion plays a key role in many applications like healthcare, to gather patients emotional behavior. There are certain emotions which are given more importance due to their effectiveness in understanding human feelings. In this paper, we propose an approach that models human stress from audio signals. The research challenge in speech emotion detection is defining the very meaning of stress and being able to categorize it in a precise manner. Supervised Machine Learning models, including state of the art Deep Learning classification methods, rely on the availability of clean and labelled data. One of the problems in affective computation and emotion detection is the limited amount of annotated data of stress. The existing labelled stress emotion datasets are highly subjective to the perception of the annotator. We address the first issue of feature selection by exploiting the use of traditional MFCC features in Convolutional Neural Network. Our experiments show that Emo-CNN consistently and significantly outperforms the popular existing methods over multiple datasets. It achieves 90.2% categorical accuracy on the Emo-DB dataset. To tackle the second and the more significant problem of subjectivity in stress labels, we use Lovheim's cube, which is a 3-dimensional projection of emotions. The cube aims at explaining the relationship between these neurotransmitters and the positions of emotions in 3D space. The learnt emotion representations from the Emo-CNN are mapped to the cube using three component PCA (Principal Component Analysis) which is then used to model human stress. This proposed approach not only circumvents the need for labelled stress data but also complies with the psychological theory of emotions given by Lovheim's cube. We believe that this work is the first step towards creating a connection between Artificial Intelligence and the chemistry of human emotions.

中文翻译:

从音频信号中感知压力的 Emo-CNN:一种大脑化学方法

情感在医疗保健等许多应用中起着关键作用,可以收集患者的情感行为。由于某些情绪在理解人类感受方面的有效性,因此它们更加重要。在本文中,我们提出了一种从音频信号模拟人类压力的方法。语音情感检测的研究挑战是定义压力的真正含义并能够以精确的方式对其进行分类。监督机器学习模型,包括最先进的深度学习分类方法,依赖于干净和标记数据的可用性。情感计算和情感检测的问题之一是压力的注释数据数量有限。现有的标记压力情绪数据集对注释者的感知具有高度主观性。我们通过利用卷积神经网络中传统 MFCC 特征的使用来解决特征选择的第一个问题。我们的实验表明,在多个数据集上,Emo-CNN 始终且显着地优于流行的现有方法。它在 Emo-DB 数据集上实现了 90.2% 的分类准确率。为了解决压力标签中第二个也是更重要的主观性问题,我们使用了洛夫海姆立方体,它是情绪的 3 维投影。立方体旨在解释这些神经递质与情绪在 3D 空间中的位置之间的关系。从 Emo-CNN 学习到的情感表征使用三分量 PCA(主成分分析)映射到立方体,然后用于模拟人类压力。这种提议的方法不仅避免了对标记压力数据的需求,而且符合洛夫海姆立方体给出的情绪心理学理论。我们相信,这项工作是在人工智能与人类情感化学之间建立联系的第一步。
更新日期:2020-01-09
down
wechat
bug