当前位置: X-MOL 学术arXiv.cs.SD › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Towards Robust Deep Neural Networks for Affect and Depression Recognition from Speech
arXiv - CS - Sound Pub Date : 2019-11-01 , DOI: arxiv-1911.00310
Alice Othmani, Daoud Kadoch, Kamil Bentounes, Emna Rejaibi, Romain Alfred, and Abdenour Hadid

Intelligent monitoring systems and affective computing applications have emerged in recent years to enhance healthcare. Examples of these applications include assessment of affective states such as Major Depressive Disorder (MDD). MDD describes the constant expression of certain emotions: negative emotions (low Valence) and lack of interest (low Arousal). High-performing intelligent systems would enhance MDD diagnosis in its early stages. In this paper, we present a new deep neural network architecture, called EmoAudioNet, for emotion and depression recognition from speech. Deep EmoAudioNet learns from the time-frequency representation of the audio signal and the visual representation of its spectrum of frequencies. Our model shows very promising results in predicting affect and depression. It works similarly or outperforms the state-of-the-art methods according to several evaluation metrics on RECOLA and on DAIC-WOZ datasets in predicting arousal, valence, and depression. Code of EmoAudioNet is publicly available on GitHub: https://github.com/AliceOTHMANI/EmoAudioNet

中文翻译:

从语音中建立用于情感和抑郁识别的鲁棒深度神经网络

近年来出现了智能监控系统和情感计算应用程序,以增强医疗保健。这些应用的示例包括情感状态评估,例如重度抑郁症 (MDD)。MDD 描述了某些情绪的持续表达:消极情绪(低价)和缺乏兴趣(低唤醒)。高性能智能系统将在早期阶段增强 MDD 诊断。在本文中,我们提出了一种新的深度神经网络架构,称为 EmoAudioNet,用于从语音中识别情绪和抑郁。Deep EmoAudioNet 从音频信号的时频表示及其频谱的视觉表示中学习。我们的模型在预测情绪和抑郁方面显示出非常有希望的结果。根据 RECOLA 和 DAIC-WOZ 数据集上的几个评估指标,它在预测觉醒、效价和抑郁方面的效果与最先进的方法相似或优于最先进的方法。EmoAudioNet 的代码在 GitHub 上公开:https://github.com/AliceOTHMANI/EmoAudioNet
更新日期:2020-11-19
down
wechat
bug