Video-Based Depression Level Analysis by Encoding Deep Spatiotemporal Features,IEEE Transactions on Affective Computing

当前位置： X-MOL 学术 › IEEE Trans. Affect. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Video-Based Depression Level Analysis by Encoding Deep Spatiotemporal Features
IEEE Transactions on Affective Computing ( IF 9.6 ) Pub Date : 2018-01-01 , DOI: 10.1109/taffc.2018.2870884
Mohamad Al Jazaery , Guodong Guo

As a serious mood disorder problem, depression causes severe symptoms that affect how people feel, think, and handle daily activities, such as sleeping, eating, or working. In this paper, a novel framework is proposed to estimate the Beck Depression Inventory II (BDI-II) values from video data, which uses a 3D convolutional neural network to automatically learn the spatiotemporal features at two different face scales. Then, a Recurrent Neural Network (RNN) is used to learn further from the sequence of the spatiotemporal information. This formulation, called RNN-C3D, can model the local and global spatiotemporal information from consecutive face expressions, in order to predict the depression levels. Experiments on the AVEC2013 and AVEC2014 depression datasets show that our proposed approach is promising, when compared to the state-of-the- art visual-based depression analysis methods.

中文翻译：

通过编码深度时空特征的基于视频的抑郁程度分析

作为一种严重的情绪障碍问题，抑郁症会导致严重的症状，影响人们的感受、思考和处理日常活动的方式，例如睡眠、饮食或工作。在本文中，提出了一种新框架来估计视频数据中的贝克抑郁量表 II (BDI-II) 值，该框架使用 3D 卷积神经网络自动学习两种不同面部尺度的时空特征。然后，循环神经网络 (RNN) 用于从时空信息的序列中进一步学习。这种称为 RNN-C3D 的公式可以对来自连续面部表情的局部和全局时空信息进行建模，以预测抑郁程度。在 AVEC2013 和 AVEC2014 抑郁数据集上的实验表明，我们提出的方法很有前景，

更新日期：2018-01-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11