当前位置: X-MOL 学术IEEE Trans. Affect. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dual Attention and Element Recalibration Networks for Automatic Depression Level Prediction
IEEE Transactions on Affective Computing ( IF 9.6 ) Pub Date : 5-26-2022 , DOI: 10.1109/taffc.2022.3177737
Mingyue Niu 1 , Ziping Zhao 1 , Jianhua Tao 2 , Ya Li 3 , Bjorn W. Schuller 4
Affiliation  

Physiological studies have identified that facial dynamics can be considered as biomarkers to analyze depression severity. This paper accordingly develops a Dual Attention and Element Recalibration (DAER) network to extract facial changes to predict the depression level. In this model, we propose two blocks: a Dual Attention (DA) block and Element Recalibration (ER) block. The DA block uses the self-attention to investigate the dynamic changes in the representation sequence of a facial video segment. It further examines the influence of feature components of the representation sequence on depression level prediction through bilinear-attention. Moreover, to improve the representation ability of network, the ER block is used to obtain the global information to recalibrate each element of the tensor. Adopting this approach, for the depression level prediction task, we first divide the long-term video into fixed-length segments and use the trained ResNet50 to encode each frame to generate the representation sequences of video segments. Second, the representation sequences are input into DAER network to obtain the depression level scores. Finally, the average of these scores yields the prediction result corresponding to the long-term video. Experiments on publicly available AVEC 2013 and AVEC 2014 depression databases illustrate the effectiveness of our method.

中文翻译:


用于自动抑郁水平预测的双重注意力和元素重新校准网络



生理学研究发现,面部动态可以被视为分析抑郁严重程度的生物标志物。本文相应地开发了双重注意力和元素重新校准(DAER)网络来提取面部变化来预测抑郁程度。在这个模型中,我们提出了两个模块:双重注意力(DA)模块和元素重新校准(ER)模块。 DA 块使用自注意力来研究面部视频片段表示序列的动态变化。它通过双线性注意力进一步检查表示序列的特征组件对抑郁水平预测的影响。此外,为了提高网络的表示能力,使用ER块获取全局信息来重新校准张量的每个元素。采用这种方法,对于抑郁程度预测任务,我们首先将长时视频划分为固定长度的片段,并使用训练好的ResNet50对每个帧进行编码以生成视频片段的表示序列。其次,将表示序列输入DAER网络以获得抑郁水平分数。最后,这些分数的平均值产生与长期视频对应的预测结果。在公开的 AVEC 2013 和 AVEC 2014 抑郁症数据库上进行的实验说明了我们方法的有效性。
更新日期:2024-08-26
down
wechat
bug