STCAM: Spatial-Temporal and Channel Attention Module for Dynamic Facial Expression Recognition,IEEE Transactions on Affective Computing

当前位置： X-MOL 学术 › IEEE Trans. Affect. Comput. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

STCAM: Spatial-Temporal and Channel Attention Module for Dynamic Facial Expression Recognition
IEEE Transactions on Affective Computing ( IF 11.2 ) Pub Date : 2020-09-29 , DOI: 10.1109/taffc.2020.3027340
Weicong Chen ₁ , Dong Zhang ₁ , Ming Li ₂ , Dah-Jye Lee ₃

Affiliation

Capturing the dynamics of facial expression progression in video is an essential and challenging task for facial expression recognition (FER). In this article, we propose an effective framework to address this challenge. We develop a C3D-based network architecture, 3D-Inception-ResNet, to extract spatial-temporal features from the dynamic facial expression image sequence. A Spatial-Temporal and Channel Attention Module (STCAM) is proposed to explicitly exploit the holistic spatial-temporal and channel-wise correlations among the extracted features. Specifically, the proposed STCAM calculates a channel-wise and a spatial-temporal-wise attention map to enhance the features along the corresponding feature dimensions for more representative features. We evaluate our method on three popular dynamic facial expression recognition datasets, CK+, Oulu-CASIA, and MMI. Experimental results show that our method achieves better or comparable performance compared to the state-of-the-art approaches.

中文翻译：

STCAM：用于动态面部表情识别的时空和通道注意模块

捕捉视频中面部表情进展的动态是面部表情识别 (FER) 的一项重要且具有挑战性的任务。在本文中，我们提出了一个有效的框架来应对这一挑战。我们开发了一种基于 C3D 的网络架构 3D-Inception-ResNet，以从动态面部表情图像序列中提取时空特征。提出了一种时空和通道注意模块 (STCAM)，以明确利用提取的特征之间的整体时空和通道相关性。具体来说，所提出的 STCAM 计算通道方式和空间-时间方式注意图，以增强相应特征维度上的特征以获得更具代表性的特征。我们在三个流行的动态面部表情识别数据集 CK+、Oulu-CASIA 和 MMI。实验结果表明，与最先进的方法相比，我们的方法实现了更好或相当的性能。

更新日期：2020-09-29

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>