当前位置: X-MOL 学术IEEE Trans. Inform. Forensics Secur. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Fine-Grained Facial Expression Recognition in the Wild
IEEE Transactions on Information Forensics and Security ( IF 6.3 ) Pub Date : 2020-07-06 , DOI: 10.1109/tifs.2020.3007327
Liqian Liang , Congyan Lang , Yidong Li , Songhe Feng , Jian Zhao

Over the past decades, researches on facial expression recognition have been restricted within six basic expressions (anger, fear, disgust, happiness, sadness and surprise). However, these six words can not fully describe the richness and diversity of human beings' emotions. To enhance the recognitive capabilities for computers, in this paper, we focus on fine-grained facial expression recognition in the wild and build a brand new benchmark FG-Emotions to push the research frontiers on this topic, which extends the original six classes to more elaborate thirty-three classes. Our FG-Emotions contains 10,371 images and 1,491 video clips annotated with corresponding fine-grained facial expression categories and landmarks. FG-Emotions also provides several features (e.g., LBP features and dense trajectories features) to facilitate related research. Moreover, on top of FG-Emotions, we propose a new end-to-end Multi-Scale Action Unit (AU)-based Network (MSAU-Net) for facial expression recognition with image which learns a more powerful facial representation by directly focusing on locating facial action units and utilizing “zoom in” operation to aggregate distinctive local features. As for recognition with video, we further extend the MSAU-Net to a two-stream model (TMSAU-Net) by adding a module with attention mechanism and a temporal stream branch to jointly learn spatial and temporal features. (T)MSAU-Net consistently outperforms existing state-of-the-art solutions on our FG-Emotions and several other datasets, and serves as a strong baseline to drive the future research towards fine-grained facial expression recognition in the wild.

中文翻译:


野外细粒度面部表情识别



在过去的几十年里,面部表情识别的研究一直局限于六种基本表情(愤怒、恐惧、厌恶、快乐、悲伤和惊讶)。然而,这六个字并不能完全描述人类情感的丰富性和多样性。为了增强计算机的识别能力,本文重点关注野外的细粒度面部表情识别,并构建了一个全新的基准FG-Emotions来推动该课题的研究前沿,将原来的六类扩展到更多阐述三十三类。我们的 FG-Emotions 包含 10,371 个图像和 1,491 个视频剪辑,并用相应的细粒度面部表情类别和标志进行注释。 FG-Emotions还提供了一些特征(例如,LBP特征和密集轨迹特征)以促进相关研究。此外,在 FG-Emotions 之上,我们提出了一种新的基于端到端多尺度动作单元(AU)的网络(MSAU-Net),用于图像面部表情识别,通过直接聚焦来学习更强大的面部表征定位面部动作单元并利用“放大”操作聚合独特的局部特征。对于视频识别,我们通过添加具有注意机制的模块和时间流分支来共同学习空间和时间特征,将 MSAU-Net 进一步扩展为双流模型(TMSAU-Net)。 (T)MSAU-Net 在我们的 FG-Emotions 和其他几个数据集上始终优于现有的最先进解决方案,并作为推动未来研究朝着野外细粒度面部表情识别方向发展的强大基线。
更新日期:2020-07-06
down
wechat
bug