当前位置: X-MOL 学术IEEE Trans. Circ. Syst. Video Technol. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Adaptive Multilayer Perceptual Attention Network for Facial Expression Recognition
IEEE Transactions on Circuits and Systems for Video Technology ( IF 8.4 ) Pub Date : 2022-04-06 , DOI: 10.1109/tcsvt.2022.3165321
Hanwei Liu 1 , Huiling Cai 1 , Qingcheng Lin 1 , Xuefeng Li 2 , Hui Xiao 1
Affiliation  

In complex real-world situations, problems such as illumination changes, facial occlusion, and variant poses make facial expression recognition (FER) a challenging task. To solve the robustness problem, this paper proposes an adaptive multilayer perceptual attention network (AMP-Net) that is inspired by the facial attributes and the facial perception mechanism of the human visual system. AMP-Net extracts global, local, and salient facial emotional features with different fine-grained features to learn the underlying diversity and key information of facial emotions. Different from existing methods, AMP-Net can adaptively guide the network to focus on multiple finer and distinguishable local patches with robustness to occlusion and variant poses, improving the effectiveness of learning potential facial diversity information. In addition, the proposed global perception module can learn different receptive field features in the global perception domain, and AMP-Net also supplements salient facial region features with high emotion correlation based on prior knowledge to capture key texture details and avoid important information loss. Many experiments show that AMP-Net achieves good generalizability and state-of-the-art results on several real-world datasets, including RAF-DB, AffectNet-7, AffectNet-8, SFEW 2.0, FER-2013, and FED-RO, with accuracies of 89.25%, 64.54%, 61.74%, 61.17%, 74.48%, and 71.75%, respectively. All codes and training logs are publicly available at https://github.com/liuhw01/AMP-Net .

中文翻译:

用于面部表情识别的自适应多层感知注意网络

在复杂的现实世界情况下,光照变化、面部遮挡和变体姿势等问题使面部表情识别 (FER) 成为一项具有挑战性的任务。为了解决鲁棒性问题,本文提出了一种自适应多层感知注意力网络(AMP-Net),其灵感来自于人类视觉系统的面部属性和面部感知机制。AMP-Net 提取具有不同细粒度特征的全局、局部和显着面部情绪特征,以学习面部情绪的潜在多样性和关键信息。与现有方法不同,AMP-Net 可以自适应地引导网络关注多个更精细且可区分的局部块,对遮挡和变异姿势具有鲁棒性,从而提高学习潜在面部多样性信息的有效性。此外,提出的全局感知模块可以学习全局感知域中不同的感受野特征,AMP-Net还基于先验知识补充具有高情感相关性的显着面部区域特征,以捕捉关键纹理细节,避免重要信息丢失。许多实验表明,AMP-Net 在包括 RAF-DB、AffectNet-7、AffectNet-8、SFEW 2.0、FER-2013 和 FED-RO 在内的多个真实世界数据集上取得了良好的泛化性和最先进的结果,准确率分别为 89.25%、64.54%、61.74%、61.17%、74.48% 和 71.75%。所有代码和培训日志均可在以下位置公开获得 AMP-Net还基于先验知识补充具有高情感相关性的显着面部区域特征,以捕获关键纹理细节并避免重要信息丢失。许多实验表明,AMP-Net 在包括 RAF-DB、AffectNet-7、AffectNet-8、SFEW 2.0、FER-2013 和 FED-RO 在内的多个真实世界数据集上取得了良好的泛化性和最先进的结果,准确率分别为 89.25%、64.54%、61.74%、61.17%、74.48% 和 71.75%。所有代码和培训日志均可在以下位置公开获得 AMP-Net还基于先验知识补充具有高情感相关性的显着面部区域特征,以捕获关键纹理细节并避免重要信息丢失。许多实验表明,AMP-Net 在包括 RAF-DB、AffectNet-7、AffectNet-8、SFEW 2.0、FER-2013 和 FED-RO 在内的多个真实世界数据集上取得了良好的泛化性和最先进的结果,准确率分别为 89.25%、64.54%、61.74%、61.17%、74.48% 和 71.75%。所有代码和培训日志均可在以下位置公开获得 和 71.75%,分别。所有代码和培训日志均可在以下位置公开获得 和 71.75%,分别。所有代码和培训日志均可在以下位置公开获得https://github.com/liuhw01/AMP-Net .
更新日期:2022-04-06
down
wechat
bug