MAFER: a Multi-resolution Approach to Facial Expression Recognition,arXiv - CS - Computer Vision and Pattern Recognition

当前位置： X-MOL 学术 › arXiv.cs.CV › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

MAFER: a Multi-resolution Approach to Facial Expression Recognition
arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2021-05-06 , DOI: arxiv-2105.02481
Fabio Valerio Massoli, Donato Cafarelli, Claudio Gennaro, Giuseppe Amato, Fabrizio Falchi

Emotions play a central role in the social life of every human being, and their study, which represents a multidisciplinary subject, embraces a great variety of research fields. Especially concerning the latter, the analysis of facial expressions represents a very active research area due to its relevance to human-computer interaction applications. In such a context, Facial Expression Recognition (FER) is the task of recognizing expressions on human faces. Typically, face images are acquired by cameras that have, by nature, different characteristics, such as the output resolution. It has been already shown in the literature that Deep Learning models applied to face recognition experience a degradation in their performance when tested against multi-resolution scenarios. Since the FER task involves analyzing face images that can be acquired with heterogeneous sources, thus involving images with different quality, it is plausible to expect that resolution plays an important role in such a case too. Stemming from such a hypothesis, we prove the benefits of multi-resolution training for models tasked with recognizing facial expressions. Hence, we propose a two-step learning procedure, named MAFER, to train DCNNs to empower them to generate robust predictions across a wide range of resolutions. A relevant feature of MAFER is that it is task-agnostic, i.e., it can be used complementarily to other objective-related techniques. To assess the effectiveness of the proposed approach, we performed an extensive experimental campaign on publicly available datasets: \fer{}, \raf{}, and \oulu{}. For a multi-resolution context, we observe that with our approach, learning models improve upon the current SotA while reporting comparable results in fix-resolution contexts. Finally, we analyze the performance of our models and observe the higher discrimination power of deep features generated from them.

中文翻译：

MAFER：面部表情识别的多分辨率方法

情感在每个人的社会生活中都发挥着核心作用，他们的研究代表了一个多学科的学科，涵盖了许多研究领域。特别是对于后者，由于面部表情与人机交互应用的相关性，因此对面部表情的分析代表了非常活跃的研究领域。在这种情况下，面部表情识别（FER）是识别人脸表情的任务。通常，面部图像是通过具有本质上不同的特性（例如输出分辨率）的相机获取的。在文献中已经表明，在针对多分辨率场景进行测试时，应用于面部识别的深度学习模型的性能会下降。由于FER任务涉及分析可以用异构源获取的面部图像，因此涉及质量不同的图像，因此可以预期分辨率在这种情况下也起着重要的作用。基于这样的假设，我们证明了多分辨率训练对识别面部表情的模型的好处。因此，我们提出了一个名为MAFER的两步学习程序，以训练DCNN以使其能够在各种分辨率下生成可靠的预测。MAFER的一个相关特征是它与任务无关，即可以与其他与目标相关的技术互补使用。为了评估该方法的有效性，我们对公开数据集：\ fer {}，\ raf {}和\ oulu {}进行了广泛的实验。对于多分辨率环境，我们观察到，通过我们的方法，学习模型在当前的SotA上得到了改进，同时在修复分辨率的情况下报告了可比较的结果。最后，我们分析了模型的性能，并观察到了由它们产生的深层特征的较高辨别力。

更新日期：2021-05-07

点击分享查看原文

点击收藏

阅读更多本刊最新论文