当前位置: X-MOL 学术arXiv.cs.MM › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Identity-aware Facial Expression Recognition in Compressed Video
arXiv - CS - Multimedia Pub Date : 2021-01-01 , DOI: arxiv-2101.00317
Xiaofeng Liu, Linghao Jin, Xu Han, Jun Lu, Jane You, Lingsheng Kong

This paper targets to explore the inter-subject variations eliminated facial expression representation in the compressed video domain. Most of the previous methods process the RGB images of a sequence, while the off-the-shelf and valuable expression-related muscle movement already embedded in the compression format. In the up to two orders of magnitude compressed domain, we can explicitly infer the expression from the residual frames and possible to extract identity factors from the I frame with a pre-trained face recognition network. By enforcing the marginal independent of them, the expression feature is expected to be purer for the expression and be robust to identity shifts. We do not need the identity label or multiple expression samples from the same person for identity elimination. Moreover, when the apex frame is annotated in the dataset, the complementary constraint can be further added to regularize the feature-level game. In testing, only the compressed residual frames are required to achieve expression prediction. Our solution can achieve comparable or better performance than the recent decoded image based methods on the typical FER benchmarks with about 3$\times$ faster inference with compressed data.

中文翻译:

压缩视频中的身份感知面部表情识别

本文旨在探讨在压缩视频域中对象间变异消除的面部表情表示。大多数以前的方法都处理序列的RGB图像,而现成的和有价值的表达相关的肌肉运动已经嵌入到压缩格式中。在最多两个数量级的压缩域中,我们可以从残差帧中明确推断出表达式,并可以使用预训练的人脸识别网络从I帧中提取身份因子。通过强制边缘独立于它们,可以期望表达特征对于表达而言是更纯净的,并且对于身份转移是可靠的。我们不需要同一人的身份标签或多个表达样本来消除身份。此外,当在数据集中注释顶点框时,可以进一步添加互补约束以规范功能级别的游戏。在测试中,仅需要压缩的残差帧即可实现表达预测。我们的解决方案可以比典型的FER基准上的基于最近解码图像的方法获得可比或更好的性能,并且压缩数据的推断速度要快3倍。
更新日期:2021-01-05
down
wechat
bug