Face anti-spoofing with cross-stage relation enhancement and spoof material perception,Neural Networks

当前位置： X-MOL 学术 › Neural Netw. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Face anti-spoofing with cross-stage relation enhancement and spoof material perception
Neural Networks ( IF 7.8 ) Pub Date : 2024-03-27 , DOI: 10.1016/j.neunet.2024.106275
Daiyuan Li , Guo Chen , Xixian Wu , Zitong Yu , Mingkui Tan

Face Anti-Spoofing (FAS) seeks to protect face recognition systems from spoofing attacks, which is applied extensively in scenarios such as access control, electronic payment, and security surveillance systems. Face anti-spoofing requires the integration of local details and global semantic information. Existing CNN-based methods rely on small stride or image patch-based feature extraction structures, which struggle to capture spatial and cross-layer feature correlations effectively. Meanwhile, Transformer-based methods have limitations in extracting discriminative detailed features. To address the aforementioned issues, we introduce a multi-stage CNN-Transformer-based framework, which extracts local features through the convolutional layer and long-distance feature relationships via self-attention. Based on this, we proposed a cross-attention multi-stage feature fusion, employing semantically high-stage features to query task-relevant features in low-stage features for further cross-stage feature fusion. To enhance the discrimination of local features for subtle differences, we design pixel-wise material classification supervision and add a auxiliary branch in the intermediate layers of the model. Moreover, to address the limitations of a single acquisition environment and scarcity of acquisition devices in the existing Near-Infrared dataset, we create a large-scale Near-Infrared Face Anti-Spoofing dataset with 380k pictures of 1040 identities. The proposed method could achieve the state-of-the-art in OULU-NPU and our proposed Near-Infrared dataset at just 1.3GFlops and 3.2M parameter numbers, which demonstrate the effective of the proposed method.

中文翻译：

具有跨阶段关系增强和欺骗材料感知的人脸反欺骗

人脸反欺骗（FAS）旨在保护人脸识别系统免受欺骗攻击，广泛应用于门禁、电子支付、安全监控系统等场景。人脸反欺骗需要融合局部细节和全局语义信息。现有的基于 CNN 的方法依赖于小步幅或基于图像块的特征提取结构，难以有效捕获空间和跨层特征相关性。同时，基于 Transformer 的方法在提取判别性细节特征方面存在局限性。为了解决上述问题，我们引入了一种基于 CNN-Transformer 的多级框架，该框架通过卷积层提取局部特征，并通过自注意力提取长距离特征关系。基于此，我们提出了一种交叉注意力多阶段特征融合，利用语义上的高阶段特征来查询低阶段特征中的任务相关特征，以进一步进行跨阶段特征融合。为了增强局部特征对细微差异的辨别力，我们设计了逐像素材质分类监督，并在模型的中间层添加了辅助分支。此外，为了解决现有近红外数据集中单一采集环境的限制和采集设备稀缺的问题，我们创建了一个包含 1040 个身份的 38 万张图片的大规模近红外人脸反欺骗数据集。该方法可以在 OULU-NPU 和我们提出的近红外数据集上以 1.3GFlops 和 3.2M 参数数实现最先进的性能，这证明了该方法的有效性。

更新日期：2024-03-27

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>