Face Anti-Spoofing via Adversarial Cross-Modality Translation,IEEE Transactions on Information Forensics and Security

当前位置： X-MOL 学术 › IEEE Trans. Inform. Forensics Secur. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Face Anti-Spoofing via Adversarial Cross-Modality Translation
IEEE Transactions on Information Forensics and Security ( IF 6.3 ) Pub Date : 3-10-2021 , DOI: 10.1109/tifs.2021.3065495
Ajian Liu ₁ , Zichang Tan ₂ , Jun Wan ₂ , Yanyan Liang ₁ , Zhen Lei ₂ , Guodong Guo ₃ , Stan Z. Li ₁

Affiliation

Face Presentation Attack Detection (PAD) approaches based on multi-modal data have been attracted increasingly by the research community. However, they require multi-modal face data consistently involved in both the training and testing phases. It would severely limit the applicability due to the most Face Anti-spoofing (FAS) systems are only equipped with Visible (VIS) imaging devices, i.e., RGB cameras. Therefore, how to use other modality (i.e., Near-Infrared (NIR)) to assist the performance improvement of VIS-based PAD is significant for FAS. In this work, we first discuss the big gap of performances among different modalities even though the same backbone network is applied. Then, we propose a novel Cross-modal Auxiliary (CMA) framework for the VIS-based FAS task. The main trait of CMA is that the performance can be greatly improved with the help of other modality while no other modality is required in the testing stage. The proposed CMA consists of a Modality Translation Network (MT-Net) and a Modality Assistance Network (MA-Net). The former aims to close the visible gap between different modalities via a generative model that maps inputs from one modality (i.e., RGB) to another (i.e., NIR). The latter focuses on how to use the translated modality (i.e., target modality) and RGB modality (i.e., source modality) together to train a discriminative PAD model. Extensive experiments are conducted to demonstrate that the proposed framework can push the state-of-the-art (SOTA) performances on both multi-modal datasets (i.e., CASIA-SURF, CeFA, and WMCA) and RGB-based datasets (i.e., OULU-NPU, and SiW).

中文翻译：

通过对抗性跨模态翻译面对反欺骗

基于多模态数据的人脸呈现攻击检测（PAD）方法越来越受到研究界的关注。然而，他们需要在训练和测试阶段持续参与的多模态人脸数据。由于大多数人脸反欺骗（FAS）系统仅配备可见光（VIS）成像设备，即RGB相机，这将严重限制其适用性。因此，如何使用其他模态（即近红外（NIR））来辅助基于VIS的PAD性能改进对于FAS来说具有重要意义。在这项工作中，我们首先讨论即使应用相同的骨干网络，不同模式之间的性能差距也很大。然后，我们为基于 VIS 的 FAS 任务提出了一种新颖的跨模态辅助（CMA）框架。 CMA的主要特点是在测试阶段不需要其他模态的情况下，借助其他模态可以大大提高性能。所提出的 CMA 由模态翻译网络（MT-Net）和模态辅助网络（MA-Net）组成。前者旨在通过将输入从一种模态（即 RGB）映射到另一种模态（即 NIR）的生成模型来缩小不同模态之间的可见差距。后者重点关注如何一起使用翻译模态（即目标模态）和 RGB 模态（即源模态）来训练判别性 PAD 模型。进行了大量的实验来证明所提出的框架可以在多模态数据集（即 CASIA-SURF、CeFA 和 WMCA）和基于 RGB 的数据集（即OULU-NPU 和 SiW）。

更新日期：2024-08-22

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11