当前位置: X-MOL 学术IEEE Trans. Inform. Forensics Secur. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Dual Adversarial Disentanglement and Deep Representation Decorrelation for NIR-VIS Face Recognition
IEEE Transactions on Information Forensics and Security ( IF 6.3 ) Pub Date : 2020-06-26 , DOI: 10.1109/tifs.2020.3005314
Weipeng Hu , Haifeng Hu

The task of near-infrared and visual (NIR-VIS) face recognition refers to matching face data from different modalities, which has broad application prospects in areas such as multimedia information retrieval and criminal investigation. However, it remains a challenging task due to high intra-class variations and small-scale NIR-VIS dataset. In this paper, we propose a novel approach called Dual Adversarial Disentanglement and deep Representation Decorrelation (DADRD) to solve the NIR-VIS matching problem. In order to reduce the gap between NIR-VIS images, three key components are designed for DADRD model, including Cross-modal Margin (CmM) loss, Dual Adversarial Disentangled Variations (DADV) and Deep Representation Decorrelation (DRD). Firstly, the CmM loss captures within- and between-class information of the data, and it further reduces modality difference by a center-variation item. Secondly, the Mixed Facial Representation (MFR) layer of the backbone network is divided into three parts: the identity-related layer, the modality-related layer and the residual-related layer. The DADV is designed to reduce the intra-class variations, which consists of Adversarial Disentangled Modality Variations (ADMV) and Adversarial Disentangled Residual Variations (ADRV). Specifically, the ADMV and ADRV aim at eliminating spectrum variations and residual variations (i.e., lighting, pose, expression, occlusion, etc) respectively via an adversarial mechanism. Finally, we impose a DRD on the three decomposed features to make them irrelevant to each other, which can more effectively separate the three component information and enhance feature representations. In particular, we develop a Joint Three-stage Optimization (JTsO) strategy to effectively optimize the network. The joint formulation leads to the purification of identity information and the disentanglement of within-class variation information. Extensive experiments have been carried out on three challenging datasets, and the results demonstrate the effectiveness of our method.

中文翻译:


NIR-VIS 人脸识别的双重对抗性解缠和深度表示去相关



近红外与视觉(NIR-VIS)人脸识别任务是指匹配来自不同模态的人脸数据,在多媒体信息检索和刑事侦查等领域具有广阔的应用前景。然而,由于类内差异较大和 NIR-VIS 数据集规模较小,这仍然是一项具有挑战性的任务。在本文中,我们提出了一种称为双重对抗解缠结和深度表示去相关(DADRD)的新方法来解决 NIR-VIS 匹配问题。为了减少 NIR-VIS 图像之间的差距,DADRD 模型设计了三个关键组件,包括跨模态裕度 (CmM) 损失、双重对抗解缠结变化 (DADV) 和深度表示去相关 (DRD)。首先,CmM 损失捕获数据的类内和类间信息,并通过中心变异项进一步减少模态差异。其次,骨干网络的混合面部表示(MFR)层分为三部分:身份相关层、模态相关层和残差相关层。 DADV旨在减少类内变异,包括对抗性解缠模态变异(ADMV)和对抗性解缠残差(ADRV)。具体来说,ADMV 和 ADRV 旨在通过对抗机制分别消除光谱变化和残余变化(即照明、姿势、表情、遮挡等)。最后,我们对三个分解特征施加DRD,使它们彼此无关,这样可以更有效地分离三个分量信息并增强特征表示。特别是,我们开发了联合三阶段优化(JTsO)策略来有效优化网络。 联合公式导致了身份信息的纯化和类内变异信息的解开。在三个具有挑战性的数据集上进行了广泛的实验,结果证明了我们方法的有效性。
更新日期:2020-06-26
down
wechat
bug