当前位置: X-MOL 学术Comput. Secur. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
A novel and universal GAN-based countermeasure to recover adversarial examples to benign examples
Computers & Security ( IF 4.8 ) Pub Date : 2021-09-04 , DOI: 10.1016/j.cose.2021.102457
Rui Yang 1, 2 , Tian-Jie Cao 1, 2 , Xiu-Qing Chen 3 , Feng-Rong Zhang 1, 2
Affiliation  

Some recent studies have demonstrated that the deep neural network (DNN) is vulnerable to adversarial examples, which contain some subtle and human-imperceptible perturbations. Although numerous countermeasures have been proposed and play a significant role, most of them all have some flaws and are only effective for certain types of adversarial examples. In the paper, we present a novel and universal countermeasure to recover multiple types of adversarial examples to benign examples before they are fed into the deep neural network. The idea is to model the mapping between adversarial examples and benign examples using a generative adversarial network (GAN). Its GAN architecture consists of a generator based on UNET, a discriminator based on ACGAN, and a newly added third-party classifier. The UNET can enhance the capacity of the generator to recover adversarial examples to benign examples. The loss function makes full use of the advantages of ACGAN and WGAN-GP to ensure the stability of the training process and accelerate its convergence. Besides, a classification loss and a perceptual loss, all from the third-party classifier, are employed to improve further the generator's capacity to eliminate adversarial perturbations. Experiments are conducted on the MNIST, CIFAR10, and IMAGENET datasets. First, we perform ablation experiments to prove the proposed countermeasure's validity. Then, we defend against seven types of state-of-the-art adversarial examples on four deep neural networks and compare them with six existing countermeasures. Finally, the experimental results demonstrate that the proposed countermeasure is universal and has a more excellent performance than other countermeasures. The experimental code is available at https://github.com/Afreadyang/IAED-GAN.



中文翻译:

一种新颖且通用的基于 GAN 的对策,可将对抗性示例恢复为良性示例

最近的一些研究表明,深度神经网络 (DNN) 容易受到对抗性示例的影响,其中包含一些微妙且人类无法察觉的扰动。尽管已经提出了许多对策并发挥了重要作用,但它们中的大多数都存在一些缺陷,并且仅对某些类型的对抗性示例有效。在本文中,我们提出了一种新颖且通用的对策,可在将多种类型的对抗性示例输入深度神经网络之前将其恢复为良性示例。这个想法是使用生成对抗网络 (GAN) 对对抗性示例和良性示例之间的映射进行建模。它的 GAN 架构由一个基于 UNET 的生成器、一个基于 ACGAN 的判别器和一个新添加的第三方分类器组成。UNET 可以增强生成器将对抗性示例恢复为良性示例的能力。损失函数充分利用了ACGAN和WGAN-GP的优势,保证训练过程的稳定性,加速其收敛。此外,分类损失和感知损失均来自第三方分类器,用于进一步提高生成器消除对抗性扰动的能力。实验在 MNIST、CIFAR10 和 IMAGENET 数据集上进行。首先,我们进行消融实验以证明所提出的对策的有效性。然后,我们在四个深度神经网络上防御七种最先进的对抗性示例,并将它们与六种现有对策进行比较。最后,实验结果表明,所提出的对策具有通用性,并且比其他对策具有更优异的性能。实验代码可在 https://github.com/Afreadyang/IAED-GAN 获得。

更新日期:2021-09-13
down
wechat
bug