当前位置: X-MOL 学术arXiv.cs.AI › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SpectralDefense: Detecting Adversarial Attacks on CNNs in the Fourier Domain
arXiv - CS - Artificial Intelligence Pub Date : 2021-03-04 , DOI: arxiv-2103.03000
Paula Harder, Franz-Josef Pfreundt, Margret Keuper, Janis Keuper

Despite the success of convolutional neural networks (CNNs) in many computer vision and image analysis tasks, they remain vulnerable against so-called adversarial attacks: Small, crafted perturbations in the input images can lead to false predictions. A possible defense is to detect adversarial examples. In this work, we show how analysis in the Fourier domain of input images and feature maps can be used to distinguish benign test samples from adversarial images. We propose two novel detection methods: Our first method employs the magnitude spectrum of the input images to detect an adversarial attack. This simple and robust classifier can successfully detect adversarial perturbations of three commonly used attack methods. The second method builds upon the first and additionally extracts the phase of Fourier coefficients of feature-maps at different layers of the network. With this extension, we are able to improve adversarial detection rates compared to state-of-the-art detectors on five different attack methods.

中文翻译:

SpectralDefense:在Fourier域中检测对CNN的对抗攻击

尽管卷积神经网络(CNN)在许多计算机视觉和图像分析任务中都取得了成功,但它们仍然容易受到所谓的对抗攻击:输入图像中的小巧,精心设计的扰动可能导致错误的预测。一种可能的防御方法是检测对抗性示例。在这项工作中,我们展示了如何在输入图像和特征图的傅立叶域中进行分析,以区分良性测试样本与对抗性图像。我们提出了两种新颖的检测方法:我们的第一种方法利用输入图像的幅度谱来检测对抗性攻击。这个简单而强大的分类器可以成功检测三种常用攻击方法的对抗性扰动。第二种方法建立在第一种方法的基础上,并另外提取了网络不同层上特征图的傅立叶系数的相位。通过此扩展,与五种不同攻击方法上的最新检测器相比,我们能够提高对抗检测率。
更新日期:2021-03-05
down
wechat
bug