当前位置: X-MOL 学术Pattern Recogn. Lett. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Perturbation analysis of gradient-based adversarial attacks
Pattern Recognition Letters ( IF 3.9 ) Pub Date : 2020-05-01 , DOI: 10.1016/j.patrec.2020.04.034
Utku Ozbulak , Manvel Gasparyan , Wesley De Neve , Arnout Van Messem

After the discovery of adversarial examples and their adverse effects on deep learning models, many studies focused on finding more diverse methods to generate these carefully crafted samples. Although empirical results on the effectiveness of adversarial example generation methods against defense mechanisms are discussed in detail in the literature, an in-depth study of the theoretical properties and the perturbation effectiveness of these adversarial attacks has largely been lacking. In this paper, we investigate the objective functions of three popular methods for adversarial example generation: the L-BFGS attack, the Iterative Fast Gradient Sign attack, and Carlini & Wagner’s attack. Specifically, we perform a comparative and formal analysis of the loss functions underlying the aforementioned attacks while laying out large-scale experimental results on the ImageNet dataset. This analysis exposes (1) the faster optimization speed as well as the constrained optimization space of the cross-entropy loss, (2) the detrimental effects of using the signature of the cross-entropy loss on optimization precision as well as optimization space, and (3) the slow optimization speed of the logit loss in the context of adversariality. Our experiments reveal that the Iterative Fast Gradient Sign attack, which is thought to be fast for generating adversarial examples, is the worst attack in terms of the number of iterations required to create adversarial examples in the setting of equal perturbation. Moreover, our experiments show that the underlying loss function of Carlini & Wagner’s attack, which is criticized for being substantially slower than other adversarial attacks, is not that much slower than other loss functions. Finally, we analyze how well neural networks can identify adversarial perturbations generated by the attacks under consideration, hereby revisiting the idea of adversarial retraining on ImageNet.



中文翻译:

基于梯度的对抗攻击的摄动分析

在发现对抗性示例及其对深度学习模型的不利影响之后,许多研究都集中在寻找更多样化的方法来生成这些精心制作的样本。尽管在文献中详细讨论了对抗示例生成方法针对防御机制的有效性的实证结果,但在很大程度上缺乏对这些对抗攻击的理论性质和摄动有效性的深入研究。在本文中,我们研究了三种常见的对抗性示例生成方法的目标功能:L-BFGS攻击,迭代快速梯度符号攻击以及Carlini&Wagner的攻击。特别,我们对上述攻击背后的损失函数进行了比较和形式分析,同时在ImageNet数据集上列出了大规模的实验结果。该分析揭示了(1)更快的优化速度以及交叉熵损失的受限优化空间,(2)使用交叉熵损失的特征对优化精度以及优化空间的不利影响,以及(3)在对抗性情况下,logit损失的优化速度较慢。我们的实验表明,迭代快速梯度符号攻击被认为是 (2)使用交叉熵损失签名对优化精度以及优化空间的不利影响,(3)在对抗性情况下对数损失的优化速度较慢。我们的实验表明,迭代快速梯度符号攻击被认为是 (2)使用交叉熵损失签名对优化精度以及优化空间的不利影响,(3)在对抗性情况下对数损失的优化速度较慢。我们的实验表明,迭代快速梯度符号攻击被认为是用于产生对抗的例子,是在以在等于扰动的设定对抗性实例所需的迭代的数目方面最坏攻击。而且,我们的实验表明,被批评为比其他对抗性攻击要慢得多的Carlini&Wagner攻击的潜在损失函数并不比其他损失函数慢很多。最后,我们分析了神经网络如何很好地识别所考虑的攻击所产生的对抗性扰动,从而重新审视了ImageNet上的对抗性再训练思想。

更新日期:2020-05-01
down
wechat
bug