当前位置: X-MOL 学术arXiv.cs.CR › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Generalizable Adversarial Attacks with Latent Variable Perturbation Modelling
arXiv - CS - Cryptography and Security Pub Date : 2019-05-26 , DOI: arxiv-1905.10864
Avishek Joey Bose, Andre Cianflone, William L. Hamilton

Adversarial attacks on deep neural networks traditionally rely on a constrained optimization paradigm, where an optimization procedure is used to obtain a single adversarial perturbation for a given input example. In this work we frame the problem as learning a distribution of adversarial perturbations, enabling us to generate diverse adversarial distributions given an unperturbed input. We show that this framework is domain-agnostic in that the same framework can be employed to attack different input domains with minimal modification. Across three diverse domains---images, text, and graphs---our approach generates whitebox attacks with success rates that are competitive with or superior to existing approaches, with a new state-of-the-art achieved in the graph domain. Finally, we demonstrate that our framework can efficiently generate a diverse set of attacks for a single given input, and is even capable of attacking \textit{unseen} test instances in a zero-shot manner, exhibiting attack generalization.

中文翻译:

具有潜在变量扰动建模的可推广对抗性攻击

对深度神经网络的对抗性攻击传统上依赖于约束优化范式,其中优化程序用于获得给定输入示例的单个对抗性扰动。在这项工作中,我们将问题构建为学习对抗性扰动的分布,使我们能够在给定未受干扰的输入的情况下生成不同的对抗性分布。我们表明该框架是域不可知的,因为可以使用相同的框架以最少的修改来攻击不同的输入域。在三个不同的领域——图像、文本和图形——我们的方法生成白盒攻击的成功率与现有方法竞争或优于现有方法,并在图形领域实现了新的最先进技术。最后,
更新日期:2020-01-22
down
wechat
bug