当前位置:
X-MOL 学术
›
arXiv.cs.CR
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Generating Adversarial Inputs Using A Black-box Differential Technique
arXiv - CS - Cryptography and Security Pub Date : 2020-07-10 , DOI: arxiv-2007.05315 Jo\~ao Batista Pereira Matos Ju\'unior, Lucas Carvalho Cordeiro, Marcelo d'Amorim, Xiaowei Huang
arXiv - CS - Cryptography and Security Pub Date : 2020-07-10 , DOI: arxiv-2007.05315 Jo\~ao Batista Pereira Matos Ju\'unior, Lucas Carvalho Cordeiro, Marcelo d'Amorim, Xiaowei Huang
Neural Networks (NNs) are known to be vulnerable to adversarial attacks. A
malicious agent initiates these attacks by perturbing an input into another one
such that the two inputs are classified differently by the NN. In this paper,
we consider a special class of adversarial examples, which can exhibit not only
the weakness of NN models - as do for the typical adversarial examples - but
also the different behavior between two NN models. We call them
difference-inducing adversarial examples or DIAEs. Specifically, we propose
DAEGEN, the first black-box differential technique for adversarial input
generation. DAEGEN takes as input two NN models of the same classification
problem and reports on output an adversarial example. The obtained adversarial
example is a DIAE, so that it represents a point-wise difference in the input
space between the two NN models. Algorithmically, DAEGEN uses a local
search-based optimization algorithm to find DIAEs by iteratively perturbing an
input to maximize the difference of two models on predicting the input. We
conduct experiments on a spectrum of benchmark datasets (e.g., MNIST, ImageNet,
and Driving) and NN models (e.g., LeNet, ResNet, Dave, and VGG). Experimental
results are promising. First, we compare DAEGEN with two existing white-box
differential techniques (DeepXplore and DLFuzz) and find that under the same
setting, DAEGEN is 1) effective, i.e., it is the only technique that succeeds
in generating attacks in all cases, 2) precise, i.e., the adversarial attacks
are very likely to fool machines and humans, and 3) efficient, i.e, it requires
a reasonable number of classification queries. Second, we compare DAEGEN with
state-of-the-art black-box adversarial attack methods (simba and tremba), by
adapting them to work on a differential setting. The experimental results show
that DAEGEN performs better than both of them.
中文翻译:
使用黑盒差分技术生成对抗性输入
众所周知,神经网络 (NN) 容易受到对抗性攻击。恶意代理通过将输入扰乱到另一个输入来发起这些攻击,这样两个输入被 NN 分类为不同的类别。在本文中,我们考虑一类特殊的对抗样本,它不仅可以展示 NN 模型的弱点——就像典型的对抗样本一样——而且可以展示两个 NN 模型之间的不同行为。我们称它们为差异诱导对抗样本或 DIAE。具体来说,我们提出了 DAEGEN,这是第一个用于生成对抗性输入的黑盒差分技术。DAEGEN 将相同分类问题的两个 NN 模型作为输入,并报告输出一个对抗性示例。得到的对抗样本是一个 DIAE,因此它代表了两个 NN 模型之间输入空间的逐点差异。在算法上,DAEGEN 使用基于局部搜索的优化算法通过迭代扰动输入以最大化两个模型在预测输入时的差异来找到 DIAE。我们对一系列基准数据集(例如 MNIST、ImageNet 和 Driving)和 NN 模型(例如 LeNet、ResNet、Dave 和 VGG)进行了实验。实验结果很有希望。首先,我们将 DAEGEN 与现有的两种白盒差分技术(DeepXplore 和 DLFuzz)进行比较,发现在相同设置下,DAEGEN 是 1) 有效,即它是唯一一种在所有情况下都能成功生成攻击的技术,2)精确,即对抗性攻击很可能愚弄机器和人类,3)高效,即 它需要合理数量的分类查询。其次,我们将 DAEGEN 与最先进的黑盒对抗攻击方法(simba 和 tremba)进行比较,使它们适应不同的设置。实验结果表明,DAEGEN 的性能优于两者。
更新日期:2020-07-13
中文翻译:
使用黑盒差分技术生成对抗性输入
众所周知,神经网络 (NN) 容易受到对抗性攻击。恶意代理通过将输入扰乱到另一个输入来发起这些攻击,这样两个输入被 NN 分类为不同的类别。在本文中,我们考虑一类特殊的对抗样本,它不仅可以展示 NN 模型的弱点——就像典型的对抗样本一样——而且可以展示两个 NN 模型之间的不同行为。我们称它们为差异诱导对抗样本或 DIAE。具体来说,我们提出了 DAEGEN,这是第一个用于生成对抗性输入的黑盒差分技术。DAEGEN 将相同分类问题的两个 NN 模型作为输入,并报告输出一个对抗性示例。得到的对抗样本是一个 DIAE,因此它代表了两个 NN 模型之间输入空间的逐点差异。在算法上,DAEGEN 使用基于局部搜索的优化算法通过迭代扰动输入以最大化两个模型在预测输入时的差异来找到 DIAE。我们对一系列基准数据集(例如 MNIST、ImageNet 和 Driving)和 NN 模型(例如 LeNet、ResNet、Dave 和 VGG)进行了实验。实验结果很有希望。首先,我们将 DAEGEN 与现有的两种白盒差分技术(DeepXplore 和 DLFuzz)进行比较,发现在相同设置下,DAEGEN 是 1) 有效,即它是唯一一种在所有情况下都能成功生成攻击的技术,2)精确,即对抗性攻击很可能愚弄机器和人类,3)高效,即 它需要合理数量的分类查询。其次,我们将 DAEGEN 与最先进的黑盒对抗攻击方法(simba 和 tremba)进行比较,使它们适应不同的设置。实验结果表明,DAEGEN 的性能优于两者。