当前位置: X-MOL 学术IEEE Trans. Pattern Anal. Mach. Intell. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Generalizable Data-Free Objective for Crafting Universal Adversarial Perturbations
IEEE Transactions on Pattern Analysis and Machine Intelligence ( IF 20.8 ) Pub Date : 7-31-2018 , DOI: 10.1109/tpami.2018.2861800
Konda Reddy Mopuri , Aditya Ganeshan , R. Venkatesh Babu

Machine learning models are susceptible to adversarial perturbations: small changes to input that can cause large changes in output. It is also demonstrated that there exist input-agnostic perturbations, called universal adversarial perturbations, which can change the inference of target model on most of the data samples. However, existing methods to craft universal perturbations are (i) task specific, (ii) require samples from the training data distribution, and (iii) perform complex optimizations. Additionally, because of the data dependence, fooling ability of the crafted perturbations is proportional to the available training data. In this paper, we present a novel, generalizable and data-free approach for crafting universal adversarial perturbations. Independent of the underlying task, our objective achieves fooling via corrupting the extracted features at multiple layers. Therefore, the proposed objective is generalizable to craft image-agnostic perturbations across multiple vision tasks such as object recognition, semantic segmentation, and depth estimation. In the practical setting of black-box attack scenario (when the attacker does not have access to the target model and it's training data), we show that our objective outperforms the data dependent objectives to fool the learned models. Further, via exploiting simple priors related to the data distribution, our objective remarkably boosts the fooling ability of the crafted perturbations. Significant fooling rates achieved by our objective emphasize that the current deep learning models are now at an increased risk, since our objective generalizes across multiple tasks without the requirement of training data for crafting the perturbations. To encourage reproducible research, we have released the codes for our proposed algorithm 1 .

中文翻译:


用于制作通用对抗性扰动的可推广的无数据目标



机器学习模型容易受到对抗性扰动的影响:输入的微小变化可能会导致输出的巨大变化。还证明存在与输入无关的扰动,称为通用对抗性扰动,它可以改变目标模型对大多数数据样本的推断。然而,现有的制作通用扰动的方法是(i)特定于任务的,(ii)需要来自训练数据分布的样本,以及(iii)执行复杂的优化。此外,由于数据依赖性,精心设计的扰动的欺骗能力与可用的训练数据成正比。在本文中,我们提出了一种新颖的、可推广的、无数据的方法来制作普遍的对抗性扰动。独立于底层任务,我们的目标通过破坏多层提取的特征来实现欺骗。因此,所提出的目标可推广到跨多个视觉任务(例如对象识别、语义分割和深度估计)制作与图像无关的扰动。在黑盒攻击场景的实际设置中(当攻击者无法访问目标模型及其训练数据时),我们表明我们的目标优于数据依赖目标来欺骗学习模型。此外,通过利用与数据分布相关的简单先验,我们的目标显着提高了精心设计的扰动的欺骗能力。我们的目标实现的显着愚弄率强调,当前的深度学习模型现在面临着更大的风险,因为我们的目标概括了多个任务,而不需要训练数据来制作扰动。 为了鼓励可重复的研究,我们发布了我们提出的算法的代码 1 。
更新日期:2024-08-22
down
wechat
bug