当前位置: X-MOL 学术IEEE Trans. Dependable Secure Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Defending Against Adversarial Attack Towards Deep Neural Networks Via Collaborative Multi-Task Training
IEEE Transactions on Dependable and Secure Computing ( IF 7.3 ) Pub Date : 2020-08-05 , DOI: 10.1109/tdsc.2020.3014390
Derui Wang 1, 2 , Chaoran Li 1, 2 , Sheng Wen 1 , Surya Nepal 2 , Yang Xiang 1
Affiliation  

Deep neural networks (DNNs) are known to be vulnerable to adversarial examples which contain human-imperceptible perturbations. A series of defending methods, either proactive defence or reactive defence, have been proposed in the recent years. However, most of the methods can only handle specific attacks. For example, proactive defending methods are invalid against grey-box or white-box attacks, while reactive defending methods are challenged by low-distortion adversarial examples or transferring adversarial examples. This becomes a critical problem since a defender usually does not have the type of attack as a priori knowledge. Moreover, existing two-pronged defences (e.g., MagNet), which take advantage of both proactive and reactive methods, have been reported as broken under transferring attacks. To address this problem, this article proposed a novel defensive framework based on collaborative multi-task training, aiming at providing defence for different types of attacks. The proposed defence first encodes training labels into label pairs and counters black-box attacks leveraging adversarial training supervised by the encoded label pairs. The defence further constructs a detector to identify and reject high-confidence adversarial examples that bypass the black-box defence. In addition, the proposed collaborative architecture can prevent adversaries from finding valid adversarial examples when the defence strategy is exposed. In the experiments, we evaluated our defence against four state-of-the-art attacks on $MNIST$ and $CIFAR10$ datasets. The results showed that our defending method achieved up to 96.3 percent classification accuracy on black-box adversarial examples, and detected up to 98.7 percent of the high confidence adversarial examples. It only decreased the model accuracy on benign example classification by 2.1 percent for the $CIFAR10$ dataset.

中文翻译:

通过协作多任务训练防御针对深度神经网络的对抗性攻击

众所周知,深度神经网络 (DNN) 容易受到包含人类无法察觉的扰动的对抗性示例的影响。近年来提出了一系列防御方法,无论是主动防御还是被动防御。但是,大多数方法只能处理特定的攻击。例如,主动防御方法对灰盒或白盒攻击无效,而被动防御方法则受到低失真对抗样本或转移对抗样本的挑战。这成为一个关键问题,因为防御者通常没有先验的攻击类型知识。此外,据报道,利用主动和被动方法的现有双管齐下的防御(例如,MagNet)在转移攻击下被破坏。针对这一问题,本文提出了一种基于协同多任务训练的新型防御框架,旨在为不同类型的攻击提供防御。所提出的防御首先将训练标签编码为标签对,并利用由编码标签对监督的对抗性训练来对抗黑盒攻击。防御进一步构建了一个检测器,以识别和拒绝绕过黑盒防御的高置信度对抗样本。此外,所提出的协作架构可以防止对手在暴露防御策略时找到有效的对抗样本。在实验中,$MNIST$$CIFAR10$数据集。结果表明,我们的防御方法在黑盒对抗样本上实现了高达 96.3% 的分类准确率,并检测到了高达 98.7% 的高置信度对抗样本。它仅将良性示例分类的模型准确率降低了 2.1%$CIFAR10$数据集。
更新日期:2020-08-05
down
wechat
bug