当前位置: X-MOL 学术Image Vis. Comput. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Class-aware domain adaptation for improving adversarial robustness
Image and Vision Computing ( IF 4.2 ) Pub Date : 2020-05-05 , DOI: 10.1016/j.imavis.2020.103926
Xianxu Hou , Jingxin Liu , Bolei Xu , Xiaolong Wang , Bozhi Liu , Guoping Qiu

Recent works have demonstrated convolutional neural networks are vulnerable to adversarial examples, i.e., inputs to machine learning models that an attacker has intentionally designed to cause the models to make a mistake. To improve the adversarial robustness of neural networks, adversarial training has been proposed to train networks by injecting adversarial examples into the training data. However, adversarial training could overfit to a specific type of adversarial attack and also lead to standard accuracy drop on clean images. To this end, we propose a novel Class-Aware Domain Adaptation (CADA) method for adversarial defense without directly applying adversarial training. Specifically, we propose to learn domain-invariant features for adversarial examples and clean images via a domain discriminator. Furthermore, we introduce a class-aware component into the discriminator to increase the discriminative power of the network for adversarial examples. We evaluate our newly proposed approach using multiple benchmark datasets. The results demonstrate that our method can significantly improve the state-of-the-art of adversarial robustness for various attacks and maintain high performances on clean images.



中文翻译:

类感知域自适应,以提高对抗性

最近的工作表明,卷积神经网络容易受到对抗性示例的攻击,例如,攻击者故意设计的机器学习模型的输入会导致模型犯错误。为了提高神经网络的对抗鲁棒性,已经提出了对抗训练来通过将对抗性实例注入训练数据中来训练网络。但是,对抗训练可能会过度适合特定类型的对抗攻击,并且还会导致干净图像的标准准确性下降。为此,我们提出了一种新颖的类别感知领域适应(CADA)方法来进行对抗防御,而无需直接应用对抗训练。具体来说,我们建议通过域区分器学习对抗性示例的领域不变特征和清晰图像。此外,我们在区分器中引入了类感知组件,以提高网络在对抗性示例中的区分能力。我们使用多个基准数据集评估了我们新提出的方法。结果表明,我们的方法可以显着提高针对各种攻击的对抗性鲁棒性的最新水平,并在清晰的图像上保持高性能。

更新日期:2020-05-05
down
wechat
bug