Generating transferable adversarial examples based on perceptually-aligned perturbation

Chen, Hongqiao; Lu, Keda; Wang, Xianmin; Li, Jin

doi:10.1007/s13042-020-01240-1

Generating transferable adversarial examples based on perceptually-aligned perturbation

Original Article
Published: 12 January 2021

Volume 12, pages 3295–3307, (2021)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Hongqiao Chen¹,
Keda Lu¹,
Xianmin Wang^1,2 &
…
Jin Li¹

620 Accesses
7 Citations
1 Altmetric
Explore all metrics

Abstract

Neural networks (NNs) are known to be susceptible to adversarial examples (AEs), which are intentionally designed to deceive a target classifier by adding small perturbations to the inputs. And interestingly, AEs crafted for one NN can mislead another model. Such a property is referred to as transferability, which is often leveraged to perform attacks in black-box settings. To mitigate the transferability of AEs, many approaches are explored to enhance the NN’s robustness. Especially, adversarial training (AT) and its variants are shown be the strongest defense to resist such transferable AEs. To boost the transferability of AEs against the robust models that have undergone AT, a novel AE generating method is proposed in this paper. The motivation of our method is based on the observation that robust models with AT is more sensitive to the perceptually-relevant gradients, hence it is reasonable to synthesize the AEs by the perturbations that have the perceptually-aligned features. The detailed process of the proposed method is given as below. First, by optimizing the loss function over an ensemble of random noised inputs, we obtain perceptually-aligned perturbations that have the noise-invariant property. Second, we employ Perona–Malik (P–M) filter to smooth the derived adversarial perturbations, such that the perceptually-relevant feature of the perturbation is significantly reinforced and the local oscillation of the perturbation is substantially suppressed. Our method can be generally applied to any gradient-based attack method. We carry out extensive experiments under ImageNet dataset for various robust and non-robust models, and the experimental results demonstrate the effectiveness of our method. Particularly, by combining our method with diverse inputs method and momentum iterative fast gradient sign method, we can achieve state-of-the-art performance in terms of fooling the robust models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi-scale Features Destructive Universal Adversarial Perturbations

Adversarial perturbation denoising utilizing common characteristics in deep feature space

Article 13 January 2024

Jianchang Huang, Yinyao Dai, … Yaguan Qian

Transferable Adversarial Perturbations

References

Athalye A, Carlini N, Wagner D (2018) Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. arXiv:1802.00420
Balduzzi D, Frean M, Leary L, Lewis JP, Ma KW, Mcwilliams B (2017) The shattered gradients problem: if resnets are the answer, then what is the question?. Neural and evolutionary computing. arXiv:1702.08591
Brendel W, Rauber J, Bethge M (2017) Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. Machine learning. arXiv:1712.04248
Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In: 2017 ieee symposium on security and privacy (sp). IEEE, pp 39–57
Chan A, Tay Y, Ong YS, Fu J (2019) Jacobian adversarially regularized networks for robustness
Chen J, Su M, Shen S, Xiong H, Zheng H (2019) POBA-GA: perturbation optimized black-box adversarial attacks via genetic algorithm. Comput Secur 85:89–106
Article Google Scholar
Chen PY, Zhang H, Yi J, Hsieh CJ (2017) Zoo: zeroth order optimization based black-box attacks to deep neural networks without training substitute models, pp 15–26. https://doi.org/10.1145/3128572.3140448
Dhillon GS, Azizzadenesheli K, Lipton ZC et al (2018) Stochastic activation pruning for robust adversarial defense. arXiv preprint arXiv:1803.01442
Dong Y, Liao F, Pang T et al (2018) Boosting adversarial attacks with momentum. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9185–9193
Dong Y, Pang T, Su H et al (2019) Evading defenses to transferable adversarial examples by translation-invariant attacks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4312–4321
Feng W, Chen Z, Gursoy MC, Velipasalar S (2020) Defense strategies against adversarial jamming attacks via deep reinforcement learning. In: 2020 54th annual conference on information sciences and systems (CISS)
Goodfellow I, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. Machine learning. arXiv
Grosse K, Manoharan P, Papernot N, Backes M, Mcdaniel P (2017) On the (statistical) detection of adversarial examples. Cryptography and security. arXiv
Gu X, Angelov PP, Soares EA (2020) A self-adaptive synthetic over-sampling technique for imbalanced classification. Int J Intell Syst 35(6):923–943
Article Google Scholar
Guo C, Rana M, Cisse M, Der Maaten LV (2017) Countering adversarial images using input transformations. Computer vision and pattern recognition. arXiv:1711.00117
He K, Zhang X, Ren S et al (2016) Identity mappings in deep residual networks. In: European conference on computer vision. Springer, Cham, pp 630–645
Google Scholar
Huan Z, Wang Y, Zhang X et al (2020) Data-free adversarial perturbations for practical black-box attack. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, Cham, pp 127–138
Chapter Google Scholar
Iyyer M, Wieting J, Gimpel K, Zettlemoyer L (2018) Adversarial example generation with syntactically controlled paraphrase networks, vol 1, pp 1875–1885. arXiv:1804.06059
Kurakin A, Goodfellow I, Bengio S (2016) Adversarial examples in the physical world. Computer vision and pattern recognition. arXiv:1607.02533
Li J, Kuang X, Lin S, Ma X, Tang Y (2020) Privacy preservation for machine learning training and classification based on homomorphic encryption schemes. Inf Sci 526:166–179. https://doi.org/10.1016/j.ins.2020.03.041
Article MathSciNet MATH Google Scholar
Li X, Li F (2017) Adversarial examples detection in deep networks with convolutional filter statistics. In: Proceedings of the IEEE international conference on computer vision, pp 5764–5772
Li Y, Li L, Wang L et al (2019) Nattack: learning the distributions of adversarial examples for an improved black-box attack on deep neural networks. arXiv preprint arXiv:1905.00441
Liu X, Cheng M, Zhang H et al (2018) Towards robust neural networks via random self-ensemble. In: Proceedings of the european conference on computer vision (ECCV), pp 369–385
Liu Y, Chen X, Liu C, Song D (2016) Delving into transferable adversarial examples and black-box attacks. Learning. arXiv:1611.02770
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2017) Towards deep learning models resistant to adversarial attacks. Machine learning. arXiv:1706.06083
Meng L, Lin C T, Jung T P et al (2019) White-box target attack for EEG-based BCI regression problems. In: International conference on neural information processing. Springer, Cham, pp 476–488
Chapter Google Scholar
Metzen JH, Genewein T, Fischer V, Bischoff B (2017) On detecting adversarial perturbations. arXiv:1702.04267
Moosavidezfooli S, Fawzi A, Fawzi O, Frossard P, Soatto S (2017) Analysis of universal adversarial perturbations. Computer vision and pattern recognition. arXiv:1705.09554
Mopuri KR, Garg U, Babu RV (2017) Fast feature fool: a data independent approach to universal adversarial perturbations. Computer vision and pattern recognition. arXiv:1707.05572
Nazemi A, Fieguth P (2019) Potential adversarial samples for white-box attacks. arXiv:1912.06409
Papernot N, Mcdaniel P, Goodfellow I (2016) Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. Cryptography and security. arXiv:1605.0727
Papernot N, McDaniel P, Wu X et al (2016) Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE symposium on security and privacy (SP). IEEE, pp 582–597
He X, Yan S, Hu Y et al (2005) Face recognition using laplacianfaces. IEEE Trans Pattern Anal Mach Intell 27(3):328–340
Article Google Scholar
Smilkov D, Thorat N, Kim B, Viegas FB, Wattenberg M (2017) Smoothgrad: removing noise by adding noise. Learning. arXiv:1706.03825
Sun L, Wang J, Yu PS, Li B (2018) Adversarial attack and defense on graph data: a survey. Cryptography and security. arXiv:1812.10528
Sutanto RE, Lee S (2020) Adversarial attack defense based on the deep image prior network. In: Information science and applications. Springer, Singapore, pp 519–526
Chapter Google Scholar
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2016) Inception-v4, inception-resnet and the impact of residual connections on learning, pp 4278–4284. arXiv:1602.07261
Szegedy C, Vanhoucke V, Ioffe S et al (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Tang P, Wang C, Wenyu X, Wenjun L, Jingdong Z (2019) Object detection in videos by high quality object linking. IEEE Trans Pattern Anal Mach Intel 42(5):1272–1278
Article Google Scholar
Tramer F, Kurakin A, Papernot N, Goodfellow I, Boneh D, Mcdaniel P (2017) Ensemble adversarial training: attacks and defenses. Machine learning. arXiv:1705.07204
Wang X, Kuang X, Li J et al (2020) Oblivious transfer for privacy-preserving in VANET's feature matching. IEEE transactions on intelligent transportation systems
Wang X, Li J, Kuang X, Tan Y, Li J (2019) The security of machine learning in an adversarial setting: a survey. J Parallel Distrib Comput 130:12–23
Article Google Scholar
Weickert J, Romeny BTH, Viergever M et al (1998) Efficient and reliable schemes for nonlinear diffusion filtering. IEEE Trans Image Process 7(3):398–410
Article Google Scholar
Wu L, Zhu Z, Tai C, Weinan E (2018) Understanding and enhancing the transferability of adversarial examples. Machine learning. arXiv:1802.09707
Xie C, Zhang Z, Zhou Y et al (2019) Improving transferability of adversarial examples with input diversity. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2730–2739
Zhang F, Chen Y, Li Z, Hong Z, Liu J, Ma F, Han J, Ding E (2019) Acfnet: attentional class feature network for semantic segmentation. In: 2019 IEEE international conference on image processing (ICIP)
Zhang Y, Liang P (2019) Defending against whitebox adversarial attacks via randomized discretization. arXiv:1903.10586
Zhao Q, Zhao C, Cui S et al (2020) PrivateDL: privacy‐preserving collaborative deep learning against leakage from gradient sharing. Int J Intell Syst

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Nos. 62072127, 62002076), National Natural Science Foundation for Outstanding Youth Foundation (No. 61722203), Project 6142111180404 supported by CNKLSTISS, Science and Technology Program of Guangzhou, China (No. 202002030131), Guangdong basic and applied basic research fund joint fund Youth Fund (No. 2019A1515110213), Educational Commission of Guangdong Provice of China (2016KZDXM036).

Author information

Authors and Affiliations

Institute of Artificial Intelligence and Blockchain, Guangzhou University, Guangzhou, 510006, China
Hongqiao Chen, Keda Lu, Xianmin Wang & Jin Li
State Key Laboratory of Information Security, Chinese Academy of Sciences, Beijing, China
Xianmin Wang

Authors

Hongqiao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Keda Lu
View author publications
You can also search for this author in PubMed Google Scholar
Xianmin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jin Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xianmin Wang or Jin Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, H., Lu, K., Wang, X. et al. Generating transferable adversarial examples based on perceptually-aligned perturbation. Int. J. Mach. Learn. & Cyber. 12, 3295–3307 (2021). https://doi.org/10.1007/s13042-020-01240-1

Download citation

Received: 08 July 2020
Accepted: 19 November 2020
Published: 12 January 2021
Issue Date: November 2021
DOI: https://doi.org/10.1007/s13042-020-01240-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Generating transferable adversarial examples based on perceptually-aligned perturbation

Abstract

Access this article

Similar content being viewed by others

Multi-scale Features Destructive Universal Adversarial Perturbations

Adversarial perturbation denoising utilizing common characteristics in deep feature space

Transferable Adversarial Perturbations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Generating transferable adversarial examples based on perceptually-aligned perturbation

Abstract

Access this article

Similar content being viewed by others

Multi-scale Features Destructive Universal Adversarial Perturbations

Adversarial perturbation denoising utilizing common characteristics in deep feature space

Transferable Adversarial Perturbations

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation