Skip to main content
Log in

Generating transferable adversarial examples based on perceptually-aligned perturbation

  • Original Article
  • Published:
International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Abstract

Neural networks (NNs) are known to be susceptible to adversarial examples (AEs), which are intentionally designed to deceive a target classifier by adding small perturbations to the inputs. And interestingly, AEs crafted for one NN can mislead another model. Such a property is referred to as transferability, which is often leveraged to perform attacks in black-box settings. To mitigate the transferability of AEs, many approaches are explored to enhance the NN’s robustness. Especially, adversarial training (AT) and its variants are shown be the strongest defense to resist such transferable AEs. To boost the transferability of AEs against the robust models that have undergone AT, a novel AE generating method is proposed in this paper. The motivation of our method is based on the observation that robust models with AT is more sensitive to the perceptually-relevant gradients, hence it is reasonable to synthesize the AEs by the perturbations that have the perceptually-aligned features. The detailed process of the proposed method is given as below. First, by optimizing the loss function over an ensemble of random noised inputs, we obtain perceptually-aligned perturbations that have the noise-invariant property. Second, we employ Perona–Malik (P–M) filter to smooth the derived adversarial perturbations, such that the perceptually-relevant feature of the perturbation is significantly reinforced and the local oscillation of the perturbation is substantially suppressed. Our method can be generally applied to any gradient-based attack method. We carry out extensive experiments under ImageNet dataset for various robust and non-robust models, and the experimental results demonstrate the effectiveness of our method. Particularly, by combining our method with diverse inputs method and momentum iterative fast gradient sign method, we can achieve state-of-the-art performance in terms of fooling the robust models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

References

  1. Athalye A, Carlini N, Wagner D (2018) Obfuscated gradients give a false sense of security: circumventing defenses to adversarial examples. arXiv:1802.00420

  2. Balduzzi D, Frean M, Leary L, Lewis JP, Ma KW, Mcwilliams B (2017) The shattered gradients problem: if resnets are the answer, then what is the question?. Neural and evolutionary computing. arXiv:1702.08591

  3. Brendel W, Rauber J, Bethge M (2017) Decision-based adversarial attacks: Reliable attacks against black-box machine learning models. Machine learning. arXiv:1712.04248

  4. Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In: 2017 ieee symposium on security and privacy (sp). IEEE, pp 39–57

  5. Chan A, Tay Y, Ong YS, Fu J (2019) Jacobian adversarially regularized networks for robustness

  6. Chen J, Su M, Shen S, Xiong H, Zheng H (2019) POBA-GA: perturbation optimized black-box adversarial attacks via genetic algorithm. Comput Secur 85:89–106

    Article  Google Scholar 

  7. Chen PY, Zhang H, Yi J, Hsieh CJ (2017) Zoo: zeroth order optimization based black-box attacks to deep neural networks without training substitute models, pp 15–26. https://doi.org/10.1145/3128572.3140448

  8. Dhillon GS, Azizzadenesheli K, Lipton ZC et al (2018) Stochastic activation pruning for robust adversarial defense. arXiv preprint arXiv:1803.01442

  9. Dong Y, Liao F, Pang T et al (2018) Boosting adversarial attacks with momentum. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9185–9193

  10. Dong Y, Pang T, Su H et al (2019) Evading defenses to transferable adversarial examples by translation-invariant attacks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 4312–4321

  11. Feng W, Chen Z, Gursoy MC, Velipasalar S (2020) Defense strategies against adversarial jamming attacks via deep reinforcement learning. In: 2020 54th annual conference on information sciences and systems (CISS)

  12. Goodfellow I, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. Machine learning. arXiv

  13. Grosse K, Manoharan P, Papernot N, Backes M, Mcdaniel P (2017) On the (statistical) detection of adversarial examples. Cryptography and security. arXiv

  14. Gu X, Angelov PP, Soares EA (2020) A self-adaptive synthetic over-sampling technique for imbalanced classification. Int J Intell Syst 35(6):923–943

    Article  Google Scholar 

  15. Guo C, Rana M, Cisse M, Der Maaten LV (2017) Countering adversarial images using input transformations. Computer vision and pattern recognition. arXiv:1711.00117

  16. He K, Zhang X, Ren S et al (2016) Identity mappings in deep residual networks. In: European conference on computer vision. Springer, Cham, pp 630–645

    Google Scholar 

  17. Huan Z, Wang Y, Zhang X et al (2020) Data-free adversarial perturbations for practical black-box attack. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, Cham, pp 127–138

    Chapter  Google Scholar 

  18. Iyyer M, Wieting J, Gimpel K, Zettlemoyer L (2018) Adversarial example generation with syntactically controlled paraphrase networks, vol 1, pp 1875–1885. arXiv:1804.06059

  19. Kurakin A, Goodfellow I, Bengio S (2016) Adversarial examples in the physical world. Computer vision and pattern recognition. arXiv:1607.02533

  20. Li J, Kuang X, Lin S, Ma X, Tang Y (2020) Privacy preservation for machine learning training and classification based on homomorphic encryption schemes. Inf Sci 526:166–179. https://doi.org/10.1016/j.ins.2020.03.041

    Article  MathSciNet  MATH  Google Scholar 

  21. Li X, Li F (2017) Adversarial examples detection in deep networks with convolutional filter statistics. In: Proceedings of the IEEE international conference on computer vision, pp 5764–5772

  22. Li Y, Li L, Wang L et al (2019) Nattack: learning the distributions of adversarial examples for an improved black-box attack on deep neural networks. arXiv preprint arXiv:1905.00441

  23. Liu X, Cheng M, Zhang H et al (2018) Towards robust neural networks via random self-ensemble. In: Proceedings of the european conference on computer vision (ECCV), pp 369–385

  24. Liu Y, Chen X, Liu C, Song D (2016) Delving into transferable adversarial examples and black-box attacks. Learning. arXiv:1611.02770

  25. Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2017) Towards deep learning models resistant to adversarial attacks. Machine learning. arXiv:1706.06083

  26. Meng L, Lin C T, Jung T P et al (2019) White-box target attack for EEG-based BCI regression problems. In: International conference on neural information processing. Springer, Cham, pp 476–488

    Chapter  Google Scholar 

  27. Metzen JH, Genewein T, Fischer V, Bischoff B (2017) On detecting adversarial perturbations. arXiv:1702.04267

  28. Moosavidezfooli S, Fawzi A, Fawzi O, Frossard P, Soatto S (2017) Analysis of universal adversarial perturbations. Computer vision and pattern recognition. arXiv:1705.09554

  29. Mopuri KR, Garg U, Babu RV (2017) Fast feature fool: a data independent approach to universal adversarial perturbations. Computer vision and pattern recognition. arXiv:1707.05572

  30. Nazemi A, Fieguth P (2019) Potential adversarial samples for white-box attacks. arXiv:1912.06409

  31. Papernot N, Mcdaniel P, Goodfellow I (2016) Transferability in machine learning: from phenomena to black-box attacks using adversarial samples. Cryptography and security. arXiv:1605.0727

  32. Papernot N, McDaniel P, Wu X et al (2016) Distillation as a defense to adversarial perturbations against deep neural networks. In: 2016 IEEE symposium on security and privacy (SP). IEEE, pp 582–597

  33. He X, Yan S, Hu Y et al (2005) Face recognition using laplacianfaces. IEEE Trans Pattern Anal Mach Intell 27(3):328–340

    Article  Google Scholar 

  34. Smilkov D, Thorat N, Kim B, Viegas FB, Wattenberg M (2017) Smoothgrad: removing noise by adding noise. Learning. arXiv:1706.03825

  35. Sun L, Wang J, Yu PS, Li B (2018) Adversarial attack and defense on graph data: a survey. Cryptography and security. arXiv:1812.10528

  36. Sutanto RE, Lee S (2020) Adversarial attack defense based on the deep image prior network. In: Information science and applications. Springer, Singapore, pp 519–526

    Chapter  Google Scholar 

  37. Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2016) Inception-v4, inception-resnet and the impact of residual connections on learning, pp 4278–4284. arXiv:1602.07261

  38. Szegedy C, Vanhoucke V, Ioffe S et al (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826

  39. Tang P, Wang C, Wenyu X, Wenjun L, Jingdong Z (2019) Object detection in videos by high quality object linking. IEEE Trans Pattern Anal Mach Intel 42(5):1272–1278

    Article  Google Scholar 

  40. Tramer F, Kurakin A, Papernot N, Goodfellow I, Boneh D, Mcdaniel P (2017) Ensemble adversarial training: attacks and defenses. Machine learning. arXiv:1705.07204

  41. Wang X, Kuang X, Li J et al (2020) Oblivious transfer for privacy-preserving in VANET's feature matching. IEEE transactions on intelligent transportation systems

  42. Wang X, Li J, Kuang X, Tan Y, Li J (2019) The security of machine learning in an adversarial setting: a survey. J Parallel Distrib Comput 130:12–23

    Article  Google Scholar 

  43. Weickert J, Romeny BTH, Viergever M et al (1998) Efficient and reliable schemes for nonlinear diffusion filtering. IEEE Trans Image Process 7(3):398–410

    Article  Google Scholar 

  44. Wu L, Zhu Z, Tai C, Weinan E (2018) Understanding and enhancing the transferability of adversarial examples. Machine learning. arXiv:1802.09707

  45. Xie C, Zhang Z, Zhou Y et al (2019) Improving transferability of adversarial examples with input diversity. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2730–2739

  46. Zhang F, Chen Y, Li Z, Hong Z, Liu J, Ma F, Han J, Ding E (2019) Acfnet: attentional class feature network for semantic segmentation. In: 2019 IEEE international conference on image processing (ICIP)

  47. Zhang Y, Liang P (2019) Defending against whitebox adversarial attacks via randomized discretization. arXiv:1903.10586

  48. Zhao Q, Zhao C, Cui S et al (2020) PrivateDL: privacy‐preserving collaborative deep learning against leakage from gradient sharing. Int J Intell Syst

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (Nos. 62072127, 62002076), National Natural Science Foundation for Outstanding Youth Foundation (No. 61722203), Project 6142111180404 supported by CNKLSTISS, Science and Technology Program of Guangzhou, China (No. 202002030131), Guangdong basic and applied basic research fund joint fund Youth Fund (No. 2019A1515110213), Educational Commission of Guangdong Provice of China (2016KZDXM036).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Xianmin Wang or Jin Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, H., Lu, K., Wang, X. et al. Generating transferable adversarial examples based on perceptually-aligned perturbation. Int. J. Mach. Learn. & Cyber. 12, 3295–3307 (2021). https://doi.org/10.1007/s13042-020-01240-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13042-020-01240-1

Keywords

Navigation