Elsevier

Information Fusion

Volume 76, December 2021, Pages 55-65
Information Fusion

Full length article
A defense method based on attention mechanism against traffic sign adversarial samples

https://doi.org/10.1016/j.inffus.2021.05.005Get rights and content

Highlights

  • The attacker can cheat the neural network by adding tiny disturbance to the image.

  • If traffic signs are attacked, automatic driving will probably be misguided.

  • The key areas in the image are extracted by the attention mechanism.

  • It can theoretically defend against unknown attacks.

Abstract

A traditional neural network cannot realize the invariance of image rotation and distortion well, so an attacker can fool the neural network by adding tiny disturbances to an image. If traffic signs are attacked, automatic driving will probably be misguided, leading to disastrous consequences. Inspired by the principle of human vision, this paper proposes a defense method based on an attentional mechanism for traffic sign adversarial samples. In this method, the affine coordinate parameters of the target objects in the images are extracted by a CNN, and then the target objects are redrawn by the coordinate mapping model. In this process, the key areas in the image are extracted by the attention mechanism, and the pixels are filtered by interpolation. Our model simulates the daily behavior of human beings, making it more intelligent in the defense against the adversarial samples. Experiments show that our method has a strong defense ability for traffic sign adversarial samples generated by various attack methods. Compared with other defense methods, our method is more universal and has a strong defense ability against a variety of attacks. Moreover, our model is portable and can be easily implanted into neural networks in the form of defense plug-ins.

Introduction

Deep neural networks (DNNs) have achieved great success in various tasks such as semantic segmentation, target tracking and image retrieval [1], [2], [3]. However, as was first shown by Christian Szegedy et al. [4], for an adversarial sample created by adding subtle disturbance to the input samples, the image classification recognition model can output the wrong classification with high confidence. Since the added disturbance is very trivial, the adversarial sample looks almost the same as the original image. Subsequently, a large number of attack methods have been developed [5], [6], [7], [8], demonstrating the severe shortcoming of the recognition by the neural network.

Intelligent transportation systems have been emerging gradually, and in particular, DNNs have been used in the control pipes of automobiles [9], [10]. Traffic sign recognition is a key technology and is one of the challenges in the intelligent transportation research. As shown in Fig. 1, if an attacker generates adversarial samples of traffic signs, the images that do not appear to be different from the normal traffic signs to human eyes may be misjudged as the wrong signs by the automatic recognition system [11], which will can clearly result in catastrophic consequences.

Currently, the existing defense methods are designed based on specific adversarial samples. Experiments have shown that once the attack mode changes, the defense ability will be greatly reduced or even invalid. Therefore, we seek to explore a defense model that is independent of the attack mode. We speculate that the application of a visual attention mechanism [12] in neural networks may be able to provide a defense against attacks. We know that when people observe a target object, they often focus on the key parts of the target object and ignore the background information. As shown in Fig. 2, people will directly focus on the bird in the image and ignore the blue sky background. Attention mechanisms have been widely used in the field of computer vision [13]. Inspired by this, we implant a visual attention mechanism in a DNN to guide it to focus on the key parts of the target object while ignoring the disturbance in the adversarial samples.

In this paper, we design a defense method based on an attention mechanism to eliminate the disturbance in the adversarial samples and capture the main features of the correct objects. In our model, through end-to-end training, CNN extracted affine coordinate parameters of the target object in the image, and redrew the target object with the coordinate mapping module. During this process, the key areas of the image are extracted by the attention mechanism, and the pixels are filtered by interpolation to eliminate the misclassification caused by the disturbance in the adversarial samples. Different from the current conventional defense methods, our method is not targeted to the specific neural network model and the structure of the adversarial samples but rather guides the network to pay attention to the key parts of the target object in a more intelligent manner and discards the non-vital parts to fundamentally defend against the adversarial samples. Equipped with this perspective, our contributions can be summarized as follows:

  • 1.

    We propose a defense method based on the attention mechanism that can guide the neural network to focus on the key parts of the target object in the image while ignoring the disturbance to achieve the defense goal;

  • 2.

    We propose a spatial transformation model that can transform significant parts of the image into new space and redraw to generate new images;

  • 3.

    The experimental results show that compared with other defense models, our model has a balanced performance on multiple neural networks and datasets. This indicates that our model has better robustness and universality.

Section snippets

Background

In this section, we will give a brief overview of the adversarial samples and defense methods.

Inspiration of an attention mechanism

In this study, we try to explore a more intelligent and universal defense model. Next, we analyze the defects of CNN and then introduce the inspiration of the attention mechanism.

The convolution and pooling mechanism of CNNs is most suitable for image recognition, so it has been widely used in the field of computer vision [34]. In images, as shown in Fig. 2, the target object usually occupies only a part of the space, and the rest is irrelevant background. However, in the process of

The proposed defense model

In this paper, we propose a defense model based on the attention mechanism. The workflow of our model is shown in Fig. 3. It includes three modules: the parameter regressing module, the coordinate mapping module and the image interpolation module. An image (or a feature map) is input into our model, and the first CNN in the parameter regressing module is responsible for regressing the coordinate parameters of the target object. Then, reverse affine transformation in the coordinate mapping

Experiments and results

In this section, we will verify the performance of our model against the various attack methods on the traffic sign database. To verify the robustness of this model in the real environment, we test its defense ability at different distances and angles in the simulated real traffic environment. In addition, we carried out experiments on the daily image database to verify its generality.

Conclusion

The defense model proposed in this paper is a spatial transformation module based on the attention mechanism. Through end-to-end learning, the main features of the object of the image are extracted, then it is enlarged by spatial transformation, and the unimportant background and noise disturbance in the image are removed at the same time. As a plug-in, the module can be easily embedded into other neural networks, propagating the important features of the module and using it as input into the

Discussion

Let us analyze the reason why the proposed defense model can defend against the adversarial samples in depth. When the enemy attacks the image, the location of the disturbance is not fixed. We know that our method can enlarge the key parts of the object in the adversarial samples, and then cut the original image. If the disturbance is outside the range of clipping, it can be eliminated directly. If the disturbance is within the range of clipping, the interpolation method in our model can change

CRediT authorship contribution statement

Jian Weng: Conception and design of study, Acquisition of data, Analysis and/or interpretation of data, Writing - original draft, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work was supported in part by the National Key R&D Program of China under Grant 2017YFB0802200, the Major Program of Guangdong Basic and Applied Research, China under Grant 2019B030302008.

References (46)

  • RenL. et al.

    Uniform and variational deep learning for RGB-D object recognition and person re-identification

    IEEE Trans. Image Process.

    (2019)
  • LiH. et al.

    Cifar10-dvs: an event-stream dataset for object classification

    Front. Neurosci.

    (2017)
  • LeCunY. et al.

    Deep learning

    Nature

    (2015)
  • ReichsteinM. et al.

    Deep learning and process understanding for data-driven earth system science

    Nature

    (2019)
  • SzegedyC. et al.

    Intriguing properties of neural networks

    (2014)
  • B. Heo, M. Lee, S. Yun, J.Y. Choi, Knowledge distillation with adversarial samples supporting decision boundary, in:...
  • CarliniN. et al.

    Towards evaluating the robustness of neural networks

  • PapernotN. et al.

    The limitations of deep learning in adversarial settings

  • C. Xie, J. Wang, Z. Zhang, Y. Zhou, L. Xie, A. Yuille, Adversarial examples for semantic segmentation and object...
  • GeigerA. et al.

    Are we ready for autonomous driving? the kitti vision benchmark suite

  • TeichmannM. et al.

    Multinet: Real-time joint semantic reasoning for autonomous driving

  • EykholtK. et al.

    Robust physical-world attacks on deep learning models

    (2018)
  • UngerleiderK. et al.

    Mechanisms of visual attention in the human cortex

    Ann. Rev. Neuroence

    (2003)
  • LopezP.R. et al.

    Pay attention to the activations: a modular attention mechanism for fine-grained image recognition

    IEEE Trans. Multimed.

    (2019)
  • K. Eykholt, I. Evtimov, E. Fernandes, . Li, Robust physical-world attacks on deep learning visual classification, in:...
  • PapernotN. et al.

    Transferability in machine learning: from phenomena to black-box attacks using adversarial samples

    (2016)
  • GoodfellowI.J. et al.

    Explaining and harnessing adversarial examples

    (2014)
  • KurakinA. et al.

    Adversarial machine learning at scale

    (2016)
  • SuJ. et al.

    One pixel attack for fooling deep neural networks

    IEEE Trans. Evol. Comput.

    (2019)
  • SzegedyC. et al.

    Intriguing properties of neural networks

    (2013)
  • FawziA. et al.

    Robustness of classifiers: from adversarial to random noise

  • A. Nguyen, J. Yosinski, J. Clune, Deep neural networks are easily fooled: High confidence predictions for...
  • S.-M. Moosavi-Dezfooli, A. Fawzi, O. Fawzi, P. Frossard, Universal adversarial perturbations, in: Proceedings of the...
  • Cited by (0)

    View full text