Abstract
Adversarial training has become one of the most widely used methods to defense the attack of adversarial examples, since its properties of improving the robustness of neural networks. To achieve this, many representative works have been proposed to optimize the hyper-parameters in the adversarial training, so as to obtain the optimal trade-off between model classification accuracy and robustness. However, existing works are still in its infancy, especially in terms of model accuracy and training efficiency. In this paper, we propose Specific Adversarial Training(SAT), a novel framework to solve this challenge. Specifically, SAT improves the process of adversarial training by crafting specific perturbation and label for each data point. With this, these generated samples can close and properly cross the decision boundary meanwhile obtain an ideal label, which performs a positive effects in adversarial training. Experimental results show that our method can achieve 88.62% natural accuracy while the adversarial accuracy also improve from 43.79% to 52.34% in the CIFAR-10 dataset. Meanwhile, we achieve a higher efficiency compared to prior works.
Similar content being viewed by others
References
Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2013) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199
Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2017) Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083
Kurakin A, Goodfellow I, Bengio S (2016) Adversarial examples in the physical world. arXiv:1607.02533
Wang Y, Ma X, Bailey J, Yi J, Zhou B, Gu Q (2019) On the convergence and robustness of adversarial training. ICML 1:2
Balaji Y, Goldstein T, Hoffman J (2019) Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets. arXiv preprint arXiv:1910.08051
Cai QZ, Du M, Liu C, Song D (2018) Curriculum adversarial training. arXiv preprint arXiv:1805.04807
Cheng M, Lei Q, Chen PY, Dhillon I, Hsieh CJ (2020) Cat: Customized adversarial training for improved robustness. arXiv preprint arXiv:2002.06789
Carlini N, Mishra P, Vaidya T, Zhang Y, Sherr M, Shields C, Zhou W (2016) Hidden voice commands. In: 25th USENIX Security Symposium (USENIX Security 16) , pp 513–530
Wei X, Zhu J, Yuan S, Su H (2019) Sparse adversarial perturbations for videos. Proceedings of the AAAI Conference on Artificial Intelligence 33:8973–8980
Kurakin A, Goodfellow I, Bengio S (2016) Adversarial machine learning at scale. arXiv preprint arXiv:1611.01236,
Zhang H, Yu Y, Jiao J, Xing EP, Ghaoui LE, Jordan MI (2019) Theoretically principled trade-off between robustness and accuracy. arXiv preprint arXiv:1901.08573
Xie C, Wang J, Zhang Z, Ren Z, Yuille A (2017) Mitigating adversarial effects through randomization. arXiv preprint arXiv:1711.01991
Guo C, Rana M, Cisse M, Van Der Maaten L (2017) Countering adversarial images using input transformations. arXiv preprint arXiv:1711.00117
Liu X, Cheng M, Zhang H, Hsieh CJ (2018) Towards robust neural networks via random self-ensemble. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 369–385
Dhillon GS, Azizzadenesheli K, Lipton ZC, Bernstein J, Kossaifi J, Khanna A, Anandkumar A (2018) Stochastic activation pruning for robust adversarial defense. arXiv preprint arXiv:1803.01442
Raghunathan A, Steinhardt J, Liang P (2018) Certified defenses against adversarial examples. arXiv preprint 1801.09344
Raghunathan A, Steinhardt J, Liang PS (2018) Semidefinite relaxations for certifying robustness to adversarial examples. In: Advances in neural information processing systems, pp 10877–10887
Wong E, Kolter Z (2018) Provable defenses against adversarial examples via the convex outer adversarial polytope. In: International conference on machine learning, PMLR, pp 5286– 5295
Ding GW, Sharma Y, Lui KYC, Huang R (2019) Mma training: Direct input space margin maximization through adversarial training. In: International conference on learning representations
Duan R, Ma X, Wang Y, Bailey J, Qin AK, Yang Y (2020) Adversarial camouflage: Hiding physical-world attacks with natural styles. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 1000–1008
Xu W, Evans D, Qi Y (2017) Feature squeezing: Detecting adversarial examples in deep neural networks. arXiv preprint arXiv:1704.01155
Samangouei P, Kabkab M, Chellappa R (2018) Defense-gan: Protecting classifiers against adversarial attacks using generative models. arXiv preprint arXiv:1805.06605
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Xu Y, Xu Y, Qian Q, Li H, Jin R (2020) Towards understanding label smoothing. arXiv preprint arXiv:2006.11653
Wang Y, Jha S, Chaudhuri K (2018) Analyzing the robustness of nearest neighbors to adversarial examples. In: International conference on machine learning, PMLR, pp 5133–5142
Ye N, Zhu Z (2018) Bayesian adversarial learning. In: Proceedings of the 32nd international conference on neural information processing systems, pp 6892–6901
Liu X, Li Y, Wu C, Hsieh CJ (2018) Adv-bnn: Improved adversarial defense through robust bayesian neural network. arXiv preprint arXiv:1810.01279
Cao X, Jia J, Gong NZ (2019) IPGuard: Protecting the Intellectual Property of Deep Neural Networks via Fingerprinting the Classification Boundary. arXiv preprint arXiv:1910.12903
Wei C, Ma T (2019) Improved sample complexities for deep networks and robust classification via an all-layer margin. arXiv preprint arXiv:1910.04284
Acknowledgments
This work is supported by the National Natural Science Foundation of China under Grants 62020106013, 61972454, 61802051, 61772121, and 61728102, Sichuan Science and Technology Program under Grants 2020JDTD0007 and 2020YFG0298, the Fundamental Research Funds for Chinese Central Universities under Grant ZYGX2020ZB027.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhang, Y., Li, H., Xu, G. et al. One radish, One hole: Specific adversarial training for enhancing neural network’s robustness. Peer-to-Peer Netw. Appl. 14, 2262–2274 (2021). https://doi.org/10.1007/s12083-021-01178-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12083-021-01178-3