Skip to main content
Log in

Why Does Regularization Help with Mitigating Poisoning Attacks?

  • Published:
Neural Processing Letters Aims and scope Submit manuscript

Abstract

We use distributionally-robust optimization for machine learning to mitigate the effect of data poisoning attacks. We provide performance guarantees for the trained model on the original data (not including the poison records) by training the model for the worst-case distribution on a neighbourhood around the empirical distribution (extracted from the training dataset corrupted by a poisoning attack) defined using the Wasserstein distance. We relax the distributionally-robust machine learning problem by finding an upper bound for the worst-case fitness based on the empirical sampled-averaged fitness and the Lipschitz-constant of the fitness function (on the data for given model parameters) as regularizer. For regression models, we prove that this regularizer is equal to the dual norm of the model parameters.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Vorobeychik Y, Kantarcioglu M (2018) Adversarial machine learning. Synth Lect Artif Intell Machine Learn 12(3):1–169

    Article  Google Scholar 

  2. Biggio B, Nelson B, Laskov P (2011) Support vector machines under adversarial label noise, In Asian Conference on Machine Learning, pp. 97-112

  3. Biggio B, Nelson B, Laskov P (2012) Poisoning attacks against support vector machines, In 29th International Conference on Machine Learning, pp. 1807-1814

  4. Demontis A, Biggio B, Fumera G, Giacinto, G, Roli F (2017) Infinity-norm support vector machines against adversarial label contamination., In ITASEC, pp. 106-115

  5. Esfahani PM, Kuhn D (2018) Data-driven distributionally robust optimization using the wasserstein metric: Performance guarantees and tractable reformulations. Mathemat Program 171(1–2):115–166

    Article  MathSciNet  Google Scholar 

  6. Kantorovich LV, Rubinshtein G (1958) On a space of totally additive functions. Vestn Lening Univ 13:52–59

    MATH  Google Scholar 

  7. Cortez P, Cerdeira A, Almeida F, Matos T, Reis J (2009) Modeling wine preferences by data mining from physicochemical properties. Decision Support Syst 47(4):547–553

    Article  Google Scholar 

  8. Harrison D Jr, Rubinfeld DL (1978) Hedonic housing prices and the demand for clean air. J Environ Econ Manage 5(1):81–102

    Article  Google Scholar 

  9. Kohavi R (1966) Scaling up the accuracy of Naive-Bayes classifiers: A decision-tree hybrid, In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 202-207

  10. Valiant LG (1985) Learning disjunction of conjunctions, In International Joint Conferences on Artificial Intelligence (IJCAI), pp. 560-566

  11. Kearns M, Li M (1993) Learning in the presence of malicious errors. SIAM J Comput 22(4):807–837

    Article  MathSciNet  Google Scholar 

  12. Bshouty NH, Eiron N, Kushilevitz E (2002) Pac learning with nasty noise. Theoret Comput Sci 288(2):255–275

    Article  MathSciNet  Google Scholar 

  13. Kalai AT, Klivans AR, Mansour Y, Servedio RA (2008) Agnostically learning halfspaces. SIAM J Comput 37(6):1777–1805

    Article  MathSciNet  Google Scholar 

  14. Servedio RA (2003) Smooth boosting and learning with malicious noise. J Mach Learn Res 4:633–648

    MathSciNet  MATH  Google Scholar 

  15. Steinhardt J, Koh PWW, Liang PS (2017) Certified defenses for data poisoning attacks, In Advances in Neural Information Processing Systems, pp. 3517-3529

  16. Klivans AR, Long PM, Servedio RA (2009) Learning halfspaces with malicious noise. J Mach Learn Res 10:2715–2740

    MathSciNet  MATH  Google Scholar 

  17. Cretu GF, Stavrou A, Locasto ME, Stolfo SJ, Keromytis AD (2008) Casting out demons: Sanitizing training data for anomaly sensors, In 2008 IEEE Symposium on Security and Privacy (SP 2008), pp. 81-95

  18. Barreno M, Nelson B, Joseph AD, Tygar JD (2010) The security of machine learning. Mach Learn 81(2):121–148

    Article  MathSciNet  Google Scholar 

  19. Liu C, Li B, Vorobeychik Y, Oprea A (2017) Robust linear regression against training data poisoning, In Proceedings of the 10th ACM Workshop on Artificial Intelligence and Security, pp. 91-102, ACM

  20. Li B, Vorobeychik Y (2014) Feature cross-substitution in adversarial classification, In Advances in Neural Information Processing Systems, pp. 2087-2095

  21. Li B, Vorobeychik Y (2018) Evasion-robust classification on binary domains. ACM Transact Knowledge Discover Data (TKDD) 12(4):50

    Google Scholar 

  22. Frogner C, Zhang C, Mobahi H, Araya M, Poggio TA (2015) Learning with a Wasserstein loss, In Advances in Neural Information Processing Systems, pp. 2053-2061

  23. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks, In International Conference on Machine Learning, pp. 214-223

  24. Blanchet J, Kang Y, Murthy K (2019) Robust Wasserstein profile inference and applications to machine learning. J Appl Probab 56(3):830–857

    Article  MathSciNet  Google Scholar 

  25. Kuhn D, Esfahani PM, Nguyen VA, Shafieezadeh-Abadeh S (2019) Wasserstein distributionally robust optimization: Theory and applications in machine learning, In Operations Research & Management Science in the Age of Analytics, pp. 130-166, INFORMS

  26. Farokhi F (2019) A game-theoretic approach to adversarial linear support vector classification, Preprint:arXiv:1906.09721

  27. Zhang R, Zhu Q (2017) A game-theoretic defense against data poisoning attacks in distributed support vector machines, In 2017 IEEE 56th Annual Conference on Decision and Control (CDC), pp. 4582-4587

  28. Ou Y, Samavi R (2019) Mixed strategy game model against data poisoning attacks. Preprint:arXiv:1906.02872

  29. Hanasusanto GA, Kuhn D, Wiesemann W (2016) A comment on computational complexity of stochastic programming problems. Mathemat Program 159(1–2):557–569

    Article  MathSciNet  Google Scholar 

  30. Brownlees C, Joly E, Lugosi G et al (2015) Empirical risk minimization for heavy-tailed losses. Annals Statist 43(6):2507–2536

    Article  MathSciNet  Google Scholar 

  31. Catoni O (2012) Challenging the empirical mean and empirical variance: A deviation study. Annales de l’Institut Henri Poincaré, Probabilités et Statistiques 48:1148–1185

    Article  MathSciNet  Google Scholar 

  32. Fournier N, Guillin A (2015) On the rate of convergence in Wasserstein distance of the empirical measure. Probab Theory Related Fields 162(3–4):707–738

    Article  MathSciNet  Google Scholar 

  33. Pflug GC, Pichler A (2014) Multistage Stochastic Optimization. Springer Series in Operations Research and Financial Engineering, Springer International Publishing

  34. Prügel-Bennett A (2020) The Probability Companion for Engineering and Computer Science. Cambridge University Press, Cambridge

    MATH  Google Scholar 

  35. Reddy BD, Reddy DD, Marsden JE, Sirovich L, Golubitsky M, Jager W (1998) Introductory Functional Analysis: With Applications to Boundary Value Problems and Finite Elements. Introductory Functional Analysis Series, Springer, Newyork

    Book  Google Scholar 

  36. Hall WS, Newell ML (1979) The mean value theorem for vector valued functions: A simple proof. Mathemat Magazine 52(3):157–158

    Article  MathSciNet  Google Scholar 

  37. Aziznejad S, Gupta H, Campos J, Unser M (2020) Deep neural networks with trainable activations and controlled lipschitz constant. IEEE Transact Sig Process 68:4688–4699

    Article  MathSciNet  Google Scholar 

Download references

Acknowledgements

Funding was provided by University of Melbourne.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Farhad Farokhi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Farokhi, F. Why Does Regularization Help with Mitigating Poisoning Attacks?. Neural Process Lett 53, 2933–2945 (2021). https://doi.org/10.1007/s11063-021-10539-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11063-021-10539-1

Keywords

Navigation