Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning,ACM Computing Surveys

当前位置： X-MOL 学术 › ACM Comput. Surv. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Wild Patterns Reloaded: A Survey of Machine Learning Security against Training Data Poisoning
ACM Computing Surveys ( IF 16.6 ) Pub Date : 2023-03-01 , DOI: 10.1145/3585385
Antonio Emanuele Cinà ₁ , Kathrin Grosse ₂ , Ambra Demontis ₃ , Sebastiano Vascon ₄ , Werner Zellinger ₅ , Bernhard A. Moser ₅ , Alina Oprea ₆ , Battista Biggio ₇ , Marcello Pelillo ₁ , Fabio Roli ₈

Affiliation

The success of machine learning is fueled by the increasing availability of computing power and large training datasets. The training data is used to learn new models or update existing ones, assuming that it is sufficiently representative of the data that will be encountered at test time. This assumption is challenged by the threat of poisoning, an attack that manipulates the training data to compromise the model’s performance at test time. Although poisoning has been acknowledged as a relevant threat in industry applications, and a variety of different attacks and defenses have been proposed so far, a complete systematization and critical review of the field is still missing. In this survey, we provide a comprehensive systematization of poisoning attacks and defenses in machine learning, reviewing more than 100 papers published in the field in the last 15 years. We start by categorizing the current threat models and attacks, and then organize existing defenses accordingly. While we focus mostly on computer-vision applications, we argue that our systematization also encompasses state-of-the-art attacks and defenses for other data modalities. Finally, we discuss existing resources for research in poisoning, and shed light on the current limitations and open research questions in this research field.

中文翻译：

重装狂野模式：针对训练数据中毒的机器学习安全性调查

计算能力和大型训练数据集可用性的提高推动了机器学习的成功。训练数据用于学习新模型或更新现有模型，假设它足以代表测试时将遇到的数据。这种假设受到中毒威胁的挑战，中毒是一种操纵训练数据以在测试时损害模型性能的攻击。尽管中毒已被公认为行业应用中的相关威胁，并且迄今为止已提出各种不同的攻击和防御措施，但仍然缺乏对该领域的完整系统化和批判性审查。在本次调查中，我们提供了机器学习中毒攻击和防御的全面系统化，审阅过去 15 年在该领域发表的 100 多篇论文。我们首先对当前的威胁模型和攻击进行分类，然后相应地组织现有的防御措施。虽然我们主要关注计算机视觉应用程序，但我们认为我们的系统化还包括针对其他数据模式的最先进的攻击和防御。最后，我们讨论了中毒研究的现有资源，并阐明了该研究领域当前的局限性和未解决的研究问题。

更新日期：2023-03-01

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>