当前位置: X-MOL 学术arXiv.cs.LG › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Model-Targeted Poisoning Attacks: Provable Convergence and Certified Bounds
arXiv - CS - Machine Learning Pub Date : 2020-06-30 , DOI: arxiv-2006.16469
Fnu Suya, Saeed Mahloujifar, David Evans, Yuan Tian

Machine learning systems that rely on training data collected from untrusted sources are vulnerable to poisoning attacks, in which adversaries controlling some of the collected data are able to induce a corrupted model. In this paper, we consider poisoning attacks where there is an adversary who has a particular target classifier in mind and hopes to induce a classifier close to that target by adding as few poisoning points as possible. We propose an efficient poisoning attack based on online convex optimization. Unlike previous model-targeted poisoning attacks, our attack comes with provable convergence to any achievable target classifier. The distance from the induced classifier to the target classifier is inversely proportional to the square root of the number of poisoning points. We also provide a certified lower bound on the minimum number of poisoning points needed to achieve a given target classifier. We report on experiments showing our attack has performance that is similar to or better than the state-of-the-art attacks in terms of attack success rate and distance to the target model, while providing the advantages of provable convergence, and the efficiency benefits associated with being an online attack that can determine near-optimal poisoning points incrementally.

中文翻译:

以模型为目标的中毒攻击:可证明的收敛和认证的界限

依赖于从不受信任的来源收集的训练数据的机器学习系统容易受到中毒攻击,在这种攻击中,控制一些收集到的数据的对手能够诱导损坏的模型。在本文中,我们考虑中毒攻击,其中有一个对手有一个特定的目标分类器,并希望通过添加尽可能少的中毒点来诱导接近该目标的分类器。我们提出了一种基于在线凸优化的高效中毒攻击。与之前的以模型为目标的中毒攻击不同,我们的攻击可以证明收敛到任何可实现的目标分类器。诱导分类器到目标分类器的距离与中毒点数的平方根成反比。我们还提供了实现给定目标分类器所需的最小中毒点数的认证下限。我们报告的实验表明,我们的攻击在攻击成功率和与目标模型的距离方面具有与最先进的攻击相似或更好的性能,同时提供可证明收敛的优势和效率优势与在线攻击相关联,可以逐步确定接近最佳的中毒点。
更新日期:2020-07-01
down
wechat
bug