当前位置:
X-MOL 学术
›
arXiv.cs.CY
›
论文详情
Our official English website, www.x-mol.net, welcomes your
feedback! (Note: you will need to create a separate account there.)
Fair Classification with Adversarial Perturbations
arXiv - CS - Computers and Society Pub Date : 2021-06-10 , DOI: arxiv-2106.05964 L. Elisa Celis, Anay Mehrotra, Nisheeth K. Vishnoi
arXiv - CS - Computers and Society Pub Date : 2021-06-10 , DOI: arxiv-2106.05964 L. Elisa Celis, Anay Mehrotra, Nisheeth K. Vishnoi
We study fair classification in the presence of an omniscient adversary that,
given an $\eta$, is allowed to choose an arbitrary $\eta$-fraction of the
training samples and arbitrarily perturb their protected attributes. The
motivation comes from settings in which protected attributes can be incorrect
due to strategic misreporting, malicious actors, or errors in imputation; and
prior approaches that make stochastic or independence assumptions on errors may
not satisfy their guarantees in this adversarial setting. Our main contribution
is an optimization framework to learn fair classifiers in this adversarial
setting that comes with provable guarantees on accuracy and fairness. Our
framework works with multiple and non-binary protected attributes, is designed
for the large class of linear-fractional fairness metrics, and can also handle
perturbations besides protected attributes. We prove near-tightness of our
framework's guarantees for natural hypothesis classes: no algorithm can have
significantly better accuracy and any algorithm with better fairness must have
lower accuracy. Empirically, we evaluate the classifiers produced by our
framework for statistical rate on real-world and synthetic datasets for a
family of adversaries.
中文翻译:
具有对抗性扰动的公平分类
我们在一个无所不知的对手存在的情况下研究公平分类,给定 $\eta$,允许选择训练样本的任意 $\eta$ 部分并任意扰乱其受保护的属性。动机来自受保护属性可能由于战略误报、恶意行为者或插补错误而错误的设置;在这种对抗性环境中,对错误做出随机或独立假设的先前方法可能无法满足他们的保证。我们的主要贡献是一个优化框架,用于在这种对抗性环境中学习公平分类器,并提供可证明的准确性和公平性保证。我们的框架适用于多个和非二进制受保护的属性,专为大类线性分数公平性指标而设计,除了受保护的属性外,还可以处理扰动。我们证明了我们的框架对自然假设类的保证近乎严格:没有算法可以有明显更好的准确性,任何具有更好公平性的算法都必须具有较低的准确性。根据经验,我们评估了我们的框架生成的分类器,用于对一系列对手的真实世界和合成数据集的统计率。
更新日期:2021-06-11
中文翻译:
具有对抗性扰动的公平分类
我们在一个无所不知的对手存在的情况下研究公平分类,给定 $\eta$,允许选择训练样本的任意 $\eta$ 部分并任意扰乱其受保护的属性。动机来自受保护属性可能由于战略误报、恶意行为者或插补错误而错误的设置;在这种对抗性环境中,对错误做出随机或独立假设的先前方法可能无法满足他们的保证。我们的主要贡献是一个优化框架,用于在这种对抗性环境中学习公平分类器,并提供可证明的准确性和公平性保证。我们的框架适用于多个和非二进制受保护的属性,专为大类线性分数公平性指标而设计,除了受保护的属性外,还可以处理扰动。我们证明了我们的框架对自然假设类的保证近乎严格:没有算法可以有明显更好的准确性,任何具有更好公平性的算法都必须具有较低的准确性。根据经验,我们评估了我们的框架生成的分类器,用于对一系列对手的真实世界和合成数据集的统计率。