当前位置: X-MOL 学术arXiv.cs.CL › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Balancing out Bias: Achieving Fairness Through Training Reweighting
arXiv - CS - Computation and Language Pub Date : 2021-09-16 , DOI: arxiv-2109.08253
Xudong Han, Timothy Baldwin, Trevor Cohn

Bias in natural language processing arises primarily from models learning characteristics of the author such as gender and race when modelling tasks such as sentiment and syntactic parsing. This problem manifests as disparities in error rates across author demographics, typically disadvantaging minority groups. Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables. Moreover, evaluation of bias has been inconsistent in previous work, in terms of dataset balance and evaluation methods. This paper introduces a very simple but highly effective method for countering bias using instance reweighting, based on the frequency of both task labels and author demographics. We extend the method in the form of a gated model which incorporates the author demographic as an input, and show that while it is highly vulnerable to input data bias, it provides debiased predictions through demographic input perturbation, and outperforms all other bias mitigation techniques when combined with instance reweighting.

中文翻译:

平衡偏差:通过训练重新加权实现公平

自然语言处理中的偏差主要源于模型在对情感和句法解析等任务进行建模时学习作者的性别和种族等特征。这个问题表现为不同作者人口统计的错误率差异,通常对少数群体不利。现有的减轻和衡量偏见的方法并没有直接解释作者人口统计和语言变量之间的相关性。此外,就数据集平衡和评估方法而言,偏差评估在以前的工作中一直不一致。本文介绍了一种非常简单但非常有效的方法,该方法基于任务标签和作者人口统计数据的频率,使用实例重新加权来对抗偏见。
更新日期:2021-09-20
down
wechat
bug