Fair is Better than Sensational: Man is to Doctor as Woman is to Doctor,Computational Linguistics

当前位置： X-MOL 学术 › Comput. Linguist. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Fair is Better than Sensational: Man is to Doctor as Woman is to Doctor
Computational Linguistics ( IF 3.7 ) Pub Date : 2020-06-01 , DOI: 10.1162/coli_a_00379
Malvina Nissim ₁ , Rik van Noord ₁ , Rob van der Goot ₂

Affiliation

Analogies such as man is to king as woman is to X are often used to illustrate the amazing power of word embeddings. Concurrently, they have also been used to expose how strongly human biases are encoded in vector spaces trained on natural language, with examples like man is to computer programmer as woman is to homemaker. Recent work has shown that analogies are in fact not an accurate diagnostic for bias, but this does not mean that they are not used anymore, or that their legacy is fading. Instead of focusing on the intrinsic problems of the analogy task as a bias detection tool, we discuss a series of issues involving implementation as well as subjective choices, which might have yielded a distorted picture of bias in word embeddings. We stand by the truth that human biases are present in word embeddings, and of course, to the need to address them. But analogies are not an accurate tool to do so, and the way they have been most often used has exacerbated some possibly non-existing biases and perhaps hid others. Because they are still widely popular, and some of them have become classics within and outside the NLP community, we deem it important to provide a series of clarifications that should put well-known, and potentially new analogies into the right perspective.

中文翻译：

公平胜于耸人听闻：男人之于医生就像女人之于医生

男人之于国王，女人之于 X 之类的类比经常被用来说明词嵌入的惊人力量。同时，它们还被用来揭示人类偏见在自然语言训练的向量空间中的编码程度，例如男人之于计算机程序员，就像女人之于家庭主妇。最近的工作表明，类比实际上并不是偏见的准确诊断，但这并不意味着它们不再被使用，或者它们的遗产正在消失。我们没有关注类比任务作为偏差检测工具的内在问题，而是讨论了一系列涉及实现和主观选择的问题，这些问题可能会在词嵌入中产生扭曲的偏差图片。我们支持词嵌入中存在人类偏见的事实，当然，解决这些问题的必要性。但类比并不是这样做的准确工具，它们最常用的方式加剧了一些可能不存在的偏见，并可能隐藏了其他偏见。因为它们仍然广受欢迎，并且其中一些已经成为 NLP 社区内外的经典，我们认为提供一系列澄清很重要，这些澄清应该将众所周知的和潜在的新类比置于正确的角度。

更新日期：2020-06-01

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南11