Argument from Old Man's View: Assessing Social Bias in Argumentation,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Argument from Old Man's View: Assessing Social Bias in Argumentation
arXiv - CS - Computation and Language Pub Date : 2020-11-24 , DOI: arxiv-2011.12014
Maximilian Spliethöver, Henning Wachsmuth

Social bias in language - towards genders, ethnicities, ages, and other social groups - poses a problem with ethical impact for many NLP applications. Recent research has shown that machine learning models trained on respective data may not only adopt, but even amplify the bias. So far, however, little attention has been paid to bias in computational argumentation. In this paper, we study the existence of social biases in large English debate portals. In particular, we train word embedding models on portal-specific corpora and systematically evaluate their bias using WEAT, an existing metric to measure bias in word embeddings. In a word co-occurrence analysis, we then investigate causes of bias. The results suggest that all tested debate corpora contain unbalanced and biased data, mostly in favor of male people with European-American names. Our empirical insights contribute towards an understanding of bias in argumentative data sources.

中文翻译：

老人观点的争论：论辩中的社会偏见

语言的社会偏见-朝向性别，种族，年龄和其他社会群体-对许多NLP应用程序造成了道德影响问题。最近的研究表明，在各个数据上训练的机器学习模型不仅可以采用，甚至可以扩大偏差。但是，到目前为止，很少有人关注计算论证的偏见。在本文中，我们研究了大型英语辩论门户网站中社会偏见的存在。特别是，我们在特定于门户的语料库上训练单词嵌入模型，并使用WEAT（一种用于测量单词嵌入偏差的现有指标）系统地评估它们的偏差。在单词共现分析中，我们然后调查偏差的原因。结果表明，所有经过测试的辩论语料库均包含不平衡且有偏见的数据，主要是赞成具有欧美名字的男性。

更新日期：2020-11-25

点击分享查看原文

点击收藏

阅读更多本刊最新论文