RECAST: Interactive Auditing of Automatic Toxicity Detection Models,arXiv - CS - Computers and Society

当前位置： X-MOL 学术 › arXiv.cs.CY › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

RECAST: Interactive Auditing of Automatic Toxicity Detection Models
arXiv - CS - Computers and Society Pub Date : 2020-01-07 , DOI: arxiv-2001.01819
Austin P. Wright, Omar Shaikh, Haekyu Park, Will Epperson, Muhammed Ahmed, Stephane Pinel, Diyi Yang, Duen Horng Chau

As toxic language becomes nearly pervasive online, there has been increasing interest in leveraging the advancements in natural language processing (NLP), from very large transformer models to automatically detecting and removing toxic comments. Despite the fairness concerns, lack of adversarial robustness, and limited prediction explainability for deep learning systems, there is currently little work for auditing these systems and understanding how they work for both developers and users. We present our ongoing work, RECAST, an interactive tool for examining toxicity detection models by visualizing explanations for predictions and providing alternative wordings for detected toxic speech.

中文翻译：

RECAST：自动毒性检测模型的交互式审计

随着有毒语言在网上变得几乎无处不在，人们越来越关注利用自然语言处理 (NLP) 的进步，从非常大的转换器模型到自动检测和删除有毒评论。尽管深度学习系统存在公平性问题、缺乏对抗性鲁棒性以及有限的预测可解释性，但目前几乎没有工作可以审核这些系统并了解它们如何为开发人员和用户工作。我们展示了我们正在进行的工作 RECAST，这是一种交互式工具，用于通过可视化预测解释并为检测到的有毒语音提供替代措辞来检查毒性检测模型。

更新日期：2020-07-02

点击分享查看原文

点击收藏

阅读更多本刊最新论文