Calling Out Bluff: Attacking the Robustness of Automatic Scoring Systems with Simple Adversarial Testing,arXiv - CS - Artificial Intelligence

当前位置： X-MOL 学术 › arXiv.cs.AI › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Calling Out Bluff: Attacking the Robustness of Automatic Scoring Systems with Simple Adversarial Testing
arXiv - CS - Artificial Intelligence Pub Date : 2020-07-14 , DOI: arxiv-2007.06796
Yaman Kumar, Mehar Bhatia, Anubha Kabra, Jessy Junyi Li, Di Jin, Rajiv Ratn Shah

A significant progress has been made in deep-learning based Automatic Essay Scoring (AES) systems in the past two decades. The performance commonly measured by the standard performance metrics like Quadratic Weighted Kappa (QWK), and accuracy points to the same. However, testing on common-sense adversarial examples of these AES systems reveal their lack of natural language understanding capability. Inspired by common student behaviour during examinations, we propose a task agnostic adversarial evaluation scheme for AES systems to test their natural language understanding capabilities and overall robustness.

中文翻译：

呼唤虚张声势：用简单的对抗性测试攻击自动评分系统的稳健性

在过去的二十年中，基于深度学习的自动论文评分 (AES) 系统取得了重大进展。通常由标准性能指标（如二次加权 Kappa (QWK)）衡量的性能和准确性指向相同。然而，对这些 AES 系统的常识对抗样本的测试揭示了它们缺乏自然语言理解能力。受考试期间常见学生行为的启发，我们为 AES 系统提出了一种与任务无关的对抗性评估方案，以测试其自然语言理解能力和整体稳健性。

更新日期：2020-11-12

点击分享查看原文

点击收藏

阅读更多本刊最新论文