Numerical reasoning in machine reading comprehension tasks: are we there yet?,arXiv - CS - Computation and Language

当前位置： X-MOL 学术 › arXiv.cs.CL › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Numerical reasoning in machine reading comprehension tasks: are we there yet?
arXiv - CS - Computation and Language Pub Date : 2021-09-16 , DOI: arxiv-2109.08207
Hadeel Al-Negheimish, Pranava Madhyastha, Alessandra Russo

Numerical reasoning based machine reading comprehension is a task that involves reading comprehension along with using arithmetic operations such as addition, subtraction, sorting, and counting. The DROP benchmark (Dua et al., 2019) is a recent dataset that has inspired the design of NLP models aimed at solving this task. The current standings of these models in the DROP leaderboard, over standard metrics, suggest that the models have achieved near-human performance. However, does this mean that these models have learned to reason? In this paper, we present a controlled study on some of the top-performing model architectures for the task of numerical reasoning. Our observations suggest that the standard metrics are incapable of measuring progress towards such tasks.

中文翻译：

机器阅读理解任务中的数值推理：我们到了吗？

基于数字推理的机器阅读理解是一项涉及阅读理解以及使用加法、减法、排序和计数等算术运算的任务。DROP 基准（Dua 等人，2019 年）是最近的一个数据集，它启发了旨在解决此任务的 NLP 模型的设计。这些模型在 DROP 排行榜中的当前排名，超过标准指标，表明这些模型已经实现了接近人类的性能。然而，这是否意味着这些模型已经学会了推理？在本文中，我们对一些用于数值推理任务的最佳模型架构进行了受控研究。我们的观察表明，标准指标无法衡量此类任务的进展情况。

更新日期：2021-09-20

点击分享查看原文

点击收藏

阅读更多本刊最新论文