当前位置: X-MOL 学术Multimed. Tools Appl. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Incremental BERT with commonsense representations for multi-choice reading comprehension
Multimedia Tools and Applications ( IF 3.0 ) Pub Date : 2021-07-28 , DOI: 10.1007/s11042-021-11197-0
Ronghan Li 1 , Lifang Wang 1 , Zejun Jiang 1 , Dong Liu 1 , Meng Zhao 1 , Xinyu Lu 1
Affiliation  

Compared to extractive machine reading comprehension (MRC) limited to text spans, multi-choice MRC is more flexible in evaluating the model’s ability to utilize external commonsense knowledge. On the one hand, existing methods leverage transfer learning and complicated matching networks to solve the multi-choice MRC, which lacks interpretability for commonsense questions. On the other hand, although Transformer based pre-trained language models such as BERT have shown powerful performance in MRC, external knowledge such as unspoken commonsense and world knowledge still can not be used explicitly for downstream tasks. In this work, we present three simple yet effective injection methods plugged in BERT’s structure to fine-tune the multi-choice MRC tasks with off-the-shelf commonsense representations directly. Moreover, we introduce a mask mechanism for the token-level multi-hop relationship searching to filter external knowledge. Experimental results indicate that the incremental BERT outperforms the baseline by a considerable margin on DREAM and CosmosQA, two knowledge-driven multi-choice datasets. Further analysis shows the robustness of the incremental model in the case of an incomplete training set.



中文翻译:

具有常识表示的增量 BERT 用于多项选择阅读理解

与仅限于文本跨度的提取式机器阅读理解 (MRC) 相比,多选 MRC 在评估模型利用外部常识知识的能力时更加灵活。一方面,现有方法利用迁移学习和复杂的匹配网络来解决多选 MRC,这对常识性问题缺乏可解释性。另一方面,虽然基于 Transformer 的预训练语言模型(如 BERT)在 MRC 中表现出强大的性能,但外部知识如未说出口的常识和世界知识仍然无法明确用于下游任务。在这项工作中,我们提出了三种简单而有效的注入方法,插入到 BERT 的结构中,以直接使用现成的常识表示微调多选 MRC 任务。而且,我们为令牌级多跳关系搜索引入了掩码机制来过滤外部知识。实验结果表明,增量 BERT 在 DREAM 和 CosmosQA 这两个知识驱动的多选数据集上以相当大的优势优于基线。进一步的分析显示了增量模型在训练集不完整的情况下的鲁棒性。

更新日期:2021-07-28
down
wechat
bug