当前位置: X-MOL 学术Comput. Linguist. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Interpretability Analysis for Named Entity Recognition to Understand System Predictions and How They Can Improve
Computational Linguistics ( IF 3.7 ) Pub Date : 2021-03-05 , DOI: 10.1162/coli_a_00397
Oshin Agarwal 1 , Yinfei Yang 2 , Byron C. Wallace 3 , Ani Nenkova 1
Affiliation  

Named entity recognition systems achieve remarkable performance on domains such as English news. It is natural to ask: What are these models actually learning to achieve this? Are they merely memorizing the names themselves? Or are they capable of interpreting the text and inferring the correct entity type from the linguistic context? We examine these questions by contrasting the performance of several variants of architectures for named entity recognition, with some provided only representations of the context as features. We experiment with GloVebased BiLSTM-CRF as well as BERT. We find that context does influence predictions, but the main factor driving high performance is learning the name tokens themselves. Furthermore, we find that BERT is not always better at recognizing predictive contexts compared to a BiLSTMCRF model.We enlist human annotators to evaluate the feasibility of inferring entity types from context alone and find that humans are also mostly unable to infer entity types for the majority of examples on which the context-only system made errors. However, there is room for improvement: A system should be able to recognize any named entity in a predictive context correctly and our experiments indicate that current systems may be improved by such capability. Our human study also revealed that systems and humans do not always learn the same contextual clues, and context-only systems are sometimes correct even when humans fail to recognize the entity type from the context. Finally, we find that one issue contributing to model errors is the use of “entangled” representations that encode both contextual and local token information into a single vector, which can obscure clues. Our results suggest that designing models that explicitly operate over representations of local inputs and context, respectively, may in some cases improve performance. In light of these and related findings, we highlight directions for future work.



中文翻译:

具名实体识别的可解释性分析,以了解系统预测及其改进方法

命名实体识别系统在诸如英语新闻之类的领域上取得了卓越的性能。很自然地会问:这些模型实际上可以学到什么来实现这一目标?他们只是自己记住名字吗?还是他们能够解释文本并从语言环境中推断出正确的实体类型?我们通过对比几种用于命名实体识别的体系结构变体的性能来研究这些问题,其中一些仅提供了上下文作为特征的表示。我们使用基于GloVe的BiLSTM-CRF以及BERT进行实验。我们发现上下文确实会影响预测,但是驱动高性能的主要因素是学习名称令牌本身。此外,我们发现与BiLSTMCRF模型相比,BERT在识别预测上下文方面并不总是更好。我们邀请人工注释者来评估仅从上下文推断实体类型的可行性,并且发现对于大多数仅上下文系统出错的示例,人类也大多无法推断实体类型。但是,还有改进的空间:系统应该能够在预测的上下文中正确识别任何命名的实体,并且我们的实验表明,当前的系统可以通过这种功能来进行改进。我们的人类研究还表明,系统和人类并不总是学习相同的上下文线索,即使人类无法从上下文中识别实体类型,纯上下文系统有时也是正确的。最后,我们发现导致模型错误的一个问题是使用“纠缠”表示将上下文和本地令牌信息都编码为单个向量,这可能会掩盖线索。我们的结果表明,在某些情况下设计分别对本地输入和上下文表示进行显式操作的模型可能会提高性能。根据这些以及相关的发现,我们重点介绍了未来工作的方向。

更新日期:2021-03-07
down
wechat
bug