当前位置:
X-MOL 学术
›
arXiv.cs.IR
›
论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
SEPT: Improving Scientific Named Entity Recognition with Span Representation
arXiv - CS - Information Retrieval Pub Date : 2019-11-08 , DOI: arxiv-1911.03353 Tan Yan, Heyan Huang, Xian-Ling Mao
arXiv - CS - Information Retrieval Pub Date : 2019-11-08 , DOI: arxiv-1911.03353 Tan Yan, Heyan Huang, Xian-Ling Mao
We introduce a new scientific named entity recognizer called SEPT, which
stands for Span Extractor with Pre-trained Transformers. In recent papers, span
extractors have been demonstrated to be a powerful model compared with sequence
labeling models. However, we discover that with the development of pre-trained
language models, the performance of span extractors appears to become similar
to sequence labeling models. To keep the advantages of span representation, we
modified the model by under-sampling to balance the positive and negative
samples and reduce the search space. Furthermore, we simplify the origin
network architecture to combine the span extractor with BERT. Experiments
demonstrate that even simplified architecture achieves the same performance and
SEPT achieves a new state of the art result in scientific named entity
recognition even without relation information involved.
中文翻译:
SEPT:使用跨度表示改进科学命名实体识别
我们引入了一种新的科学命名实体识别器,称为 SEPT,它代表具有预训练变压器的跨度提取器。在最近的论文中,与序列标记模型相比,跨度提取器已被证明是一种强大的模型。然而,我们发现随着预训练语言模型的发展,跨度提取器的性能似乎变得与序列标记模型相似。为了保持跨度表示的优势,我们通过欠采样修改模型以平衡正负样本并减少搜索空间。此外,我们简化了原始网络架构,将跨度提取器与 BERT 相结合。
更新日期:2020-10-14
中文翻译:
SEPT:使用跨度表示改进科学命名实体识别
我们引入了一种新的科学命名实体识别器,称为 SEPT,它代表具有预训练变压器的跨度提取器。在最近的论文中,与序列标记模型相比,跨度提取器已被证明是一种强大的模型。然而,我们发现随着预训练语言模型的发展,跨度提取器的性能似乎变得与序列标记模型相似。为了保持跨度表示的优势,我们通过欠采样修改模型以平衡正负样本并减少搜索空间。此外,我们简化了原始网络架构,将跨度提取器与 BERT 相结合。